Friday, April 15, 2016

Dirty reads - non locking SELECTs and mysqldump

There is a MySQL instance at work which is the bane of my work life existence. We put all our new clients on it (architectural decision made long before I joined the company) and so its end up having thousands of tiny sharded databases. Each of those databases has hundreds of tables even though their size is small. It is running an old version of MySQL (in the process of getting ready to upgrade). I have a number of scripts that collect information from our databases and also collect data from the information schema. Whenever my scripts or monitoring tools encounter this particular instance things start to be problematic. Queries against the information schema become extremely slow and some of them take 20-30 minutes where as they work very fast on other servers. Monitoring tools start to trip over themselves. Soon Nagios starts alerting about slave lag.

The moment I try to run a query on the information schema on this particular MySQL instance with 2000 databases (around 1 millions tables on the server) then the monitoring team immediately complains about how it makes slave lag alerts show up on their dashboard. My scripts don't need to do locking but the SELECT statements implicitly do locking. To stop this I've tried to use dirty reads in scripts but it still doesn't work. I'm guess they don't work on the information_schema because they are not InnoDB and the transactions only work with InnoDB tables.

Dirty reads are what I refer to as Non-locking SELECT. Sometimes it is acceptable to SELECT “old” data that is in the process of being changed by another session but their session updates have not yet been written. The data pages that have not been committed yet are called “dirty pages”. Performing non locking SELECTS are sometimes called “Dirty reads” and has the advantage of being faster because it ignores the locking being performed by other DML (deletes, updates, inserts) and SELECTS.

The disadvantage is you don’t get the most recent data, and might only get partial data, violating the “consistency” of the data. Dirty reads can be done at a session level or a transaction level. This is how it is done:

Session Level

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;

-- run your queries
SELECT column_1,column_2 FROM TABLE_NAME ;

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ;

Transaction Level

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;

-- run your queries
SELECT column_1,column_2 FROM TABLE_NAME ;

COMMIT ;


MySQL dump

mysqldump -u'USERNAME' -p -h'db1' --databases information_schema --skip-lock-table > information_schema_test.sql

mysqldump -h'db1' -u'USERNAME' -p --databases database1 database2 --lock-tables=false > dump_data.sql

No comments:

Post a Comment