I've been a fan of Solar Winds' 12 Steps to Query Performing Tuning Infographic which they published for Oracle and SQL Server. However, they didn't publish one for MySQL. Using the same 12 step model as in their infographic, I've written my own:
12 Steps to MySQL Query Tuning
Step 1 - Examine the Execution Plan
Make sure you are working with tables, not views
Make Certain you know the row count
Review the Access Predicates (key_len, ref columns in EXPLAIN PLAN)
- Is all or part of composite index being used?
Are there full table scans? Are there full index scans?
Review the Extra column
- Do you see "Using temporary;" or "Using filesort;" (BAD)
- Do you see "Using Index" (GOOD)
Is the Possible keys column NULL?
Is the value in the Rows column high? (BAD)
Does the type column show RANGE?
- Is it having traverse large number of rows in the Index leaf nodes?
- You want to keep the scanned index range as small as possible
Examine the EXPLAIN EXTENDED
SHOW WARNINGS <- Run this right after EXPLAIN EXTENDED to review how the actual SQL will be executed
Step 2 - Review the filters, JOINS, WHERE predicates, ORDER BY, GROUP BY, HAVING BY, LIMIT
What is the filtered row count?
If there is a LIMIT clause? Is it occurring after or being the entire table is accessed?
Are there functions in the ORDER BY?
Review the data types to avoid implicit conversion
- truthy/falsey filtering in the Predicates (converting columns to Boolean)
- WHERE !TIMESTAMP (ignores index)
Don't wrap your indexes in expressions in the WHERE predicates
- WHERE DATE(TIMESTMAP) = '2010-10-10' (ignores index)
Should the query be using an INNER JOIN vs. LEFT JOIN?
Is the query unnecessarily filtering out duplicates? (UNION vs. UNION ALL)
Are there bind variables? Using prepared statements?
Step 3 - Know the Selectivity of the table data
Is the data skewed resulting in queries that cannot effectively use existing indexes?
Make Sure to work with the smallest possible logical set
Can additional WHERE predicates be added?
Know when the predicate is applied when dealing with multiple tables, you want the most SELECTIVE predicate earlier than later
Step 4 - Gather Table Information
Review Table definitions
Have the stats been updated? Get current stats.
Are there triggers on the table? If so, how often do they fire? What are they doing?
What is the storage engine? Does it allow for row level locking vs. table level locking?
Make sure that datatypes match for the columns that are being JOINED between two tables, including the CHARSET and COLLATION
Step 5 - Evaluate Existing Indexes
Look at Index Definitions
Are there redundant indexes?
Are there composite indexes?
Is there a primary key on the table?
Is the primary key an integer, char?
Is the Primary key unnecessarily large as in the case a large varchar primary key?
Are there far too many secondary indexes?
Know the Selectivity of the indexes
Is the cardinality for the indexes high/low?
Could a fulltext index be useful?
Is each column that could be JOINED indexed?
Check to see if covering indexes can be created (avoid duplication and overlapping of indexes)
Step 6 - Analyze Columns in the WHERE clause and SELECT
Look for SELECT * or scalar function (the more data brought back the less optimal it may be to use certain functions)
Look for CASE, CAST, CONVERT
Are there sub queries? Can the sub query be converted to a JOIN?
Step 7 - Run the Query and record baseline metrics
Gather average execution times
If you are using Master-Master, Master-Slave or some other type of cluster where each instance has the same data, test running the query on each replica to see if the execution time and/or execution plan changes
Step 8 - Adjust/Tune the query
Focus on the most expensive operations first (full table scans, index scans, file sort, temporary table)
Look for high cost steps to eliminate
Consider a covering index, which includes every column that satisfies the query
In the case of ORDER BY and GROUP BY, check to see if you can eliminate "Using temporary;" or "Using filesort;" by properly indexing the table
Try to eliminate the frequent counting of rows by utilizing summary tables
Rewrite sub queries as a join
Seek vs. Scans, which is more expensive?
NOTE: Out of date statistics can impact performance. Issue "ANAYLZE TABLE"
NOTE: Fragmented tables can impact performance - during a planned maintenance Issue "OPTIMIZE TABLE"
Step 9 - Re-run the query
Record the results and compare
Only make one change at a time
Step 10 - Consider adjusting indexes
Test using index hints to see if the optimizer is choosing poorly
Look to reduce logical I/O
Step 11 - Incremental changes
Continue to re-run query and record results after each incremental change
Have a peer review the query it with you to see if you are missing something
Step 12 - Engineer out the Stupid
Abuse of wildcards?
Use of views?
Scalar Functions in the query?
Can the query be cached? Does it make sense to be using MySQL query cache or a dedicated caching layer? Consider turning off MySQL query cache as the effort to check the cache can severely slow down your queries.
Is the query using a cursor in a stored procedure?
Does the server have enough RAM to keep indexes in memory?
Are the data files stored on fast drives?
Are all the tables in your query using the same storage engine?
Do all your tables have Primary Keys, proper secondary indexes?
Is your table normalized/de-normalized as appropriate?
Is your MySQL server so heavily overloaded that queries are performing poorly?
Has the query been tested on large and small data sets?
Is the query being used on a sharded database where different databases have the same tables but possibly very different data sets?
Is the query doing silly things like over use of the OR/UNION command when an IN clause may be more appropriate?
Are you forcing the query to do full table scan by putting wildcard characters (%) on the left side of an index?
Is the table definition using the right data types? Abuse of CHAR(255), VARCHAR(255), BIGINT?
Can a complex query with dependent sub queries be split into multiple queries and/or use transactions?
UPDATE April 2016:
Solar Winds finally came out with an infographic for MySQL:
http://www.solarwinds.com/assets/infographics/dpa-mysql-12-steps.aspx