Tuesday, June 14, 2016

Exporting MySQL results from remote instance to local file system

Why does exporting to a CSV locally from a remote MySQL server have to be so hard?

If you are running MySQL on RDS or don't have access to the file system on the remote server you cannot use the recommend approach to create a CSV file. This would be the recommend way...

SELECT order_id,product_name,qty
FROM orders
WHERE foo = 'bar'
INTO OUTFILE '/tmp/orders.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';

I like all the creative ways people on this post came up with to get over this:

http://stackoverflow.com/questions/356578/how-to-output-mysql-query-results-in-csv-format

If you know your data doesn't have commas or tabs then many of these solutions will work just fine. Such as this Example 1. Example 2 seems to account for even having commas or tabs in your data. But extra quotations in the data might be problematic.

Example 1:
mysql --user=wibble --password wobble -B -e "select * from vehicle_categories;" | sed "s/'/\'/;s/\t/\",\"/g;s/^/\"/;s/$/\"/;s/\n//g" > vehicle_categories.csv

Regex Explanation:
  • s/// means substitute what's between the first // with what's between the second //
  • the "g" at the end is a modifier that means "all instance, not just first"
  • ^ (in this context) means beginning of line
  • $ (in this context) means end of line
So, putting it all together:
s/'/\'/          replace ' with \'
s/\t/\",\"/g     replace all \t (tab) with ","
s/^/\"/          at the beginning of the line place a "
s/$/\"/          at the end of the line place a "
s/\n//g          replace all \n (newline) with nothing

Example 2:

select concat_ws(',',
    concat('"', replace(field1, '"', '""'), '"'),
    concat('"', replace(field2, '"', '""'), '"'),
    concat('"', replace(field3, '"', '""'), '"'))

from your_table where etc;
Explanation:
  1. Replace " with "" in each field --> replace(field1, '"', '""')
  2. Surround each result in quotation marks --> concat('"', result1, '"')
  3. Place a comma between each quoted result --> concat_ws(',', quoted1, quoted2, ...)

I liked the Example 2 which I have listed above. I've created a script that dynamically generates that type of SQL and then I execute that dynamically generated SQL.

My script runs three commands like this:

# Generate the header
mysql -uUSERNAME -pPASSWORD -hHOSTNAME  -s -e "(SELECT GROUP_CONCAT(CONCAT('\"',COLUMN_NAME,'\"'),'' '' SEPARATOR ',')
        FROM INFORMATION_SCHEMA.COLUMNS c WHERE TABLE_NAME = 'MY_TABLE_NAME' AND TABLE_SCHEMA = 'MY_DATABASE_NAME');" > my_report.csv

# Create SQL that will be used to get the data
mysql -uUSERNAME -pPASSWORD -hHOSTNAME -s -e "SET SESSION group_concat_max_len = 1000000;
        (SELECT CONCAT('SELECT CONCAT_WS(\',\', ', GROUP_CONCAT('CONCAT(\'\"\', REPLACE(',column_name, ',\'\"\', \'\"\"\'), \'\"\')'), ') FROM MY_TABLE_NAME')
        FROM information_schema.columns c
        WHERE c.table_schema = MY_DATABASE_NAME
        AND c.TABLE_NAME = 'MY_TABLE_NAME');" > dynamically_generated_code.sql

# Run the above SQL to get the data
mysql -uUSERNAME -pPASSWORD -hHOSTNAME  -s MY_DATABASE_NAME < dynamically_generated_code.sql >> my_report.csv


Explanation:

  1. First query puts the header values for the CSV file and writes them to the CSV file
  2. Second query creates the SQL to pull the data so that it will be in comma separated value format.
  3. The third query executes the SQL that was generated from step 2 and pulls the data for the CSV file and appends it to the file created in step one.

     

No comments:

Post a Comment