Wednesday, July 6, 2016

Two ways to convert to utf8

I'm in the middle of a project to converts lots of legacy tables from latin1 to utf8. These are two ways of converting specific tables to utf8. These commands are the same for utf8mb4, just change the CHARACTER SET and COLLATE for utf8mb4.

Option 1:

       

-- Change the schema default to utf8

ALTER DATABASE databaseA CHARACTER SET utf8 COLLATE utf8_unicode_ci;



-- Convert the defaults for the table to utf8

ALTER TABLE tableA CHARACTER SET = utf8, COLLATE = utf8_unicode_ci;



-- Convert specific column(s) to utf8

ALTER TABLE tableA MODIFY value VARCHAR(40) CHARACTER SET utf8 COLLATE utf8_unicode_ci;



       
 

What is good about this option is it will preserve other character sets on the same table. Maybe you need to keep latin1 on a specific column or you want to have a case sensitive column that uses some special collation, doing the change like this will not touch those other columns.

Option 2:

       

-- Change the schema default to utf8

ALTER DATABASE databaseA CHARACTER SET utf8 COLLATE utf8_unicode_ci;



-- Convert the table and all columns in it to utf8

ALTER TABLE tableA CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;


       
 

This option converts all columns to utf8. This is nice because it will catch all columns so you don't have to do each one separately. However, if you wanted to keep a specific column a specific collation then you would want to use Option 1.


For an excellent primer on utf8 and MySQL this is a great source:

http://mysql.rjweb.org/doc.php/charcoll

No comments:

Post a Comment