UTF-8 and Latin1

Convert a Latin1 database to UTF-8


  1. convert an existing latin1 encoded MediaWiki MySQL database to utf8; and
  2. remove any table_name_prefix.


shell> mysqldump -default-character-set=latin1 wiki_database |
 sed > wiki_database-utf8.sql \
  -e 's/`table_name_prefix_//' \
  -e 's/SET NAMES latin1/SET NAMES utf8/' \

Next, load the new database:

shell> mysqladmin drop wiki_database
shell> mysqladmin create wiki_database
shell> mysql wiki_database < wiki_database-utf8.sql

NB: For some strange (uninvestigated) reason, the "SET NAMES utf8" causes:

ERROR 1062 (23000) at line 300: Duplicate entry  for key 1

The error occurs in the values for the math table. I simply deleted the line:


This leaves the math table empty. I have found no harm done by this. Creating a brand new MediaWiki (1.12.0) also results in a an empty math table. Perhaps there is some safe way to fill the table with the appropriate entries.

Now your MediaWiki database is utf8 too, and you can keep all character_set_% in utf8.

Testing the conversion

mysql> create database mies character set = latin1;
shell% mysql --default-character-set=latin1 mies < wikidb.sql
mysql> set names latin;
mysql> SELECT user_name, user_real_name FROM `cheetah_user` WHERE `user_name` LIKE "%Gaston%" ORDER BY `user_name`;

