We recently upgraded from MySQL 4 to MySQL 5 and we noticed strange characters in some of the varchar and text fields.

step 1: dump out the contents of the bad field in hex
mysql> select hex(myfieldname) from mytablename where myid=’something’;

step 2: run a php program to print out the chars one at a time
function hex2asc($temp) {
$len = strlen($temp);
for ($i=0;$i<$len;$i+=2) {
$data[$i]=chr(hexdec(substr($temp,$i,2)))." ".substr($temp,$i,2);

}
return $data;
}
$str="HEX GOES HERE";
$data=hex2asc($str);
print_r($data);

?>

step 3: once you’ve identified the bad chars and their hex values, change them in the database

here are some bad chars I found:

– fix apostrophe
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’92′),”‘”);

– fix left single quote
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’93′),”‘”);

– fix right single quote
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’94′),”‘”);

– fix bullets
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’95′),”& #8226;”);

– fix double dash
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’96′),”–”);

– fix triple dash
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’97′),”—”);

— fix supscripted TM
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’99′),”<sup>TM</sup>”);

– replace jacked up apostrophe/single quote
– delete EFs
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’EF’),”");
– replace BFs with single quote
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’BF’),”‘”);
– delete BDs
update mytablename set myfieldname=REPLACE(myfieldname,UNHEX(’BD’),”");

see also: http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html