Decode Utf8

Decode Utf8

Ever needed to replace all the multibyte characters in a string to their latin equivalent? Me either. This handy little function uses a simple string replace to "decode" the utf8 characters in a string and returns the string with the utf8 characters replaced with their latin counterpart where available. If no latin counterpart exists, an approximation of the character is used. Of course this may be problematic when it comes to simplified chinese, but the user can alway set the corresponding value to an empty string, or their own interpretation.

But PHP already has methods for handling UTF-8! This is correct. PHP has iconv() that can be used with the //TRANSLIT option, or utf8_decode(). Lets see how well they do.

echo iconv("UTF8""ISO-8859-1//TRANSLIT""мúĺţìбýřę śťřïňğ.");

The results from above will vary depending on the character set on the system it is run on, but the results are less than favourable. Lets see how utf8_decode fares.

echo utf8_decode("мúĺţìбýřę śťřïňğ.");

Once again, the results, depending on the system character set, are less than optimal. Now lets try with this array for substiturion


     * @Utf8_decode
     * @Replace accented chars with latin
     * @param string $string The string to convert
     * @return string The corrected string
function decode_utf8($string)
$accented = array(

$replace = array(


Example Usage


echo decode_utf8('мúĺţìбýťę śťřïňğ');


multibyte string

This time the result is as expected. The string "multibyte string" is returned and a calm falls upon the earth.

Feel free to add to more characters to the array as may fit character set needs.