DEV Community ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป

DEV Community ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป is a community of 967,611 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in
Benjamin Delespierre
Benjamin Delespierre

Posted on

Convert accentuated character to their ASCII equivalent in PHP

A a french developer, I often come across non-ASCII characters in user-input data. In order to generate clean, search friendly equivalents, I created the following function that removes the accents while preserving the string integrity.

Example

$str = "ร€ l'รฎle, en รฉtรฉ, quelle fรฉlicitรฉ !";

echo accent2ascii($str); // A l'ile, en ete, quelle felicite
Enter fullscreen mode Exit fullscreen mode

The function

/**
 * Converts accentuated characters (ร รฉรฏรถรป etc.) 
 * to their ASCII equivalent (aeiou etc.)
 *
 * @param  string $str
 * @param  string $charset
 * @return string
 */
function accent2ascii(string $str, string $charset = 'utf-8'): string
{
    $str = htmlentities($str, ENT_NOQUOTES, $charset);

    $str = preg_replace('#&([A-za-z])(?:acute|cedil|caron|circ|grave|orn|ring|slash|th|tilde|uml);#', '\1', $str);
    $str = preg_replace('#&([A-za-z]{2})(?:lig);#', '\1', $str); // pour les ligatures e.g. 'œ'
    $str = preg_replace('#&[^;]+;#', '', $str); // supprime les autres caractรจres

    return $str;
}
Enter fullscreen mode Exit fullscreen mode

Don't forget to leave a like to encourage me to post more useful PHP snippets.

Top comments (2)

Collapse
 
suckup_de profile image
Lars Moelleken
<?php

$str = "ร€ l'รฎle, en รฉtรฉ, quelle fรฉlicitรฉ !";

var_dump(transliterator_transliterate('NFKC; [:Nonspacing Mark:] Remove; NFKC; Any-Latin; Latin-ASCII', $str));

(3v4l.org/H3DAb)

If the "intl" php extension is not installed or you need something language specific you can also use this package: github.com/voku/portable-ascii

Collapse
 
bdelespierre profile image
Benjamin Delespierre Author

Nice job you did there!

๐ŸŒš Friends don't let friends browse without dark mode.

Sorry, it's true.