That’s a great question Lito — which you’ve answered yourself :-)
Because the PHP Transliterator is a wrapper for the native ICU lib in C, I’m not surprised it performs a lot worse than Laravel’s native php str_slug.
I’ll take a look at Laravel’s implementation tomorrow. 👍🏻 Very curious how they do it.
I've taken a look at Laravel's str_slug. It uses voku/helper/ASCII::to_asciiunder the hood.
That lib and function uses a quite clever in-memory cache on runtime, in which every character is cached in an array: github.com/voku/portable-ascii/blo...
So subsequent transforms are much faster because they don't need to be transformed again.
This is of course highly beneficial to the performance.
The output difference between my slugify() and voku's to_ascii is explained by the fact that the latter takes a locale into account (English by default).
That being said: my "bonus tip" slugify example was never meant to be production code. It's just another example of what the ICU Transliterator can do. Of course there are other libs out there that do the same kind of stuff, which are perhaps better/faster at doing so; because there's a lot of development in them.
I hope you liked my article anyway, even if it's not directly usable for you. 🤞🏻
Do you use Laravel? How about performance
Transliterator
vsstr_slug
? And convert string results? Thanks!Here the test, 10.000 iterations over 2 strings:
And results:
Laravel
str_slug
function has a great performance, but result is not same.That’s a great question Lito — which you’ve answered yourself :-)
Because the PHP Transliterator is a wrapper for the native ICU lib in C, I’m not surprised it performs a lot worse than Laravel’s native php str_slug.
I’ll take a look at Laravel’s implementation tomorrow. 👍🏻 Very curious how they do it.
For me, all related with performance is always a MUST. I work with a lot of data and I always need a efficient solution for every problem :)
I've taken a look at Laravel's
str_slug
. It usesvoku/helper/ASCII::to_ascii
under the hood.That lib and function uses a quite clever in-memory cache on runtime, in which every character is cached in an array:
github.com/voku/portable-ascii/blo...
So subsequent transforms are much faster because they don't need to be transformed again.
This is of course highly beneficial to the performance.
The output difference between my
slugify()
and voku'sto_ascii
is explained by the fact that the latter takes a locale into account (English by default).That being said: my "bonus tip" slugify example was never meant to be production code. It's just another example of what the ICU Transliterator can do. Of course there are other libs out there that do the same kind of stuff, which are perhaps better/faster at doing so; because there's a lot of development in them.
I hope you liked my article anyway, even if it's not directly usable for you. 🤞🏻
Oh! caches 😅
Your article is great! and is perfect as the subject say, to understand how UTF-8 and ASCII converion works.