DEV Community

UTF-8 in MySQL

Paweł bbkr Pabian on September 24, 2023

This series is supposed to be focused on technical aspects of Unicode and I do not plan to analyze UTF support in various technologies. However for...
Collapse
 
raiph profile image
raiph

I can imagine scenarios where one might want to know if the rules used by the accent and case (in)sensitive handling of MySQL definitely matches the ones used in Rakudo. Have you ever considered that, or even researched that?

Here's my current thinking/guess:

  • You can't realistically know, right?

  • You could read Rakudo's source code, or inspect roast for accent/case comparison tests, but Rakudo doesn't currently support configuring which version of Unicode it supports, so while you can follow your best practice idea for MySQL (sticking to Unicode 9 handling), you're not going to be able to do the same with Rakudo.

Googling turns up nothing about this, but if anyone might have any idea about this, it seems it would be you, and here and now seems to be the best place and time to try get it into a public space that might turn up in future googles.


Typos:

deafult column collation for ordering / groupping

s/deafult/default/
s/groupping/grouping/

Collapse
 
bbkr profile image
Paweł bbkr Pabian • Edited

I encountered this issue many times. For example I have case insensitive column in database and want to map it to Raku / Perl Hash so that the column is the Hash key. And the question always remains: will %hash{ %row{ 'column'}.fc } = %row cause data loss? Is Perl / Raku folding case the same as case insensitivity in database collation?

That is why I always recommend underrated WEIGHT_STRING function.

SELECT login, WEIGHT_STRING(login) AS login_fc
Enter fullscreen mode Exit fullscreen mode
%hash{ %row{ 'login_fc'} } = %row;
Enter fullscreen mode Exit fullscreen mode

This gives one source of truth for collation behavior.

Thanks for spotting typos. Fixed.