みなさん、こんにちは。私はGoogleはそれが吸うのであれば、それは本当に私のせいではないように入力テキストに使用しています。私の日本人はせいぜいまばらであるためか、多分それは、ある。しかし、私はそれをWOLTLABに日本語の文字がどのように管理するかをテストすることをお勧めと思いました。

  • みなさん、こんにちは。私はGoogleはそれが吸うのであれば、それは本当に私のせいではないように入力テキストに使用しています。私の日本人はせいぜいまばらであるためか、多分それは、ある。しかし、私はそれをWOLTLABに日本語の文字がどのように管理するかをテストすることをお勧めと思いました。

  • ^^^

    Hello everyone. I am using Google translate to type this so if it sucks, it's really not my fault. Or maybe it is, because my Japanese is spotty at best. But I thought it a good idea to test how WOLTLAB manages Japanese characters.

  • utf8_general_ci is faster when it comes to sorting, since it uses a bunch of shortcuts to circumvent costly comparisons with umlauts.

    utf8_unicode_ci uses the Unicode standard which yields accurate results and provides an excellent support for umlauts. This is important because this affects indices too, for example utf8_general_ci treats a and ä to be the same, which may cause unwanted collisions. Take the word Apfel (engl. apple) and it's plural Äpfel (engl. apples), MySQL would treat both as the same word even through they are anything but the same, utf8_unicode_ci prevents this and is able to tell them apart.

    The performance gain of choosing utf8_general_ci is almost nonexistent in real-world scenarios and causes troubles with languages containing non-ASCII characters (such as the German umlauts). It might even cause issues when you rely on UNIQUE indices: They should ensure that a specific key appears only once, but in the example above this would prevent the singular or plural from appearing, because it is blocked by the other one. Tags for threads would suffer from the same issue, because MySQL would not be able to tell them apart and treat them as the same.

  • utf8_general_ci is faster when it comes to sorting, since it uses a bunch of shortcuts to circumvent costly comparisons with umlauts.

    utf8_unicode_ci uses the Unicode standard which yields accurate results and provides an excellent support for umlauts. This is important because this affects indices too, for example utf8_general_ci treats a and ä to be the same, which may cause unwanted collisions. Take the word Apfel (engl. apple) and it's plural Äpfel (engl. apples), MySQL would treat both as the same word even through they are anything but the same, utf8_unicode_ci prevents this and is able to tell them apart.

    The performance gain of choosing utf8_general_ci is almost nonexistent in real-world scenarios and causes troubles with languages containing non-ASCII characters (such as the German umlauts). It might even cause issues when you rely on UNIQUE indices: They should ensure that a specific key appears only once, but in the example above this would prevent the singular or plural from appearing, because it is blocked by the other one. Tags for threads would suffer from the same issue, because MySQL would not be able to tell them apart and treat them as the same.

    I know.

    The question was which one do you use for WBB?

    (I prefer utf8_unicode_ci by the way)

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!