ããã¯ããªã«ãããããŠæžãããã®ïŒ
MySQL 8.0ã®Charset utf8mb4ã§äœ¿ããCollationã«ã€ããŠãã¡ãã£ãšèŠãŠãããããªãšæããŸããŠã
å
·äœçã«ã¯ããMySQL培åºå
¥é 第4çãã®ã11.2 Collationãã«æžãããŠããæåæ¯èŒããã³ãœãŒãã«ã€ããŠèªåã§
確èªããŠã¿ãããšæããŸãã
MySQL培åºå ¥é 第4ç MySQL 8.0察å¿
- äœè :yoku0825,åäº æµ,é¶Žé· é®äž,ãšã¿ããŸãã²ã,æ·±çº æ¥åºæµ·,ïšå±± è£å€§,çç³ æŠå€«,å±±ïš ç±ç«
- çºå£²æ¥: 2020/07/06
- ã¡ãã£ã¢: åè¡æ¬ïŒãœããã«ããŒïŒ
utf8mb4ã§ã®CharsetãšCollation
MySQLã®CharsetãšCollationã«é¢ããããã¥ã¡ã³ãã¯ããã¡ãã§ãã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10 文字セット、照合順序、Unicode
MySQLã§ã¯è€æ°ã®CharsetïŒæåã»ããïŒã䜿ãããšãã§ãããã®ç°å¢ã§äœ¿çšã§ããCharsetã¯ä»¥äžã§ç¢ºèªã§ããŸãã
mysql> show character set;
ãããŠãCharsetã«ã¯CollationïŒç §åé åºïŒããããæåã®æ¯èŒããœãŒãã«é¢ãã£ãŠããããšã«ãªããŸãã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.2 MySQL での文字セットと照合順序
䜿çšã§ããCollationã¯ã以äžã§ç¢ºèªã§ããŸãã
mysql> show collation;
Charsetããã³Collationã¯ããµãŒããŒãããŒã¿ããŒã¹ãããŒãã«ãã«ã©ã ãæååãªãã©ã«ããããã§æå®ããããšã
ã§ããŸãã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.3 文字セットと照合順序の指定
ã§ãã©ã®CharsetãCollationã䜿ããã§ããâŠã
ãŸãã¯Charsetã
UnicodeããµããŒããããUTF-8ç³»ã®Charsetãéžã¶ããšã«ãªãã§ãããã
ãšããããutf8mb4
ïŒ4ãã€ãã®UTF-8ãšã³ã³ãŒãã£ã³ã°ïŒã§ããã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.9 Unicode のサポート
æšä»ã¯ããŸã䜿ããªããããããŸããããcp932
ãeucjpms
ãªã©ããããŸãã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.10.7 アジアの文字セット
ä»åã¯ãutf8mb4
ãæ±ãããšã«ããŸãã
ç¶ããŠCollationã
utf8mb4_bin
ãutf8mb4_general_ci
ãutf8mb4_0900_as_ci
ãutf8mb4_ja_0900_as_cs
ãutf8mb4_ja_0900_as_cs_ks
ãªã©ã
ããããã§ããããããã®èªã¿æ¹ã¯ä»¥äžãèŠããšããããŸãã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.3.1 照合の命名規則
ãŸããCollationã®ååã¯é¢é£ä»ããããŠããCharsetã§å§ãŸããŸãã
ç §åé åºåã¯ãé¢é£ä»ããããŠããæåã»ããã®ååã§å§ãŸããéåžžã¯ãä»ã®ç §åé åºç¹æ§ã瀺ã 1 ã€ä»¥äžã®æ¥å°ŸèŸãç¶ããŸãã
ja
ãªã©ã®Localeãå«ãŸããå Žåã¯ãèšèªåºæã®Collationã§ããããšãè¡šããŠããŸãã
èšèªåºæã®ç §åã«ã¯ããã±ãŒã«ã³ãŒããŸãã¯èšèªåãå«ãŸããŸãã
ãã®ããšã®ãµãã£ãã¯ã¹ïŒæ¥å°ŸèŸïŒã¯ã以äžã®æå³ã«ãªããŸãã
_ai
⊠ã¢ã¯ã»ã³ããåºå¥ããªãïŒAccent InsensitiveïŒ_as
⊠ã¢ã¯ã»ã³ããåºå¥ããïŒAccent SensitiveïŒ_ci
⊠倧æåã»å°æåãåºå¥ããªãïŒCase InsensitiveïŒ_cs
⊠倧æåã»å°æåãåºå¥ããïŒCase SensitiveïŒ_ks
⊠ã«ããåºå¥ããïŒKana SensitiveïŒ_bin
⊠ãã€ããª
æ¥æ¬èªã«ã€ããŠèšããšãã¢ã¯ã»ã³ãã¯æž
é³æ¿é³åæ¿é³ãããªã¯å¹³ä»®åçä»®åãããããåºå¥ãããã©ãããšãã
話ã«ãªããŸãã
æ°åãå ¥ã£ãŠããå Žåã¯ãUnicode Collation AlgorithmïŒUnicodeç §åã¢ã«ãŽãªãºã ãUCAïŒã®ããŒãžã§ã³ã瀺ããŠããŸãã
utf8mb4_unicode_520_ci
⊠Unicode Collation Algorithm 5.2.0ã«åºã¥ããŠããutf8mb4_ja_0900_as_cs
⊠Unicode Collation Algorithm 9.0.0ã«åºã¥ããŠãã
ã€ãŸããUCAã®ããŒãžã§ã³ãå ¥ã£ãŠãããã®ã«ã€ããŠã¯ãUnicodeèŠæ Œã«æ²¿ã£ãCollationã ãšããããã§ããã
ãããŸã§ã®å
容ãèžãŸãããšãããšãã°utf8mb4_ja_0900_as_cs_ks
ã ãšä»¥äžã®ãããªè§£éã«ãªããŸãã
- æ¥æ¬èªåºæã®Collation
- UCA 9.0.0ã«åºã¥ããŠãã
- ã¢ã¯ã»ã³ããåºå¥ãã
- 倧æåã»å°æåãåºå¥ãã
- ã«ããåºå¥ãã
ãšãããšãUCAã®ããŒãžã§ã³ãå
¥ã£ãŠããªãCollationã¯ã©ããããã®ããšãããšãUCAã®èŠæ Œã«åŸããªãMySQLç¬èªã®
èŠåã®ãã®ã ãšããããšã«ãªããŸãã
_bin
ã«ã€ããŠã¯ãã€ããªãªã®ã§ãutf8mb4_bin
ãutf8mb4_0900_bin
ã¯ãã€ããªæ¯èŒãšãªããŸãã
ãã®2ã€ã®éãã¯ã以äžã«èšèŒããããŸãã
- _bin (ãã€ããª) ç §åé åºãé€ããã¹ãŠã® Unicode ç §åé åºã«ã€ããŠãMySQL ã¯ããŒãã«æ€çŽ¢ãå®è¡ããŠæåç §åé åºãæ€çŽ¢ããŸãã
- utf8mb4_0900_bin 以å€ã®_bin ç §åé åºã®å Žåãéã¿ã¯ã³ãŒããã€ã³ãã«åºã¥ããå è¡ãããŒããã€ããè¿œå ãããå ŽåããããŸãã
- utf8mb4_0900_bin ã®å Žåãéã¿ã¯ utf8mb4 ãšã³ã³ãŒãã£ã³ã°ãã€ãã§ãã ãœãŒãé åºã¯ utf8mb4_bin ã®å Žåãšåãã§ãããã¯ããã«é«éã§ãã
ã³ãŒããã€ã³ãã§ã®æ¯èŒãããã€ãã§ã®æ¯èŒããšããéãã§ããããŸããutf8mb4_bin
ã¯PAD SPACEã
utf8mb4_0900_bin
ã¯NO PADãšããéãããããŸãïŒç
§åãããå±æ§ã«ã€ããŠã¯åŸè¿°ïŒã
ããšã¯ãutf8mb4_general_ci
ãutf8mb4_unicode_ci
ãšãã£ãCollationã«ã€ããŠã
MySQL :: MySQL 8.0 リファレンスマニュアル :: 10.10.1 Unicode 文字セット
ãããã«ã€ããŠã¯ã以äžã«èšèŒããããŸãã
_general_ci ãš unicode_ci ã®ç §åé åº
xxx_general_ci
ã®æ¹ãxxx_unicode_ci
ãããé«éãªããã§ããã粟床ãäœããªããšãã
Unicode æåã»ããã®å Žåãxxx_general_ci ç §åé åºã䜿çšããŠå®è¡ããæŒç®ã¯ãxxx_unicode_ci ç §åé åºã®ãã®ãããé«éã§ãã ããšãã°ãutf8_general_ci ç §åé åºã®æ¯èŒã¯ãutf8_unicode_ci ã®æ¯èŒãããé«éã§ããã粟床ã¯å°ãäœããªããŸããããã¯ãutf8_unicode_ci ã§æ¡åŒµãªã©ã®ãããã³ã°ããµããŒããããŠããããã§ãã
ããã ãšãã¡ãã£ãšããããããªãã§ããã
以äžã«ããããªèšè¿°ããããŸããã
äžè¬ç §å (xxx_general_ci) ã® BMP æåã®å Žåãéã¿ã¯ã³ãŒããã€ã³ãã§ãã
ã€ãŸããBMPã®ç¯å²ã§ã¯ã³ãŒããã€ã³ãã§ã®æ¯èŒãè¡ãããã§ãã
utf8mb4_general_ci
⊠倧æåã»å°æåãåºå¥ããïŒ_ciïŒãä»ã¯BMPã®ã³ãŒããã€ã³ãã«æ²¿ã£ãŠæ¯èŒãè¡ãããã ããBMPã®ç¯å²å€ïŒ
U+10000`以äžã®æåïŒã¯åºå¥ã§ããªãutf8mb4_unicode_ci
⊠Unicodeæ¡åŒµããµããŒããããã®
ããæžããšutf8mb4_unicode_ci
ã®æ¹ãè¯ãããã§ãããå®éã«ã¯ããããå°ã£ãããšã«ãªãã®ã§äœ¿ããªãã§ãããâŠã
ãŸããCollationã«ã¯PADå±æ§ããããNO PAD
ã®ãã®ã¯æååã®æ«å°Ÿã«ããã¹ããŒã¹ãæåãšããŠæ±ãããŸãã
UCA 9.0.0 以äžã«åºã¥ãç §åã¯ã9.0.0 ããåã® UCA ããŒãžã§ã³ã«åºã¥ãç §åããé«éã§ãã ãŸãã9.0.0 ããåã® UCA ããŒãžã§ã³ã«åºã¥ãç §åã§äœ¿çšããã PAD SPACE ãšã¯å¯Ÿç §çã«ãNO PAD ã®ãããå±æ§ããããŸãã éãã€ããªæååãæ¯èŒããããã«ãNO PAD ç §åé åºã§ã¯ãæååã®æ«å°Ÿã®ã¹ããŒã¹ã¯ä»ã®æåãšåæ§ã«æ±ãããŸãã
èŠããã«ãæåŸã®ã¹ããŒã¹ãæ¯èŒã«å«ãããã©ããããšãã話ã§ãã
ãããŠå
ã«å°ãããã¥ã¡ã³ããåºããŠããŸããŸããããCollationã«ãããæåã®éã¿ã¯WEIGHT_STRING
é¢æ°ã䜿ã£ãŠ
調ã¹ãããšãã§ããŸãã
å®éã®æåã®å€å®ãã©ããªã£ãŠããã®ãã確èªããã«ã¯ããã¡ãã䜿ããšããã§ãããã
ããã©ã«ãã®utf8mb4ã®Collation
Charset utf8mb4
ãéžãã å Žåãããã©ã«ãã®Collationã¯utf8mb4_0900_ai_ci
ãšãªãããã§ãã
mysql> show collation where collation like 'utf8mb4%' and `default` = 'Yes'; +--------------------+---------+-----+---------+----------+---------+---------------+ | Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute | +--------------------+---------+-----+---------+----------+---------+---------------+ | utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 | NO PAD | +--------------------+---------+-----+---------+----------+---------+---------------+ 1 row in set (0.00 sec)
ããããå ã¯ãå®éã«åãããªãã確èªããŠã¿ãŸãããã
ç°å¢
ä»åã®ç°å¢ã«ã€ããŠã
MySQL 8.0.24ã䜿ããŸãããã¡ãã¯ã172.17.0.2ã§åäœããŠãããã®ãšããŸãã
ãŸãã確èªã®ããã®æ å ±ã¯ããã°ã©ã ã§äœæããããšã«ããŸããä»åã¯Pythonã䜿ãããšã«ããŸããã
$ python3 -V Python 3.8.5 $ pip3 -V pip 20.0.2 from /path/to/venv/lib/python3.8/site-packages/pip (python 3.8)
MySQLãžã®ã¢ã¯ã»ã¹ã«ã¯ãMySQL Connector/Pythonã䜿ããŸãã
MySQL :: MySQL Connector/Python Developer Guide
$ pip3 install mysql-connector-python==8.0.24
ããã°ã©ã èªäœã¯ãæåŸã«èŒããããšã«ããŸãã
ä»åæ±ãCollation
ä»åæ±ãCollationã¯ããã¡ãã«ããŸãã
mysql> show collation -> where -> collation like 'utf8mb4_ja%' or -> collation like 'utf8mb4_0900%' or -> collation like 'utf8mb4%bin' or -> collation like 'utf8mb4%general%' or -> collation like 'utf8mb4%unicode%'; +--------------------------+---------+-----+---------+----------+---------+---------------+ | Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute | +--------------------------+---------+-----+---------+----------+---------+---------------+ | utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 | NO PAD | | utf8mb4_0900_as_ci | utf8mb4 | 305 | | Yes | 0 | NO PAD | | utf8mb4_0900_as_cs | utf8mb4 | 278 | | Yes | 0 | NO PAD | | utf8mb4_0900_bin | utf8mb4 | 309 | | Yes | 1 | NO PAD | | utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 | PAD SPACE | | utf8mb4_general_ci | utf8mb4 | 45 | | Yes | 1 | PAD SPACE | | utf8mb4_ja_0900_as_cs | utf8mb4 | 303 | | Yes | 0 | NO PAD | | utf8mb4_ja_0900_as_cs_ks | utf8mb4 | 304 | | Yes | 24 | NO PAD | | utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | 8 | PAD SPACE | | utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 | PAD SPACE | +--------------------------+---------+-----+---------+----------+---------+---------------+ 10 rows in set (0.01 sec)
å°ãæ¯èŒããŠã¿ã
ãããŸã§ã®èª¬æãèžãŸããŠãäžéšã®Collationãå°ã確èªããŠã¿ãŸãããã
utf8mb4_0900_ai_ci
ãã¢ã¯ã»ã³ãåºå¥ãªãã倧æåã»å°æååºå¥ãªããã§ããããŸããNO PADã§ããããŸãã
mysql> set names utf8mb4 collate utf8mb4_0900_ai_ci; Query OK, 0 rows affected (0.00 sec) mysql> select 'a' = 'A'; +-----------+ | 'a' = 'A' | +-----------+ | 1 | +-----------+ 1 row in set (0.00 sec) mysql> select 'ã' = 'ã'; +---------------+ | 'ã' = 'ã' | +---------------+ | 1 | +---------------+ 1 row in set (0.00 sec) mysql> select 'ã¯' = 'ã°'; +---------------+ | 'ã¯' = 'ã°' | +---------------+ | 1 | +---------------+ 1 row in set (0.00 sec) mysql> select 'a ' = 'A'; +-------------+ | 'a ' = 'A' | +-------------+ | 0 | +-------------+ 1 row in set (0.00 sec) mysql> select 'ð£' = 'ðº'; +-----------+ | '?' = '?' | +-----------+ | 0 | +-----------+ 1 row in set (0.00 sec)
çµæã1
ãªã®ã¯ãTrue
ãè¡šããŠããŸãã
倧æåãå°æåãåºå¥ããããããããšãããããã¯ããšãã°ããåºå¥ããããæ«å°Ÿã®ã¹ããŒã¹ã¯åºå¥ãããŸããã
ð£ãšðºãåºå¥ãããŸãããã
ä»åºŠã¯ãutf8mb4_general_ci
ã«ããŠã¿ãŸãããã
mysql> set names utf8mb4 collate utf8mb4_general_ci; Query OK, 0 rows affected (0.00 sec) mysql> select 'a' = 'A'; +-----------+ | 'a' = 'A' | +-----------+ | 1 | +-----------+ 1 row in set (0.00 sec) mysql> select 'ã' = 'ã'; +---------------+ | 'ã' = 'ã' | +---------------+ | 0 | +---------------+ 1 row in set (0.00 sec) mysql> select 'ã¯' = 'ã°'; +---------------+ | 'ã¯' = 'ã°' | +---------------+ | 0 | +---------------+ 1 row in set (0.00 sec) mysql> select 'a ' = 'A'; +-------------+ | 'a ' = 'A' | +-------------+ | 1 | +-------------+ 1 row in set (0.00 sec) mysql> select 'ð£' = 'ðº'; +-----------+ | '?' = '?' | +-----------+ | 1 | +-----------+ 1 row in set (0.01 sec)
倧æåãå°æåã¯åºå¥ãããŸãããããããšãããããã¯ããšãã°ãã¯åºå¥ãããŸããã
æ«å°Ÿã®ã¹ããŒã¹ã¯åºå¥ãããŠããŸãããð£ãšðºã¯åãæåã«ãªã£ãŠããŸããŸããã
ããã²ãšã€ãutf8mb4_0900_as_cs
ãã¢ã¯ã»ã³ãã倧æåã»å°æåãåºå¥ãããã®ã«ã
mysql> set names utf8mb4 collate utf8mb4_0900_as_cs; Query OK, 0 rows affected (0.00 sec) mysql> select 'a' = 'A'; +-----------+ | 'a' = 'A' | +-----------+ | 0 | +-----------+ 1 row in set (0.00 sec) mysql> select 'ã' = 'ã'; +---------------+ | 'ã' = 'ã' | +---------------+ | 0 | +---------------+ 1 row in set (0.00 sec) mysql> select 'ã¯' = 'ã°'; +---------------+ | 'ã¯' = 'ã°' | +---------------+ | 0 | +---------------+ 1 row in set (0.00 sec) mysql> select 'a ' = 'A'; +-------------+ | 'a ' = 'A' | +-------------+ | 0 | +-------------+ 1 row in set (0.00 sec) mysql> select 'ð£' = 'ðº'; +-----------+ | '?' = '?' | +-----------+ | 0 | +-----------+ 1 row in set (0.00 sec)
ããšã¯ãweight_string
é¢æ°ã§éã¿ãèŠãŠã¿ãŸãããã
mysql> set names utf8mb4 collate utf8mb4_general_ci; Query OK, 0 rows affected (0.00 sec) mysql> select 'a', hex('a'), hex(weight_string('a')); +---+----------+-------------------------+ | a | hex('a') | hex(weight_string('a')) | +---+----------+-------------------------+ | a | 61 | 0041 | +---+----------+-------------------------+ 1 row in set (0.00 sec) mysql> select 'A', hex('A'), hex(weight_string('A')); +---+----------+-------------------------+ | A | hex('A') | hex(weight_string('A')) | +---+----------+-------------------------+ | A | 41 | 0041 | +---+----------+-------------------------+ 1 row in set (0.00 sec) mysql> select 'ã', hex('ã'), hex(weight_string('ã')); +-----+------------+---------------------------+ | ã | hex('ã') | hex(weight_string('ã')) | +-----+------------+---------------------------+ | ã | E38182 | 3042 | +-----+------------+---------------------------+ 1 row in set (0.00 sec) mysql> select 'ã', hex('ã'), hex(weight_string('ã')); +-----+------------+---------------------------+ | ã | hex('ã') | hex(weight_string('ã')) | +-----+------------+---------------------------+ | ã | E38181 | 3041 | +-----+------------+---------------------------+ 1 row in set (0.00 sec) mysql> select 'ð£', hex('ð£'), hex(weight_string('ð£')); +------+----------+-------------------------+ | ? | hex('?') | hex(weight_string('?')) | +------+----------+-------------------------+ | ð£ | F09F8DA3 | FFFD | +------+----------+-------------------------+ 1 row in set (0.00 sec) mysql> select 'ðº', hex('ðº'), hex(weight_string('ðº')); +------+----------+-------------------------+ | ? | hex('?') | hex(weight_string('?')) | +------+----------+-------------------------+ | ðº | F09F8DBA | FFFD | +------+----------+-------------------------+ 1 row in set (0.00 sec)
ããèŠããšãæåãåºå¥ãããããããªãçç±ãããããŸããããŸããutf8mb4_general_ci
ã ãšBMPå€ã®æåã¯\ufffd
ã«
ãªã£ãŠããŸãããã§ãã
ãã£ãšæ¯èŒããŠã¿ã
ã§ã¯ããã£ãšCollationãåºããŠæ¯èŒããŠã¿ãŸãããã
以äžã®Collationã察象ã«ããŸãã
utf8mb4_0900_ai_ci
utf8mb4_0900_as_ci
utf8mb4_0900_as_cs
utf8mb4_ja_0900_as_cs
utf8mb4_ja_0900_as_cs_ks
utf8mb4_bin
utf8mb4_0900_bin
utf8mb4_general_ci
utf8mb4_unicode_ci
utf8mb4_unicode_520_ci
ãããã®Collationã«å¯ŸããŠã以äžã®æåã®çå€æ¯èŒã»å€§å°ãç®åºã¿ãããšæããŸãã
â»ãMySQL培åºå
¥é 第4çãã«æžãããŠãããã什åãé¢ä¿ã®æåã¯æå
ã®ç°å¢ã ãšå
¥åã§ããŸããã§ããâŠ
A
ãša
A
ãšïŒ¡
ïŒå šè§ïŒïŒ¡
ïŒå šè§ïŒãšïœ
ïŒå šè§ïŒã
ãšã
ã
ãšã¢
ã¯
ãšã°
ã°
ãšã±
1
ãšâ
0
ãšã
ïŒæŒ¢æ°åïŒå¹³æ
ãšã»
ð£
ãšðº
ãŸããäžèšã®æåã«å¯ŸããŠweight_string
ã§éã¿ãç®åºããŸãã
çµæã¯ããã¡ãã
æåæ¯èŒã
utf8mb4_unicode_ci
ããã»ãšãã©åºå¥ã§ããŠãªãã§ãããutf8mb4_unicode_520_ci
ã¯ãð£ãšðºã ãåºå¥ã§ããŠããŸããã
è¡šã倧ãããªã£ãã®ã§ç»åã«ããŸããããMarkdownã®ãŸãŸã§ã貌ã£ãŠãããŸãããã
| æ¯èŒ | utf8mb4_0900_ai_ci | utf8mb4_0900_as_ci | utf8mb4_0900_as_cs | utf8mb4_ja_0900_as_cs | utf8mb4_ja_0900_as_cs_ks | utf8mb4_bin | utf8mb4_0900_bin | utf8mb4_general_ci | utf8mb4_unicode_ci | utf8mb4_unicode_520_ci | |:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:| | A = a | â | â | à | à | à | à | à | â | â | â | | A =  | â | â | à | â | â | à | à | à | â | â | |  = ïœ | â | â | à | à | à | à | à | â | â | â | | ã = ã | â | â | à | à | à | à | à | à | â | â | | ã = 㢠| â | â | à | â | à | à | à | à | â | â | | 㯠= ã° | â | à | à | à | à | à | à | à | â | â | | ã° = ã± | â | à | à | à | à | à | à | à | â | â | | 1 = â | â | â | à | à | à | à | à | à | â | â | | 0 = ã | â | â | â | â | â | à | à | à | â | â | | å¹³æ = ã» | â | â | à | à | à | à | à | à | â | â | | ð£ = ðº | à | à | à | à | à | à | à | â | â | à |
倧å°æ¯èŒã
Markdownã§ã
| æ¯èŒ | utf8mb4_0900_ai_ci | utf8mb4_0900_as_ci | utf8mb4_0900_as_cs | utf8mb4_ja_0900_as_cs | utf8mb4_ja_0900_as_cs_ks | utf8mb4_bin | utf8mb4_0900_bin | utf8mb4_general_ci | utf8mb4_unicode_ci | utf8mb4_unicode_520_ci | |:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:| | A comp a | = | = | > | > | > | < | < | = | = | = | | A comp  | = | = | < | = | = | < | < | < | = | = | |  comp ïœ | = | = | > | > | > | < | < | = | = | = | | ã comp ã | = | = | > | > | > | > | > | > | = | = | | ã comp 㢠| = | = | < | = | < | < | < | < | = | = | | 㯠comp ã° | = | < | < | < | < | < | < | < | = | = | | ã° comp ã± | = | < | < | < | < | < | < | < | = | = | | 1 comp â | = | = | < | < | < | < | < | < | = | = | | 0 comp ã | = | = | = | = | = | < | < | < | = | = | | å¹³æ comp ã» | = | = | < | < | < | > | > | > | = | = | | ð£ comp ðº | < | < | < | < | < | < | < | = | = | < |
æåŸã¯ãæåã®éã¿ã
ããã¯ãã¡ãã£ãšèŠããŸãããâŠãMarkdownã§ã
| æåïŒhexïŒ | utf8mb4_0900_ai_ciïŒweightïŒ | utf8mb4_0900_as_ciïŒweightïŒ | utf8mb4_0900_as_csïŒweightïŒ | utf8mb4_ja_0900_as_csïŒweightïŒ | utf8mb4_ja_0900_as_cs_ksïŒweightïŒ | utf8mb4_binïŒweightïŒ | utf8mb4_0900_binïŒweightïŒ | utf8mb4_general_ciïŒweightïŒ | utf8mb4_unicode_ciïŒweightïŒ | utf8mb4_unicode_520_ciïŒweightïŒ | |:----|:----|:----|:----|:----|:----|:----|:----|:----|:----|:----| | A(41) | 1C47 | 1C4700000020 | 1C470000002000000008 | 1C470000002000000008 | 1C470000002000000008 | 000041 | 41 | 0041 | 0E33 | 120F | | a(61) | 1C47 | 1C4700000020 | 1C470000002000000002 | 1C470000002000000002 | 1C470000002000000002 | 000061 | 61 | 0041 | 0E33 | 120F | | (EFBCA1) | 1C47 | 1C4700000020 | 1C470000002000000009 | 1C470000002000000008 | 1C470000002000000008 | 00FF21 | EFBCA1 | FF21 | 0E33 | 120F | | ïœ(EFBD81) | 1C47 | 1C4700000020 | 1C470000002000000003 | 1C470000002000000002 | 1C470000002000000002 | 00FF41 | EFBD81 | FF21 | 0E33 | 120F | | ã(E38182) | 3D5A | 3D5A00000020 | 3D5A000000200000000E | 1FB6000000200000000E | 1FB6000000200000000E00000002 | 003042 | E38182 | 3042 | 1E52 | 2B15 | | ã(E38181) | 3D5A | 3D5A00000020 | 3D5A000000200000000D | 1FB6000000200000000D | 1FB6000000200000000D00000002 | 003041 | E38181 | 3041 | 1E52 | 2B15 | | ã¢(E382A2) | 3D5A | 3D5A00000020 | 3D5A0000002000000011 | 1FB6000000200000000E | 1FB6000000200000000E00000008 | 0030A2 | E382A2 | 30A2 | 1E52 | 2B15 | | ã¯(E381AF) | 3D74 | 3D7400000020 | 3D74000000200000000E | 1FD0000000200000000E | 1FD0000000200000000E00000002 | 00306F | E381AF | 306F | 1E6B | 2B2E | | ã°(E381B0) | 3D74 | 3D74000000200037 | 3D740000002000370000000E0002 | 1FD00000002000370000000E0002 | 1FD00000002000370000000E000200000002 | 003070 | E381B0 | 3070 | 1E6B | 2B2E | | ã±(E381B1) | 3D74 | 3D74000000200038 | 3D740000002000380000000E0002 | 1FD00000002000380000000E0002 | 1FD00000002000380000000E000200000002 | 003071 | E381B1 | 3071 | 1E6B | 2B2E | | 1(31) | 1C3E | 1C3E00000020 | 1C3E0000002000000002 | 1C3E0000002000000002 | 1C3E0000002000000002 | 000031 | 31 | 0031 | 0E2A | 1206 | | â (E291A0) | 1C3E | 1C3E00000020 | 1C3E0000002000000006 | 1C3E0000002000000006 | 1C3E0000002000000006 | 002460 | E291A0 | 2460 | 0E2A | 1206 | | 0(30) | 1C3D | 1C3D00000020 | 1C3D0000002000000002 | 1C3D0000002000000002 | 1C3D0000002000000002 | 000030 | 30 | 0030 | 0E29 | 1205 | | ã(E38087) | 1C3D | 1C3D00000020 | 1C3D0000002000000002 | 1C3D0000002000000002 | 1C3D0000002000000002 | 003007 | E38087 | 3007 | 0E29 | 1205 | | å¹³æ(E5B9B3E68890) | FB40DE73FB40E210 | FB40DE73FB40E210000000200020 | FB40DE73FB40E210000000200020000000020002 | 5E4E5A91000000200020000000020002 | 5E4E5A91000000200020000000020002 | 005E73006210 | E5B9B3E68890 | 5E736210 | FB40DE73FB40E210 | FB40DE73FB40E210 | | ã»(E38DBB) | FB40DE73FB40E210 | FB40DE73FB40E210000000200020 | FB40DE73FB40E2100000002000200000001C001C | FB40DE73FB40E2100000002000200000001C001C | FB40DE73FB40E2100000002000200000001C001C | 00337B | E38DBB | 337B | FB40DE73FB40E210 | FB40DE73FB40E210 | | ð£(F09F8DA3) | 130C | 130C00000020 | 130C0000002000000002 | 130C0000002000000002 | 130C0000002000000002 | 01F363 | F09F8DA3 | FFFD | FFFD | FBC3F363 | | ðº(F09F8DBA) | 1323 | 132300000020 | 13230000002000000002 | 13230000002000000002 | 13230000002000000002 | 01F37A | F09F8DBA | FFFD | FFFD | FBC3F37A |
ã©ã®Collationã䜿ãïŒ
ã©ããªãã§ãããïŒutf8mb4_ja_0900_as_cs_ks
ãéžæããã®ãçŸç¶ã¯ããã®ã§ããããïŒ
ãããã¯ãå²ãåã£ãŠutf8mb4_bin
ãutf8mb4_0900_bin
ãéžã¶ãã§ãããããïŒ
ãªãã±
æåŸã«ãäžèšã®Markdownãäœæããããã°ã©ã ãèŒããŠãããŸãã
å®è¡ãããšãåºåã®äžéšã«Markdownãå«ãŸãããã®ãåŸãããŸãã
$ python3 mysql_collation.py
æ¯èŒããCollationããæåã®çš®é¡ãå€ããŠã¿ããšããããè©Šããã§ãããã
mysql_collation.py
import mysql.connector from mysql.connector import MySQLConnection from mysql.connector.cursor import MySQLCursor connection_configuration: dict = { 'user': 'kazuhira', 'password': 'password', 'host': '172.17.0.2', 'database': 'practice' } collations = [ 'utf8mb4_0900_ai_ci', 'utf8mb4_0900_as_ci', 'utf8mb4_0900_as_cs', 'utf8mb4_ja_0900_as_cs', 'utf8mb4_ja_0900_as_cs_ks', 'utf8mb4_bin', 'utf8mb4_0900_bin', 'utf8mb4_general_ci', 'utf8mb4_unicode_ci', 'utf8mb4_unicode_520_ci' ] comparison_string_pairs = [ ('A', 'a'), ('A', ''), ('', 'ïœ'), ('ã', 'ã'), ('ã', 'ã¢'), ('ã¯', 'ã°'), ('ã°', 'ã±'), ('1', 'â '), ('0', 'ã'), ('å¹³æ', 'ã»'), ('ð£', 'ðº') ] try: with mysql.connector.connect(**connection_configuration) as conn: conn: MySQLConnection = conn with conn.cursor() as cur: cur: MySQLCursor = cur print('=====================================================') print('print utf8mb4 collations.') print('=====================================================') cur.execute(""" show collation where collation like 'utf8mb4_ja%' or collation like 'utf8mb4_0900%' or collation like 'utf8mb4%bin' or collation like 'utf8mb4%general%' or collation like 'utf8mb4%unicode%' """) rows: list = cur.fetchall() sorted_collations = sorted(map(lambda r: r[0], rows), reverse=True) for c in sorted_collations: print(c) print() print('=====================================================') print('print strings equals comparison') print('=====================================================') print('| æ¯èŒ | ' + ' | '.join(collations) + ' |') print('|:----:|:----' + ':|:----'.join(list(map(lambda x: '', collations))) + ':|') for pair in comparison_string_pairs: print(f'| {pair[0]} = {pair[1]} ', end='') for collation in collations: cur.execute("set names utf8mb4 collate %s", (collation,)) cur.execute("select case %s = %s when 1 then 'â' else 'Ã' end", (pair[0], pair[1])) print(f' | {cur.fetchone()[0]}', end='') print(' |') print() print('=====================================================') print('print strings sort comparison') print('=====================================================') print('| æ¯èŒ | ' + ' | '.join(collations) + ' |') print('|:----:|:----' + ':|:----'.join(list(map(lambda x: '', collations))) + ':|') for pair in comparison_string_pairs: print(f'| {pair[0]} comp {pair[1]} ', end='') for collation in collations: cur.execute("set names utf8mb4 collate %s", (collation,)) cur.execute("select %s > %s, %s < %s", (pair[0], pair[1], pair[0], pair[1])) row = cur.fetchone() if row[0] == 0 and row[1] == 0: result = '=' elif row[0] == 1: result = '>' elif row[1] == 1: result = '<' print(f' | {result}', end='') print(' |') print() print('=====================================================') print('print strings weight') print('=====================================================') print('| æåïŒhexïŒ | ' + ' | '.join(map(lambda c: f'{c}ïŒweightïŒ', collations)) + ' |') print('|:----|:----' + '|:----'.join(list(map(lambda x: '', collations))) + '|') characters = [] for pair in comparison_string_pairs: for s in pair: if not s in characters: characters.append(s) for c in characters: cur.execute('select hex(%s)', (c,)) c_hex = cur.fetchone()[0] print(f'| {c}({c_hex})', end='') for collation in collations: cur.execute("set names utf8mb4 collate %s", (collation,)) cur.execute("select hex(weight_string(%s))", (c,)) print(f' | {cur.fetchone()[0]}', end='') print(' |') print() except mysql.connector.Error as err: print(f'VendorError: {err.errno}') print(f'SQLState: {err.sqlstate}') print(f'SQLException: {err.msg}')