CLOVERšŸ€

That was when it all began.

MySQL 8.0恫态LOAD DATA态Parallel Table Import Utilityļ¼ˆMySQL Shellļ¼‰ć€JDBC恧CSVćƒ­ćƒ¼ćƒ‰ć—ć¦ćæ悋

恓悌ćÆ态ćŖć«ć‚’ć—ćŸćć¦ę›øć„ćŸć‚‚ć®ļ¼Ÿ

MySQL恫CSVćƒ­ćƒ¼ćƒ‰ć‚’ć—ć‚ˆć†ćØć—ćŸę™‚ć«ć€LOAD DATA悒ä½æć£ćŸć‚Šć€ćµć¤ć†ć«INSERTꖇ悒ä½æć£ćŸć‚ŠćØć„ćć¤ć‹ę–¹ę³•ćŒ
恂悋ćØę€ć†ć®ć§ć™ćŒć€ć©ć‚Œćć‚‰ć„é•ć†ć‚‚ć®ć ć‚ć†ļ¼ŸćØ恄恆恓ćØ恧試恗恦ćæ悋恓ćØć«ć—ć¾ć—ćŸć€‚

ćŠé”Œ

CSVćƒ•ć‚”ć‚¤ćƒ«ć‚’ē”Øę„ć—ć¦ć€ä»„äø‹ć®ę–¹ę³•ć‚’試恗恦ćæć¾ć™ć€‚

  • LOAD DATAļ¼ˆę­£ē¢ŗ恫ćÆ态LOAD DATA LOCALļ¼‰
  • Parallel Table Import Utilityļ¼ˆMySQL Shellļ¼‰
  • CSV悒čŖ­ćæč¾¼ćæ态INSERTę–‡ć‚’å®Ÿč”Œć™ć‚‹Javaćƒ—ćƒ­ć‚°ćƒ©ćƒ 
    • ć‚·ćƒ³ćƒ—ćƒ«ć«INSERTć‚’å®Ÿč”Œ
    • JDBCć§ć®ćƒćƒƒćƒę›“ę–°
    • rewriteBatchedStatements悒trueć«ć—ć¦ć€ćƒćƒƒćƒę›“ę–°

ćƒ‡ćƒ¼ć‚æ恫ćÆ态Stack Overflowć®ćƒ¦ćƒ¼ć‚¶ćƒ¼ęƒ…å ±ć‚’4ć‚«ćƒ©ćƒ ć®ćæ态50äø‡ä»¶ć ć‘åˆ‡ć‚Šå‡ŗ恗恦CSVćƒ•ć‚”ć‚¤ćƒ«ć«ć—ćŸć‚‚ć®ć‚’ä½æć„ć¾ć™ć€‚

ć‚ć¾ć‚Šćƒ‡ćƒ¼ć‚æ量č‡Ŗ体ćÆå¤§ć—ćŸć“ćØ恌ćŖć„ć®ć§ć™ćŒć€ćć‚“ćŖć«ę°—é•·ć«å¾…ć¦ćŖć‹ć£ćŸć‹ć‚‰ć§ć™ā€¦ć€‚

ęœ€åˆć«ć€ćć‚Œćžć‚Œć®ę–¹ę³•ć«ć¤ć„ć¦ę›øć„ć¦ćŠćć¾ć™ć€‚

LOAD DATA

LOAD DATAćÆć€ćƒ†ć‚­ć‚¹ćƒˆćƒ•ć‚”ć‚¤ćƒ«ć‹ć‚‰MySQLć®ćƒ†ćƒ¼ćƒ–ćƒ«ćøćƒ‡ćƒ¼ć‚æć‚’ćƒ­ćƒ¼ćƒ‰ć™ć‚‹ę–‡ć§ć™ć€‚

SQL ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆ / LOAD DATA ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆ

通åøø恮INSERTꖇ悈悊悂态20å€é€Ÿć„ćć†ć§ć™ć€‚

ćƒ†ć‚­ć‚¹ćƒˆćƒ•ć‚”ć‚¤ćƒ«ć‹ć‚‰ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ćƒ­ćƒ¼ćƒ‰ć™ć‚‹å “åˆćÆ态LOAD DATA 悒ä½æē”Øć—ć¾ć™ć€‚ 通åøø态恓悌ćÆ INSERT ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆć‚’ä½æē”Øć™ć‚‹å “åˆć‚ˆć‚Šć€20 å€é€ŸććŖć‚Šć¾ć™ć€‚

ęœ€é©åŒ– / INSERT ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆć®ęœ€é©åŒ–

恟恠恗态ć‚Æćƒ©ć‚¤ć‚¢ćƒ³ćƒˆå“ć«ć‚ć‚‹ćƒ•ć‚”ć‚¤ćƒ«ć‚’ćƒ­ćƒ¼ćƒ‰ć™ć‚‹å “åˆļ¼ˆLOAD DATA LOCALļ¼‰ćÆć€ć‚»ć‚­ćƒ„ćƒŖćƒ†ć‚£äøŠć®ę³Øꄏē‚¹ćŒć‚悊态
ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆć§ćÆē„”åŠ¹ć«ćŖć£ć¦ć„ć¾ć™ć€‚ć“ć®ę©Ÿčƒ½ć‚’ä½æć†ćŸć‚ć«ćÆ态ć‚Æćƒ©ć‚¤ć‚¢ćƒ³ćƒˆļ¼ć‚µćƒ¼ćƒćƒ¼äø”ę–¹ć§ęœ‰åŠ¹ć«ć™ć‚‹åæ…č¦ćŒć‚ć‚Šć¾ć™ć€‚

ć‚»ć‚­ćƒ„ćƒŖćƒ†ć‚£ćƒ¼ / LOAD DATA LOCAL ć®ć‚»ć‚­ćƒ„ćƒŖćƒ†ć‚£ćƒ¼äøŠć®č€ƒę…®äŗ‹é …

Parallel Table Import Utility

Parallel Table Import UtilityćÆ态MySQL Shellć®ę©Ÿčƒ½ć§ć™ć€‚

MySQL :: MySQL Shell 8.0 :: 8.4 Parallel Table Import Utility

ć“ć®ę©Ÿčƒ½ćÆć€å…„åŠ›ćØćŖć‚‹ćƒ†ć‚­ć‚¹ćƒˆćƒ•ć‚”ć‚¤ćƒ«ć‚’č§£ęžć€ćƒćƒ£ćƒ³ć‚Æć«åˆ†å‰²ć—ć¦MySQLć«ćƒ‡ćƒ¼ć‚æ悒äø¦åˆ—ć«ćƒ­ćƒ¼ćƒ‰ć™ć‚‹ć“ćØćŒć§ćć¾ć™ć€‚

内éƒØēš„恫ćÆLOAD DATA恌äø¦åˆ—ć§å®Ÿč”Œć•ć‚Œć‚‹ćŸć‚ć€é€šåøø恮LOAD DATAćŒć‚·ćƒ³ć‚°ćƒ«ć‚¹ćƒ¬ćƒƒćƒ‰ć§å‹•ä½œć™ć‚‹ć®ć«åÆ¾ć—ć¦
å¤§å¹…ć«é«˜é€ŸåŒ–ć‚’č¦‹č¾¼ć‚€ć“ćØ恌恧恍悋悈恆恧恙怂

äø¦åˆ—åŗ¦ćÆć€ć‚¹ćƒ¬ćƒƒćƒ‰ę•°ć®ęŒ‡å®šćŠć‚ˆć³ćƒćƒ£ćƒ³ć‚Æ悵悤ć‚ŗć‹ć‚‰å°Žå‡ŗć•ć‚Œć¾ć™ć€‚

min{max{1, threads}, chunks}}

LOAD DATA悈悊悂速恏ćŖ悋恋恩恆恋ćÆ态ä½æē”Øć™ć‚‹ćƒ‡ćƒ¼ć‚æćƒ•ć‚”ć‚¤ćƒ«ć®ć‚µć‚¤ć‚ŗć«ä¾å­˜ć™ć‚‹ćØć“ć‚ć‚‚ć‚ć‚‹ć®ć§ć€č؈ęø¬ć‚悋恮ćæ恧恙恭怂

ćć®ä»–ć€å…„åŠ›ć®å‰å‡¦ē†ćØć—ć¦å¤‰ę›ćŒć§ććŸć‚Šć€č¤‡ę•°ć®ćƒ•ć‚”ć‚¤ćƒ«ć‚’ä½æć£ćŸć‚¤ćƒ³ćƒćƒ¼ćƒˆćŒć§ććŸć‚Šć€åœ§ēø®ć•ć‚ŒćŸćƒ•ć‚”ć‚¤ćƒ«ć‚’
ćƒ­ćƒ¼ćƒ‰ć—ćŸć‚Šć‚‚ć§ćć‚‹ć‚ˆć†ć§ć™ć€‚

rewriteBatchedStatements

MySQL恫ćÆ态INSERTꖇ恮VALUESå„ć«å€¤ć‚’č¤‡ę•°ę›ø恍态1åŗ¦ć®INSERTę–‡ć§č¤‡ę•°ć®č”Œć‚’ē™»éŒ²ć™ć‚‹ę©Ÿčƒ½ćŒć‚ć‚Šć¾ć™ć€‚
ā€»ćƒćƒ«ć‚ÆINSERT

ęœ€é©åŒ– / InnoDB ćƒ†ćƒ¼ćƒ–ćƒ«ć®äø€ę‹¬ćƒ‡ćƒ¼ć‚æćƒ­ćƒ¼ćƒ‰

Connector/J恧ćÆ态rewriteBatchedStatementsćØć„ć†ćƒ—ćƒ­ćƒ‘ćƒ†ć‚£ć‚’true恫恙悋ćØJDBCć®ćƒćƒƒćƒę›“ę–°ć‚’ćƒćƒ«ć‚ÆINSERT恫
ę›øćę›ćˆć¦å®Ÿč”Œć—ć¦ćć‚Œć¾ć™ć€‚

MySQL :: MySQL Connector/J 8.0 Developer Guide :: 6.3.13 Performance Extensions

ēµęžœ

å…ˆć«ēµęžœć ć‘ę›øć„ć¦ćŠććØ态恵恤恆恫INSERTę–‡ć‚’å®Ÿč”Œć™ć‚‹ć‚ˆć‚Šć‚‚ć€LOAD DATA悒ä½æć£ćŸć‚ŠParallel Table Import Utility悒
ä½æć†ę–¹ćŒåœ§å€’ēš„ć«é€Ÿć„ć§ć™ć€‚

LOAD DATAćØParallel Table Import Utility恫恤恄恦ćÆć€ä»Šå›žć®ćƒ‡ćƒ¼ć‚æ量恧ćÆć»ć¼å·®ćŒć‚ć‚Šć¾ć›ć‚“ć§ć—ćŸć€‚
Parallel Table Import UtilityćŒć€č‹„å¹²é€Ÿć„ćć‚‰ć„ć§ć™ć€‚ę‰±ć£ćŸćƒ‡ćƒ¼ć‚æé‡ćŒå°ć•ć‹ć£ćŸć§ć™ć­ā€¦ć€‚

JDBC恧恮INSERTę–‡å®Ÿč”ŒćÆ态rewriteBatchedStatements悒trueć«ć—ćŸćƒćƒ«ć‚ÆINSERTć®ę–¹ćŒć€ä»–ć®ę–¹ę³•ć®å€ä»„äøŠé€Ÿć„ēµęžœć«
ćŖć‚Šć¾ć—ćŸć€‚

ć‚µćƒžćƒŖćÆ态恓悓ćŖꄟ恘恧恙怂

50äø‡č”Œć®CSV恫åÆ¾ć—ć¦ć€‚

  • Parallel Table Import Utility ā€¦ 6ē§’ē؋åŗ¦
  • LOAD DATA ā€¦ 7怜9ē§’ē؋åŗ¦
  • 惐惫ć‚ÆINSERT ā€¦ 30ē§’ē؋åŗ¦
  • 単ē“”ćŖINSERTć€ćŠć‚ˆć³ćƒćƒƒćƒę›“ę–° ā€¦ 80ē§’ē؋åŗ¦

惐惫ć‚ÆINSERTćŠć‚ˆć³ćƒćƒƒćƒę›“ę–°ć®ćƒćƒƒćƒć‚µć‚¤ć‚ŗćÆ态10ć«ć—ć¦ć„ć¾ć™ć€‚

恧ćÆć€å®Ÿéš›ć«ä½œęˆć—ćŸć‚½ćƒ¼ć‚¹ć‚³ćƒ¼ćƒ‰ćŖ恩悒čØ˜č¼‰ć—ć¦ć„ćć¾ć™ć€‚

ē’°å¢ƒ

ä»Šå›žć®ē’°å¢ƒćÆ态恓恔悉恧恙怂

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.23    |
+-----------+
1 row in set (0.01 sec)

MySQLć‚µćƒ¼ćƒćƒ¼ćÆ态172.17.0.2ć§å‹•ä½œć—ć¦ć„ć‚‹ć‚‚ć®ćØć—ć¾ć™ć€‚ćƒćƒ„ćƒ¼ćƒ‹ćƒ³ć‚°ćÆē‰¹ć«ć—ć¦ćŠć‚‰ćšć€ć»ć¼ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆå€¤ć§ć™ć€‚

MySQL Shell怂

$ mysqlsh --version
/path/to/mysqlsh   Ver 8.0.23 for Linux on x86_64 - for MySQL 8.0.23 (MySQL Community Server (GPL))

ć¾ćŸć€Javać«é–¢ć™ć‚‹ē’°å¢ƒęƒ…å ±ćÆ恓恔悉怂

$ java --version
openjdk 11.0.10 2021-01-19
OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)
OpenJDK 64-Bit Server VM (build 11.0.10+9-Ubuntu-0ubuntu1.20.04, mixed mode, sharing)


$ mvn --version
Apache Maven 3.8.1 (05c21c65bdfed0f71a2f2ada8b84da59348c4c5d)
Maven home: $HOME/.sdkman/candidates/maven/current
Java version: 11.0.10, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: ja_JP, platform encoding: UTF-8
OS name: "linux", version: "5.4.0-72-generic", arch: "amd64", family: "unix"

ćƒ†ćƒ¼ćƒ–ćƒ«å®šē¾©ćØćƒ‡ćƒ¼ć‚æä½œęˆ

DDLćÆ态恓悓ćŖꄟ恘恧ē”Øꄏ怂

create table account (
  id int,
  name varchar(50),
  registered datetime,
  about varchar(255),
  primary key(id)
);

ćƒ‡ćƒ¼ć‚æćÆStack Overflowć®ćƒ¦ćƒ¼ć‚¶ćƒ¼ć‚’ä½æć„ć¾ć—ćŸć€‚

$ curl -OL https://archive.org/download/stackexchange/stackoverflow.com-Users.7z
$ 7z x stackoverflow.com-Users.7z

$ ll -h Users.xml
-rw-rw-r-- 1 xxxxx xxxxx 4.4G  3꜈  1 13:31 Users.xml

$ wc -l Users.xml
14080582 Users.xml

äø­čŗ«ćŒXMLćŖ恓ćØćØć€ć”ć‚‡ć£ćØå¤§ćć„ć®ć§ćƒ‡ćƒ¼ć‚æęŠ½å‡ŗćØ件ꕰ悒ēµžć‚Šč¾¼ć‚€ć®ć«ćƒ—ćƒ­ć‚°ćƒ©ćƒ ć‚’ä½œęˆć—ć¦CSVć«å¤‰ę›ć—ć¾ć—ćŸć€‚

$ groovy users_to_csv.groovy Users.xml 500000

$ ll -h users_500000.csv
-rw-rw-r-- 1 xxxxx xxxxx 32M  4꜈ 17 23:00 users_500000.csv

恓恓恧ä½æć£ćŸćƒ—ćƒ­ć‚°ćƒ©ćƒ ćÆć€ęœ€å¾Œć«č¼‰ć›ć¾ć™ć€‚users_500000.csvćØć„ć†ć®ćŒć€ćƒ‡ćƒ¼ć‚æćƒ­ćƒ¼ćƒ‰ć«ä½æ恆CSVćƒ•ć‚”ć‚¤ćƒ«ć§ć™ć€‚

ćƒ­ćƒ¼ćƒ‰ć—ćŸćƒ‡ćƒ¼ć‚æćÆ态ꬔ恮ć‚Æć‚ØćƒŖć®å®Ÿč”Œå‰ć«ćÆTRUNCATE恗恦恄悋悂恮ćØć—ć¾ć™ć€‚

mysql> truncate table account;
Query OK, 0 rows affected (0.21 sec)

恧ćÆć€é€²ć‚ć¦ć„ćć¾ć—ć‚‡ć†ć€‚

LOAD DATA

ć¾ćšćÆLOAD DATAć‹ć‚‰ć‚„ć£ć¦ćæć¾ć™ć€‚ę­£ē¢ŗ恫ćÆ态LOAD DATA LOCAL恧恙恌怂

SQL ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆ / LOAD DATA ć‚¹ćƒ†ćƒ¼ćƒˆćƒ”ćƒ³ćƒˆ

mysqlć‚³ćƒžćƒ³ćƒ‰ć‚’čµ·å‹•ć€‚

$ mysql -umysql_user -p -h172.17.0.2 practice
Enter password: 
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 22
Server version: 8.0.23 MySQL Community Server - GPL

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

å®Ÿč”Œć—ć‚ˆć†ćØ恙悋ćØ态ć‚Øćƒ©ćƒ¼ć«ćŖć‚Šć¾ć™ć€‚

mysql> load data local infile 'users_500000.csv' into table account fields terminated by ',' optionally enclosed by '"' escaped by '\\';
ERROR 3948 (42000): Loading local data is disabled; this must be enabled on both the client and server sides

ć‚Øćƒ©ćƒ¼ćƒ”ćƒƒć‚»ćƒ¼ć‚ø恮通悊态ć‚Æćƒ©ć‚¤ć‚¢ćƒ³ćƒˆć®ćƒ­ćƒ¼ć‚«ćƒ«ćƒ‡ćƒ¼ć‚æ悒ć‚Æćƒ©ć‚¤ć‚¢ćƒ³ćƒˆćØć‚µćƒ¼ćƒćƒ¼ćć‚Œćžć‚Œć§ęœ‰åŠ¹ć«ć™ć‚‹åæ…č¦ćŒć‚ć‚Šć¾ć™ć€ćØ怂

ć“ć”ć‚‰ć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć«čØ˜č¼‰ć•ć‚Œć¦ć„ć‚‹å†…å®¹ć§ć™ć€‚

ć‚»ć‚­ćƒ„ćƒŖćƒ†ć‚£ćƒ¼ / LOAD DATA LOCAL ć®ć‚»ć‚­ćƒ„ćƒŖćƒ†ć‚£ćƒ¼äøŠć®č€ƒę…®äŗ‹é …

ć‚µćƒ¼ćƒćƒ¼å“ćÆ态my.cnf恮mysqldć‚»ć‚Æć‚·ćƒ§ćƒ³ć§local-infile悒čØ­å®šć™ć‚‹ć‹ć€set global local_infile恧čØ­å®šć—ć¾ć™ć€‚

今回ćÆset global local_infileć§ęœ‰åŠ¹ć«ć™ć‚‹ć“ćØć«ć—ć¾ć—ćŸć€‚

mysql> set global local_infile = on;
Query OK, 0 rows affected (0.00 sec)

再åŗ¦å®Ÿč”Œā€¦ć™ć‚‹ćØ态ć‚Øćƒ©ćƒ¼ćƒ”ćƒƒć‚»ćƒ¼ć‚øćŒå¤‰ć‚ć‚Šć¾ć™ć€‚

mysql> load data local infile 'users_500000.csv' into table account fields terminated by ',' optionally enclosed by '"' escaped by '\\';
ERROR 2068 (HY000): LOAD DATA LOCAL INFILE file request rejected due to restrictions on access.

ć‚Æćƒ©ć‚¤ć‚¢ćƒ³ćƒˆå“ć‚‚ć€ć“ć®ę©Ÿčƒ½ć‚’ęœ‰åŠ¹ć«ć™ć‚‹åæ…č¦ćŒć‚ć‚‹ć‹ć‚‰ć§ć™ć€‚

my.cnf恮clientć‚»ć‚Æć‚·ćƒ§ćƒ³ć§local-infile悒čØ­å®šć™ć‚‹ć‹ć€mysqlć‚³ćƒžćƒ³ćƒ‰ć®čµ·å‹•ę™‚ć«--local-infile恧čØ­å®šć—ć¾ć™ć€‚

今回ćÆ态mysqlć‚³ćƒžćƒ³ćƒ‰ć®ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć§ęŒ‡å®šć—ć¾ć—ćŸć€‚

$ mysql -umysql_user -p -h172.17.0.2 --local-infile=on practice

ć“ć‚Œć§ćƒ­ćƒ¼ćƒ‰ć—ē›“恙ćØ态今åŗ¦ćÆęˆåŠŸć—ć¾ć—ćŸć€‚

mysql> load data local infile 'users_500000.csv' into table account fields terminated by ',' optionally enclosed by '"' escaped by '\\';
Query OK, 500000 rows affected (7.73 sec)
Records: 500000  Deleted: 0  Skipped: 0  Warnings: 0

Parallel Table Import Utility

ꬔćÆ态Parallel Table Import Utility恧恙怂

MySQL :: MySQL Shell 8.0 :: 8.4 Parallel Table Import Utility

恓恔悉ćÆ态MySQL Shellć®ę©Ÿčƒ½ć«ćŖć‚Šć¾ć™ć€‚

mysqlshć‚³ćƒžćƒ³ćƒ‰ć‹ć‚‰ē›“ęŽ„å®Ÿč”Œć™ć‚‹ę–¹ę³•ćØ态JavaScript悄Pythonć§å®Ÿč”Œć™ć‚‹ę–¹ę³•ćŒć‚ć‚Šć¾ć™ć€‚

ć©ć”ć‚‰ć®ę–¹ę³•ć‚’ä½æć†ć«ć—ć‚ć€å†…éƒØēš„恫ćÆēµå±€LOAD DATA悒ä½æć†ć®ć§ć€ć“ć”ć‚‰ć‚’ęœ‰åŠ¹ć«ć—ć¦ćŠćåæ…č¦ćŒć‚ć‚Šć¾ć™ć€‚

mysql> set global local_infile = on;
Query OK, 0 rows affected (0.00 sec)

ć¾ćšćÆmysqlshć‚³ćƒžćƒ³ćƒ‰ć§å®Ÿč”Œć—ć¦ćæć¾ć—ć‚‡ć†ć€‚

$ mysqlsh mysql_user@172.17.0.2:3306 -- util import-table users_500000.csv --schema=practice --table=account --fieldsTerminatedBy=',' --fieldsEnclosedBy='"' --fieldsOptionallyEnclosed=true --fieldsEscapedBy='\' --bytesPerChunk=10M
Please provide the password for 'mysql_user@172.17.0.2:3306': ********
Importing from file '/path/to/users_500000.csv' to table `practice`.`account` in MySQL Server at 172.17.0.2:3306 using 4 threads
[Worker003] users_500000.csv: Records: 43164  Deleted: 0  Skipped: 0  Warnings: 0
[Worker002] users_500000.csv: Records: 131723  Deleted: 0  Skipped: 0  Warnings: 0
[Worker001] users_500000.csv: Records: 163859  Deleted: 0  Skipped: 0  Warnings: 0
[Worker000] users_500000.csv: Records: 161254  Deleted: 0  Skipped: 0  Warnings: 0
100% (32.62 MB / 32.62 MB), 5.31 MB/s
File '/path/to/users_500000.csv' (32.62 MB) was imported in 6.1474 sec at 5.31 MB/s
Total rows affected in practice.account: Records: 500000  Deleted: 0  Skipped: 0  Warnings: 0

fields怜ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć®ę„å‘³ćÆ态LOAD DATAćØåŒć˜ć§ć™ć€‚schemać§ćƒ‡ćƒ¼ć‚æē™»éŒ²å…ˆć®ćƒ‡ćƒ¼ć‚æćƒ™ćƒ¼ć‚¹åć€tableć§ćƒ†ćƒ¼ćƒ–ćƒ«åć‚’
ęŒ‡å®šć—ć¾ć™ć€‚

ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć‚’č¦‹ć‚‹ćØć€č¤‡ę•°ć®ćƒ•ć‚”ć‚¤ćƒ«ć‚’äø€ę°—ć«ęŒ‡å®šć™ć‚‹ć“ćØ悂恧恍悋悈恆恧恙怂

ä½•å›žć‹č©¦ć—ć¦ćæć¾ć—ćŸćŒć€LOAD DATAć‚ˆć‚Šć‚‚ę°—ęŒć”é€Ÿć„ć§ć™ćŒć€ćć“ć¾ć§å·®ćÆć‚ć‚Šć¾ć›ć‚“ć€‚åÆ¾č±”ć®ćƒ•ć‚”ć‚¤ćƒ«ć‚µć‚¤ć‚ŗ恌
å°ć•ć„ć‚ˆć†ć§ć™ć­ć€‚

ćŖćŠć€ä»Šå›žćÆ4ć¤ć®ć‚¹ćƒ¬ćƒƒćƒ‰ćŒä½æć‚ć‚Œć¦ć„ć¾ć—ćŸć€‚

using 4 threads

ć‚¹ćƒ¬ćƒƒćƒ‰ę•°ć‚’ę±ŗå®šć™ć‚‹å¤‰ę•°ćÆ2ć¤ć‚ć‚Šć€ć²ćØ恤ćÆä»Šå›žęŒ‡å®šć—ćŸbytesPerChunkć€ć‚‚ć†ć²ćØ恤ćÆthreads恧恙怂

bytesPerChunkć®ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆå€¤ćÆ50M恧态threadsć®ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆå€¤ćÆ8恧恙怂

ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć®č؈ē®—å¼ć«ć‚ˆć‚‹ćØ态1ćØthreadsć®å¤§ćć„ę–¹ćØć€ćƒćƒ£ćƒ³ć‚Æ悵悤ć‚ŗć®ć†ć”ć€å°ć•ć„ę–¹ćŒéøęŠžć•ć‚Œć¾ć™ć€‚

min{max{1, threads}, chunks}}

今回ćÆćƒćƒ£ćƒ³ć‚Æ悵悤ć‚ŗ悒10Mć«ć—ć¦ć€ćƒ•ć‚”ć‚¤ćƒ«ć‚µć‚¤ć‚ŗ恌32Mć ć£ćŸć®ć§4恫ćŖ悊态threadsć®ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆå€¤8ć‚ˆć‚Šć‚‚å°ć•ć„ć®ć§
4ć‚¹ćƒ¬ćƒƒćƒ‰ć«ćŖć£ćŸćØ恄恆恓ćØ恧恙恭怂

$ ll -h users_500000.csv
-rw-rw-r-- 1 xxxxx xxxxx 32M  4꜈ 17 23:00 users_500000.csv

ćØ恓悍恧态mysql_user:password@172.17.0.2:3306ćæ恟恄ćŖčؘčæ°ć«ć™ć‚‹ćØć€ćƒ‘ć‚¹ćƒÆćƒ¼ćƒ‰ć‚‚ęŒ‡å®šć—ć¦å®Ÿč”Œć§ćć‚‹ć‚ˆć†ć§ć™ć€‚

JavaScriptć§ć‚‚å®Ÿč”Œć—ć¦ćæć¾ć—ć‚‡ć†ć€‚ć¾ćšćÆmysqlshć‚³ćƒžćƒ³ćƒ‰ć§ćƒ­ć‚°ć‚¤ćƒ³ć€‚

$ mysqlsh mysql_user@172.17.0.2:3306
Please provide the password for 'mysql_user@172.17.0.2:3306': ********
MySQL Shell 8.0.23

Copyright (c) 2016, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'mysql_user@172.17.0.2:3306'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 59
Server version: 8.0.23 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  172.17.0.2:3306 ssl  JS > 

å®Ÿč”Œć€‚

 MySQL  172.17.0.2:3306 ssl  JS > util.importTable('users_500000.csv', {schema: 'practice', table: 'account', fieldsTerminatedBy: ',', fieldsEnclosedBy: '"', fieldsOptionallyEnclosed: true, fieldsEscapedBy: '\\', bytesPerChunk: '10M'})
Importing from file '/path/to/users_500000.csv' to table `practice`.`account` in MySQL Server at 172.17.0.2:3306 using 4 threads
[Worker002] users_500000.csv: Records: 43164  Deleted: 0  Skipped: 0  Warnings: 0
[Worker003] users_500000.csv: Records: 131723  Deleted: 0  Skipped: 0  Warnings: 0
[Worker001] users_500000.csv: Records: 161254  Deleted: 0  Skipped: 0  Warnings: 0
[Worker000] users_500000.csv: Records: 163859  Deleted: 0  Skipped: 0  Warnings: 0
100% (32.62 MB / 32.62 MB), 5.89 MB/s
File ''/path/to/users_500000.csv' (32.62 MB) was imported in 5.5399 sec at 5.89 MB/s
Total rows affected in practice.account: Records: 500000  Deleted: 0  Skipped: 0  Warnings: 0

ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćŖć©ć®ę„å‘³ćÆć€å…ˆć»ć©ćØå¤‰ć‚ć‚‰ćŖć„ć®ć§å‰²ę„›ć€‚

JDBC

恓恓恋悉ćÆ态Javać§ćƒ—ćƒ­ć‚°ćƒ©ćƒ ć‚’ę›øć„ć¦ć„ćć¾ć™ć€‚

Maven依存関äæ‚ćØć€ćƒ—ćƒ©ć‚°ć‚¤ćƒ³ć®čØ­å®šć€‚

    <dependencies>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.23</version>
        </dependency>

        <dependency>
            <groupId>com.opencsv</groupId>
            <artifactId>opencsv</artifactId>
            <version>5.4</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.4</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>org.littlewings.mysql.Importer</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

å®Ÿč”ŒåÆčƒ½JARćƒ•ć‚”ć‚¤ćƒ«ć‚’ä½œęˆć—ć¾ć™ć€‚

ćƒ—ćƒ­ć‚°ćƒ©ćƒ ć®é››å½¢ćÆ态恓悓ćŖꄟ恘恧恙怂

src/main/java/org/littlewings/mysql/Importer.java

package org.littlewings.mysql;

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Properties;

import com.opencsv.CSVReader;
import com.opencsv.exceptions.CsvValidationException;

public class Importer {
    String mode;
    String filePath;

    public static void main(String... args) throws SQLException, IOException, CsvValidationException {
        Importer importer = new Importer(args[0], args[1]);
        importer.execute();
    }

    public Importer(String mode, String filePath) {
        this.mode = mode;
        this.filePath = filePath;
    }

    public void execute() throws SQLException, IOException, CsvValidationException {
        switch (mode) {
            case "simple":
                simpleInsert();
                break;
            case "batch":
                batchInsert();
                break;
            case "batch-rewrite":
                batchRewriteInsert();
                break;
            default:
                System.out.printf("unknown mode [%s]%n", mode);
                break;
        }
    }

    怜ēœē•„怜

    private void log(String format, Object... args) {
        System.out.printf("[%s] %s%n", LocalDateTime.now(), String.format(format, args));
    }
}

å®Ÿč”Œć™ć‚‹ēØ®é”žć«ć‚ˆć£ć¦ć€simple态batch态batch-rewrite恮3恤悒ē”Øę„ć—ć¾ć—ćŸć€‚

ćƒ‘ćƒƒć‚±ćƒ¼ć‚øćƒ³ć‚°ć—ćŸć‚‰

$ mvn package

恓悓ćŖę„Ÿć˜ć§å®Ÿč”Œć—ć¾ć™ć€‚

$ java -jar target/mysql-import-0.0.1-SNAPSHOT.jar [ēخ锞] users_500000.csv

恧ćÆ态恝悌恞悌ēØ®é”žć”ćØ恫ę›øć„ć¦ć„ćć¾ć™ć€‚

単ē“”恫INSERTę–‡ć‚’å®Ÿč”Œć™ć‚‹

単ē“”恫INSERTę–‡ć‚’å®Ÿč”Œć™ć‚‹ćƒ‘ć‚æćƒ¼ćƒ³ć®ć‚³ćƒ¼ćƒ‰ćÆ态恓恔悉怂

    private void simpleInsert() throws SQLException, IOException, CsvValidationException {
        Properties properties = new Properties();
        properties.setProperty("user", "mysql_user");
        properties.setProperty("password", "password");

        int commitThreshold = 10000;
        int count = 0;

        long startTime = System.currentTimeMillis();
        long commitStartTime = System.currentTimeMillis();

        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("uuuu-MM-dd HH:mm:ss");

        try (CSVReader csvReader = new CSVReader(Files.newBufferedReader(Paths.get(filePath), StandardCharsets.UTF_8));
             Connection conn = DriverManager.getConnection("jdbc:mysql://172.17.0.2:3306/practice", properties);
             PreparedStatement statement = conn.prepareStatement("insert into account(id, name, registered, about) values(?, ?, ?, ?)")) {
            conn.setAutoCommit(false);

            String[] columns;

            while ((columns = csvReader.readNext()) != null) {
                count++;

                statement.setInt(1, Integer.parseInt(columns[0]));
                statement.setString(2, columns[1]);
                statement.setTimestamp(3, Timestamp.valueOf(LocalDateTime.parse(columns[2], formatter)));
                statement.setString(4, columns[3]);

                statement.executeUpdate();

                if (count % commitThreshold == 0) {
                    long committedTime = System.currentTimeMillis() - commitStartTime;
                    log("commit, %d records, lap time = %f sec...", count, committedTime * 1.0 / 1000);
                    conn.commit();

                    commitStartTime = System.currentTimeMillis();
                }
            }

            conn.commit();

            long elapsedTime = System.currentTimeMillis() - startTime;
            log("%d records inserted, elapsed time = %f sec", count, elapsedTime * 1.0 / 1000);
        }
    }

CSVćƒ•ć‚”ć‚¤ćƒ«ć‚’1č”Œćšć¤čŖ­ć‚“恧INSERT恗态10,000件恔ćØć«ć‚³ćƒŸćƒƒćƒˆć™ć‚‹ć‚ˆć†ć«ć—ć¾ć—ćŸć€‚

å®Ÿč”Œć®ę§˜å­ćÆ态恓恔悉怂

$ java -jar target/mysql-import-0.0.1-SNAPSHOT.jar simple users_500000.csv
[2021-04-18T02:08:52.365205] commit, 10000 records, lap time = 3.998000 sec...
[2021-04-18T02:08:54.063781] commit, 20000 records, lap time = 1.643000 sec...
[2021-04-18T02:08:55.629334] commit, 30000 records, lap time = 1.521000 sec...
[2021-04-18T02:08:57.112967] commit, 40000 records, lap time = 1.431000 sec...
[2021-04-18T02:08:58.513753] commit, 50000 records, lap time = 1.357000 sec...
[2021-04-18T02:09:00.039572] commit, 60000 records, lap time = 1.455000 sec...
[2021-04-18T02:09:01.461116] commit, 70000 records, lap time = 1.380000 sec...
[2021-04-18T02:09:02.883278] commit, 80000 records, lap time = 1.379000 sec...
[2021-04-18T02:09:04.375070] commit, 90000 records, lap time = 1.431000 sec...
[2021-04-18T02:09:05.878608] commit, 100000 records, lap time = 1.464000 sec...

怜ēœē•„怜

[2021-04-18T02:09:50.253484] commit, 400000 records, lap time = 1.357000 sec...
[2021-04-18T02:09:51.730666] commit, 410000 records, lap time = 1.402000 sec...
[2021-04-18T02:09:53.159438] commit, 420000 records, lap time = 1.367000 sec...
[2021-04-18T02:09:54.605684] commit, 430000 records, lap time = 1.394000 sec...
[2021-04-18T02:09:56.128087] commit, 440000 records, lap time = 1.459000 sec...
[2021-04-18T02:09:57.562830] commit, 450000 records, lap time = 1.381000 sec...
[2021-04-18T02:09:59.001128] commit, 460000 records, lap time = 1.368000 sec...
[2021-04-18T02:10:00.460900] commit, 470000 records, lap time = 1.386000 sec...
[2021-04-18T02:10:01.823628] commit, 480000 records, lap time = 1.308000 sec...
[2021-04-18T02:10:03.394428] commit, 490000 records, lap time = 1.480000 sec...
[2021-04-18T02:10:04.834426] commit, 500000 records, lap time = 1.366000 sec...
[2021-04-18T02:10:04.919687] 500000 records inserted, elapsed time = 76.556000 sec
ćƒćƒƒćƒę›“ę–°

ꬔćÆć€ćƒćƒƒćƒę›“ę–°ć€‚

    private void batchInsert() throws SQLException, IOException, CsvValidationException {
        Properties properties = new Properties();
        properties.setProperty("user", "mysql_user");
        properties.setProperty("password", "password");

        int commitThreshold = 10000;
        int batchThreshold = 10;
        int count = 0;

        long startTime = System.currentTimeMillis();
        long commitStartTime = System.currentTimeMillis();

        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("uuuu-MM-dd HH:mm:ss");

        try (CSVReader csvReader = new CSVReader(Files.newBufferedReader(Paths.get(filePath), StandardCharsets.UTF_8));
             Connection conn = DriverManager.getConnection("jdbc:mysql://172.17.0.2:3306/practice", properties);
             PreparedStatement statement = conn.prepareStatement("insert into account(id, name, registered, about) values(?, ?, ?, ?)")) {
            conn.setAutoCommit(false);

            String[] columns;

            while ((columns = csvReader.readNext()) != null) {
                count++;

                statement.setInt(1, Integer.parseInt(columns[0]));
                statement.setString(2, columns[1]);
                statement.setTimestamp(3, Timestamp.valueOf(LocalDateTime.parse(columns[2], formatter)));
                statement.setString(4, columns[3]);

                statement.addBatch();

                if (count % batchThreshold == 0) {
                    statement.executeBatch();
                }

                if (count % commitThreshold == 0) {
                    long committedTime = System.currentTimeMillis() - commitStartTime;
                    log("commit, %d records, lap time = %f sec...", count, committedTime * 1.0 / 1000);
                    conn.commit();

                    commitStartTime = System.currentTimeMillis();
                }
            }

            conn.commit();

            long elapsedTime = System.currentTimeMillis() - startTime;
            log("%d records inserted, elapsed time = %f sec", count, elapsedTime * 1.0 / 1000);
        }
    }

å…ˆć»ć©ćØćÆé•ć£ć¦ć€10件恔ćØć«ćƒćƒƒćƒę›“ę–°ć™ć‚‹ć‚ˆć†ć«ć—ć¾ć—ćŸć€‚

                statement.addBatch();

                if (count % batchThreshold == 0) {
                    statement.executeBatch();
                }

å®Ÿč”Œć®ę§˜å­ć€‚

$ java -jar target/mysql-import-0.0.1-SNAPSHOT.jar batch users_500000.csv
[2021-04-18T02:11:18.837457] commit, 10000 records, lap time = 4.386000 sec...
[2021-04-18T02:11:20.909568] commit, 20000 records, lap time = 2.016000 sec...
[2021-04-18T02:11:22.369608] commit, 30000 records, lap time = 1.411000 sec...
[2021-04-18T02:11:23.981233] commit, 40000 records, lap time = 1.564000 sec...
[2021-04-18T02:11:25.655595] commit, 50000 records, lap time = 1.605000 sec...
[2021-04-18T02:11:27.154502] commit, 60000 records, lap time = 1.452000 sec...
[2021-04-18T02:11:28.629899] commit, 70000 records, lap time = 1.435000 sec...
[2021-04-18T02:11:30.353573] commit, 80000 records, lap time = 1.684000 sec...
[2021-04-18T02:11:31.992758] commit, 90000 records, lap time = 1.599000 sec...
[2021-04-18T02:11:33.567422] commit, 100000 records, lap time = 1.529000 sec...

怜ēœē•„怜

[2021-04-18T02:12:17.328760] commit, 400000 records, lap time = 1.351000 sec...
[2021-04-18T02:12:18.712159] commit, 410000 records, lap time = 1.329000 sec...
[2021-04-18T02:12:20.172459] commit, 420000 records, lap time = 1.415000 sec...
[2021-04-18T02:12:21.583530] commit, 430000 records, lap time = 1.346000 sec...
[2021-04-18T02:12:23.183495] commit, 440000 records, lap time = 1.525000 sec...
[2021-04-18T02:12:24.645953] commit, 450000 records, lap time = 1.409000 sec...
[2021-04-18T02:12:26.059431] commit, 460000 records, lap time = 1.355000 sec...
[2021-04-18T02:12:27.586332] commit, 470000 records, lap time = 1.427000 sec...
[2021-04-18T02:12:29.061952] commit, 480000 records, lap time = 1.403000 sec...
[2021-04-18T02:12:30.537175] commit, 490000 records, lap time = 1.425000 sec...
[2021-04-18T02:12:31.900668] commit, 500000 records, lap time = 1.290000 sec...
[2021-04-18T02:12:31.957996] 500000 records inserted, elapsed time = 77.511000 sec

単ē“”ćŖINSERTę–‡ć‚’å®Ÿč”Œć™ć‚‹ę–¹ćØć€ć»ć¼å·®ćŒć‚ć‚Šć¾ć›ć‚“ā€¦ć€‚

rewriteBatchedStatements悒ä½æć£ćŸćƒćƒ«ć‚ÆINSERT

ęœ€å¾ŒćÆ态rewriteBatchedStatements悒ä½æć£ćŸćƒćƒ«ć‚ÆINSERT恧恙怂

    private void batchRewriteInsert() throws SQLException, IOException, CsvValidationException {
        Properties properties = new Properties();
        properties.setProperty("user", "mysql_user");
        properties.setProperty("password", "password");
        properties.setProperty("rewriteBatchedStatements", "true");
        // properties.setProperty("profileSQL", "true");

        int commitThreshold = 10000;
        int batchThreshold = 10;
        int count = 0;

        long startTime = System.currentTimeMillis();
        long commitStartTime = System.currentTimeMillis();

        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("uuuu-MM-dd HH:mm:ss");

        try (CSVReader csvReader = new CSVReader(Files.newBufferedReader(Paths.get(filePath), StandardCharsets.UTF_8));
             Connection conn = DriverManager.getConnection("jdbc:mysql://172.17.0.2:3306/practice", properties);
             PreparedStatement statement = conn.prepareStatement("insert into account(id, name, registered, about) values(?, ?, ?, ?)")) {
            conn.setAutoCommit(false);

            String[] columns;

            while ((columns = csvReader.readNext()) != null) {
                count++;

                statement.setInt(1, Integer.parseInt(columns[0]));
                statement.setString(2, columns[1]);
                statement.setTimestamp(3, Timestamp.valueOf(LocalDateTime.parse(columns[2], formatter)));
                statement.setString(4, columns[3]);

                statement.addBatch();

                if (count % batchThreshold == 0) {
                    statement.executeBatch();
                }

                if (count % commitThreshold == 0) {
                    long committedTime = System.currentTimeMillis() - commitStartTime;
                    log("commit, %d records, lap time = %f sec...", count, committedTime * 1.0 / 1000);
                    conn.commit();

                    commitStartTime = System.currentTimeMillis();
                }
            }

            conn.commit();

            long elapsedTime = System.currentTimeMillis() - startTime;
            log("%d records inserted, elapsed time = %f sec", count, elapsedTime * 1.0 / 1000);
        }
    }

ćƒćƒƒćƒę›“ę–°ć®ćƒ—ćƒ­ć‚°ćƒ©ćƒ ćØ恮違恄ćÆ态ꎄē¶šć™ć‚‹ę™‚ć®ćƒ—ćƒ­ćƒ‘ćƒ†ć‚£ć§rewriteBatchedStatements悒true恫恙悋恠恑恧恙怂

MySQL :: MySQL Connector/J 8.0 Developer Guide :: 6.3.13 Performance Extensions

        properties.setProperty("rewriteBatchedStatements", "true");

å®Ÿč”Œć®ę§˜å­ć€‚

$ java -jar target/mysql-import-0.0.1-SNAPSHOT.jar batch-rewrite users_500000.csv
[2021-04-18T02:13:25.024357] commit, 10000 records, lap time = 3.168000 sec...
[2021-04-18T02:13:25.943424] commit, 20000 records, lap time = 0.883000 sec...
[2021-04-18T02:13:26.882186] commit, 30000 records, lap time = 0.910000 sec...
[2021-04-18T02:13:27.540862] commit, 40000 records, lap time = 0.627000 sec...
[2021-04-18T02:13:28.173199] commit, 50000 records, lap time = 0.602000 sec...
[2021-04-18T02:13:28.740784] commit, 60000 records, lap time = 0.531000 sec...
[2021-04-18T02:13:29.340097] commit, 70000 records, lap time = 0.568000 sec...
[2021-04-18T02:13:29.860993] commit, 80000 records, lap time = 0.489000 sec...
[2021-04-18T02:13:30.424720] commit, 90000 records, lap time = 0.515000 sec...
[2021-04-18T02:13:30.932128] commit, 100000 records, lap time = 0.480000 sec...

怜ēœē•„怜

[2021-04-18T02:13:47.384536] commit, 400000 records, lap time = 0.449000 sec...
[2021-04-18T02:13:47.899068] commit, 410000 records, lap time = 0.479000 sec...
[2021-04-18T02:13:48.434318] commit, 420000 records, lap time = 0.488000 sec...
[2021-04-18T02:13:48.939430] commit, 430000 records, lap time = 0.476000 sec...
[2021-04-18T02:13:49.577309] commit, 440000 records, lap time = 0.591000 sec...
[2021-04-18T02:13:50.101016] commit, 450000 records, lap time = 0.483000 sec...
[2021-04-18T02:13:50.606555] commit, 460000 records, lap time = 0.454000 sec...
[2021-04-18T02:13:51.219517] commit, 470000 records, lap time = 0.579000 sec...
[2021-04-18T02:13:51.783128] commit, 480000 records, lap time = 0.473000 sec...
[2021-04-18T02:13:52.529406] commit, 490000 records, lap time = 0.671000 sec...
[2021-04-18T02:13:53.129715] commit, 500000 records, lap time = 0.537000 sec...
[2021-04-18T02:13:53.178695] 500000 records inserted, elapsed time = 31.325000 sec

ć“ć‚Œć ć‘ć§ć€å€ä»„äøŠć«é€ŸććŖć‚Šć¾ć—ćŸć€‚

č¼‰ć›ć¦ć„ćŸć‚³ćƒ¼ćƒ‰ć§ćÆć‚³ćƒ”ćƒ³ćƒˆć‚¢ć‚¦ćƒˆć—ć¦ć„ć¾ć™ćŒć€profileSQL悒true恫恙悋ćØSQLꖇ恌č”Øē¤ŗć•ć‚Œć‚‹ć‚ˆć†ć«ćŖ悋恮恧

MySQL :: MySQL Connector/J 8.0 Developer Guide :: 6.3.14 Debugging/Profiling

        properties.setProperty("rewriteBatchedStatements", "true");
        properties.setProperty("profileSQL", "true");

ćƒćƒƒćƒę›“ę–°ćŒćƒćƒ«ć‚ÆINSERTć«å¤‰ę›ć•ć‚Œć¦ć„ć‚‹ę§˜å­ćŒē¢ŗčŖć§ćć¾ć™ć€‚
ā€»å€¤ć‚‚č”Øē¤ŗć•ć‚Œć¦ć„ć‚‹ć®ć§ć™ćŒć€č¼‰ć›ć¦ć„ć¾ć›ć‚“

Sun Apr 18 02:17:00 JST 2021 INFO: [QUERY] insert into account(id, name, registered, about) values(...),(...),(...),(...),(...),(...),(...),(...),(...),(...) [Created on: Sun Apr 18 02:17:00 JST 2021, duration: 1, connection-id: 73, statement-id: 0, resultset-id: 0,   at org.littlewings.mysql.Importer.batchRewriteInsert(Importer.java:181)]
Sun Apr 18 02:17:00 JST 2021 INFO: [FETCH]  [Created on: Sun Apr 18 02:17:00 JST 2021, duration: 0, connection-id: 73, statement-id: 0, resultset-id: 0,    at org.littlewings.mysql.Importer.batchRewriteInsert(Importer.java:181)]

ć¾ćØ悁

MySQL恫åÆ¾ć—ć¦ć€LOAD DATA态Parallel Table Import Utility态JDBC恧恮通åøø恮INSERTć‚„ćƒćƒ«ć‚ÆINSERTćŖ恩悒
試恗恦ćæć¾ć—ćŸć€‚

LOAD DATA悄Parallel Table Import UtilityćŒåœ§å€’ēš„ć«é€Ÿć„ć§ć™ć­ć€‚

å¤§é‡ć®ćƒ†ć‚­ć‚¹ćƒˆćƒ‡ćƒ¼ć‚æć‚’ćƒ­ćƒ¼ćƒ‰ć™ć‚‹ćŖć‚‰ć€ć‚„ć£ć±ć‚Šć“ć”ć‚‰ćŖ悓恧恗悇恆恭恇ā€¦ć€‚

ćƒ‡ćƒ¼ć‚æé‡ćŒå¤šć„ć»ć©å·®ćŒå‡ŗ恝恆ćŖ恮恧态ē‰¹ć«Parallel Table Import UtilityćØåˆć‚ć›ć¦ē¢ŗčŖć—恦éøć¼ć†ć‹ćŖćØę€ć„ć¾ć™ć€‚

ć‚Ŗćƒžć‚±

Stack Overflowć®ćƒ¦ćƒ¼ć‚¶ćƒ¼ć®ćƒ‡ćƒ¼ć‚æ悒态XML恋悉CSVć«å¤‰ę›ć—ć¤ć¤ć€ęŒ‡å®šć®ćƒ¬ć‚³ćƒ¼ćƒ‰ę•°ć«ēµžć‚‹ćƒ—ćƒ­ć‚°ćƒ©ćƒ ć‚’ęœ€å¾Œć«
č¼‰ć›ć¦ćŠćć¾ć™ć€‚

IdćŒę­£ć®å€¤ć—ć‹ę‹¾ć£ć¦ć„ćŖć‹ć£ćŸć‚Šć€AboutMe悒255ę–‡å­—ć§åˆ‡ć‚ŠęØć¦ćŸć‚Šć€ę—„ę™‚ć®ćƒ•ć‚©ćƒ¼ćƒžćƒƒćƒˆć‚’å¤‰ćˆćŸć‚ŠćØ恄恏恤恋
å¤‰ę›ćŖć©ć‚’ć—ć¦ć„ć¾ć™ć€‚

users_to_csv.groovy

import com.opencsv.CSVParser
import com.opencsv.CSVParserBuilder
import com.opencsv.CSVWriterBuilder

import java.nio.charset.StandardCharsets
import java.nio.file.Files
import java.nio.file.Paths
import java.time.format.DateTimeFormatter
import javax.xml.stream.XMLInputFactory
import javax.xml.stream.XMLStreamConstants

@Grab('com.opencsv:opencsv:5.4')
import com.opencsv.CSVWriterBuilder
import com.opencsv.CSVParserBuilder

@Grab('org.jsoup:jsoup:1.13.1')
import org.jsoup.Jsoup

def xmlPath = args[0]
def csvSize = args[1] as int

def reader = Files.newBufferedReader(Paths.get(xmlPath), StandardCharsets.UTF_8)
reader.readLine() // skip BOM.

def outputFile = "users_${csvSize}.csv"
def csvWriter = new CSVWriterBuilder(Files.newBufferedWriter(Paths.get(outputFile), StandardCharsets.UTF_8)).withParser(new CSVParserBuilder().withEscapeChar('\\' as char).build()).build()

def datetimeFormatter = DateTimeFormatter.ofPattern('uuuu-MM-dd HH:mm:ss')

def count = 0

def xmlInputFactory = XMLInputFactory.newInstance()
def streamReader = xmlInputFactory.createXMLStreamReader(reader)

while (streamReader.hasNext()) {
    def eventType = streamReader.next()

    if (eventType == XMLStreamConstants.START_ELEMENT) {
        if (streamReader.getLocalName() == 'row') {
            def accountId = streamReader.getAttributeValue(null, "AccountId")
            def name = streamReader.getAttributeValue(null, 'DisplayName')
            def registeredDate = streamReader.getAttributeValue(null, 'CreationDate')
            def about = streamReader.getAttributeValue(null, "AboutMe")

            if (accountId == null || accountId as int < 0) {
                continue
            }

            registeredDate = datetimeFormatter.format(DateTimeFormatter.ISO_LOCAL_DATE_TIME.parse(registeredDate))

            if (about == null) {
                about = ''
            } else {
                about = Jsoup.parse(about).text()
            }

            if (about.length() > 255) {
                about = about.substring(0, 252) + "..."
            }

            csvWriter.writeNext(List.of(accountId, name, registeredDate, about).toArray(new String[0]))

            count++

            if (count % 10000 == 0) {
                println("${count} records, generated...")
            }

            if (count == csvSize) {
                break
            }
        }
    }
}

streamReader.close()
csvWriter.close()