これは、なにをしたくて書いたもの?
少し前に、TiDBのQuickStartを試してみました。
Ubuntu Linux 22.04 LTSに、MySQL互換の分散データベースTiDBをインストールして(ローカルでクラスターを立ち上げて)みる - CLOVER🍀
QuickStartはもうひとつあって、HTAPに関するものがあります。
Quick start with HTAP | PingCAP Docs
今回はこちらを試してみようと思います。
HTAP
QuickStartに進む前に、HTAP自体について見てみましょう。TiDBのドキュメントにはHTAP自体を
HTAPとは「Hybrid transactional/analytical processing」の略で、OLTPとOLAPの両方の性質を併せ持つデータベースのことです。
Hybrid transactional/analytical processing - Wikipedia
OLTP(online transaction processing)はユーザーのリクエストに応答し、リアルタイムで処理することが期待されるものです。
よく使用するRDBMSはこちらの性質ですね。
Online transaction processing - Wikipedia
OLAP(Online analytical processing)は、大量のデータを分析、集計することが期待されるものです。
Online analytical processing - Wikipedia
それぞれ性質が異なるので、通常データベースはどちらかに特化することが多いのですが、その両方の性質を併せ持つのがHTAPという
捉え方でよさそうです。
TiDBのHTAPのQuickStart
今回参照するTiDBのHTAPのQuickStartはこちらです。
Quick start with HTAP | PingCAP Docs
この内容に沿って進めるのですが、基本的なコンセプトは目を通しておきましょう。
Quick Start Guide for TiDB HTAP / Basic concepts
前提知識として、TiDB OLTP向けの行ベースストレージエンジンであるTiKV、TiDB OLAP向けのストレージエンジンであるTiFlashに関する
基本的な知識は持っている必要がある、とされています。
それはまた今度見ましょう…。
続いている内容をまとめます。
- HTAPのストレージエンジン
- HTAPでは行ベースのストレージエンジンと列指向(カラムナ)のストレージエンジンが共存する
- どちらのストレージエンジンもデータを自動的にレプリケーションし、強い一貫性を維持する
- 行ベースのストレージエンジンはOLTP向けに、列指向のストレージエンジンはOLAP向けにパフォーマンスを最適化する
- HTAPのデータの一貫性
- HTAPのデータの分離
- TiKVとTiFlashはリソース分離の問題を解決するために、異なるマシンにデプロイできる
- MPPコンピューティングエンジン
…なんとなく意味はわかりますが、そろそろTiDBの構成要素というか、アーキテクチャーに関するドキュメントを眺めた方がよさそうですね。
それでは、ここから先はこのQuickStartの内容を進めていきます。
環境
今回の環境は、こちら。
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy $ uname -srvmpio Linux 5.15.0-105-generic #115-Ubuntu SMP Mon Apr 15 09:52:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
MySQL Shellも一部使います。
$ mysqlsh --version mysqlsh Ver 8.4.0 for Linux on x86_64 - for MySQL 8.4.0 (MySQL Community Server (GPL))
TiDBのHTAPのQuickStartを試す
では、こちらのドキュメントに沿って進めていきます。
Quick start with HTAP | PingCAP Docs
TiUpのインストール。
$ curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh $ . .bashrc $ tiup --version 1.15.0 v1.15.0-nightly-9 Go Version: go1.21.9 Git Ref: master GitHash: 42d1df362797b114297068688e374b22020e6c5a
TiDBはLTSである7.5.1で起動します。
$ tiup playground v7.5.1 --host 0.0.0.0
テストデータ生成ツールのインストールとデータの作成。
$ tiup install bench $ tiup bench tpch --sf=1 prepare
けっこう時間がかかります…。
以下が実行時のログで、「Finished」が表示されたら完了です。
creating nation creating region creating part creating supplier creating partsupp creating customer creating orders creating lineitem generating nation table generate nation table done generating region table generate region table done generating customers table generate customers table done generating suppliers table generate suppliers table done generating part/partsupplier tables generate part/partsupplier tables done generating orders/lineitem tables generate orders/lineitem tables done Finished
データサイズを確認してみましょう。
接続。
$ tiup client
データは、testデータベースに入っているようです。
my:root@172.17.0.2:4000=> show databases; Database -------------------- INFORMATION_SCHEMA METRICS_SCHEMA PERFORMANCE_SCHEMA mysql test (5 rows) my:root@172.17.0.2:4000=> use test; USE my:root@172.17.0.2:4000=> show tables; Tables_in_test ---------------- customer lineitem nation orders part partsupp region supplier (8 rows)
データサイズを確認するSQL。
SELECT CONCAT(table_schema,'.',table_name) AS 'Table Name', table_rows AS 'Number of Rows', FORMAT_BYTES(data_length) AS 'Data Size', FORMAT_BYTES(index_length) AS 'Index Size', FORMAT_BYTES(data_length+index_length) AS'Total' FROM information_schema.TABLES WHERE table_schema='test';
結果。
Table Name | Number of Rows | Data Size | Index Size | Total ---------------+----------------+------------+------------+------------ test.nation | 25 | 2.44 KiB | 0 bytes | 2.44 KiB test.region | 5 | 416 bytes | 0 bytes | 416 bytes test.part | 215360 | 25.76 MiB | 0 bytes | 25.76 MiB test.supplier | 10000 | 1.41 MiB | 0 bytes | 1.41 MiB test.partsupp | 836864 | 124.93 MiB | 12.77 MiB | 137.70 MiB test.customer | 150000 | 24.06 MiB | 0 bytes | 24.06 MiB test.orders | 1504096 | 166.63 MiB | 0 bytes | 166.63 MiB test.lineitem | 6071871 | 774.65 MiB | 92.65 MiB | 867.30 MiB (8 rows)
データはランダムに作成されるようなので、実行ごとに結果や行数は異なるようです。今回は全体で約900万レコードのデータが
作られたことになります。
ちなみに、tiup bench tpch prepare
はワークロード向けのテストデータを準備するコマンドです。
$ tiup bench tpch --help Starting component bench: $HOME/.tiup/components/bench/v1.12.0/tiup-bench tpch --help Usage: go-tpc tpch [command] Available Commands: cleanup Cleanup data for the workload prepare Prepare data for the workload run Run workload Flags: --check Check output data, only when the scale factor equals 1 -h, --help help for tpch --queries string All queries (default "q1,q2,q3,q4,q5,q6,q7,q8,q9,q10,q11,q12,q13,q14,q15,q16,q17,q18,q19,q20,q21,q22") --sf int scale factor (default 1) --use-explain execute explain analyze Global Flags: 〜省略〜
tiup bench
で複数のベンチマーク向けのデータを生成できるようなのですが、tpch
というのはDWHの性能検証向けのTPC-Hのことですね。
$ tiup bench --help Starting component bench: $HOME/.tiup/components/bench/v1.12.0/tiup-bench --help Usage: tiup bench {ch/rawsql/tpcc/tpch/ycsb} [flags]
話を戻して。
列指向のストレージエンジンを使う前に、まずは行ベースのストレージエンジンを使った状態でTiDBのパフォーマンスを確認します。
以下のクエリーを実行。
SELECT l_orderkey, SUM( l_extendedprice * (1 - l_discount) ) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1996-01-01' AND l_shipdate > DATE '1996-02-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate limit 10;
結果。
l_orderkey | revenue | o_orderdate | o_shippriority ------------+-------------+----------------------+---------------- 5828933 | 386117.1688 | 1995-12-03T00:00:00Z | 0 1648647 | 373096.8642 | 1995-12-06T00:00:00Z | 0 1364641 | 352640.6056 | 1995-12-19T00:00:00Z | 0 3949606 | 347750.4435 | 1995-12-23T00:00:00Z | 0 4792161 | 347715.0509 | 1995-12-30T00:00:00Z | 0 4340739 | 347490.5251 | 1995-12-06T00:00:00Z | 0 1609574 | 342497.8886 | 1995-12-31T00:00:00Z | 0 3076934 | 338202.3259 | 1995-12-24T00:00:00Z | 0 3232933 | 337349.2536 | 1995-12-26T00:00:00Z | 0 2345058 | 335142.6104 | 1995-12-31T00:00:00Z | 0 (10 rows)
やや時間がかかりました。そういえば、MySQLと違って実行時間が出力されていませんね。
MySQL Shellでやり直しましょう。
$ mysqlsh root:@localhost:4000/test --sql
同じクエリーの実行結果。
+------------+-------------+-------------+----------------+ | l_orderkey | revenue | o_orderdate | o_shippriority | +------------+-------------+-------------+----------------+ | 5828933 | 386117.1688 | 1995-12-03 | 0 | | 1648647 | 373096.8642 | 1995-12-06 | 0 | | 1364641 | 352640.6056 | 1995-12-19 | 0 | | 3949606 | 347750.4435 | 1995-12-23 | 0 | | 4792161 | 347715.0509 | 1995-12-30 | 0 | | 4340739 | 347490.5251 | 1995-12-06 | 0 | | 1609574 | 342497.8886 | 1995-12-31 | 0 | | 3076934 | 338202.3259 | 1995-12-24 | 0 | | 3232933 | 337349.2536 | 1995-12-26 | 0 | | 2345058 | 335142.6104 | 1995-12-31 | 0 | +------------+-------------+-------------+----------------+ 10 rows in set (1.9028 sec)
ちなみにこの結果は2回目のもので、キャッシュが効いてちょっと速くなった気がします…。
explain analyzeも見ておきましょう。
explain analyze SELECT l_orderkey, SUM( l_extendedprice * (1 - l_discount) ) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1996-01-01' AND l_shipdate > DATE '1996-02-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate limit 10;
結果。
+------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ | id | estRows | actRows | task | access object | execution info | operator info | memory | disk | +------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ | Projection_14 | 10.00 | 10 | root | | time:1.99s, loops:2, RU:5124.543791, Concurrency:OFF | test.lineitem.l_orderkey, Column#34, test.orders.o_orderdate, test.orders.o_shippriority | 40.9 KB | N/A | | └─TopN_17 | 10.00 | 10 | root | | time:1.99s, loops:2 | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 | 76.8 KB | N/A | | └─HashAgg_22 | 674032.16 | 7664 | root | | time:1.98s, loops:11, partial_worker:{wall_time:1.981795359s, concurrency:5, task_num:17, tot_wait:9.885371099s, tot_exec:19.073106ms, tot_time:9.906871113s, max:1.981758925s, p95:1.981758925s}, final_worker:{wall_time:1.986458455s, concurrency:5, task_num:25, tot_wait:9.906349192s, tot_exec:23.550433ms, tot_time:9.929910195s, max:1.98641688s, p95:1.98641688s} | group by:Column#39, Column#40, Column#41, funcs:sum(Column#35)->Column#34, funcs:firstrow(Column#36)->test.orders.o_orderdate, funcs:firstrow(Column#37)->test.orders.o_shippriority, funcs:firstrow(Column#38)->test.lineitem.l_orderkey | 4.78 MB | N/A | | └─Projection_77 | 1580588.37 | 16655 | root | | time:1.98s, loops:18, Concurrency:5 | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#35, test.orders.o_orderdate->Column#36, test.orders.o_shippriority->Column#37, test.lineitem.l_orderkey->Column#38, test.lineitem.l_orderkey->Column#39, test.orders.o_orderdate->Column#40, test.orders.o_shippriority->Column#41 | 810.3 KB | N/A | | └─IndexJoin_30 | 1580588.37 | 16655 | root | | time:1.98s, loops:18, inner:{total:8.41s, concurrency:5, task:15, construct:400.7ms, fetch:8s, build:5.16ms}, probe:36.2ms | inner join, inner:TableReader_26, outer key:test.orders.o_orderkey, inner key:test.lineitem.l_orderkey, equal cond:eq(test.orders.o_orderkey, test.lineitem.l_orderkey) | 19.5 MB | N/A | | ├─HashJoin_65(Build) | 391944.14 | 184087 | root | | time:224.5ms, loops:185, build_hash_table:{total:10.5ms, fetch:1.62ms, build:8.91ms}, probe:{concurrency:5, total:2.88s, max:578.8ms, probe:2.57s, fetch:305.8ms} | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)] | 2.94 MB | 0 Bytes | | │ ├─TableReader_71(Build) | 30317.61 | 30142 | root | | time:3.21ms, loops:31, cop_task: {num: 8, max: 938.9µs, min: 338.1µs, avg: 535µs, p95: 938.9µs, tot_proc: 11.3µs, tot_wait: 659.9µs, rpc_num: 8, rpc_time: 4.11ms, copr_cache_hit_ratio: 1.00, build_task_duration: 11.4µs, max_distsql_concurrency: 1} | data:Selection_70 | 497.6 KB | N/A | | │ │ └─Selection_70 | 30317.61 | 30142 | cop[tikv] | | tikv_task:{proc max:60ms, min:0s, avg: 20.5ms, p80:40ms, p95:60ms, iters:179, tasks:8}, scan_detail: {get_snapshot_time: 362.4µs, rocksdb: {block: {}}} | eq(test.customer.c_mktsegment, "BUILDING") | N/A | N/A | | │ │ └─TableFullScan_69 | 150000.00 | 150000 | cop[tikv] | table:customer | tikv_task:{proc max:52ms, min:0s, avg: 16ms, p80:28ms, p95:52ms, iters:179, tasks:8} | keep order:false | N/A | N/A | | │ └─TableReader_68(Probe) | 1135130.91 | 909968 | root | | time:40.5ms, loops:887, cop_task: {num: 42, max: 17.3ms, min: 336.1µs, avg: 2.48ms, p95: 11.7ms, tot_proc: 49.7µs, tot_wait: 14.1ms, rpc_num: 42, rpc_time: 102.8ms, copr_cache_hit_ratio: 1.00, build_task_duration: 10.8µs, max_distsql_concurrency: 3} | data:Selection_67 | 6.16 MB | N/A | | │ └─Selection_67 | 1135130.91 | 909968 | cop[tikv] | | tikv_task:{proc max:320ms, min:0s, avg: 96.6ms, p80:220ms, p95:312ms, iters:1630, tasks:42}, scan_detail: {get_snapshot_time: 1.42ms, rocksdb: {block: {}}} | lt(test.orders.o_orderdate, 1996-01-01) | N/A | N/A | | │ └─TableFullScan_66 | 1504096.00 | 1500000 | cop[tikv] | table:orders | tikv_task:{proc max:320ms, min:0s, avg: 92.7ms, p80:208ms, p95:300ms, iters:1630, tasks:42} | keep order:false | N/A | N/A | | └─TableReader_26(Probe) | 175459.49 | 16655 | root | | time:7.59s, loops:42, cop_task: {num: 113, max: 503.5ms, min: 17.6ms, avg: 192ms, p95: 444ms, max_proc_keys: 24712, p95_proc_keys: 15988, tot_proc: 8.6s, tot_wait: 176.5ms, rpc_num: 113, rpc_time: 21.7s, copr_cache_hit_ratio: 0.04, build_task_duration: 9.73ms, max_distsql_concurrency: 6} | data:Selection_25 | N/A | N/A | | └─Selection_25 | 175459.49 | 16655 | cop[tikv] | | tikv_task:{proc max:492ms, min:4ms, avg: 175.8ms, p80:284ms, p95:420ms, iters:1198, tasks:113}, scan_detail: {total_process_keys: 735390, total_process_keys_size: 142969563, total_keys: 919342, get_snapshot_time: 86.2ms, rocksdb: {key_skipped_count: 735390, block: {cache_hit_count: 716158}}} | gt(test.lineitem.l_shipdate, 1996-02-01) | N/A | N/A | | └─TableRangeScan_24 | 391944.14 | 736133 | cop[tikv] | table:lineitem | tikv_task:{proc max:488ms, min:4ms, avg: 174.8ms, p80:284ms, p95:420ms, iters:1198, tasks:113} | range: decided by [eq(test.lineitem.l_orderkey, test.orders.o_orderkey)], keep order:false | N/A | N/A | +------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ 15 rows in set (1.9900 sec)
ちなみにこのクエリーは、以下の内容です。
- 配送に関するプライオリティーキューで、指定された日までに出荷されていない最も収益の高い注文の優先度と潜在的な収益を提供する
- 潜在的な収益の計算式は
l_extendedprice * (1 - l_discount)
の和として定義する - 注文は潜在的な収益の降順でソートされ、未出荷の上位10件を取得する
次に、データを列指向のストレージエンジンにレプリケーションします。
以下のクエリーを実行することで、指定のテーブルをTiFlashにレプリケーションすることができます。
ALTER TABLE test.customer SET TIFLASH REPLICA 1; ALTER TABLE test.orders SET TIFLASH REPLICA 1; ALTER TABLE test.lineitem SET TIFLASH REPLICA 1;
実行。
MySQL localhost:4000 ssl test SQL > ALTER TABLE test.customer SET TIFLASH REPLICA 1; Query OK, 0 rows affected (0.5111 sec) MySQL localhost:4000 ssl test SQL > ALTER TABLE test.orders SET TIFLASH REPLICA 1; Query OK, 0 rows affected (0.5125 sec) MySQL localhost:4000 ssl test SQL > ALTER TABLE test.lineitem SET TIFLASH REPLICA 1; Query OK, 0 rows affected (0.9729 sec)
各テーブルのレプリケーションステータスを、以下のクエリーで確認します。
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'customer'; SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'orders'; SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'lineitem';
それぞれ、結果。
MySQL localhost:4000 ssl test SQL > SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'customer'; +--------------+------------+----------+---------------+-----------------+-----------+----------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS | +--------------+------------+----------+---------------+-----------------+-----------+----------+ | test | customer | 112 | 1 | | 1 | 1 | +--------------+------------+----------+---------------+-----------------+-----------+----------+ 1 row in set (0.0564 sec) MySQL localhost:4000 ssl test SQL > SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'orders'; +--------------+------------+----------+---------------+-----------------+-----------+----------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS | +--------------+------------+----------+---------------+-----------------+-----------+----------+ | test | orders | 114 | 1 | | 1 | 1 | +--------------+------------+----------+---------------+-----------------+-----------+----------+ 1 row in set (0.0025 sec) MySQL localhost:4000 ssl test SQL > SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'lineitem'; +--------------+------------+----------+---------------+-----------------+-----------+----------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS | +--------------+------------+----------+---------------+-----------------+-----------+----------+ | test | lineitem | 116 | 1 | | 1 | 1 | +--------------+------------+----------+---------------+-----------------+-----------+----------+ 1 row in set (0.0027 sec)
結果の各カラムの意味は以下になります。
AVAILABLE
… TiFlashのレプリカが利用可能かどうか。1は利用可能、0は利用不可PROGRESS
… レプリケーションの進行状況を0.0〜1.0で表す。1は、TiFlashのレプリケーションが完了したことを表す
では、以下のクエリーをもう1度実行してみましょう。
SELECT l_orderkey, SUM( l_extendedprice * (1 - l_discount) ) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1996-01-01' AND l_shipdate > DATE '1996-02-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate limit 10;
結果。
+------------+-------------+-------------+----------------+ | l_orderkey | revenue | o_orderdate | o_shippriority | +------------+-------------+-------------+----------------+ | 5828933 | 386117.1688 | 1995-12-03 | 0 | | 1648647 | 373096.8642 | 1995-12-06 | 0 | | 1364641 | 352640.6056 | 1995-12-19 | 0 | | 3949606 | 347750.4435 | 1995-12-23 | 0 | | 4792161 | 347715.0509 | 1995-12-30 | 0 | | 4340739 | 347490.5251 | 1995-12-06 | 0 | | 1609574 | 342497.8886 | 1995-12-31 | 0 | | 3076934 | 338202.3259 | 1995-12-24 | 0 | | 3232933 | 337349.2536 | 1995-12-26 | 0 | | 2345058 | 335142.6104 | 1995-12-31 | 0 | +------------+-------------+-------------+----------------+ 10 rows in set (1.7648 sec)
2回目以降はこうなりました。行ベースのストレージエンジンの時よりも速いですね。
+------------+-------------+-------------+----------------+ | l_orderkey | revenue | o_orderdate | o_shippriority | +------------+-------------+-------------+----------------+ | 5828933 | 386117.1688 | 1995-12-03 | 0 | | 1648647 | 373096.8642 | 1995-12-06 | 0 | | 1364641 | 352640.6056 | 1995-12-19 | 0 | | 3949606 | 347750.4435 | 1995-12-23 | 0 | | 4792161 | 347715.0509 | 1995-12-30 | 0 | | 4340739 | 347490.5251 | 1995-12-06 | 0 | | 1609574 | 342497.8886 | 1995-12-31 | 0 | | 3076934 | 338202.3259 | 1995-12-24 | 0 | | 3232933 | 337349.2536 | 1995-12-26 | 0 | | 2345058 | 335142.6104 | 1995-12-31 | 0 | +------------+-------------+-------------+----------------+ 10 rows in set (0.3646 sec)
explain analyzeしてみましょう。
explain analyze SELECT l_orderkey, SUM( l_extendedprice * (1 - l_discount) ) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1996-01-01' AND l_shipdate > DATE '1996-02-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate limit 10;
結果。
+----------------------------------------------------------------+------------+---------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+ | id | estRows | actRows | task | access object | execution info | operator info | memory | disk | +----------------------------------------------------------------+------------+---------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+ | Projection_14 | 10.00 | 10 | root | | time:271.1ms, loops:2, RU:0.000000, Concurrency:OFF | test.lineitem.l_orderkey, Column#34, test.orders.o_orderdate, test.orders.o_shippriority | 26.1 KB | N/A | | └─TopN_18 | 10.00 | 10 | root | | time:271.1ms, loops:2 | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 | 1.23 KB | N/A | | └─TableReader_139 | 10.00 | 40 | root | | time:271.1ms, loops:3, cop_task: {num: 5, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00} | MppVersion: 2, data:ExchangeSender_138 | 1.32 KB | N/A | | └─ExchangeSender_138 | 10.00 | 40 | mpp[tiflash] | | tiflash_task:{time:264.1ms, loops:4, threads:4} | ExchangeType: PassThrough | N/A | N/A | | └─TopN_137 | 10.00 | 40 | mpp[tiflash] | | tiflash_task:{time:264.1ms, loops:4, threads:4} | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 | N/A | N/A | | └─Projection_133 | 674032.16 | 7664 | mpp[tiflash] | | tiflash_task:{time:264.1ms, loops:4, threads:4} | Column#34, test.orders.o_orderdate, test.orders.o_shippriority, test.lineitem.l_orderkey | N/A | N/A | | └─HashAgg_134 | 674032.16 | 7664 | mpp[tiflash] | | tiflash_task:{time:264.1ms, loops:4, threads:4} | group by:test.lineitem.l_orderkey, test.orders.o_orderdate, test.orders.o_shippriority, funcs:sum(Column#43)->Column#34, funcs:firstrow(test.orders.o_orderdate)->test.orders.o_orderdate, funcs:firstrow(test.orders.o_shippriority)->test.orders.o_shippriority, funcs:firstrow(test.lineitem.l_orderkey)->test.lineitem.l_orderkey, stream_count: 4 | N/A | N/A | | └─ExchangeReceiver_136 | 674032.16 | 7664 | mpp[tiflash] | | tiflash_task:{time:264.1ms, loops:4, threads:4} | stream_count: 4 | N/A | N/A | | └─ExchangeSender_135 | 674032.16 | 7664 | mpp[tiflash] | | tiflash_task:{time:260.9ms, loops:1, threads:1} | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.lineitem.l_orderkey, collate: binary], [name: test.orders.o_orderdate, collate: binary], [name: test.orders.o_shippriority, collate: binary], stream_count: 4 | N/A | N/A | | └─HashAgg_131 | 674032.16 | 7664 | mpp[tiflash] | | tiflash_task:{time:256.9ms, loops:1, threads:1} | group by:Column#48, Column#49, Column#50, funcs:sum(Column#47)->Column#43 | N/A | N/A | | └─Projection_140 | 1580588.37 | 16655 | mpp[tiflash] | | tiflash_task:{time:256.9ms, loops:92, threads:8} | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#47, test.lineitem.l_orderkey->Column#48, test.orders.o_orderdate->Column#49, test.orders.o_shippriority->Column#50 | N/A | N/A | | └─HashJoin_120 | 1580588.37 | 16655 | mpp[tiflash] | | tiflash_task:{time:256.9ms, loops:92, threads:8} | inner join, equal:[eq(test.orders.o_orderkey, test.lineitem.l_orderkey)], stream_count: 4 | N/A | N/A | | ├─ExchangeReceiver_52(Build) | 391944.14 | 184087 | mpp[tiflash] | | tiflash_task:{time:172.9ms, loops:18, threads:4} | stream_count: 4 | N/A | N/A | | │ └─ExchangeSender_51 | 391944.14 | 184087 | mpp[tiflash] | | tiflash_task:{time:158.7ms, loops:23, threads:8} | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.orders.o_orderkey, collate: binary], stream_count: 4 | N/A | N/A | | │ └─HashJoin_44 | 391944.14 | 184087 | mpp[tiflash] | | tiflash_task:{time:154.7ms, loops:23, threads:8} | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)] | N/A | N/A | | │ ├─ExchangeReceiver_48(Build) | 30317.61 | 30142 | mpp[tiflash] | | tiflash_task:{time:70.7ms, loops:2, threads:8} | | N/A | N/A | | │ │ └─ExchangeSender_47 | 30317.61 | 30142 | mpp[tiflash] | | tiflash_task:{time:39.8ms, loops:3, threads:8} | ExchangeType: Broadcast, Compression: FAST | N/A | N/A | | │ │ └─TableFullScan_45 | 30317.61 | 30142 | mpp[tiflash] | table:customer | tiflash_task:{time:31.8ms, loops:3, threads:8}, tiflash_scan:{dtfile:{total_scanned_packs:38, total_skipped_packs:0, total_scanned_rows:300000, total_skipped_rows:0, total_rs_index_check_time: 0ms, total_read_time: 9ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 1, total_remote_region_num: 0, total_learner_read_time: 2ms} | pushed down filter:eq(test.customer.c_mktsegment, "BUILDING"), keep order:false | N/A | N/A | | │ └─Selection_50(Probe) | 1135130.91 | 909968 | mpp[tiflash] | | tiflash_task:{time:134.7ms, loops:23, threads:8} | lt(test.orders.o_orderdate, 1996-01-01) | N/A | N/A | | │ └─TableFullScan_49 | 1504096.00 | 1500000 | mpp[tiflash] | table:orders | tiflash_task:{time:126.7ms, loops:23, threads:8}, tiflash_scan:{dtfile:{total_scanned_packs:185, total_skipped_packs:0, total_scanned_rows:1500000, total_skipped_rows:0, total_rs_index_check_time: 0ms, total_read_time: 126ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 3, total_remote_region_num: 0, total_learner_read_time: 1ms} | pushed down filter:empty, keep order:false | N/A | N/A | | └─ExchangeReceiver_56(Probe) | 2718161.32 | 2432047 | mpp[tiflash] | | tiflash_task:{time:228.9ms, loops:92, threads:8} | | N/A | N/A | | └─ExchangeSender_55 | 2718161.32 | 2432047 | mpp[tiflash] | | tiflash_task:{time:256ms, loops:97, threads:8} | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.lineitem.l_orderkey, collate: binary] | N/A | N/A | | └─TableFullScan_53 | 2718161.32 | 2432047 | mpp[tiflash] | table:lineitem | tiflash_task:{time:124ms, loops:97, threads:8}, tiflash_scan:{dtfile:{total_scanned_packs:1490, total_skipped_packs:0, total_scanned_rows:12002430, total_skipped_rows:0, total_rs_index_check_time: 0ms, total_read_time: 611ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 13, total_remote_region_num: 0, total_learner_read_time: 2ms} | pushed down filter:gt(test.lineitem.l_shipdate, 1996-02-01), keep order:false | N/A | N/A | +----------------------------------------------------------------+------------+---------+--------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+------+ 23 rows in set (0.3486 sec)
「execution info」の欄を見ると、行ベースのストレージエンジンだった時はtikv_task
と書かれていたものがtiflash_task
に
なっています。
また、結果にExchangeSender
やExchangeReceiver
が含まれている場合、MPPモードが有効になっていることを示しているそうです。
「task」の多くは「mpp[tiflash]」になっていますし、「MppVersion: 2, data:ExchangeSender_138 」と書かれている結果もありますしね。
| └─TableReader_139 | 10.00 | 40 | root | | time:271.1ms, loops:3, cop_task: {num: 5, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00} | MppVersion: 2, data:ExchangeSender_138 | 1.32 KB | N/A |
これでQuickStartの内容は確認できました。
ちなみに、強制的にTiKVを選択させたい場合はヒント句でread_from_storage
でストレージエンジンを指定するようです。
explain analyze SELECT /*+ read_from_storage(tikv[customer,orders,lineitem]) */ l_orderkey, SUM( l_extendedprice * (1 - l_discount) ) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1996-01-01' AND l_shipdate > DATE '1996-02-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate limit 10;
これで、TiKVを選択すりょうになりました。
+------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ | id | estRows | actRows | task | access object | execution info | operator info | memory | disk | +------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ | Projection_14 | 10.00 | 10 | root | | time:3.03s, loops:2, RU:9837.350827, Concurrency:OFF | test.lineitem.l_orderkey, Column#34, test.orders.o_orderdate, test.orders.o_shippriority | 2.52 KB | N/A | | └─TopN_17 | 10.00 | 10 | root | | time:3.03s, loops:2 | Column#34:desc, test.orders.o_orderdate, offset:0, count:10 | 76.8 KB | N/A | | └─HashAgg_24 | 674032.16 | 7664 | root | | time:3.03s, loops:11, partial_worker:{wall_time:3.030973074s, concurrency:5, task_num:17, tot_wait:15.139139758s, tot_exec:12.139184ms, tot_time:15.152985423s, max:3.030928905s, p95:3.030928905s}, final_worker:{wall_time:3.034161772s, concurrency:5, task_num:25, tot_wait:15.152203914s, tot_exec:16.297955ms, tot_time:15.168511981s, max:3.034063683s, p95:3.034063683s} | group by:Column#39, Column#40, Column#41, funcs:sum(Column#35)->Column#34, funcs:firstrow(Column#36)->test.orders.o_orderdate, funcs:firstrow(Column#37)->test.orders.o_shippriority, funcs:firstrow(Column#38)->test.lineitem.l_orderkey | 4.78 MB | N/A | | └─Projection_88 | 1580588.37 | 16655 | root | | time:3.03s, loops:18, Concurrency:5 | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#35, test.orders.o_orderdate->Column#36, test.orders.o_shippriority->Column#37, test.lineitem.l_orderkey->Column#38, test.lineitem.l_orderkey->Column#39, test.orders.o_orderdate->Column#40, test.orders.o_shippriority->Column#41 | 592.9 KB | N/A | | └─IndexJoin_37 | 1580588.37 | 16655 | root | | time:3.03s, loops:18, inner:{total:3.65s, concurrency:5, task:15, construct:183.1ms, fetch:3.46s, build:4.8ms}, probe:31ms | inner join, inner:TableReader_33, outer key:test.orders.o_orderkey, inner key:test.lineitem.l_orderkey, equal cond:eq(test.orders.o_orderkey, test.lineitem.l_orderkey) | 10.9 MB | N/A | | ├─HashJoin_74(Build) | 391944.14 | 184087 | root | | time:2.57s, loops:185, build_hash_table:{total:219.6ms, fetch:207.5ms, build:12.1ms}, probe:{concurrency:5, total:12.8s, max:2.57s, probe:455ms, fetch:12.4s} | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)] | 2.93 MB | 0 Bytes | | │ ├─TableReader_80(Build) | 30317.61 | 30142 | root | | time:211.9ms, loops:31, cop_task: {num: 8, max: 66.1ms, min: 3.63ms, avg: 26.9ms, p95: 66.1ms, max_proc_keys: 63184, p95_proc_keys: 63184, tot_proc: 207.3ms, tot_wait: 641µs, rpc_num: 8, rpc_time: 214.9ms, copr_cache_hit_ratio: 0.00, build_task_duration: 9.78µs, max_distsql_concurrency: 1} | data:Selection_79 | 497.7 KB | N/A | | │ │ └─Selection_79 | 30317.61 | 30142 | cop[tikv] | | tikv_task:{proc max:64ms, min:0s, avg: 24.5ms, p80:64ms, p95:64ms, iters:179, tasks:8}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 30533765, total_keys: 150008, get_snapshot_time: 320.3µs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 1030}}} | eq(test.customer.c_mktsegment, "BUILDING") | N/A | N/A | | │ │ └─TableFullScan_78 | 150000.00 | 150000 | cop[tikv] | table:customer | tikv_task:{proc max:64ms, min:0s, avg: 22ms, p80:52ms, p95:64ms, iters:179, tasks:8} | keep order:false | N/A | N/A | | │ └─TableReader_77(Probe) | 1135130.91 | 909968 | root | | time:2.25s, loops:887, cop_task: {num: 42, max: 383.2ms, min: 1.76ms, avg: 133.3ms, p95: 365.9ms, max_proc_keys: 82912, p95_proc_keys: 82912, tot_proc: 5.39s, tot_wait: 5.95ms, rpc_num: 42, rpc_time: 5.6s, copr_cache_hit_ratio: 0.00, build_task_duration: 9.6µs, max_distsql_concurrency: 3} | data:Selection_76 | 3.08 MB | N/A | | │ └─Selection_76 | 1135130.91 | 909968 | cop[tikv] | | tikv_task:{proc max:368ms, min:0s, avg: 128.6ms, p80:336ms, p95:356ms, iters:1630, tasks:42}, scan_detail: {total_process_keys: 1500000, total_process_keys_size: 227213890, total_keys: 1500042, get_snapshot_time: 1.2ms, rocksdb: {key_skipped_count: 1500000, block: {cache_hit_count: 7492}}} | lt(test.orders.o_orderdate, 1996-01-01) | N/A | N/A | | │ └─TableFullScan_75 | 1504096.00 | 1500000 | cop[tikv] | table:orders | tikv_task:{proc max:368ms, min:0s, avg: 127ms, p80:332ms, p95:352ms, iters:1630, tasks:42} | keep order:false | N/A | N/A | | └─TableReader_33(Probe) | 175459.49 | 16655 | root | | time:3.25s, loops:42, cop_task: {num: 103, max: 257ms, min: 2.32ms, avg: 72.6ms, p95: 168.7ms, max_proc_keys: 25568, p95_proc_keys: 17730, tot_proc: 5.35s, tot_wait: 33.2ms, rpc_num: 103, rpc_time: 7.48s, copr_cache_hit_ratio: 0.00, build_task_duration: 6.03ms, max_distsql_concurrency: 6} | data:Selection_32 | N/A | N/A | | └─Selection_32 | 175459.49 | 16655 | cop[tikv] | | tikv_task:{proc max:240ms, min:0s, avg: 64.7ms, p80:96ms, p95:160ms, iters:1151, tasks:103}, scan_detail: {total_process_keys: 736133, total_process_keys_size: 143113751, total_keys: 920269, get_snapshot_time: 8.34ms, rocksdb: {key_skipped_count: 736133, block: {cache_hit_count: 716899}}} | gt(test.lineitem.l_shipdate, 1996-02-01) | N/A | N/A | | └─TableRangeScan_31 | 391944.14 | 736133 | cop[tikv] | table:lineitem | tikv_task:{proc max:236ms, min:0s, avg: 64.3ms, p80:96ms, p95:160ms, iters:1151, tasks:103} | range: decided by [eq(test.lineitem.l_orderkey, test.orders.o_orderkey)], keep order:false | N/A | N/A | +------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+---------+ 15 rows in set (3.0684 sec)
Use TiDB to Read TiFlash Replicas | PingCAP Docs
通常は、ストレージエンジンの選択はオプティマイザーがコスト見積もりに基づいて行うようです。
ところで、TiFlashにレプリケーションしたことでデータ量が増えていると思うのですが、以下のクエリーを実行しても
SELECT CONCAT(table_schema,'.',table_name) AS 'Table Name', table_rows AS 'Number of Rows', FORMAT_BYTES(data_length) AS 'Data Size', FORMAT_BYTES(index_length) AS 'Index Size', FORMAT_BYTES(data_length+index_length) AS'Total' FROM information_schema.TABLES WHERE table_schema='test';
データサイズは変わったようすはありませんでした。
+---------------+----------------+------------+------------+------------+ | Table Name | Number of Rows | Data Size | Index Size | Total | +---------------+----------------+------------+------------+------------+ | test.nation | 25 | 2.44 KiB | 0 bytes | 2.44 KiB | | test.region | 5 | 416 bytes | 0 bytes | 416 bytes | | test.part | 215360 | 25.76 MiB | 0 bytes | 25.76 MiB | | test.supplier | 10000 | 1.41 MiB | 0 bytes | 1.41 MiB | | test.partsupp | 836864 | 124.93 MiB | 12.77 MiB | 137.70 MiB | | test.customer | 150000 | 24.06 MiB | 0 bytes | 24.06 MiB | | test.orders | 1504096 | 166.63 MiB | 0 bytes | 166.63 MiB | | test.lineitem | 6071871 | 774.65 MiB | 92.65 MiB | 867.30 MiB | +---------------+----------------+------------+------------+------------+ 8 rows in set (0.0139 sec)
TiFlash込みのデータ量を見たい場合には、どうしたらいいんでしょうね?
こちらのテーブルを見るとよさそうな気がします。
https://docs.pingcap.com/ja/tidb/v7.5/information-schema-tiflash-tables
こんなクエリーを作成。
select tidb_database, tidb_table, total_rows, format_bytes(total_size) total_size from information_schema.tiflash_tables where tidb_database = 'test';
結果。
+---------------+------------+------------+------------+ | tidb_database | tidb_table | total_rows | total_size | +---------------+------------+------------+------------+ | test | customer | 150000 | 32.27 MiB | | test | orders | 1500000 | 237.34 MiB | | test | lineitem | 6001215 | 1.19 GiB | +---------------+------------+------------+------------+ 3 rows in set (0.0152 sec)
TiKVよりもデータが大きくなりそうですね…。
おわりに
TiDBのHTAPに関するQuickStartを試してみました。
そもそもHTAPという考え方を知らなかったりしたので、勉強になりましたし、TiDBで試せたのも良かったですね。
次は構成要素などについて書かれたドキュメントを見ていったりしたいなと思います。