CLOVERšŸ€

That was when it all began.

Trino恧态MinIOćØMySQLć®ćƒ‡ćƒ¼ć‚æ悒joinć—ć¦ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæ悋

恓悌ćÆ态ćŖć«ć‚’ć—ćŸćć¦ę›øć„ćŸć‚‚ć®ļ¼Ÿ

å‰ć«ć€Trino悒ä½æć£ć¦MySQLćØMinIOć«ę ¼ē“ć•ć‚ŒćŸćƒ‡ćƒ¼ć‚æć«ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæć¾ć—ćŸć€‚

分散SQLクエリーエンジン、TrinoをUbuntu Linux 20.04 LTSにインストールしてMySQLに接続してみる - CLOVER🍀

Trinoから、Hive connectorでAmazon S3互換のオブジェクトストレージMinIOにアクセスしてみる - CLOVER🍀

今回ćÆ态MinIOćØMySQLć«ę ¼ē“ć•ć‚ŒćŸćƒ‡ćƒ¼ć‚æ悒joinć—ć¦ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæ恟恄ćØę€ć„ć¾ć™ć€‚

ē’°å¢ƒ

TrinoćÆ态395悒ä½æē”Øć—ć¾ć™ć€‚č”Øē¤ŗ恗恦恄悋悂恮ćÆCLIć®ćƒćƒ¼ć‚øćƒ§ćƒ³ć§ć™ćŒć€ć‚µćƒ¼ćƒćƒ¼å“ć‚‚åŒć˜ćƒćƒ¼ć‚øćƒ§ćƒ³ć‚’ä½æē”Øć—ć¾ć™ć€‚

$ trino --version
Trino CLI 395

Trinoć‚µćƒ¼ćƒćƒ¼ćÆ72.17.0.2ć§å‹•ä½œć•ć›ć¾ć™ćŒć€čµ·å‹•ćÆå¾Œć§č”Œć„ć¾ć™ć€‚

Apache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ćÆ态3.0.0悒ä½æē”Øć—ć¾ć™ć€‚ć¾ćŸć€172.17.0.3ć§å‹•ä½œć•ć›ć¾ć™ć€‚

Apache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć«åæ…要ćŖ态Apache Hadoopć®ćƒćƒ¼ć‚øćƒ§ćƒ³ć€‚

$ /opt/hadoop/bin/hadoop version
Hadoop 3.3.4
Source code repository https://github.com/apache/hadoop.git -r a585a73c3e02ac62350c136643a5e7f6095a3dbb
Compiled by stevel on 2022-07-29T12:32Z
Compiled with protoc 3.7.1
From source with checksum fb9dd8918a7b8a5b430d61af858f6ec
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.3.4.jar

ćŖ恊态Trino悂Apache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć‚‚ä»„äø‹ć®Javać§å‹•ä½œć•ć›ć¾ć™ć€‚

$ java --version
openjdk 17.0.4 2022-07-19
OpenJDK Runtime Environment Temurin-17.0.4+8 (build 17.0.4+8)
OpenJDK 64-Bit Server VM Temurin-17.0.4+8 (build 17.0.4+8, mixed mode, sharing)

MinIOļ¼ˆAmazon S3ļ¼‰ć«ć‚¢ć‚Æć‚»ć‚¹ć™ć‚‹ćŸć‚ć®ć€ē’°å¢ƒå¤‰ę•°ć®čح定ćÆ恓恔悉怂

$ export HADOOP_HOME=/opt/hadoop

$ AWS_SDK_JAR=$(find ${HADOOP_HOME}/share/hadoop/tools/lib -name 'aws-java-sdk-bundle-*.jar')
$ HADOOP_AWS_JAR=$(find ${HADOOP_HOME}/share/hadoop/tools/lib -name 'hadoop-aws-*.jar')

$ export HADOOP_CLASSPATH=${AWS_SDK_JAR}:${HADOOP_AWS_JAR}

Apache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć®čµ·å‹•ćÆć€å¾Œć§č”Œć„ć¾ć™ć€‚

MinIOć®ćƒćƒ¼ć‚øćƒ§ćƒ³ćÆ态恓恔悉怂

$ minio --version
minio version RELEASE.2022-09-07T22-25-02Z (commit-id=bb855499e1519f31c03c9b91c0f9f10cb6439253)
Runtime: go1.18.6 linux/amd64
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>
Copyright: 2015-2022 MinIO, Inc.

MinIOćÆ仄äø‹ć®ć‚³ćƒžćƒ³ćƒ‰ć§čµ·å‹•ć—态172.17.0.4ć§å‹•ä½œć—ć¦ć„ć‚‹ć‚‚ć®ćØć—ć¾ć™ć€‚

$ MINIO_ROOT_USER=minioadmin MINIO_ROOT_PASSWORD=minioadmin minio server /var/lib/minio/data --console-address :9001

MinIOę“ä½œē”Ø恮AWS CLIć®ćƒćƒ¼ć‚øćƒ§ćƒ³ć€‚

$ aws --version
aws-cli/2.7.31 Python/3.9.11 Linux/5.4.0-125-generic exe/x86_64.ubuntu.20 prompt/off

ć‚Æćƒ¬ćƒ‡ćƒ³ć‚·ćƒ£ćƒ«ć®čØ­å®šć€‚

$ export AWS_ACCESS_KEY_ID=minioadmin
$ export AWS_SECRET_ACCESS_KEY=minioadmin
$ export AWS_DEFAULT_REGION=ap-northeast-1

MySQL恫恤恄恦ćÆ恓恔悉怂172.17.0.5ć§å‹•ä½œć—ć¦ć„ć‚‹ć‚‚ć®ćØ恗态ꎄē¶šęƒ…å ±ćÆkazuhiraļ¼password恧态practicećØć„ć†ćƒ‡ćƒ¼ć‚æćƒ™ćƒ¼ć‚¹ć‚’
ä½œęˆęøˆćæćØć—ć¾ć™ć€‚

$ mysql --version
mysql  Ver 8.0.30 for Linux on x86_64 (MySQL Community Server - GPL)

ćƒ‡ćƒ¼ć‚æ恮ęŗ–å‚™

ć¾ćšćÆćƒ‡ćƒ¼ć‚æ恮ęŗ–å‚™ć‚’ć—ć¾ć—ć‚‡ć†ć€‚

MinIO恫ćÆCSVćƒ•ć‚”ć‚¤ćƒ«ć‚’ē½®ćć“ćØć«ć—ć¾ć™ć€‚ćŠé”ŒćÆ悵悶ć‚Ø恕悓恧态1č”Œē›®ć‚’ćƒ˜ćƒƒćƒ€ćƒ¼ć«ć—ćŸCSVćƒ•ć‚”ć‚¤ćƒ«ć‚’3恤ē”Øę„ć—ć¾ć™ć€‚

isono-family.csv

family_id,id,first_name,last_name,age
1,1,悵悶ć‚Ø,惕悰ē”°,24
1,2,ćƒžć‚¹ć‚Ŗ,惕悰ē”°,28
1,3,ę³¢å¹³,ē£Æ野,54
1,4,惕惍,ē£Æ野,50
1,5,ć‚«ćƒ„ć‚Ŗ,ē£Æ野,11
1,6,ćƒÆć‚«ćƒ”,ē£Æ野,9
1,7,ć‚æ惩ć‚Ŗ,惕悰ē”°,3

namino-family.csv

family_id,id,first_name,last_name,age
2,1,惎ćƒŖć‚¹ć‚±,ę³¢é‡Ž,26
2,2,ć‚æć‚¤ć‚³,ę³¢é‡Ž,22
2,3,悤ć‚Æ惩,ę³¢é‡Ž,1

isasaka-family.csv

family_id,id,first_name,last_name,age
3,1,難ē‰©,伊佐坂,60
3,2,ćŠč»½,伊佐坂,50
3,3,ē”šå…­,伊佐坂,20
3,4,ęµ®ę±Ÿ,伊佐,16

MinIO恮ć‚Øćƒ³ćƒ‰ćƒć‚¤ćƒ³ćƒˆćÆ恓恔悉怂

$ MINIO_ENDPOINT=http://172.17.0.4:9000

ćƒć‚±ćƒƒćƒˆć‚’ä½œęˆć—ć¦

$ aws s3 mb --endpoint-url $MINIO_ENDPOINT s3://trino-bucket

syncć§ć‚¢ćƒƒćƒ—ćƒ­ćƒ¼ćƒ‰ć€‚

$ aws s3 sync --endpoint-url $MINIO_ENDPOINT . s3://trino-bucket/files
upload: ./isono-family.csv to s3://trino-bucket/files/isono-family.csv
upload: ./isasaka-family.csv to s3://trino-bucket/files/isasaka-family.csv
upload: ./namino-family.csv to s3://trino-bucket/files/namino-family.csv

ē¢ŗčŖć€‚

$ aws s3 ls --endpoint-url $MINIO_ENDPOINT trino-bucket/files/
2022-09-13 22:46:43        131 isasaka-family.csv
2022-09-13 22:46:43        207 isono-family.csv
2022-09-13 22:46:43        112 namino-family.csv

MySQLå“ć«ć‚‚ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œęˆć—ć¾ć™ć€‚MinIOć«ć‚¢ćƒƒćƒ—ćƒ­ćƒ¼ćƒ‰ć—ćŸćƒ•ć‚”ć‚¤ćƒ«ćØjoinć™ć‚‹ęƒ³å®šć®ć‚‚ć®ć§ć™ć€‚

create table family(
  id integer,
  name varchar(20),
  primary key(id)
);

ćƒ‡ćƒ¼ć‚æ恮ē™»éŒ²ć€‚

insert into family(id, name) values(1, 'ē£Æ野家');
insert into family(id, name) values(2, 'ę³¢é‡Žå®¶');
insert into family(id, name) values(3, '伊佐坂');

ē¢ŗčŖć€‚

mysql> select * from family;
+----+-----------+
| id | name      |
+----+-----------+
|  1 | ē£Æ野家    |
|  2 | ę³¢é‡Žå®¶    |
|  3 | 伊佐坂    |
+----+-----------+
3 rows in set (0.00 sec)

ć“ć‚Œć§ć€ćƒ‡ćƒ¼ć‚æ恮ęŗ–å‚™ćÆ完äŗ†ć§ć™ć€‚

TrinoćØApache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć®ęŗ–å‚™

ē¶šć„恦ćÆ态TrinoćŠć‚ˆć³Apache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć®ęŗ–å‚™ć‚’č”Œć„ć¾ć™ć€‚

ć¾ćšćÆApache Hive惔ć‚æć‚¹ćƒˆć‚¢ć‚µćƒ¼ćƒ“ć‚¹ć‹ć‚‰č”Œć„ć¾ć—ć‚‡ć†ć€‚MinIOć«ć‚¢ć‚Æć‚»ć‚¹ć™ć‚‹ćŸć‚ć€ä»„äø‹ć®ć‚ˆć†ć«čØ­å®šć€‚

conf/metastore-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<!-- These are default values meant to allow easy smoke testing of the metastore.  You will
likely need to add a number of new values. -->
<configuration>
  <property>
    <name>metastore.thrift.uris</name>
    <value>thrift://0.0.0.0:9083</value>
    <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  </property>
  <property>
    <name>metastore.task.threads.always</name>
    <value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
  </property>
  <property>
    <name>metastore.expression.proxy</name>
    <value>org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy</value>
  </property>

  <property>
    <name>fs.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
  </property>
  <property>
    <name>fs.s3a.access.key</name>
    <value>minioadmin</value>
  </property>
  <property>
    <name>fs.s3a.secret.key</name>
    <value>minioadmin</value>
  </property>
  <property>
    <name>fs.s3a.endpoint</name>
    <value>http://172.17.0.4:9000</value>
  </property>
  <property>
    <name>fs.s3a.path.style.access</name>
    <value>true</value>
  </property>
</configuration>

čµ·å‹•ć€‚

$ bin/schematool -initSchema -dbType derby
$ bin/start-metastore

ꬔ恫态Trino恮čØ­å®šć‚’č”Œć„ć¾ć™ć€‚

Trinoč‡Ŗä½“ć®čØ­å®šć€‚

etc/node.properties

node.environment=container
node.id=340fae6b-55fe-486e-b122-d0fbe61d0ebb
node.data-dir=/var/lib/trino-server/data

etc/jvm.config

-server
-Xmx2G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
-XX:+UnlockDiagnosticVMOptions
-XX:+UseAESCTRIntrinsics

etc/config.properties

coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
discovery.uri=http://172.17.0.2:8080

ē¶šć„恦态Connector恮čØ­å®šć§ć™ć€‚MinIOćŠć‚ˆć³MySQLć«ć‚¢ć‚Æć‚»ć‚¹ć™ć‚‹ćŸć‚ć®čØ­å®šć‚’č”Œć„ć¾ć™ć€‚

MinIO怂悫ć‚æćƒ­ć‚°åćÆminioćØć—ć¦ć„ć¾ć™ć€‚

etc/catalog/minio.properties

connector.name=hive
hive.metastore.uri=thrift://172.17.0.3:9083
hive.storage-format=ORC
hive.non-managed-table-writes-enabled=true
hive.non-managed-table-creates-enabled=true

hive.s3.aws-access-key=minioadmin
hive.s3.aws-secret-key=minioadmin
hive.s3.endpoint=http://172.17.0.4:9000
hive.s3.path-style-access=true
#hive.s3select-pushdown.enabled=true

MySQL怂悫ć‚æćƒ­ć‚°åćÆmysqlćØć—ć¦ć„ć¾ć™ć€‚

etc/catalog/mysql.properties

connector.name=mysql
connection-url=jdbc:mysql://172.17.0.5:3306
connection-user=kazuhira
connection-password=password

恓悌恧态Trinoć‚’čµ·å‹•ć€‚

$ bin/launcher run

Trino恋悉MinIOćØMySQLć«ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæ悋

ęŗ–å‚™ćŒå®Œäŗ†ć—ćŸć®ć§ć€Trino恋悉MinIOćŠć‚ˆć³MySQLć«ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæć¾ć—ć‚‡ć†ć€‚

Trino恫ꎄē¶šć€‚

$ trino --server 172.17.0.2:8080
trino>

ć¾ćšć€MinIOć«å‘ć‘ć¦ć‚¹ć‚­ćƒ¼ćƒžć‚’ä½œęˆć—ć¾ć™ć€‚

trino> create schema minio.bucket with(location = 's3a://trino-bucket/');
CREATE SCHEMA

MinIOć®ć‚¢ćƒƒćƒ—ćƒ­ćƒ¼ćƒ‰ć—ćŸCSVćƒ•ć‚”ć‚¤ćƒ«ć‚’å‚ē…§ć™ć‚‹ć‚ˆć†ć«ć€ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œęˆć€‚

create table minio.bucket.people (
  family_id varchar,
  id varchar,
  first_name varchar,
  last_name varchar,
  age varchar
) with (
  format = 'csv',
  csv_separator = ',',
  csv_quote = '"',
  csv_escape = '"',
  skip_header_line_count = 1,
  external_location = 's3a://trino-bucket/files'
);

ē¢ŗčŖć€‚

trino> select * from minio.bucket.people;
 family_id | id | first_name | last_name | age
-----------+----+------------+-----------+-----
 2         | 1  | 惎ćƒŖć‚¹ć‚±   | ę³¢é‡Ž      | 26
 2         | 2  | ć‚æć‚¤ć‚³     | ę³¢é‡Ž      | 22
 2         | 3  | 悤ć‚Æ惩     | ę³¢é‡Ž      | 1
 1         | 1  | 悵悶ć‚Ø     | 惕悰ē”°    | 24
 1         | 2  | ćƒžć‚¹ć‚Ŗ     | 惕悰ē”°    | 28
 1         | 3  | ę³¢å¹³       | ē£Æ野      | 54
 1         | 4  | 惕惍       | ē£Æ野      | 50
 1         | 5  | ć‚«ćƒ„ć‚Ŗ     | ē£Æ野      | 11
 1         | 6  | ćƒÆć‚«ćƒ”     | ē£Æ野      | 9
 1         | 7  | ć‚æ惩ć‚Ŗ     | 惕悰ē”°    | 3
 3         | 1  | 難ē‰©       | 伊佐坂    | 60
 3         | 2  | ćŠč»½       | 伊佐坂    | 50
 3         | 3  | ē”šå…­       | 伊佐坂    | 20
 3         | 4  | ęµ®ę±Ÿ       | 伊佐      | 16
(14 rows)

Query 20220913_134818_00002_68bq6, FINISHED, 1 node
Splits: 3 total, 3 done (100.00%)
1.79 [14 rows, 450B] [7 rows/s, 251B/s]

OK恧恙恭怂

MySQL偓ćÆć€ć‚ć‚‰ć‹ć˜ć‚ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œęˆć—ć¦ć„ć‚‹ćŸć‚ć€Trino恋悉ćÆ恙恐恫čŖč­˜ć§ćć¾ć™ć€‚

trino> show tables from mysql.practice;
 Table
--------
 family
(1 row)

Query 20220913_134822_00003_68bq6, FINISHED, 1 node
Splits: 7 total, 7 done (100.00%)
1.08 [1 rows, 24B] [0 rows/s, 22B/s]

ćƒ†ćƒ¼ćƒ–ćƒ«å®šē¾©ć‚„

trino> desc mysql.practice.family;
 Column |    Type     | Extra | Comment
--------+-------------+-------+---------
 id     | integer     |       |
 name   | varchar(20) |       |
(2 rows)

Query 20220913_134852_00004_68bq6, FINISHED, 1 node
Splits: 7 total, 7 done (100.00%)
0.61 [2 rows, 120B] [3 rows/s, 197B/s]

ćƒ‡ćƒ¼ć‚æ恮äø­čŗ«ć‚‚ē¢ŗčŖć§ćć¾ć™ć­ć€‚

trino> select * from mysql.practice.family;
 id |  name
----+--------
  1 | ē£Æ野家
  2 | ę³¢é‡Žå®¶
  3 | 伊佐坂
(3 rows)

Query 20220913_134903_00005_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.43 [3 rows, 0B] [6 rows/s, 0B/s]

恧ćÆ态恓恮2ć¤ć®ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’join恗恦ćæć¾ć—ć‚‡ć†ć€‚

SELECT — Trino 395 Documentation

Trino恧ćÆ态仄äø‹ć®join恮ēØ®é”žć‚’ć‚µćƒćƒ¼ćƒˆć—ć¦ć„ć‚‹ć‚ˆć†ć§ć™ć€‚

[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
CROSS JOIN

SQLć‚’ä½œęˆć—ć¦

select p.id, f.name as family_name, p.first_name, p.last_name, p.age
from minio.bucket.people as p
inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
order by family_name, cast(p.age as integer) desc;

ē¢ŗčŖć€‚

trino> select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
    -> order by family_name, cast(p.age as integer) desc;
 id | family_name | first_name | last_name | age
----+-------------+------------+-----------+-----
 1  | 伊佐坂      | 難ē‰©       | 伊佐坂    | 60
 2  | 伊佐坂      | ćŠč»½       | 伊佐坂    | 50
 3  | 伊佐坂      | ē”šå…­       | 伊佐坂    | 20
 4  | 伊佐坂      | ęµ®ę±Ÿ       | 伊佐      | 16
 1  | ę³¢é‡Žå®¶      | 惎ćƒŖć‚¹ć‚±   | ę³¢é‡Ž      | 26
 2  | ę³¢é‡Žå®¶      | ć‚æć‚¤ć‚³     | ę³¢é‡Ž      | 22
 3  | ę³¢é‡Žå®¶      | 悤ć‚Æ惩     | ę³¢é‡Ž      | 1
 3  | ē£Æ野家      | ę³¢å¹³       | ē£Æ野      | 54
 4  | ē£Æ野家      | 惕惍       | ē£Æ野      | 50
 2  | ē£Æ野家      | ćƒžć‚¹ć‚Ŗ     | 惕悰ē”°    | 28
 1  | ē£Æ野家      | 悵悶ć‚Ø     | 惕悰ē”°    | 24
 5  | ē£Æ野家      | ć‚«ćƒ„ć‚Ŗ     | ē£Æ野      | 11
 6  | ē£Æ野家      | ćƒÆć‚«ćƒ”     | ē£Æ野      | 9
 7  | ē£Æ野家      | ć‚æ惩ć‚Ŗ     | 惕悰ē”°    | 3
(14 rows)

Query 20220913_134916_00006_68bq6, FINISHED, 1 node
Splits: 15 total, 15 done (100.00%)
1.05 [20 rows, 606B] [19 rows/s, 578B/s]

ć™ć”ćć‚ć£ć•ć‚Šå‹•ćć¾ć—ćŸć€‚

今回CSVćƒ•ć‚”ć‚¤ćƒ«ć‹ć‚‰ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œć£ć¦ć„ć‚‹ć®ć§ć€ć™ć¹ć¦ć®ć‚«ćƒ©ćƒ ćŒę–‡å­—åˆ—ć«ćŖć£ć¦ć„ć¾ć™ć€‚ć“ć®ć¾ć¾ć ćØMySQLć«ć‚ć‚‹ćƒ†ćƒ¼ćƒ–ćƒ«ć®åˆ—ćØ
joinć™ć‚‹éš›ć«å›°ć‚‹ć®ć§ć‚­ćƒ£ć‚¹ćƒˆć—ć¦ć„ćŸć®ć§ć™ćŒć€‚

ć‚‚ć—ć‚‚ć‚­ćƒ£ć‚¹ćƒˆć‚’ć‚„ć‚ćŸå “åˆćÆ态仄äø‹ć®ć‚ˆć†ć«åž‹ćŒåˆć‚ćšć«ć‚Øćƒ©ćƒ¼ć«ćŖć‚Šć¾ć™ć€‚

trino> select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on p.family_id as integer = f.id
    -> order by family_name, cast(p.age as integer) desc;
Query 20220912_162900_00017_kiyr9 failed: line 3:54: mismatched input 'as'. Expecting: '%', '*', '+', ',', '-', '.', '/', 'AND', 'AT', 'CROSS', 'EXCEPT', 'FETCH', 'FULL', 'GROUP', 'HAVING', 'INNER', 'INTERSECT', 'JOIN', 'LEFT', 'LIMIT', 'NATURAL', 'OFFSET', 'OR', 'ORDER', 'RIGHT', 'UNION', 'WHERE', 'WINDOW', '[', '||', <EOF>, <predicate>
select p.id, f.name as family_name, p.first_name, p.last_name, p.age
from minio.bucket.people as p
inner join mysql.practice.family as f on p.family_id as integer = f.id
order by family_name, cast(p.age as integer) desc

ć“ć®ć‚ćŸć‚ŠćÆć€ć‚†ć‚‹ć£ćØćÆ恄恋ćŖ恄悈恆恧恙恭怂

恂ćØć€ć›ć£ć‹ććŖ恮恧explainć‚‚č”Œć£ć¦ćæć¾ć—ć‚‡ć†ć€‚

å…ˆć«ēµ±čØˆęƒ…å ±ć‚’č¦‹ć¦ćŠćć¾ć™ć€‚

SHOW STATS — Trino 395 Documentation

trino> show stats for minio.bucket.people;
 column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value
-------------+-----------+-----------------------+----------------+-----------+-----------+------------
 family_id   |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 id          |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 first_name  |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 last_name   |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 age         |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 NULL        |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
(6 rows)

Query 20220913_135248_00007_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.27 [0 rows, 0B] [0 rows/s, 0B/s]

trino> show stats for mysql.practice.family;
 column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value
-------------+-----------+-----------------------+----------------+-----------+-----------+------------
 id          |      NULL |                   2.0 |            0.0 |      NULL | NULL      | NULL
 name        |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 NULL        |      NULL |                  NULL |           NULL |       3.0 | NULL      | NULL
(3 rows)

Query 20220913_135305_00008_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.26 [0 rows, 0B] [0 rows/s, 0B/s]

MinIOå“ć®ēµ±čØˆęƒ…å ±ćÆå…ØéƒØnull恧恙恭怂

analyze恗恦ćæć¾ć™ć€‚

ANALYZE — Trino 395 Documentation

trino> analyze minio.bucket.people;
ANALYZE: 14 rows

Query 20220913_135411_00009_68bq6, FINISHED, 1 node
Splits: 13 total, 13 done (100.00%)
1.15 [14 rows, 450B] [12 rows/s, 392B/s]

trino> analyze mysql.practice.family;
Query 20220913_135421_00010_68bq6 failed: This connector does not support analyze

MySQL connectorćÆanalyzeć‚’ć‚µćƒćƒ¼ćƒˆć—ć¦ć„ćŖ恄ćØčØ€ć‚ć‚Œć¾ć—ćŸć€‚

再åŗ¦ēµ±čØˆęƒ…å ±ć‚’ē¢ŗčŖć€‚

trino> show stats for minio.bucket.people;
 column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value
-------------+-----------+-----------------------+----------------+-----------+-----------+------------
 family_id   |      14.0 |                   3.0 |            0.0 |      NULL | NULL      | NULL
 id          |      14.0 |                   7.0 |            0.0 |      NULL | NULL      | NULL
 first_name  |     111.0 |                  14.0 |            0.0 |      NULL | NULL      | NULL
 last_name   |     102.0 |                   5.0 |            0.0 |      NULL | NULL      | NULL
 age         |      25.0 |                  13.0 |            0.0 |      NULL | NULL      | NULL
 NULL        |      NULL |                  NULL |           NULL |      14.0 | NULL      | NULL
(6 rows)

Query 20220913_135513_00011_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.24 [0 rows, 0B] [0 rows/s, 0B/s]

trino> show stats for mysql.practice.family;
 column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value
-------------+-----------+-----------------------+----------------+-----------+-----------+------------
 id          |      NULL |                   2.0 |            0.0 |      NULL | NULL      | NULL
 name        |      NULL |                  NULL |           NULL |      NULL | NULL      | NULL
 NULL        |      NULL |                  NULL |           NULL |       3.0 | NULL      | NULL
(3 rows)

Query 20220913_135516_00012_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.25 [0 rows, 0B] [0 rows/s, 0B/s]

MinIOå“ć«ć‚‚ć€ēµ±čØˆęƒ…å ±ćŒå…„ć‚Šć¾ć—ćŸć€‚

恧ćÆ态explain恗恦ćæć¾ć™ć€‚

Cost in EXPLAIN — Trino 395 Documentation

ēµęžœć€‚

trino> explain select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
    -> order by family_name, cast(p.age as integer) desc;
                                                               Query Plan
-----------------------------------------------------------------------------------------------------------------------------------------
 Fragment 0 [SINGLE]
     Output layout: [id, name, first_name, last_name, age]
     Output partitioning: SINGLE []
     Output[columnNames = [id, family_name, first_name, last_name, age]]
     ā”‚   Layout: [id:varchar, name:varchar(20), first_name:varchar, last_name:varchar, age:varchar]
     ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
     ā”‚   family_name := name
     ā””ā”€ Project[]
        ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteMerge[sourceFragmentIds = [1]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

 Fragment 1 [ROUND_ROBIN]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: SINGLE []
     LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
     ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteSource[sourceFragmentIds = [2]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

 Fragment 2 [SOURCE]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: ROUND_ROBIN []
     Project[]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 5.80k, memory: 207B, network: 207B}
     ā”‚   expr_1 := CAST("age" AS integer)
     ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
        ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: 4.46k, memory: 207B, network: 207B}
        ā”‚   Distribution: REPLICATED
        ā”œā”€ Project[]
        ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
        ā”‚  ā”‚   Estimates: {rows: 14 (728B), cpu: 1.90k, memory: 0B, network: 0B}
        ā”‚  ā”‚   $hashvalue := combine_hash(bigint '0', COALESCE("$operator$hash_code"("expr"), 0))
        ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
        ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
        ā”‚         Estimates: {rows: 14 (602B), cpu: 616, memory: 0B, network: 0B}/{rows: 14 (602B), cpu: 1.19k, memory: 0B, network: 0B}
        ā”‚         expr := CAST("family_id" AS integer)
        ā”‚         family_id := family_id:string:REGULAR
        ā”‚         last_name := last_name:string:REGULAR
        ā”‚         id := id:string:REGULAR
        ā”‚         first_name := first_name:string:REGULAR
        ā”‚         age := age:string:REGULAR
        ā””ā”€ LocalExchange[partitioning = SINGLE]
           ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
           ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
           ā””ā”€ RemoteSource[sourceFragmentIds = [3]]
                  Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                  Estimates:

 Fragment 3 [SOURCE]
     Output layout: [id_0, name, $hashvalue_4]
     Output partitioning: BROADCAST []
     ScanProject[table = mysql:practice.family practice.family]
         Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
         Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
         $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
         name := name:varchar(20):VARCHAR
         id_0 := id:integer:INT


(1 row)

Query 20220913_135715_00013_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.78 [0 rows, 0B] [0 rows/s, 0B/s]

explainćÆ态仄äø‹ć®ć“ćØ恌åÆčƒ½ćŖ悈恆恧恙怂

ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆćÆåˆ†ę•£å®Ÿč”Œč؈ē”»ļ¼ˆDISTRIBUTEDļ¼‰ć‚’ćƒ†ć‚­ć‚¹ćƒˆå½¢å¼ć§č”Øē¤ŗć—ć¾ć™ć€‚

å®Ÿč”Œä¾‹ć‚’å¤‰ćˆć¦ćæć¾ć—ć‚‡ć†ć€‚č«–ē†å®Ÿč”Œč؈ē”»ļ¼ˆLOGICALļ¼‰ć‚’ćƒ†ć‚­ć‚¹ćƒˆå½¢å¼ć§č”Øē¤ŗ恙悋ćØ态恓悓ćŖę„Ÿć˜ć«ćŖć‚Šć¾ć—ćŸć€‚

trino> explain(type logical, format text) select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
    -> order by family_name, cast(p.age as integer) desc;
                                                                      Query Plan
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Output[columnNames = [id, family_name, first_name, last_name, age]]
 ā”‚   Layout: [id:varchar, name:varchar(20), first_name:varchar, last_name:varchar, age:varchar]
 ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
 ā”‚   family_name := name
 ā””ā”€ Project[]
    ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
    ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
    ā””ā”€ RemoteMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
       ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
       ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
       ā””ā”€ LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
          ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
          ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
          ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
             ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
             ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
             ā””ā”€ RemoteExchange[type = REPARTITION]
                ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
                ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 7.14k, memory: 207B, network: 1.54kB}
                ā””ā”€ Project[]
                   ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
                   ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 5.80k, memory: 207B, network: 207B}
                   ā”‚   expr_1 := CAST("age" AS integer)
                   ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
                      ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
                      ā”‚   Estimates: {rows: 14 (1.27kB), cpu: 4.46k, memory: 207B, network: 207B}
                      ā”‚   Distribution: REPLICATED
                      ā”œā”€ Project[]
                      ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
                      ā”‚  ā”‚   Estimates: {rows: 14 (728B), cpu: 1.90k, memory: 0B, network: 0B}
                      ā”‚  ā”‚   $hashvalue := combine_hash(bigint '0', COALESCE("$operator$hash_code"("expr"), 0))
                      ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
                      ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
                      ā”‚         Estimates: {rows: 14 (602B), cpu: 616, memory: 0B, network: 0B}/{rows: 14 (602B), cpu: 1.19k, memory: 0B, network: 0B}
                      ā”‚         expr := CAST("family_id" AS integer)
                      ā”‚         family_id := family_id:string:REGULAR
                      ā”‚         last_name := last_name:string:REGULAR
                      ā”‚         id := id:string:REGULAR
                      ā”‚         first_name := first_name:string:REGULAR
                      ā”‚         age := age:string:REGULAR
                      ā””ā”€ LocalExchange[partitioning = SINGLE]
                         ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
                         ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
                         ā””ā”€ RemoteExchange[type = REPLICATE]
                            ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                            ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
                            ā””ā”€ ScanProject[table = mysql:practice.family practice.family]
                                   Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
                                   Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
                                   $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
                                   name := name:varchar(20):VARCHAR
                                   id_0 := id:integer:INT

(1 row)

Query 20220913_140407_00015_68bq6, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.34 [0 rows, 0B] [0 rows/s, 0B/s]

explain恮čŖ¬ę˜Žć«ćÆć€åˆ†ę•£å®Ÿč”Œč؈ē”»ć®čŖ­ćæę–¹ćŒę›øć‹ć‚Œć¦ć„ć‚‹ć®ć§ć“ć”ć‚‰ć‚’č¦‹ć¦ćæć¾ć—ć‚‡ć†ć€‚

åˆ†ę•£ćƒ—ćƒ©ćƒ³ć®å„ćƒ—ćƒ©ćƒ³ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆć€ć²ćØć¤ć¾ćŸćÆč¤‡ę•°ćƒŽćƒ¼ćƒ‰ć§ć®TrinoćƒŽćƒ¼ćƒ‰ć§å®Ÿč”Œć•ć‚Œć‚‹ć“ćØ悒č”Øć—ć¦ć„ć¾ć™ć€‚

Each plan fragment of the distributed plan is executed by a single or multiple Trino nodes.

ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćŒåˆ†ć‹ć‚Œć¦ć„ć‚‹ćØ恄恆恓ćØćÆ态TrinoćƒŽćƒ¼ćƒ‰é–“ć§ćƒ‡ćƒ¼ć‚æ恮äŗ¤ę›ćŒč”Œć‚ć‚Œć‚‹ć“ćØ悒č”Øć—ć¦ć„ć¾ć™ć€‚

Fragments separation represent the data exchange between Trino nodes.

ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆć®ēخ锞ćÆć€ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćŒTrinoćƒŽćƒ¼ćƒ‰ć«ć‚ˆć£ć¦ć©ć®ć‚ˆć†ć«å®Ÿč”Œć•ć‚Œć‚‹ć‹ć€ćć—ć¦ć©ć®ć‚ˆć†ć«ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆé–“ć§ćƒ‡ćƒ¼ć‚æćŒåˆ†ę•£ć•ć‚Œć‚‹ć‹ć‚’
ē¤ŗć—ć¦ć„ć¾ć™ć€‚

Fragment type specifies how the fragment is executed by Trino nodes and how the data is distributed between fragments:

ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆć®ēخ锞ćÆ态ꬔ恮4ć¤ćŒć‚ć‚Šć¾ć™ć€‚

  • SINGLE ā€¦ ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆ単äø€ć®ćƒŽćƒ¼ćƒ‰ć§å®Ÿč”Œć•ć‚Œć‚‹
  • HASH ā€¦ ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆć€ćƒćƒƒć‚·ćƒ„é–¢ę•°ć‚’ä½æē”Øć—ć¦åˆ†ę•£ć—ćŸå…„åŠ›ćƒ‡ćƒ¼ć‚æ悒ä½æē”Ø恗态å›ŗå®šę•°ć®ćƒŽćƒ¼ćƒ‰ć§å®Ÿč”Œć•ć‚Œć‚‹
  • ROUND_ROBIN ā€¦ ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆć€å…„åŠ›ćƒ‡ćƒ¼ć‚æć‚’ć™ć¹ć¦ć®ćƒŽćƒ¼ćƒ‰ć«ćƒ–ćƒ­ćƒ¼ćƒ‰ć‚­ćƒ£ć‚¹ćƒˆć—ć€å›ŗå®šę•°ć®ćƒŽćƒ¼ćƒ‰ć§å®Ÿč”Œć•ć‚Œć‚‹
  • SOURCE ā€¦ ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆć€åˆ†å‰²å…„åŠ›ć«ć‚¢ć‚Æć‚»ć‚¹ć§ćć‚‹ćƒŽćƒ¼ćƒ‰ć§å®Ÿč”Œć•ć‚Œć‚‹

ć“ć“ć§ć€ęœ€åˆć«å®Ÿč”Œć—ćŸåˆ†ę•£å®Ÿč”Œč؈ē”»ć‚’見čæ”恗恦ćæć¾ć™ć€‚

 Fragment 0 [SINGLE]
     Output layout: [id, name, first_name, last_name, age]
     Output partitioning: SINGLE []
     Output[columnNames = [id, family_name, first_name, last_name, age]]
     ā”‚   Layout: [id:varchar, name:varchar(20), first_name:varchar, last_name:varchar, age:varchar]
     ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
     ā”‚   family_name := name
     ā””ā”€ Project[]
        ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteMerge[sourceFragmentIds = [1]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

 Fragment 1 [ROUND_ROBIN]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: SINGLE []
     LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
     ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteSource[sourceFragmentIds = [2]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

 Fragment 2 [SOURCE]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: ROUND_ROBIN []
     Project[]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 5.80k, memory: 207B, network: 207B}
     ā”‚   expr_1 := CAST("age" AS integer)
     ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
        ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: 4.46k, memory: 207B, network: 207B}
        ā”‚   Distribution: REPLICATED
        ā”œā”€ Project[]
        ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
        ā”‚  ā”‚   Estimates: {rows: 14 (728B), cpu: 1.90k, memory: 0B, network: 0B}
        ā”‚  ā”‚   $hashvalue := combine_hash(bigint '0', COALESCE("$operator$hash_code"("expr"), 0))
        ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
        ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
        ā”‚         Estimates: {rows: 14 (602B), cpu: 616, memory: 0B, network: 0B}/{rows: 14 (602B), cpu: 1.19k, memory: 0B, network: 0B}
        ā”‚         expr := CAST("family_id" AS integer)
        ā”‚         family_id := family_id:string:REGULAR
        ā”‚         last_name := last_name:string:REGULAR
        ā”‚         id := id:string:REGULAR
        ā”‚         first_name := first_name:string:REGULAR
        ā”‚         age := age:string:REGULAR
        ā””ā”€ LocalExchange[partitioning = SINGLE]
           ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
           ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
           ā””ā”€ RemoteSource[sourceFragmentIds = [3]]
                  Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                  Estimates:

 Fragment 3 [SOURCE]
     Output layout: [id_0, name, $hashvalue_4]
     Output partitioning: BROADCAST []
     ScanProject[table = mysql:practice.family practice.family]
         Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
         Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
         $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
         name := name:varchar(20):VARCHAR
         id_0 := id:integer:INT

ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆ4ć¤ć‚ć‚Šć¾ć™ć€‚

 Fragment 0 [SINGLE]

 Fragment 1 [ROUND_ROBIN]

 Fragment 2 [SOURCE]

 Fragment 3 [SOURCE]

SOURCE悒見悋ćØć€å…„åŠ›ćØćŖć‚‹ćƒ†ćƒ¼ćƒ–ćƒ«ć®ęƒ…å ±ćŒē¾ć‚Œć¾ć™ć€‚

恓恔悉ćÆMySQLå“ć€‚ScanProjectćØ恄恆項ē›®ć‚’見悋ćØć€ćƒ†ćƒ¼ćƒ–ćƒ«åćŒå‡ŗć¦ć„ć¾ć™ć­ć€‚

 Fragment 3 [SOURCE]
     Output layout: [id_0, name, $hashvalue_4]
     Output partitioning: BROADCAST []
     ScanProject[table = mysql:practice.family practice.family]
         Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
         Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
         $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
         name := name:varchar(20):VARCHAR
         id_0 := id:integer:INT

MinIOć®ę–¹ćÆć”ć‚‡ć£ćØå„„ć§ć™ćŒć€ć“ć”ć‚‰ć‚‚ScanProjectćŒć‚ć‚Šć¾ć™ć€‚

 Fragment 2 [SOURCE]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: ROUND_ROBIN []
     Project[]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   怜ēœē•„怜
     ā”‚   
     ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
        ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   怜ēœē•„怜
        ā”‚   
        ā”œā”€ Project[]
        ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
        ā”‚  ā”‚   怜ēœē•„怜
        ā”‚  ā”‚   
        ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
        ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
        ā”‚         怜ēœē•„怜
        ā”‚         
        ā””ā”€ LocalExchange[partitioning = SINGLE]
           ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
           ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
           ā””ā”€ RemoteSource[sourceFragmentIds = [3]]
                  Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                  Estimates:

ć‚ˆćč¦‹ć‚‹ćØ态RemoteSourcećØInnerJoin恗恦恄悋恓ćØ悂ę›øć‹ć‚Œć¦ć„ć¾ć™ć­ć€‚

ćć—ć¦ć€ć“ć®ćƒ‡ćƒ¼ć‚æ悒ROUND_ROGINļ¼ˆćƒ–ćƒ­ćƒ¼ćƒ‰ć‚­ćƒ£ć‚¹ćƒˆļ¼‰ć—ć¦ć‚½ćƒ¼ćƒˆć€‚

 Fragment 1 [ROUND_ROBIN]
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: SINGLE []
     LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
     ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteSource[sourceFragmentIds = [2]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

恓恓恧ćÆ态RemoteSource恌Fragment 2恧恂悋恓ćØ悂ę›øć‹ć‚Œć¦ć„ć¾ć™ć€‚

ęœ€å¾Œć«å˜äø€ćƒŽćƒ¼ćƒ‰ć«ćƒžćƒ¼ć‚ø态ćØ恄恆ꄟ恘恧恙恭怂

 Fragment 0 [SINGLE]
     Output layout: [id, name, first_name, last_name, age]
     Output partitioning: SINGLE []
     Output[columnNames = [id, family_name, first_name, last_name, age]]
     ā”‚   Layout: [id:varchar, name:varchar(20), first_name:varchar, last_name:varchar, age:varchar]
     ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
     ā”‚   family_name := name
     ā””ā”€ Project[]
        ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
        ā””ā”€ RemoteMerge[sourceFragmentIds = [1]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:

ćƒ•ćƒ©ć‚°ćƒ”ćƒ³ćƒˆćÆ态RemoteSourceć‚’č¦‹ć¦ć„ććØ悏恋悋悈恆ćŖę°—ćŒć—ć¾ć™ć€‚

ć“ć†č¦‹ćŸå¾Œć«č«–ē†å®Ÿč”Œč؈ē”»ļ¼ˆLOGICALļ¼‰ć‚’見čæ”恙ćØ态čŖ­ć‚ćć†ćŖę„Ÿć˜ćŒć—ć¦ćć¾ć™ć­ć€‚

explain analyze悒ä½æ恆ćØ态CPUę™‚é–“ć‚„ć‚³ć‚¹ćƒˆć‚‚ē¢ŗčŖć§ćć‚‹ć‚ˆć†ć§ć™ć€‚恓恔悉ćÆåˆ†ę•£å®Ÿč”Œč؈ē”»å›ŗå®šć®ć‚ˆć†ć§ć™ć€‚

EXPLAIN ANALYZE — Trino 395 Documentation

trino> explain analyze select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
    -> order by family_name, cast(p.age as integer) desc;
                                                                                  Query Plan
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Fragment 1 [SINGLE]
     CPU: 3.81ms, Scheduled: 3.86ms, Blocked 542.91ms (Input: 98.34ms, Output: 0.00ns), Input: 14 rows (798B); per task: avg.: 14.00 std.dev.: 0.00, Output: 14 rows (728B)
     Output layout: [id, first_name, last_name, age, name]
     Output partitioning: SINGLE []
     Project[]
     ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
     ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
     ā”‚   CPU: 1.00ms (1.64%), Scheduled: 1.00ms (1.14%), Blocked: 0.00ns (0.00%), Output: 14 rows (728B)
     ā”‚   Input avg.: 3.50 rows, Input std.dev.: 173.21%
     ā””ā”€ LocalExchange[partitioning = ROUND_ROBIN]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā”‚   CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Blocked: 334.00ms (34.15%), Output: 14 rows (798B)
        ā”‚   Input avg.: 14.00 rows, Input std.dev.: 0.00%
        ā””ā”€ RemoteMerge[sourceFragmentIds = [2]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:
               CPU: 1.00ms (1.64%), Scheduled: 1.00ms (1.14%), Blocked: 98.00ms (10.02%), Output: 14 rows (798B)
               Input avg.: 14.00 rows, Input std.dev.: 0.00%

 Fragment 2 [ROUND_ROBIN]
     CPU: 7.88ms, Scheduled: 9.80ms, Blocked 448.91ms (Input: 326.76ms, Output: 0.00ns), Input: 14 rows (798B); per task: avg.: 14.00 std.dev.: 0.00, Output: 14 rows (798B)
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: SINGLE []
     LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
     ā”‚   CPU: 1.00ms (1.64%), Scheduled: 1.00ms (1.14%), Blocked: 100.00ms (10.22%), Output: 14 rows (798B)
     ā”‚   Input avg.: 3.50 rows, Input std.dev.: 100.00%
     ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā”‚   CPU: 3.00ms (4.92%), Scheduled: 4.00ms (4.55%), Blocked: 0.00ns (0.00%), Output: 14 rows (798B)
        ā”‚   Input avg.: 3.50 rows, Input std.dev.: 100.00%
        ā””ā”€ RemoteSource[sourceFragmentIds = [3]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:
               CPU: 1.00ms (1.64%), Scheduled: 1.00ms (1.14%), Blocked: 327.00ms (33.44%), Output: 14 rows (798B)
               Input avg.: 3.50 rows, Input std.dev.: 100.00%

 Fragment 3 [SOURCE]
     CPU: 40.07ms, Scheduled: 58.75ms, Blocked 119.42ms (Input: 55.91ms, Output: 0.00ns), Input: 17 rows (534B); per task: avg.: 17.00 std.dev.: 0.00, Output: 14 rows (798B)
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: ROUND_ROBIN []
     Project[]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 5.80k, memory: 207B, network: 207B}
     ā”‚   CPU: 3.00ms (4.92%), Scheduled: 4.00ms (4.55%), Blocked: 0.00ns (0.00%), Output: 14 rows (798B)
     ā”‚   Input avg.: 4.67 rows, Input std.dev.: 36.42%
     ā”‚   expr_1 := CAST("age" AS integer)
     ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
        ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: 4.46k, memory: 207B, network: 207B}
        ā”‚   CPU: 1.00ms (1.64%), Scheduled: 1.00ms (1.14%), Blocked: 41.00ms (4.19%), Output: 14 rows (728B)
        ā”‚   Left (probe) Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚   Right (build) Input avg.: 3.00 rows, Input std.dev.: 0.00%
        ā”‚   Collisions avg.: 0.00 (0.00% est.), Collisions std.dev.: ?%
        ā”‚   Distribution: REPLICATED
        ā”œā”€ Project[]
        ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
        ā”‚  ā”‚   Estimates: {rows: 14 (728B), cpu: 1.90k, memory: 0B, network: 0B}
        ā”‚  ā”‚   CPU: 2.00ms (3.28%), Scheduled: 2.00ms (2.27%), Blocked: 0.00ns (0.00%), Output: 14 rows (728B)
        ā”‚  ā”‚   Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚  ā”‚   $hashvalue := combine_hash(bigint '0', COALESCE("$operator$hash_code"("expr"), 0))
        ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
        ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
        ā”‚         Estimates: {rows: 14 (602B), cpu: 616, memory: 0B, network: 0B}/{rows: 14 (602B), cpu: 1.19k, memory: 0B, network: 0B}
        ā”‚         CPU: 30.00ms (49.18%), Scheduled: 47.00ms (53.41%), Blocked: 0.00ns (0.00%), Output: 14 rows (602B)
        ā”‚         Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚         expr := CAST("family_id" AS integer)
        ā”‚         family_id := family_id:string:REGULAR
        ā”‚         last_name := last_name:string:REGULAR
        ā”‚         id := id:string:REGULAR
        ā”‚         first_name := first_name:string:REGULAR
        ā”‚         age := age:string:REGULAR
        ā”‚         Input: 14 rows (450B), Filtered: 0.00%
        ā””ā”€ LocalExchange[partitioning = SINGLE]
           ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
           ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
           ā”‚   CPU: 0.00ns (0.00%), Scheduled: 1.00ms (1.14%), Blocked: 22.00ms (2.25%), Output: 3 rows (84B)
           ā”‚   Input avg.: 0.75 rows, Input std.dev.: 173.21%
           ā””ā”€ RemoteSource[sourceFragmentIds = [4]]
                  Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                  Estimates:
                  CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Blocked: 56.00ms (5.73%), Output: 3 rows (84B)
                  Input avg.: 0.75 rows, Input std.dev.: 173.21%

 Fragment 4 [SOURCE]
     CPU: 18.60ms, Scheduled: 26.09ms, Blocked 0.00ns (Input: 0.00ns, Output: 0.00ns), Input: 3 rows (0B); per task: avg.: 3.00 std.dev.: 0.00, Output: 3 rows (84B)
     Output layout: [id_0, name, $hashvalue_4]
     Output partitioning: BROADCAST []
     ScanProject[table = mysql:practice.family practice.family]
         Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
         Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
         CPU: 18.00ms (29.51%), Scheduled: 25.00ms (28.41%), Blocked: 0.00ns (0.00%), Output: 3 rows (84B)
         Input avg.: 3.00 rows, Input std.dev.: 0.00%
         $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
         name := name:varchar(20):VARCHAR
         id_0 := id:integer:INT
         Input: 3 rows (0B), Filtered: 0.00%


(1 row)

Query 20220913_142608_00016_68bq6, FINISHED, 1 node
Splits: 24 total, 24 done (100.00%)
0.59 [20 rows, 606B] [34 rows/s, 1.01KB/s]

explain analyze verbose悒ä½æ恆ćØ态ä½æē”Ø恗恦恄悋ć‚Ŗćƒšćƒ¬ćƒ¼ć‚æćƒ¼ć«ć‚ˆć£ć¦ćÆčæ½åŠ ęƒ…å ±ć‚’å‡ŗåŠ›ć—ć¦ćć‚Œć‚‹ć‚ˆć†ć§ć™ć€‚

trino> explain analyze verbose select p.id, f.name as family_name, p.first_name, p.last_name, p.age
    -> from minio.bucket.people as p
    -> inner join mysql.practice.family as f on cast(p.family_id as integer) = f.id
    -> order by family_name, cast(p.age as integer) desc;
                                                                                  Query Plan
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Fragment 1 [SINGLE]
     CPU: 3.29ms, Scheduled: 4.07ms, Blocked 336.69ms (Input: 66.94ms, Output: 0.00ns), Input: 14 rows (798B); per task: avg.: 14.00 std.dev.: 0.00, Output: 14 rows (728B)
     Output layout: [id, first_name, last_name, age, name]
     Output partitioning: SINGLE []
     Project[]
     ā”‚   Layout: [id:varchar, first_name:varchar, last_name:varchar, age:varchar, name:varchar(20)]
     ā”‚   Estimates: {rows: 14 (1.27kB), cpu: ?, memory: ?, network: ?}
     ā”‚   CPU: 1.00ms (1.89%), Scheduled: 1.00ms (1.30%), Blocked: 0.00ns (0.00%), Output: 14 rows (728B)
     ā”‚   metrics:
     ā”‚     'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=0.00, p75=14.00, p90=14.00, p95=14.00, p99=14.00, min=0.00, max=14.00}
     ā”‚   Input avg.: 3.50 rows, Input std.dev.: 173.21%
     ā””ā”€ LocalExchange[partitioning = ROUND_ROBIN]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā”‚   CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Blocked: 270.00ms (32.93%), Output: 14 rows (798B)
        ā”‚   metrics:
        ā”‚     'Input distribution' = {count=1.00, p01=14.00, p05=14.00, p10=14.00, p25=14.00, p50=14.00, p75=14.00, p90=14.00, p95=14.00, p99=14.00, min=14.00, max=14.00}
        ā”‚   Input avg.: 14.00 rows, Input std.dev.: 0.00%
        ā””ā”€ RemoteMerge[sourceFragmentIds = [2]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:
               CPU: 1.00ms (1.89%), Scheduled: 1.00ms (1.30%), Blocked: 67.00ms (8.17%), Output: 14 rows (798B)
               metrics:
                 'Input distribution' = {count=1.00, p01=14.00, p05=14.00, p10=14.00, p25=14.00, p50=14.00, p75=14.00, p90=14.00, p95=14.00, p99=14.00, min=14.00, max=14.00}
               Input avg.: 14.00 rows, Input std.dev.: 0.00%

 Fragment 2 [ROUND_ROBIN]
     CPU: 5.67ms, Scheduled: 7.79ms, Blocked 270.43ms (Input: 213.22ms, Output: 0.00ns), Input: 14 rows (798B); per task: avg.: 14.00 std.dev.: 0.00, Output: 14 rows (798B)
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: SINGLE []
     LocalMerge[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
     ā”‚   CPU: 1.00ms (1.89%), Scheduled: 1.00ms (1.30%), Blocked: 57.00ms (6.95%), Output: 14 rows (798B)
     ā”‚   metrics:
     ā”‚     'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=3.00, p75=11.00, p90=11.00, p95=11.00, p99=11.00, min=0.00, max=11.00}
     ā”‚   Input avg.: 3.50 rows, Input std.dev.: 128.57%
     ā””ā”€ PartialSort[orderBy = [name ASC NULLS LAST, expr_1 DESC NULLS LAST]]
        ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
        ā”‚   Estimates: {rows: 14 (1.34kB), cpu: ?, memory: ?, network: ?}
        ā”‚   CPU: 2.00ms (3.77%), Scheduled: 3.00ms (3.90%), Blocked: 0.00ns (0.00%), Output: 14 rows (798B)
        ā”‚   metrics:
        ā”‚     'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=3.00, p75=11.00, p90=11.00, p95=11.00, p99=11.00, min=0.00, max=11.00}
        ā”‚   Input avg.: 3.50 rows, Input std.dev.: 128.57%
        ā””ā”€ RemoteSource[sourceFragmentIds = [3]]
               Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
               Estimates:
               CPU: 1.00ms (1.89%), Scheduled: 1.00ms (1.30%), Blocked: 213.00ms (25.98%), Output: 14 rows (798B)
               metrics:
                 'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=3.00, p75=11.00, p90=11.00, p95=11.00, p99=11.00, min=0.00, max=11.00}
               Input avg.: 3.50 rows, Input std.dev.: 128.57%

 Fragment 3 [SOURCE]
     CPU: 37.00ms, Scheduled: 57.84ms, Blocked 255.04ms (Input: 152.00ms, Output: 0.00ns), Input: 17 rows (534B); per task: avg.: 17.00 std.dev.: 0.00, Output: 14 rows (798B)
     Output layout: [name, last_name, id, first_name, age, expr_1]
     Output partitioning: ROUND_ROBIN []
     Project[]
     ā”‚   Layout: [name:varchar(20), last_name:varchar, id:varchar, first_name:varchar, age:varchar, expr_1:integer]
     ā”‚   Estimates: {rows: 14 (1.34kB), cpu: 5.80k, memory: 207B, network: 207B}
     ā”‚   CPU: 3.00ms (5.66%), Scheduled: 3.00ms (3.90%), Blocked: 0.00ns (0.00%), Output: 14 rows (798B)
     ā”‚   metrics:
     ā”‚     'Input distribution' = {count=3.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=4.00, p75=7.00, p90=7.00, p95=7.00, p99=7.00, min=3.00, max=7.00}
     ā”‚   Input avg.: 4.67 rows, Input std.dev.: 36.42%
     ā”‚   expr_1 := CAST("age" AS integer)
     ā””ā”€ InnerJoin[criteria = ("expr" = "id_0"), hash = [$hashvalue, $hashvalue_2], distribution = REPLICATED]
        ā”‚   Layout: [last_name:varchar, id:varchar, first_name:varchar, age:varchar, name:varchar(20)]
        ā”‚   Reorder joins cost : {rows: 14 (1.27kB), cpu: 3.58k, memory: 180B, network: 180B}
        ā”‚   Estimates: {rows: 14 (1.27kB), cpu: 4.46k, memory: 207B, network: 207B}
        ā”‚   CPU: 1.00ms (1.89%), Scheduled: 1.00ms (1.30%), Blocked: 23.00ms (2.80%), Output: 14 rows (728B)
        ā”‚   Left (probe) metrics:
        ā”‚     'Input distribution' = {count=3.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=4.00, p75=7.00, p90=7.00, p95=7.00, p99=7.00, min=3.00, max=7.00}
        ā”‚   Right (build) metrics:
        ā”‚     'Input distribution' = {count=1.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=3.00, p75=3.00, p90=3.00, p95=3.00, p99=3.00, min=3.00, max=3.00}
        ā”‚   Left (probe) Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚   Right (build) Input avg.: 3.00 rows, Input std.dev.: 0.00%
        ā”‚   Collisions avg.: 0.00 (0.00% est.), Collisions std.dev.: ?%
        ā”‚   Distribution: REPLICATED
        ā”œā”€ Project[]
        ā”‚  ā”‚   Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar, $hashvalue:bigint]
        ā”‚  ā”‚   Estimates: {rows: 14 (728B), cpu: 1.90k, memory: 0B, network: 0B}
        ā”‚  ā”‚   CPU: 2.00ms (3.77%), Scheduled: 2.00ms (2.60%), Blocked: 0.00ns (0.00%), Output: 14 rows (728B)
        ā”‚  ā”‚   metrics:
        ā”‚  ā”‚     'Input distribution' = {count=3.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=4.00, p75=7.00, p90=7.00, p95=7.00, p99=7.00, min=3.00, max=7.00}
        ā”‚  ā”‚   Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚  ā”‚   $hashvalue := combine_hash(bigint '0', COALESCE("$operator$hash_code"("expr"), 0))
        ā”‚  ā””ā”€ ScanProject[table = minio:bucket:people]
        ā”‚         Layout: [last_name:varchar, expr:integer, id:varchar, first_name:varchar, age:varchar]
        ā”‚         Estimates: {rows: 14 (602B), cpu: 616, memory: 0B, network: 0B}/{rows: 14 (602B), cpu: 1.19k, memory: 0B, network: 0B}
        ā”‚         CPU: 27.00ms (50.94%), Scheduled: 46.00ms (59.74%), Blocked: 0.00ns (0.00%), Output: 14 rows (602B)
        ā”‚         metrics:
        ā”‚           'Input distribution' = {count=3.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=4.00, p75=7.00, p90=7.00, p95=7.00, p99=7.00, min=3.00, max=7.00}
        ā”‚         Input avg.: 4.67 rows, Input std.dev.: 36.42%
        ā”‚         expr := CAST("family_id" AS integer)
        ā”‚         family_id := family_id:string:REGULAR
        ā”‚         last_name := last_name:string:REGULAR
        ā”‚         id := id:string:REGULAR
        ā”‚         first_name := first_name:string:REGULAR
        ā”‚         age := age:string:REGULAR
        ā”‚         Input: 14 rows (450B), Filtered: 0.00%
        ā””ā”€ LocalExchange[partitioning = SINGLE]
           ā”‚   Layout: [id_0:integer, name:varchar(20), $hashvalue_2:bigint]
           ā”‚   Estimates: {rows: 3 (207B), cpu: 387, memory: 0B, network: 207B}
           ā”‚   CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Blocked: 38.00ms (4.63%), Output: 3 rows (84B)
           ā”‚   metrics:
           ā”‚     'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=0.00, p75=3.00, p90=3.00, p95=3.00, p99=3.00, min=0.00, max=3.00}
           ā”‚   Input avg.: 0.75 rows, Input std.dev.: 173.21%
           ā””ā”€ RemoteSource[sourceFragmentIds = [4]]
                  Layout: [id_0:integer, name:varchar(20), $hashvalue_3:bigint]
                  Estimates:
                  CPU: 0.00ns (0.00%), Scheduled: 0.00ns (0.00%), Blocked: 152.00ms (18.54%), Output: 3 rows (84B)
                  metrics:
                    'Input distribution' = {count=4.00, p01=0.00, p05=0.00, p10=0.00, p25=0.00, p50=0.00, p75=3.00, p90=3.00, p95=3.00, p99=3.00, min=0.00, max=3.00}
                  Input avg.: 0.75 rows, Input std.dev.: 173.21%

 Fragment 4 [SOURCE]
     CPU: 14.91ms, Scheduled: 19.48ms, Blocked 0.00ns (Input: 0.00ns, Output: 0.00ns), Input: 3 rows (0B); per task: avg.: 3.00 std.dev.: 0.00, Output: 3 rows (84B)
     Output layout: [id_0, name, $hashvalue_4]
     Output partitioning: BROADCAST []
     ScanProject[table = mysql:practice.family practice.family]
         Layout: [id_0:integer, name:varchar(20), $hashvalue_4:bigint]
         Estimates: {rows: 3 (207B), cpu: 180, memory: 0B, network: 0B}/{rows: 3 (207B), cpu: 387, memory: 0B, network: 0B}
         CPU: 14.00ms (26.42%), Scheduled: 18.00ms (23.38%), Blocked: 0.00ns (0.00%), Output: 3 rows (84B)
         connector metrics:
           'Physical input read time' = {duration=1.00ms}
         metrics:
           'Input distribution' = {count=1.00, p01=3.00, p05=3.00, p10=3.00, p25=3.00, p50=3.00, p75=3.00, p90=3.00, p95=3.00, p99=3.00, min=3.00, max=3.00}
         Input avg.: 3.00 rows, Input std.dev.: 0.00%
         $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("id_0"), 0))
         name := name:varchar(20):VARCHAR
         id_0 := id:integer:INT
         Input: 3 rows (0B), Filtered: 0.00%


(1 row)

Query 20220913_143008_00023_68bq6, FINISHED, 1 node
Splits: 24 total, 24 done (100.00%)
0.51 [20 rows, 606B] [39 rows/s, 1.16KB/s]

join恮話ćÆć€ć‚³ć‚¹ćƒˆćƒ™ćƒ¼ć‚¹ć®ęœ€é©åŒ–ć«é–¢ć™ć‚‹ćƒšćƒ¼ć‚ø恫悂ę›øć‹ć‚Œć¦ć„ć¾ć™ć€‚

Cost based optimizations — Trino 395 Documentation

ć‚Æć‚ØćƒŖćƒ¼ć§joinćŒå®Ÿč”Œć•ć‚Œć‚‹é †åŗćÆć€ćƒ‘ćƒ•ć‚©ćƒ¼ćƒžćƒ³ć‚¹ć«å¤§ććŖå½±éŸæ悒äøŽćˆć‚‹åÆčƒ½ę€§ćŒć‚ć‚Šć¾ć™ć€‚å‡¦ē†ć•ć‚Œć‚‹ćƒ‡ćƒ¼ć‚æé‡ć€č»¢é€ć•ć‚Œć‚‹ćƒ‡ćƒ¼ć‚æ量恌
å¤§ććŖč¦å› ć«ćŖć‚Šć¾ć™ć€‚

ć“ć®ćŸć‚ć‚³ć‚¹ćƒˆćƒ™ćƒ¼ć‚¹ć®joinꈦē•„ć®å “åˆć€Trino恧ćÆćƒ†ćƒ¼ćƒ–ćƒ«ć®ēµ±čØˆęƒ…å ±ć‹ć‚‰join恮順åŗć®ć‚³ć‚¹ćƒˆć‚’見ē©ć‚‚ć‚Šć€ęœ€ć‚‚ć‚³ć‚¹ćƒˆćŒä½Žć„join恮順åŗć‚’
éøęŠžć—ć¾ć™ć€‚

Cost based optimizations / Join enumeration

ć“ć®ęŒ™å‹•ćÆ态join_reordering_strategyćƒ—ćƒ­ćƒ‘ćƒ†ć‚£ć§3ć¤ć®å€¤ć‹ć‚‰ęŒ‡å®šć™ć‚‹ć“ćØ恫ćŖć‚Šć¾ć™ć€‚

ć¾ćŸć€Trino恧ćÆjoinć«ćƒćƒƒć‚·ćƒ„ćƒ™ćƒ¼ć‚¹ć®joinć‚¢ćƒ«ć‚“ćƒŖć‚ŗ惠悒ä½æē”Øć—ć¾ć™ćŒć€ć“ć‚ŒćÆjoinę¼”ē®—å­ć”ćØć«ć€å…„åŠ›ć”ćØć«ćƒćƒƒć‚·ćƒ„ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œęˆć™ć‚‹
åæ…č¦ćŒć‚ć‚‹ć“ćØć‚’ę„å‘³ć—ć¾ć™ć€‚ć“ć‚Œć‚’ćƒ“ćƒ«ćƒ‰å“ćØå‘¼ć¶ć‚ˆć†ć§ć™ć€‚

ć‚‚ć†ć²ćØć¤ć®å…„åŠ›ć§ćÆć€å„č”Œć”ćØć«ćƒćƒƒć‚·ćƒ„ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ć‚Æć‚ØćƒŖćƒ¼ć—ć¾ć™ć€‚ć“ć”ć‚‰ć‚’ćƒ—ćƒ­ćƒ¼ćƒ–å“ćØå‘¼ć³ć¾ć™ć€‚

join恮際恫ćÆć€ćƒ‡ćƒ¼ć‚æ恮äø€éƒØć‹ć‚‰ćƒćƒƒć‚·ćƒ„ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ę§‹ēÆ‰ć™ć‚‹PartitionedćØ态å…Øćƒ‡ćƒ¼ć‚æć‹ć‚‰ćƒćƒƒć‚·ćƒ„ćƒ†ćƒ¼ćƒ–ćƒ«ć‚’ä½œęˆć—ć¦å„ćƒŽćƒ¼ćƒ‰ć«č¤‡č£½ć™ć‚‹
Broadcast恮2恤恮ꈦē•„ćŒć‚ć‚Šć¾ć™ć€‚

Cost based optimizations / Join distribution selection

ć“ć®ęŒ™å‹•ćÆ态join_distribution_typećƒ—ćƒ­ćƒ‘ćƒ†ć‚£ć§3ć¤ć®å€¤ć‹ć‚‰ęŒ‡å®šć™ć‚‹ć“ćØ恫ćŖć‚Šć¾ć™ć€‚

ć„ćšć‚Œć®ęˆ¦ē•„ć«ć‚‚ćƒˆćƒ¬ćƒ¼ćƒ‰ć‚Ŗćƒ•ćŒć‚ć‚Šć€ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆć§ćÆć‚³ć‚¹ćƒˆćƒ™ćƒ¼ć‚¹ć®éøꊞ恧č‡Ŗ動ēš„恫Partitioned恋Broadcastć‹ć®ć„ćšć‚Œć‹ć®join悒éøęŠžć—ć¾ć™ć€‚

ć¤ć¾ć‚Šć€ć„ć‹ć«ć—ć¦ćƒ‡ćƒ¼ć‚æ悒ēµžć‚Šč¾¼ć‚“ć§ć‚¢ć‚Æć‚»ć‚¹ć™ć‚‹ć‹ćŒé‡č¦ć«ćŖ悊恝恆ćŖꄟ恘恧恙恭怂

恂ćØćÆćƒ—ćƒƒć‚·ćƒ„ćƒ€ć‚¦ćƒ³ćØ恄恆ꦂåæµć‚‚ć‚ć‚‹ć‚ˆć†ć§ć™ćŒć€ć“ć”悉ćÆć¾ćŸć„ćšć‚Œć€‚

Pushdown — Trino 395 Documentation

ć¾ćØ悁

Trino悒ä½æć£ć¦ć€MinIOćØMySQLć®ćƒ‡ćƒ¼ć‚æ悒joinć—ć¦ć‚¢ć‚Æć‚»ć‚¹ć—ć¦ćæć¾ć—ćŸć€‚

恓悌č‡Ŗ体ćÆ非åøø恫ē°”å˜ć«ć„ćć¾ć—ćŸć€‚

åˆć‚ć›ć¦ć€explain悄analyze态join恮話悒見悋恓ćØ恫ćŖć£ćŸć®ć§ć„ć‚ć„ć‚ćØå‹‰å¼·ć«ćŖć‚Šć¾ć—ćŸć­ć€‚å€‹äŗŗēš„恫ćÆć€ć‘ć£ć“ć†čˆˆå‘³ć‚’ęŒć¦ć¾ć—ćŸć€‚