CLOVERšŸ€

That was when it all began.

Prometheusć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć‚’ć•ć‚‰ć£ćØčŖ­ć‚“ć§ćæć¦ć€retentionć®čØ­å®šć‚‚ć—ć¦ćæć‚‹

ć“ć‚ŒćÆć€ćŖć«ć‚’ć—ćŸćć¦ę›øć„ćŸć‚‚ć®ļ¼Ÿ

Prometheusć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć¾ć‚ć‚Šć®ć€ćŠå‹‰å¼·ć«ć€ćØć€‚

Prometheusのデータ(TSDB)のSnapshotを取得して、リストアまで - CLOVER🍀

ć“ć”ć‚‰ć®ē¶šćć§ć€ä»Šåŗ¦ćÆć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć‚’čŖ­ćæć€ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć«ć¤ć„ć¦č¦‹ć¦ć„ć“ć†ćØę€ć„ć¾ć™ć€‚

対豔とするPrometheusć®ćƒćƒ¼ć‚øćƒ§ćƒ³ćÆć€2.9.2ćØć—ć¾ć™ļ¼ˆćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆćÆ2.9)。

ć”ć‚‡ć£ćØčŖæć¹ć¦ćæć‚‹ćØć€éŽåŽ»ć®ęƒ…å ±ćØē¾åœØć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć®å†…å®¹ćÆćć‚ŒćŖć‚Šć«ē•°ćŖć‚‹ē®‡ę‰€ćŒå¤šćć†ćŖć®ć§ć€ćć®ę™‚ē‚¹ć®
ćƒćƒ¼ć‚øćƒ§ćƒ³ć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć‚’ć”ć‚ƒć‚“ćØē¢ŗčŖć—ćŸę–¹ćŒč‰Æć„ćØę€ć„ć¾ć™ć€‚

ć‚ćć¾ć§ć€ē¾ę™‚ē‚¹ļ¼ˆ2.9.2)での話として。

Prometheusć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚ø

Prometheusć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć¾ć‚ć‚Šć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆćÆć€ć“ć”ć‚‰ć§ć™ć€‚

Storage | Prometheus

č¦‹ć¦ć„ććØć€ć–ć£ćć‚Šć“ć‚“ćŖę„Ÿć˜ć®ć“ćØćŒę›øć‹ć‚Œć¦ć„ć¾ć™ć€‚

  • Prometheusć«ćÆć€ćƒ­ćƒ¼ć‚«ćƒ«ćƒ‡ć‚£ć‚¹ć‚ÆäøŠć«ęŒć¤ę™‚ē³»åˆ—ćƒ‡ćƒ¼ć‚æćƒ™ćƒ¼ć‚¹ćŒå«ć¾ć‚Œć¦ć„ć‚‹
  • ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć§ć€ćƒŖćƒ¢ćƒ¼ćƒˆć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć‚·ć‚¹ćƒ†ćƒ ćØēµ±åˆć™ć‚‹ć“ćØć‚‚ć§ćć‚‹

ćƒŖćƒ¢ćƒ¼ćƒˆćØć®ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć‚·ć‚¹ćƒ†ćƒ ćØēµ±åˆć™ć‚‹ć“ćØć«ć¤ć„ć¦ćÆć€ć¾ćŸę©Ÿä¼šć‚’ę”¹ć‚ć¦č¦‹ć¦ć„ćć¾ć—ć‚‡ć†ć€‚

ć“ć“ć‹ć‚‰ćÆć€ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć«ć¤ć„ć¦č¦‹ć¦ć„ćć¾ć™ć€‚

ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć«ć¤ć„ć¦

Prometheusć®ćƒ­ćƒ¼ć‚«ćƒ«ć®ę™‚ē³»åˆ—ćƒ‡ćƒ¼ć‚æćƒ™ćƒ¼ć‚¹ćÆć€ćƒ‡ćƒ¼ć‚æć‚’ē‹¬č‡Ŗć®ćƒ•ć‚©ćƒ¼ćƒžćƒƒćƒˆć§ćƒ‡ć‚£ć‚¹ć‚Æć«äæå­˜ć—ć¾ć™ć€‚

ćƒ‡ć‚£ć‚¹ć‚ÆäøŠć®ćƒ¬ć‚¤ć‚¢ć‚¦ćƒˆć«ć¤ć„ć¦ć§ć™ćŒć€ä»„äø‹ć®ć‚ˆć†ćŖč¦ē“ ć§ę§‹ęˆć•ć‚Œć‚‹ć‚ˆć†ć§ć™ć€‚

  • å–å¾—ć—ćŸćƒ‡ćƒ¼ć‚æćÆć€2ę™‚é–“ć”ćØć®ćƒ–ćƒ­ćƒƒć‚Æć«ć¾ćØć‚ć‚‰ć‚Œć‚‹
  • 2ę™‚é–“ć”ćØć®å„ćƒ–ćƒ­ćƒƒć‚ÆćÆć€ä»„äø‹ć®å†…å®¹ć‚’å«ć‚€
    • ćć®ęœŸé–“ć®ć™ć¹ć¦ć®ćƒ‡ćƒ¼ć‚æć‚’å«ć‚€ć€ćƒćƒ£ćƒ³ć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖļ¼ˆ1ć¤ä»„äøŠć®ćƒćƒ£ćƒ³ć‚Æćƒ•ć‚”ć‚¤ćƒ«ć‚’å«ć‚€ļ¼‰
    • ćƒ”ć‚æćƒ‡ćƒ¼ć‚æćƒ•ć‚”ć‚¤ćƒ«
    • ć‚¤ćƒ³ćƒ‡ćƒƒć‚Æć‚¹ćƒ•ć‚”ć‚¤ćƒ«ļ¼ˆćƒ”ćƒˆćƒŖć‚Æć‚¹åćØć€ćƒćƒ£ćƒ³ć‚Æćƒ•ć‚”ć‚¤ćƒ«å†…ć®ę™‚ē³»åˆ—ć«ćƒ©ćƒ™ćƒ«ć‚’ä»˜äøŽć—ćŸć‚‚ć®ćŒć‚¤ćƒ³ćƒ‡ćƒƒć‚Æć‚¹ć•ć‚Œć‚‹ļ¼‰
    • Tombstonećƒ•ć‚”ć‚¤ćƒ«
      • APIć‚’ä½æē”Øć—ć¦ć€ę™‚ē³»åˆ—ćƒ‡ćƒ¼ć‚æć‚’å‰Šé™¤ć—ćŸå “åˆć«ć§ćć‚‹ćƒ•ć‚”ć‚¤ćƒ«
      • ćƒćƒ£ćƒ³ć‚Æćƒ•ć‚”ć‚¤ćƒ«ć‹ć‚‰ć™ćć«ćƒ‡ćƒ¼ć‚æćŒå‰Šé™¤ć•ć‚Œć‚‹ć‚ć‘ć§ćÆćŖć„

ć¾ćŸć€PrometheusćÆåŽé›†ć—ćŸćƒ‡ćƒ¼ć‚æć‚’ć™ćć«ę°øē¶šåŒ–ć™ć‚‹ć®ć§ćÆćŖćć€ć¾ćšćƒ”ćƒ¢ćƒŖäøŠć«äæęŒć•ć‚Œć¦ć„ć¾ć™ć€‚

PrometheusćŒć‚Æćƒ©ćƒƒć‚·ćƒ„ć—ćŸć‚Šć—ć¦å†čµ·å‹•ć—ćŸå “åˆćÆć€WAL(Write Ahead Logļ¼‰ć‚’ä½æć†ć“ćØć§ć‚Æćƒ©ćƒƒć‚·ćƒ„ć‹ć‚‰äæč­·ć•ć‚Œć¦ć„ć¾ć™ć€‚

WALćÆć€ä»„äø‹ć®ē‰¹å¾“ć‚’ęŒć”ć¾ć™ć€‚

  • 怌walć€ćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć«128MBć”ćØć«äæå­˜ć•ć‚Œć¦ć„ć‚‹
  • 怌walć€ćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć«å«ć¾ć‚Œć‚‹ćƒ•ć‚”ć‚¤ćƒ«ćÆć€ć¾ć åœ§ēø®ć•ć‚Œć¦ć„ćŖć„Rawćƒ‡ćƒ¼ć‚æćŒå«ć¾ć‚Œć¦ć„ć‚‹
    • ć“ć®ćŸć‚ć€é€šåøøć®ćƒ–ćƒ­ćƒƒć‚Æćƒ•ć‚”ć‚¤ćƒ«ć‚ˆć‚Šć‹ćŖć‚Šå¤§ćć„
  • PrometheusćÆęœ€ä½Ž3恤恮WALćƒ•ć‚”ć‚¤ćƒ«ć‚’äæęŒć—ć¦ć„ć‚‹
    • é«˜ćƒˆćƒ©ćƒ•ć‚£ćƒƒć‚ÆćŖć‚µćƒ¼ćƒćƒ¼ć§ćÆć€å°‘ćŖććØć‚‚2ę™‚é–“åˆ†ć®Rawćƒ‡ćƒ¼ć‚æć‚’äæå­˜ć™ć‚‹åæ…č¦ćŒć‚ć‚‹ćŸć‚ć€3ć¤ć‚’č¶…ćˆć‚‹WALćƒ•ć‚”ć‚¤ćƒ«ćŒć§ćć‚‹ć“ćØćŒć‚ć‚‹

ć§ć€ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć«ę›øć‹ć‚Œć¦ć„ć‚‹ćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖę§‹é€ ćŒć“ć”ć‚‰ć€‚ć“ć“ć¾ć§ē™»å “ć—ćŸč¦ē“ ćŒę›øć‹ć‚Œć¦ć„ć‚‹ę„Ÿć˜ć§ć™ć­ć€‚

./data/01BKGV7JBM69T2G1BGBGM6KB12
./data/01BKGV7JBM69T2G1BGBGM6KB12/meta.json
./data/01BKGTZQ1SYQJTR4PB43C8PD98
./data/01BKGTZQ1SYQJTR4PB43C8PD98/meta.json
./data/01BKGTZQ1SYQJTR4PB43C8PD98/index
./data/01BKGTZQ1SYQJTR4PB43C8PD98/chunks
./data/01BKGTZQ1SYQJTR4PB43C8PD98/chunks/000001
./data/01BKGTZQ1SYQJTR4PB43C8PD98/tombstones
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K
./data/01BKGTZQ1HHWHV8FBJXW1Y3W0K/meta.json
./data/01BKGV7JC0RY8A6MACW02A2PJD
./data/01BKGV7JC0RY8A6MACW02A2PJD/meta.json
./data/01BKGV7JC0RY8A6MACW02A2PJD/index
./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks
./data/01BKGV7JC0RY8A6MACW02A2PJD/chunks/000001
./data/01BKGV7JC0RY8A6MACW02A2PJD/tombstones
./data/wal/00000000
./data/wal/00000001
./data/wal/00000002

ć‚ˆć‚Šč©³ē“°ćŖå†…å®¹ć‚’ēŸ„ć‚ŠćŸć‘ć‚Œć°ć€TSDBć®ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆćøć€‚

tsdb/README.md at v0.7.1 · prometheus/tsdb · GitHub

ć”ćŖćæć«ć€čµ·å‹•ē›“å¾Œć®dataćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć®äø­čŗ«ćÆć€ć“ć‚“ćŖę„Ÿć˜ć§ć™ć€‚

$ find data -type f
data/wal/00000000
data/lock

lockćØć„ć†ćƒ•ć‚”ć‚¤ćƒ«ćÆå‡ŗć¦ćć¦ć„ć¾ć›ć‚“ć§ć—ćŸć­ć€‚

$ ls -l data/lock 
-rw-r--r-- 1 xxxxx xxxxx 0 May  2 10:51 data/lock

0ćƒć‚¤ćƒˆć®ćƒ•ć‚”ć‚¤ćƒ«ć§ć™ć€‚

ć“ć®ćƒ•ć‚”ć‚¤ćƒ«ćÆć€ę–‡å­—é€šć‚ŠęŽ’ä»–ć«ä½æć†ćƒ•ć‚”ć‚¤ćƒ«ć®ć‚ˆć†ć§ć€ć™ć§ć«PrometheusćŒčµ·å‹•ć—ćŸēŠ¶ę…‹ć§ć€ćƒŖćƒƒć‚¹ćƒ³ćƒćƒ¼ćƒˆć‚’
å¤‰ćˆć¦čµ·å‹•ć—ćŸć‚Šć—ć‚ˆć†ćØć™ć‚‹ćØć€ćƒ­ćƒƒć‚ÆćŒå–ć‚Œćšć«čµ·å‹•ć«å¤±ę•—ć—ć¾ć™ć€‚

$ ./prometheus --web.listen-address="0.0.0.0:9091"

...

level=error ts=2019-05-02T10:53:06.753Z caller=main.go:717 err="opening storage failed: lock DB directory: resource temporarily unavailable"

https://github.com/prometheus/prometheus/blob/v2.9.2/vendor/github.com/prometheus/tsdb/db.go#L280-L283

å…ˆć«é€²ćæć¾ć—ć‚‡ć†ć€‚

ęœ€åˆć®2ę™‚é–“ć®ćƒ–ćƒ­ćƒƒć‚ÆćÆć€ęœ€ēµ‚ēš„ć«ćÆćƒćƒƒć‚Æć‚°ćƒ©ć‚¦ćƒ³ćƒ‰ć§ć‚ˆć‚Šé•·ć„ćƒ–ćƒ­ćƒƒć‚Æć«åœ§ēø®ć•ć‚Œć¾ć™ć€‚

ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć®åˆ¶é™

ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć®åˆ¶é™ćÆć€ć‚Æćƒ©ć‚¹ć‚æåŒ–ć‚‚ć•ć‚Œć¦ćŠć‚‰ćšć€ć¾ćŸćƒ¬ćƒ—ćƒŖć‚«ć‚‚ęŒćŸćŖć„ć“ćØć§ć™ć€‚

ćŖć®ć§ć€ćƒ‡ć‚£ć‚¹ć‚Æć‚„ćƒŽćƒ¼ćƒ‰éšœå®³åÆ¾ć—ć¦ć€č€ę€§ćŒć‚ć‚Šć¾ć›ć‚“ć€‚ć‚¹ć‚±ćƒ¼ćƒ©ćƒ–ćƒ«ć§ć‚‚č€ä¹…ę€§ć‚‚ćŖć„ć€ē›“čæ‘ć®ēŸ­å‘½ćŖ
ć‚¹ćƒ©ć‚¤ćƒ‡ć‚£ćƒ³ć‚°ć‚¦ć‚£ćƒ³ćƒ‰ć‚¦ćƒ‡ćƒ¼ć‚æćØć—ć¦ę‰±ć‚ć‚Œć‚‹ć¹ćć§ć™ć€‚

č€ä¹…ę€§ć®č¦ä»¶ćŒåŽ³ć—ććŖć‘ć‚Œć°ć€ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć§ć‚‚ęœ€å¤§ę•°å¹“ć®ćƒ‡ćƒ¼ć‚æć‚’äæå­˜ć§ćć‚‹ć‹ć‚‚ć—ć‚ŒćŖć„ć€ćć‚‰ć„ć®ć“ćØćŒ
ę›øć‹ć‚Œć¦ć„ć¾ć™ā€¦ć€‚

ć“ć®ć‚ćŸć‚Šć®ć“ćØć‹ć‚‰ć€ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć®ć‚¹ć‚±ćƒ¼ćƒ©ćƒ“ćƒŖćƒ†ć‚£ćØč€ä¹…ę€§ć«ć¤ć„ć¦ć®čŖ²é”Œć‚’č§£ę±ŗć™ć‚‹ćŸć‚ć«ć€ćƒŖćƒ¢ćƒ¼ćƒˆć®
ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć‚·ć‚¹ćƒ†ćƒ ćØēµ±åˆć™ć‚‹ä»•ēµ„ćæćŒć‚ć‚‹ć€ćØć„ć†ć“ćØćæćŸć„ć§ć™ć€‚

Remote storage integrations

ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć«é–¢ć™ć‚‹ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³

ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć«é–¢ć™ć‚‹ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć®ä»„äø‹ć®éƒØåˆ†ć«ć€ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć«é–¢ć—ć¦ęŒ‡å®šć§ćć‚‹ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćŒę›øć‹ć‚Œć¦ć„ć¾ć™ć€‚

Operational aspects

Prometheusč‡Ŗčŗ«ć®ćƒ˜ćƒ«ćƒ—ć‹ć‚‰ć€ć€Œstorageć€ć‚’å«ć‚€ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć‚’č”Øē¤ŗć—ć¦ćæć‚‹ćØć€

$ ./prometheus -h 2>&1 | grep storage
      --storage.tsdb.path="data/"  
                                 Base path for metrics storage.
      --storage.tsdb.retention=STORAGE.TSDB.RETENTION  
                                 storage. This flag has been deprecated, use
                                 "storage.tsdb.retention.time" instead
      --storage.tsdb.retention.time=STORAGE.TSDB.RETENTION.TIME  
                                 How long to retain samples in storage. When
                                 "storage.tsdb.retention". If neither this flag
                                 nor "storage.tsdb.retention" nor
                                 "storage.tsdb.retention.size" is set, the
      --storage.tsdb.retention.size=STORAGE.TSDB.RETENTION.SIZE  
      --storage.tsdb.no-lockfile  
      --storage.tsdb.allow-overlapping-blocks  
      --storage.remote.flush-deadline=<duration>  
      --storage.remote.read-sample-limit=5e7  
      --storage.remote.read-concurrent-limit=10  

怌--storage.remote.ć€œć€ć«ć¤ć„ć¦ćÆć€ćƒŖćƒ¢ćƒ¼ćƒˆć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć‚·ć‚¹ćƒ†ćƒ ćØć®ēµ±åˆć«é–¢ć™ć‚‹ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćŖć®ć§ć€ć“ć“ć§ćÆåÆ¾č±”å¤–ć€‚

ćć®ä»–ć®ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć«ć¤ć„ć¦ć€čŖ¬ę˜Žć‚’č¦‹ć¦ćæć¾ć™ć€‚

      --storage.tsdb.path="data/"  
                                 Base path for metrics storage.
      --storage.tsdb.retention=STORAGE.TSDB.RETENTION  
                                 [DEPRECATED] How long to retain samples in storage. This flag has been deprecated, use "storage.tsdb.retention.time" instead
      --storage.tsdb.retention.time=STORAGE.TSDB.RETENTION.TIME  
                                 How long to retain samples in storage. When this flag is set it overrides "storage.tsdb.retention". If neither this flag nor
                                 "storage.tsdb.retention" nor "storage.tsdb.retention.size" is set, the retention time defaults to 15d.
      --storage.tsdb.retention.size=STORAGE.TSDB.RETENTION.SIZE  
                                 [EXPERIMENTAL] Maximum number of bytes that can be stored for blocks. Units supported: KB, MB, GB, TB, PB. This flag is experimental and can be
                                 changed in future releases.
      --storage.tsdb.no-lockfile  
                                 Do not create lockfile in data directory.
      --storage.tsdb.allow-overlapping-blocks  
                                 [EXPERIMENTAL] Allow overlapping blocks which in-turn enables vertical compaction and vertical query merge

恙恧恫DeprecatedćŖć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćÆē„”č¦–ć—ć¾ć™ć€‚

  • --storage.tsdb.path … PrometheusćŒćƒ‡ćƒ¼ć‚æć‚’äæå­˜ć™ć‚‹éš›ć®ćƒ™ćƒ¼ć‚¹ćƒ‘ć‚¹ć€‚ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆćÆć€ć€Œdata怍
  • --storage.tsdb.retention.time … å¤ć„ćƒ‡ćƒ¼ć‚æć‚’ć„ć¤å‰Šé™¤ć™ć‚‹ć‹ć‚’ęŒ‡å®šć™ć‚‹ć€‚ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆćÆć€ć€Œ15d怍
  • --storage.tsdb.retention.size … ļ¼ˆå®ŸéØ“ēš„ļ¼‰ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øćƒ–ćƒ­ćƒƒć‚ÆćŒęœ€å¤§ć§ä½æē”Øć§ćć‚‹ć‚µć‚¤ć‚ŗć‚’ęŒ‡å®šć™ć‚‹
  • --storage.tsdb.no-lockfile … dataćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖå†…ć«ć€ćƒ­ćƒƒć‚Æćƒ•ć‚”ć‚¤ćƒ«ć‚’ä½œęˆć—ćŖć„
  • --storage.tsdb.allow-overlapping-blocks … ļ¼ˆå®ŸéØ“ēš„ļ¼‰ć‚Ŗćƒ¼ćƒćƒ¼ćƒ©ćƒƒćƒ—ć™ć‚‹ćƒ–ćƒ­ćƒƒć‚Æć‚’čØ±åÆć™ć‚‹ć€‚åœ§ēø®ć€ć‚Æć‚ØćƒŖćƒ¼ć®ēø¦ę–¹å‘ć®ćƒžćƒ¼ć‚øćŒåÆčƒ½ć«ćŖć‚‹

怌--storage.tsdb.retention.timeć€ć®ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆå€¤ćÆ15ę—„ć§ć€ćƒ•ć‚©ćƒ¼ćƒžćƒƒćƒˆć«ćÆy态w态d态h态m态s态msćŒåˆ©ē”Øć§ćć¾ć™ć€‚

prometheus/db.go at v2.9.2 · prometheus/prometheus · GitHub

https://github.com/prometheus/prometheus/blob/v2.9.2/vendor/github.com/prometheus/common/model/time.go#L192-L208

ćƒ‡ćƒ¼ć‚æć®äæęŒęœŸé–“ć«é–¢ć™ć‚‹ćƒŖćƒ†ćƒ³ć‚·ćƒ§ćƒ³ćƒćƒŖć‚·ćƒ¼ćÆę™‚é–“ćØć‚µć‚¤ć‚ŗć®2ć¤ćŒć‚ć‚Šć¾ć™ćŒć€ä½æć‚ć‚Œć‚‹ć®ćÆęœ€åˆć«å‹•ä½œć—ćŸę–¹ć ćØć‹ć€‚
ā€»ć‚µć‚¤ć‚ŗćÆć¾ć å®ŸéØ“ēš„ćŖć®ć§ć€ć‚¹ćƒ«ćƒ¼ć—ć¾ć—ćŸćŒ

怌--storage.tsdb.allow-overlapping-blocksć€ć«ć¤ć„ć¦ćÆć€ć¾ć å®ŸéØ“ēš„ćŖę„Ÿć˜ćŒć™ć‚‹ć®ć§ć€ć‚¹ćƒ«ćƒ¼ć—ć¾ć™ā€¦ć€‚

Implement vertical query merging and compaction · Issue #90 · prometheus/tsdb · GitHub

ćŖćŠć€2ę™‚é–“ć”ćØć«ćƒ–ćƒ­ćƒƒć‚Æć«ć¾ćØć‚ć‚‰ć‚Œć‚‹ć€ćØć„ć†ć®ćÆć€ć“ć®ć‚ćŸć‚Šć‹ć‚‰ę„ć¦ć„ć‚‹ę°—ćŒć—ć¾ć™ć€‚

https://github.com/prometheus/prometheus/blob/v2.9.2/vendor/github.com/prometheus/tsdb/db.go#L51

ćØć“ć‚ć§ć§ć™ć­ć€Prometheus恮Web Console恋悉怌Statusć€ā†’ć€ŒCommand-Line Flagsć€ć‚’č¦‹ć‚‹ćØć€ć‚‚ć†å°‘ć—ęŒ‡å®šć§ććć†ćŖ
ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćŒå¤šć„ć‚ˆć†ć«č¦‹ćˆć‚‹ć®ć§ć™ćŒā€¦ć€‚

f:id:Kazuhira:20190502222000p:plain

ä»Šå›žćÆć€ćØć‚Šć‚ćˆćšę·±čæ½ć„ć—ćŖć„ā€¦ć€‚

ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øć§åæ…č¦ćØć™ć‚‹ćƒ‡ć‚£ć‚¹ć‚Æå®¹é‡

PrometheusćÆć€ć²ćØć¤ć®ć‚µćƒ³ćƒ—ćƒ«ć‚ćŸć‚Šć€1怜2ćƒć‚¤ćƒˆć»ć©ć‚’ä½æē”Øć™ć‚‹ćć†ć§ć™ć€‚

なので、PrometheusćŒč¦ę±‚ć™ć‚‹ć‚µćƒ¼ćƒćƒ¼ć®ćƒ‡ć‚£ć‚¹ć‚Æå®¹é‡ćÆć€ä»„äø‹ć®čØˆē®—å¼ć§ē®—å‡ŗć§ćć¾ć™ć€ćØć€‚

needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample
åæ…č¦ćŖćƒ‡ć‚£ć‚¹ć‚Æć‚µć‚¤ć‚ŗ = äæęŒęœŸé–“ļ¼ˆē§’ļ¼‰ Ɨ ē§’ć‚ćŸć‚Šć®å–å¾—ć‚µćƒ³ćƒ—ćƒ«ę•° * ć‚µćƒ³ćƒ—ćƒ«ć‚ćŸć‚Šć®ćƒć‚¤ćƒˆę•°

1ē§’ć‚ćŸć‚Šć«å–å¾—ć™ć‚‹ć‚µćƒ³ćƒ—ćƒ«ć‚’å°ć•ćć™ć‚‹ć«ćÆć€å–å¾—ć™ć‚‹ę™‚ē³»åˆ—ćƒ‡ćƒ¼ć‚æć®ę•°ć‚’ęø›ć‚‰ć™ļ¼ˆåŽé›†ć™ć‚‹ć‚æćƒ¼ć‚²ćƒƒćƒˆć‚’ęø›ć‚‰ć™ć€
ć‚æćƒ¼ć‚²ćƒƒćƒˆć‚ćŸć‚Šć®å–å¾—é …ē›®ć‚’ęø›ć‚‰ć™ļ¼‰ć‹ć€å–å¾—é–“éš”ć‚’é•·ćčØ­å®šć™ć‚‹ć‹ć®ć©ć”ć‚‰ć‹ć§ć™ć€‚

ćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć«ć‚ˆć‚‹ćØć€å–å¾—ć—ćŸćƒ‡ćƒ¼ć‚æćÆåœ§ēø®ć•ć‚Œć‚‹ćŸć‚ć€å–å¾—ć™ć‚‹é …ē›®ę•°ć‚’ęø›ć‚‰ć™ę–¹ćŒåŠ¹ęžœēš„ć ćØć‹ć€‚

ćŖćŠć€ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øćŒē “ęć—ćŸå “åˆćÆć€ä»„äø‹ć®åÆ¾å‡¦ć«ćŖć‚‹ćØć‹ć€‚

  • Prometheusć‚’ć‚·ćƒ£ćƒƒćƒˆćƒ€ć‚¦ćƒ³ć—ć¦ć€ćƒ‡ćƒ¼ć‚æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć‚’å‰Šé™¤ć™ć‚‹ļ¼ˆć“ć‚ŒćŒęœ€å–„ć‚‰ć—ć„ļ¼‰
    • å½“ē„¶ć€å…Øćƒ‡ćƒ¼ć‚æć‚’å¤±ć†
  • å€‹ć€…ć®ćƒ–ćƒ­ćƒƒć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć‚’å‰Šé™¤ć™ć‚‹
    • å‰Šé™¤ć—ćŸćƒ–ćƒ­ćƒƒć‚Æć«å«ć¾ć‚Œć‚‹ć€2ę™‚é–“ć®ćƒ‡ćƒ¼ć‚æćÆå¤±ć†

ć©ć”ć‚‰ć«ć›ć‚ˆć€Prometheusć®ćƒ­ćƒ¼ć‚«ćƒ«ć‚¹ćƒˆćƒ¬ćƒ¼ć‚øćÆé•·ęœŸé–“ć®ćƒ‡ćƒ¼ć‚æäæå­˜ć‚’ę„å›³ć—ć¦ć„ćŖć„ć“ćØćŒćƒ‰ć‚­ćƒ„ćƒ”ćƒ³ćƒˆć§ćÆå¼·čŖæć•ć‚Œć¦ć„ć¾ć™ć€‚

ćƒ‡ćƒ¼ć‚æć®äæęŒęœŸé–“ć‚’ęŒ‡å®šć—ć¦ćæć‚‹

ćć‚Œć§ćÆć€ćƒ‡ćƒ¼ć‚æć®äæęŒęœŸé–“ć‚’ēŸ­ć‚ć«ć—ć¦ć€å‹•ä½œć‚’ē¢ŗčŖć—ć¦ćæć¾ć—ć‚‡ć†ć€‚

ä»„äø‹ć®ć‚ˆć†ćŖć€ćƒ‡ćƒ¼ć‚æć®å–å¾—é–“éš”ć‚’ēŸ­ć‚ć«ć—ćŸPrometheusć®čØ­å®šćƒ•ć‚”ć‚¤ćƒ«ć‚’ē”Øę„ć—ć€čµ·å‹•ć€‚
prometheus.yml

global:
  scrape_interval:     1s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 1s # Evaluate rules every 15 seconds. The default is every 1 minute.

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']

ćƒ‡ćƒ¼ć‚æć®å–å¾—åÆ¾č±”ćÆć€ćƒ‡ćƒ•ć‚©ćƒ«ćƒˆļ¼ˆč‡Ŗåˆ†č‡Ŗčŗ«ļ¼‰ć®ć¾ć¾ć§ć™ć€‚

čµ·å‹•ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ćØć—ć¦ć€ę—©ć‚ć«ę¶ˆćˆć‚‹ć“ćØć‚’ē¢ŗčŖć—ćŸćć¦ć€Œ--storage.tsdb.retention.time怍悒3åˆ†ć«čØ­å®šć—ć¾ć—ćŸć€‚
ć‚ćØć€ć€Œ--storage.tsdb.min-block-durationć€ć‚’ęŒ‡å®šć—ć¦ć„ć¾ć™ćŒć€ćŖć‚“ć§ć“ć‚Œć‚’ä»˜ć‘ćŸć‹ćÆć¾ćŸå¾Œć§ā€¦ć€‚

$ ./prometheus --storage.tsdb.retention.time 3m --storage.tsdb.min-block-duration 1m
level=info ts=2019-05-02T14:09:07.183Z caller=main.go:321 msg="Starting Prometheus" version="(version=2.9.2, branch=HEAD, revision=d3245f15022551c6fc8281766ea62db4d71e2747)"

Web Consoleć§ć€čµ·å‹•ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć‚’čŖč­˜ć—ć¦ć„ć‚‹ć“ćØć‚’ē¢ŗčŖć€‚

f:id:Kazuhira:20190502231638p:plain

ćØć‚Šć‚ćˆćšć€Web Console恧CPUä½æē”Øę™‚é–“ļ¼ˆprocess_cpu_seconds_total)を蔨示。10åˆ†é–“ć®č”Øē¤ŗć§č¦‹ć¦ć„ć¾ć™ć€‚

f:id:Kazuhira:20190502231018p:plain

1åˆ†ēµŒéŽć™ć‚‹ćØć€ć“ć‚“ćŖćƒ­ć‚°ćŒå‡ŗåŠ›ć•ć‚Œ

level=info ts=2019-05-02T14:10:44.219Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806153112 maxt=1556806200000 ulid=01D9WE45RQ73CF18Y6J2BPMN1Q duration=100.343317ms
level=info ts=2019-05-02T14:10:44.237Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.785374ms

ćƒ–ćƒ­ćƒƒć‚ÆćŒć§ćć¾ć™ć€‚

$ find data -type f
data/wal/00000000
data/lock
data/01D9WE45RQ73CF18Y6J2BPMN1Q/meta.json
data/01D9WE45RQ73CF18Y6J2BPMN1Q/index
data/01D9WE45RQ73CF18Y6J2BPMN1Q/tombstones
data/01D9WE45RQ73CF18Y6J2BPMN1Q/chunks/000001

ć“ć®ćƒ­ć‚°ćŒ4å›žå‡ŗåŠ›ć•ć‚Œć‚‹ć¾ć§ćÆć€ćƒ–ćƒ­ćƒƒć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖćŒčæ½åŠ ć•ć‚Œć¦ć„ćć¾ć™ć€‚

level=info ts=2019-05-02T14:10:44.219Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806153112 maxt=1556806200000 ulid=01D9WE45RQ73CF18Y6J2BPMN1Q duration=100.343317ms
level=info ts=2019-05-02T14:10:44.237Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.785374ms
level=info ts=2019-05-02T14:11:30.196Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806200000 maxt=1556806260000 ulid=01D9WE5JP7Y6FFGZMKKS1KK2AD duration=77.507689ms
level=info ts=2019-05-02T14:11:30.217Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.549613ms
level=info ts=2019-05-02T14:12:30.229Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806260000 maxt=1556806320000 ulid=01D9WE7D97CMB7R8RS12G9E9YW duration=109.711593ms
level=info ts=2019-05-02T14:12:30.251Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.421509ms
level=info ts=2019-05-02T14:13:30.197Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806320000 maxt=1556806380000 ulid=01D9WE97W9WEKZCQ0957KVH2GX duration=75.948389ms
level=info ts=2019-05-02T14:13:30.215Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.492116ms

ć“ć‚“ćŖę„Ÿć˜ć«ć€‚

$ find data -type f
data/wal/00000000
data/01D9WE7D97CMB7R8RS12G9E9YW/meta.json
data/01D9WE7D97CMB7R8RS12G9E9YW/index
data/01D9WE7D97CMB7R8RS12G9E9YW/tombstones
data/01D9WE7D97CMB7R8RS12G9E9YW/chunks/000001
data/lock
data/01D9WE45RQ73CF18Y6J2BPMN1Q/meta.json
data/01D9WE45RQ73CF18Y6J2BPMN1Q/index
data/01D9WE45RQ73CF18Y6J2BPMN1Q/tombstones
data/01D9WE45RQ73CF18Y6J2BPMN1Q/chunks/000001
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/meta.json
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/index
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/tombstones
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/chunks/000001
data/01D9WE97W9WEKZCQ0957KVH2GX/meta.json
data/01D9WE97W9WEKZCQ0957KVH2GX/index
data/01D9WE97W9WEKZCQ0957KVH2GX/tombstones
data/01D9WE97W9WEKZCQ0957KVH2GX/chunks/000001

f:id:Kazuhira:20190502231415p:plain

恓恓恧态5å›žē›®ć®ć‚³ćƒ³ćƒ‘ć‚Æć‚·ćƒ§ćƒ³ćŒčµ·ć“ć‚‹ćØ

level=info ts=2019-05-02T14:10:44.219Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806153112 maxt=1556806200000 ulid=01D9WE45RQ73CF18Y6J2BPMN1Q duration=100.343317ms
level=info ts=2019-05-02T14:10:44.237Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.785374ms
level=info ts=2019-05-02T14:11:30.196Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806200000 maxt=1556806260000 ulid=01D9WE5JP7Y6FFGZMKKS1KK2AD duration=77.507689ms
level=info ts=2019-05-02T14:11:30.217Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.549613ms
level=info ts=2019-05-02T14:12:30.229Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806260000 maxt=1556806320000 ulid=01D9WE7D97CMB7R8RS12G9E9YW duration=109.711593ms
level=info ts=2019-05-02T14:12:30.251Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.421509ms
level=info ts=2019-05-02T14:13:30.197Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806320000 maxt=1556806380000 ulid=01D9WE97W9WEKZCQ0957KVH2GX duration=75.948389ms
level=info ts=2019-05-02T14:13:30.215Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.492116ms
level=info ts=2019-05-02T14:14:30.248Z caller=compact.go:499 component=tsdb msg="write block" mint=1556806380000 maxt=1556806440000 ulid=01D9WEB2F4SWD5FDP5EBV7BD0E duration=132.373486ms
level=info ts=2019-05-02T14:14:30.270Z caller=head.go:540 component=tsdb msg="head GC completed" duration=1.201336ms

ćƒ–ćƒ­ćƒƒć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖćŒå¢—ćˆćŖććŖć‚Šć¾ć™ć€‚ć€Œ01D9WE45RQ73CF18Y6J2BPMN1Qć€ćØć„ć†åå‰ć®ćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖćŒć„ćŖććŖć‚Šć¾ć—ćŸć€‚

$ find data -type f
data/01D9WEB2F4SWD5FDP5EBV7BD0E/meta.json
data/01D9WEB2F4SWD5FDP5EBV7BD0E/index
data/01D9WEB2F4SWD5FDP5EBV7BD0E/tombstones
data/01D9WEB2F4SWD5FDP5EBV7BD0E/chunks/000001
data/wal/00000000
data/01D9WE7D97CMB7R8RS12G9E9YW/meta.json
data/01D9WE7D97CMB7R8RS12G9E9YW/index
data/01D9WE7D97CMB7R8RS12G9E9YW/tombstones
data/01D9WE7D97CMB7R8RS12G9E9YW/chunks/000001
data/lock
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/meta.json
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/index
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/tombstones
data/01D9WE5JP7Y6FFGZMKKS1KK2AD/chunks/000001
data/01D9WE97W9WEKZCQ0957KVH2GX/meta.json
data/01D9WE97W9WEKZCQ0957KVH2GX/index
data/01D9WE97W9WEKZCQ0957KVH2GX/tombstones
data/01D9WE97W9WEKZCQ0957KVH2GX/chunks/000001

ć“ć®ēµęžœć€ęœ€åˆć®ę™‚é–“ć®ćƒ‡ćƒ¼ć‚æćŒćŖććŖć‚Šć¾ć™ć€‚ć€Œ01D9WE45RQ73CF18Y6J2BPMN1Qć€ćØć„ć†ć®ćÆć€ęœ€åˆć«ä½œć‚‰ć‚ŒćŸ
ćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć§ć—ćŸć€‚

f:id:Kazuhira:20190502231541p:plain

ćØć„ć†ć‚ć‘ć§ć€ć€Œ--storage.tsdb.retention.timeć€ć§ęŒ‡å®šć—ćŸäæęŒęœŸé–“ć‚’č¶ŠćˆćŸćƒ‡ćƒ¼ć‚æćÆå‰Šé™¤ć•ć‚Œć¦ć„ć‚‹ć“ćØćŒē¢ŗčŖć§ćć¾ć—ćŸć€‚

ćØć“ć‚ć§ć€ä»Šå›žć€čµ·å‹•ć‚Ŗćƒ—ć‚·ćƒ§ćƒ³ć«ć€Œ--storage.tsdb.retention.timeć€ć ć‘ć§ćÆćŖćć¦ć€Œ--storage.tsdb.min-block-duration怍悂
ęŒ‡å®šć—ć¾ć—ćŸć€‚

$ ./prometheus --storage.tsdb.retention.time 3m --storage.tsdb.min-block-duration 1m

怌--storage.tsdb.min-block-durationć€ć‚’ęŒ‡å®šć—ćŸć“ćØć§ć€ćƒ–ćƒ­ćƒƒć‚Æć®ēÆ„å›²ćŒ1åˆ†ć«ćŖć£ć¦ć„ć¾ć™ć€‚

ęœ€åˆć€ä»„äø‹ć®ć‚ˆć†ć«ć€Œ--storage.tsdb.retention.timeć€ć‚’ęŒ‡å®šć—ćŸć ć‘ć ćØćƒ‡ćƒ¼ć‚æćŒć¾ć£ćŸćę¶ˆćˆćšć€ć€Œć‚‚ć—ć‹ć—ć¦WAL恫恄悋間恮
ćƒ‡ćƒ¼ć‚æćÆåÆ¾č±”ć«å…„ć‚‰ćŖć„ć®ć§ćÆļ¼Ÿć€ćØę€ć„ć€å¤‰ę›“ć«č‡³ć‚Šć¾ć—ćŸć€‚

$ ./prometheus --storage.tsdb.retention.time 3m

ć¤ć¾ć‚Šć€é•·ę™‚é–“å¾…ć¤ć®ćŒå«Œć§ć€ę„µē«Æć«ēŸ­ć„ę™‚é–“ć‚’retentionć«čØ­å®šć—ćŸć®ćŒå®Œå…Øć«č£ē›®ć«å‡ŗćŸć‚ˆć†ć§ć™ć€‚

å®Ÿéš›ć€å‰Šé™¤åÆ¾č±”ćÆćƒ–ćƒ­ćƒƒć‚Æå˜ä½ć®ć‚ˆć†ć§ć™ć€‚

https://github.com/prometheus/prometheus/blob/v2.9.2/vendor/github.com/prometheus/tsdb/db.go#L670

ćƒ–ćƒ­ćƒƒć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖå†…ć®meta.jsonを見ると、minTimeとmaxTimeć§ć€ć©ć®ę™‚é–“ć®ēÆ„å›²ć®ćƒ‡ćƒ¼ć‚æćŒå«ć¾ć‚Œć¦ć„ć‚‹ć‹ē¢ŗčŖć™ć‚‹ć“ćØćŒ
ć§ćć¾ć™ć€‚
data/01D9WEEQN5CH9F53GAFCN3G14E/meta.json

{
    "ulid": "01D9WEEQN5CH9F53GAFCN3G14E",
    "minTime": 1556806500000,
    "maxTime": 1556806560000,
    "stats": {
        "numSamples": 27120,
        "numSeries": 452,
        "numChunks": 452,
        "numBytes": 77220
    },
    "compaction": {
        "level": 1,
        "sources": [
            "01D9WEEQN5CH9F53GAFCN3G14E"
        ]
    },
    "version": 1
}

ć“ć“ć‹ć‚‰ć‚ć‹ć‚‹ć“ćØćÆć€ć€Œ--storage.tsdb.min-block-durationć€ć‚’čØ­å®šć™ć‚‹ć“ćØćÆćć†ćŖć„ć‹ć‚‚ć—ć‚Œć¾ć›ć‚“ćŒć€
怌--storage.tsdb.retention.timeć€ćÆćƒ–ćƒ­ćƒƒć‚Æćƒ‡ć‚£ćƒ¬ć‚ÆćƒˆćƒŖć§ęŒć¤ēÆ„å›²ļ¼ˆćƒ‡ćƒ•ć‚©ćƒ«ćƒˆ2ę™‚é–“ļ¼‰ć®å€ę•°ć§ćŖć‘ć‚Œć°ę„å‘³ćŒćŖć„
ということですね。

č¦šćˆć¦ćŠćć¾ć—ć‚‡ć†ć€‚

ć„ć„ē¢ŗčŖć«ćŖć‚Šć¾ć—ćŸć€‚