Fluentdでログを読み込み、ログレベルがWARN以上ならメールを送るようにしてみる

これは、なにをしたくて書いたもの？

エラーログを監視したい、ということを考えた時に、Fluentdを使った場合はどうなるんだっけ？と思ったところ、
オフィシャルドキュメントに例があったので、こちらをちょっと自分で試してみることにしました。

Email Alerting like Splunk - Fluentd

ドキュメントの例

こちらのドキュメントは、Splunkの代わりに、特定の条件に合致する内容がログに含まれていたらメール送信を行うことを、
Fluentdで行ってみるという話のようです。

「fluent-plugin-grepcounter」と「fluent-plugin-mail」使用しつつApacheのアクセスログを読み込み、秒間に3回以上500系の
ステータスコードを検出したらメール送信を行います。

今回のお題

ドキュメントの例を見ていると、秒間に3回以上の閾値を超えるとメールを送信しつつ、Fluentdのstdoutプラグインを使って
デバッグログ出力する内容でした。

これを、今回はこのようにして試してみたいと思います。

JSONフォーマットでログ出力するアプリケーションを用意し、Fluentdで読み込む
ログレベルに関わらず、stdoutプラグインを使ってログを出力する
ログレベルがwarnまたはerrorの場合は、メール送信する

ドキュメントとの違いは、ログレベルがなんであってもログを残すルートを用意することですね。では、このお題でいきましょう。

サンプルアプリケーションの作成

最初に、Fluentdで読み込むためのサンプルアプリケーションを作成します。

Node.jsで作成しましょう。環境は、こちら。

$ node -v
v10.15.3


$ npm -v
6.4.1

Expressを使った、簡単なアプリケーションを作成します。ログ出力には、Winstonを使用してJSONフォーマットのログを出力します。

$ npm i express winston

インストールされたバージョン。

  "dependencies": {
    "express": "^4.17.1",
    "winston": "^3.2.1"
  }

アプリケーションは、シンプルにログレベルに応じたエンドポイントを割り当て、エンドポイントに合わせたログレベルでログ出力
するようにしておきます。
server.js

const express = require('express');
const app = express();

const winston = require('winston');
const format = winston.format;

const logger = winston.createLogger({
    format: format.combine(
        format.timestamp(),
        format.json()
    ),
    transports: [
        new winston.transports.Console(),
        new winston.transports.File({ filename: '/var/log/app/app.log' })
    ]
});

app.get('/info', (req, res) => {
    logger.info('infoログ');
    res.send('info logging');
});

app.get('/warn', (req, res) => {
    logger.warn('warnログ');
    res.send('warn logging');
});

app.get('/error', (req, res) => {
    logger.error('errorログ');
    res.send('error logging');
});

app.listen(3000, () => logger.info('Server startup'));

ログファイルの出力先は、「/var/log/app/app.log」としましょう。

    transports: [
        new winston.transports.Console(),
        new winston.transports.File({ filename: '/var/log/app/app.log' })
    ]

このスクリプトを、「npm start」で起動できるようにpackage.jsonに設定して

  "scripts": {
    "start": "node server.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  },

動作確認。

$ npm start

こんな感じでログが出力されます。
※エンドポイントへのアクセスは、省略（curlで確認しています）

{"message":"Server startup","level":"info","timestamp":"2019-05-26T10:22:25.295Z"}
{"message":"infoログ","level":"info","timestamp":"2019-05-26T10:22:40.474Z"}
{"message":"warnログ","level":"warn","timestamp":"2019-05-26T10:22:43.108Z"}
{"message":"errorログ","level":"error","timestamp":"2019-05-26T10:22:44.823Z"}

これで、ログを出力するアプリケーションの準備はおしまいです。

メールサーバーを用意する

Fluentdの設定に入る前に、メールを受信するメールサーバーを用意しましょう。

今回は、MailCatcherを使用しました。バージョンは、0.7.0です。

起動。

$ mailcatcher --no-quit --foreground --ip=0.0.0.0
Starting MailCatcher
==> smtp://0.0.0.0:1025
/var/lib/gems/2.5.0/gems/thin-1.5.1/lib/thin/server.rb:104: warning: constant ::Fixnum is deprecated
==> http://0.0.0.0:1080/

このMailCatcherが起動しているホストのIPアドレスは、172.17.0.3とします。

Fluentd側の設定

では、Fluentdの設定に移ります。

fluent-plugin-mailをインストールしておきます。

$ sudo td-agent-gem install fluent-plugin-mail

GitHub - u-ichi/fluent-plugin-mail: Fluentd output plugin to send mail

fluent-plugin-mailを含めた、Fluentdおよび各プラグインのバージョンは、こちらです。

2019-05-26 10:41:21 +0000 [info]: starting fluentd-1.4.2 pid=1304 ruby="2.4.6"
2019-05-26 10:41:21 +0000 [info]: spawn command to main:  cmdline=["/opt/td-agent/embedded/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/sbin/td-agent", "--log", "/var/log/td-agent/td-agent.log", "--daemon", "/var/run/td-agent/td-agent.pid", "--under-supervisor"]
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '3.5.1'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-kafka' version '0.9.4'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-mail' version '0.3.0'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.4.0'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.0.1'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.2.0'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-s3' version '1.1.10'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-td' version '1.0.0'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.4'
2019-05-26 10:41:22 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.2.3'
2019-05-26 10:41:22 +0000 [info]: gem 'fluentd' version '1.4.2'

用意した設定ファイル。
/etc/td-agent/td-agent.conf

<source>
  @type tail
  @id input_tail
  <parse>
    @type json
    time_key timestamp
    time_format %iso8601
    keep_time_key true
  </parse>
  path /var/log/app/app.log

  @label @raw.app.log
  tag app.log
</source>

<label @raw.app.log>
  <match **>
    @type copy
    <store>
      @type stdout
      @id @all.logging
    </store>
    <store>
      @type relabel
      @label @handle.error
    </store>
  </match>
</label>

<label @handle.error>
  <filter **>
    @type grep

    <regexp>
      key level
      pattern /^(warn|error)$/
    </regexp>
  </filter>

  <match **>
    @type mail
    host 172.17.0.3
    port 1025
    from logging-agent@example.com
    to alert@example.com

    subject "level [%s] , cause"
    subject_out_keys level

    out_keys timestamp, message
  </match>
</label>

説明を順次。

まず、tailプラグインで指定のログを読み込みます。JSONでパースするようにして、ログの日付をtime_keyにして、かつtime_keyも
そのまま残す設定としました。

<source>
  @type tail
  @id input_tail
  <parse>
    @type json
    time_key timestamp
    time_format %iso8601
    keep_time_key true
  </parse>

  path /var/log/app/app.log

  @label @raw.app.log
  tag app.log
</source>

ルーティングは、labelで行うようにしてみています。

で、このログをcopyプラグインで、2つのstoreにコピーします。ひとつはstdoutプラグインでFluentdのログに出力し、もうひとつは
relabelプラグインを使って、さらにルーティングします。

<label @raw.app.log>
  <match **>
    @type copy
    <store>
      @type stdout
      @id @all.logging
    </store>
    <store>
      @type relabel
      @label @handle.error
    </store>
  </match>
</label>

relabel - Fluentd

labelを振り直した先では、ログレベル（level）がwarnまたはerrorかどうかをgrep プラグインでマッチさせてみて、条件に一致すれば
メールを送信します。

<label @handle.error>
  <filter **>
    @type grep

    <regexp>
      key level
      pattern /^(warn|error)$/
    </regexp>
  </filter>

  <match **>
    @type mail
    host 172.17.0.3
    port 1025
    from logging-agent@example.com
    to alert@example.com

    subject "level [%s] , cause"
    subject_out_keys level

    out_keys timestamp, message
  </match>
</label>

設定を反映。

$ sudo service td-agent restart

これで、準備は完了です。

確認してみる

それでは、まずはアプリケーションを起動してみます。

$ npm start

> ... start /dir
> node server.js

{"message":"Server startup","level":"info","timestamp":"2019-05-26T10:55:04.723Z"}

起動時にinfoレベルでログ出力されるので、それがFluentd側のログにも現れます。

2019-05-26 10:55:04.723000000 +0000 app.log: {"message":"Server startup","level":"info","timestamp":"2019-05-26T10:55:04.723Z"}

infoレベルなので、メールは送信されません。

ここで、それぞれのエンドポイントに対してアクセスしてみましょう。

$ curl localhost:3000/info
info logging
$ curl localhost:3000/warn
warn logging
$ curl localhost:3000/error
error logging

Fluentd側には、以下のようにログ出力されます。

2019-05-26 10:56:49.531000000 +0000 app.log: {"message":"infoログ","level":"info","timestamp":"2019-05-26T10:56:49.531Z"}
2019-05-26 10:56:52.005000000 +0000 app.log: {"message":"warnログ","level":"warn","timestamp":"2019-05-26T10:56:52.005Z"}
2019-05-26 10:56:56.462000000 +0000 app.log: {"message":"errorログ","level":"error","timestamp":"2019-05-26T10:56:56.462Z"}

MailCatcher側を見ると、2通メールを受信したようです。

==> SMTP: Received message from '<logging-agent@example.com>' (348 bytes)
==> SMTP: Received message from '<logging-agent@example.com>' (350 bytes)

http://[MailCatcherが動作しているホスト]:1080 へアクセスしてみます。

確認。

warnログの方。

f:id:Kazuhira:20190526200006p:plain

errorログの方。

f:id:Kazuhira:20190526200036p:plain

メールが送られてきたのはwarn、errorレベルのログであり、infoのものは入っていませんね。うまくいったようです。

まとめ

Fluentdで、ログをtailしつつ、内容によってはメール送信を行うサンプルを書いてみました。

こうやってログ監視の仕組みをFluentdにさせるのがいいかどうかはちょっと微妙な気もするのですが、まずは入り口として。

今後、Fluentdに限らず他のやり方もちょっと見ていこうかなと思います。

CLOVER🍀

That was when it all began.