CLOVER🍀

That was when it all began.

TF-IDFを実装してみる(Java)

これは、なにをしたくて書いたもの?

ちょっとTF-IDFをちゃんと見てみたくなりまして。

せっかくなので、1度自分で実装してみようかな、と。

TF-IDF

TF-IDFは、文書中に含まれる単語の重要度を評価する手法のひとつです。

tf-idf - Wikipedia

tf-idfについてざっくりまとめ_理論編 | Developers.IO

TF(Term Frequency = 単語の出現頻度)とIDF(Inverse Document Frequency = 逆文書頻度)の積で求められ、TFとIDFはそれぞれ
以下の定義です。

  • TF … 文書dにおける単語tの出現回数を、文書dにおけるすべての単語の出現回数の和で割ったもの
  • IDF … 総文書数を単語tを含む文書数で割り、その常用対数(log)をとったもの

f:id:Kazuhira:20191116000610p:plain

「ある文書中によく登場する単語」は重要度が上がり(TF)、また単語が出現する文書が少ないほど多いと重要度が上がります(IDF)。
反対に多くの文書でよく出現するような単語は、重要度が下がります(IDF)。

これにより、文書中の単語の重要度を評価することができます。

Apache Luceneでも、使用されていました。
Apache Lucene 6.0からは、デフォルトがBM25になっています

https://lucene.apache.org/core/8_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

f:id:Kazuhira:20191116001033p:plain

ただ、よく見ると、Wikipediaの計算式とはちょっと違います。

どうやら、重み付けの定義がいろいろあるようなのです。

Term frequency–Inverse document frequency

f:id:Kazuhira:20191116001317p:plain

今回は、日本語版Wikipediaに乗っている、シンプルなパターンで基本を押さえてみましょう。

環境

今回の環境は、こちら。

$ java --version
openjdk 11.0.4 2019-07-16
OpenJDK Runtime Environment (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3)
OpenJDK 64-Bit Server VM (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3, mixed mode, sharing)


$ mvn --version
Apache Maven 3.6.2 (40f52333136460af0dc0d7232c0dc0bcf0d9e117; 2019-08-28T00:06:16+09:00)
Maven home: $HOME/.sdkman/candidates/maven/current
Java version: 11.0.4, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: ja_JP, platform encoding: UTF-8
OS name: "linux", version: "4.15.0-70-generic", arch: "amd64", family: "unix"

お題

先ほど書いたとおり、TF-IDFをシンプルなパターンで実装します。

文書を単語に分割する必要があるのですが、日本語を入れると話がややこしくなるかなと思ったので、ホワイトスペースを
主に考慮すればよい英文からいくつか持ってきました。

以下あたりのページから。

単語分割の手法は、各文書の中を見て、今回は簡単に以下のようにします。

  • ホワイトスペースおよび「/」で分割
  • 句読点にあたるようなもの(「.」、「,」、「!」、「?」、「:」)が単語の末尾にあれば削除
  • 小文字に変換
  • 「is」などの単語を省くため、3文字に満たない単語は除外(適当)

で、作成したソースコードがこちら。

ある文書中の単語の出現頻度を保持するクラス。
src/main/java/org/littlewings/tfidf/DocTermFreq.java

package org.littlewings.tfidf;

import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class DocTermFreq {
    String docName;

    Map<String, Integer> words;

    public static DocTermFreq create(String docName, String text) {
        Map<String, Integer> words = new TreeMap<>();

        for (String word : text.split("\\s+|/")) {
            String normalized;

            List<String> punctuations = List.of(".", ",", "!", "?", ":");  // 単語の末尾にあった場合は削る

            if (punctuations.stream().anyMatch(word::endsWith)) {
                normalized = word.substring(0, word.length() - 1).toLowerCase();  // 末尾を削って小文字に
            } else {
                normalized = word.toLowerCase();  // 小文字に
            }

            if (word.length() > 2) {
                // 2文字以下の単語はスキップ
                words.compute(normalized, (k, v) -> v == null ? 1 : v + 1);
            }
        }

        DocTermFreq docTermFreq = new DocTermFreq();
        docTermFreq.setDocName(docName);
        docTermFreq.setWords(words);

        return docTermFreq;
    }

    public String getDocName() {
        return docName;
    }

    public void setDocName(String docName) {
        this.docName = docName;
    }

    public Map<String, Integer> getWords() {
        return words;
    }

    public void setWords(Map<String, Integer> words) {
        this.words = words;
    }

    public boolean containsWord(String word) {
        return words.containsKey(word);
    }

    public int getWordCount(String word) {
        if (containsWord(word)) {
            return words.get(word);
        } else {
            return 0;
        }
    }

    public int getAllWordCount() {
        return words.values().stream().collect(Collectors.summingInt(Integer::intValue));
    }

    @Override
    public String toString() {
        return words
                .entrySet()
                .stream()
                .map(e -> "  " + e.getKey() + ": " + e.getValue())
                .collect(Collectors.joining(System.lineSeparator()));
    }
}

ファイルを読み込んでDocTermFreqのインスタンスを構築し、TF-IDFを計算するクラス。
src/main/java/org/littlewings/tfidf/TfIdfCalculator.java

package org.littlewings.tfidf;

import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class TfIdfCalculator {
    Map<String, DocTermFreq> docTermFreqMap;

    public TfIdfCalculator() {
        docTermFreqMap = new LinkedHashMap<>();
    }

    public void addDoc(String filePath) {
        Path path = Paths.get(filePath);

        try {
            String text = Files.readString(path, StandardCharsets.UTF_8);
            docTermFreqMap.put(filePath, DocTermFreq.create(filePath, text));
        } catch (IOException e) {
            throw new UncheckedIOException(e);
        }
    }

    public double calc(String filePath, String word) {
        DocTermFreq docTermFreq = docTermFreqMap.get(filePath);

        double tf = (double) docTermFreq.getWordCount(word) / (double) docTermFreq.getAllWordCount();
        double idf =
                Math.log(
                        (double) docTermFreqMap.size() /
                                (double) docTermFreqMap.values().stream().filter(dtfq -> dtfq.containsWord(word)).count()
                );

        return tf * idf;
    }

    public List<DocTermFreq> getDocTermFreqs() {
        return new ArrayList<>(docTermFreqMap.values());
    }

    @Override
    public String toString() {
        return docTermFreqMap
                .entrySet()
                .stream()
                .map(e -> e.getKey() + ":" + System.lineSeparator() + e.getValue())
                .collect(Collectors.joining());
    }
}

TF-IDFを計算しているのは、ここですね。

    public double calc(String filePath, String word) {
        DocTermFreq docTermFreq = docTermFreqMap.get(filePath);

        double tf = (double) docTermFreq.getWordCount(word) / (double) docTermFreq.getAllWordCount();
        double idf =
                Math.log(
                        (double) docTermFreqMap.size() /
                                (double) docTermFreqMap.values().stream().filter(dtfq -> dtfq.containsWord(word)).count()
                );

        return tf * idf;
    }

単語の出現頻度や、単語が含まれる文書数は、今回は都度計算することにしました。

対象となる文書を指定し、TF-IDFを取得するmainクラス。
src/main/java/org/littlewings/tfidf/App.java

package org.littlewings.tfidf;

import java.util.List;
import java.util.stream.Collectors;

public class App {
    public static void main(String... args) {
        List<String> docs = List.of(
                "docs/ignite.txt",
                "docs/infinispan.txt",
                "docs/kafka.txt",
                "docs/lucene.txt",
                "docs/mysql.txt",
                "docs/postgresql.txt",
                "docs/redis.txt",
                "docs/wildfly.txt"
        );

        TfIdfCalculator calculator = new TfIdfCalculator();
        docs.forEach(calculator::addDoc);

        // 後で
    }
}

この「後で」の部分を変えつつ、確認してみましょう。

確認

では、まず簡単に。

ある文書を指定して、その中に含まれる単語からTF-IDFを計算してみます。

        System.out.printf("lucene / tf-idf = %f%n", calculator.calc("docs/lucene.txt", "lucene"));
        System.out.printf("from / tf-idf = %f%n", calculator.calc("docs/kafka.txt", "from"));
        System.out.printf("redis / tf-idf = %f%n", calculator.calc("docs/redis.txt", "redis"));

結果。

lucene / tf-idf = 0.126027
from / tf-idf = 0.004123
redis / tf-idf = 0.097336

lucene」や「redis」のような特徴的な単語は高い値になっていますが、「from」のような割とどこにでも出てくるような単語は
低い値になっています。

では、各文書に含まれるすべての単語から、TF-IDFを計算してみましょう。

文書ごとに、「文書名と文書中に含まれる単語数」、「単語」、「TF-IDF」、「現在の文書中に登場する、該当の単語の出現数」、
「各文書内で、該当の単語が何回出現するか」を表示してみます。

        calculator.getDocTermFreqs().forEach(docTermFreq -> {
            String docName = docTermFreq.getDocName();

            System.out.printf("%s / %d:%n", docName, docTermFreq.getAllWordCount());

            docTermFreq
                    .getWords()
                    .keySet()
                    .forEach(word -> {
                                System.out.printf(
                                        "  %s / tf-idf = %f, this-doc-count = %d%n",
                                        word,
                                        calculator.calc(docName, word),
                                        docTermFreq.getWordCount(word));
                                System.out.printf(
                                        "    per-doc-count = [%s]%n",
                                        calculator.getDocTermFreqs().stream().map(tfq -> Integer.toString(tfq.getWordCount(word))).collect(Collectors.joining(", "))
                                );
                            }
                    );

            System.out.println();
        });

結果は、こちら。

docs/ignite.txt / 131:
  accelerate / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  achieve / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  acid / tf-idf = 0.031747, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 0, 0, 0]
  across / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 2]
  airlines / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  american / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  analytical / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  and / tf-idf = 0.000000, this-doc-count = 10
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  availability / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  avoid / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  benefits / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  cache / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 12, 0, 0, 0, 0, 2, 0]
  caching / tf-idf = 0.014974, this-doc-count = 2
    per-doc-count = [2, 2, 0, 1, 0, 0, 0, 0]
  capabilities / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 0, 2, 1, 1, 0, 0, 4]
  cluster / tf-idf = 0.005291, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 0, 1, 0]
  collocated / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  compliance / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  computations / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  consistency / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  data / tf-idf = 0.010980, this-doc-count = 5
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  database / tf-idf = 0.010763, this-doc-count = 3
    per-doc-count = [3, 4, 0, 0, 2, 5, 1, 0]
  databases / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  delivering / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 1, 0, 0, 0]
  deploy / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  disk / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 1, 1]
  distributed / tf-idf = 0.059898, this-doc-count = 8
    per-doc-count = [8, 8, 1, 0, 0, 0, 0, 0]
  enforce / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  existing / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  fastest / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  features / tf-idf = 0.005291, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 8, 1, 1]
  fitness / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  for / tf-idf = 0.002039, this-doc-count = 2
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  full / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 2]
  grid / tf-idf = 0.021165, this-doc-count = 2
    per-doc-count = [2, 3, 0, 0, 0, 0, 0, 0]
  high / tf-idf = 0.002196, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  homeaway / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  horizontal / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  hour / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  ignite / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  ignite™ / tf-idf = 0.047621, this-doc-count = 3
    per-doc-count = [3, 0, 0, 0, 0, 0, 0, 0]
  in-memory / tf-idf = 0.014974, this-doc-count = 2
    per-doc-count = [2, 1, 0, 0, 0, 0, 3, 0]
  ing / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  jactravel / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  japan / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  joins / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  keep / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  key-value / tf-idf = 0.021165, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 1, 0, 0]
  learning / tf-idf = 0.031747, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 0, 0, 0]
  machine / tf-idf = 0.031747, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 0, 0, 0]
  main / tf-idf = 0.031747, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 0, 0, 0]
  many / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 4, 0, 0, 1, 9, 0, 1]
  memory / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 5]
  memory-centric / tf-idf = 0.047621, this-doc-count = 3
    per-doc-count = [3, 0, 0, 0, 0, 0, 0, 0]
  models / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  more / tf-idf = 0.002196, this-doc-count = 1
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  nodes / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  noise / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  nosql / tf-idf = 0.021165, this-doc-count = 2
    per-doc-count = [2, 4, 0, 0, 0, 0, 0, 0]
  petabyte / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  platform / tf-idf = 0.005291, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 1, 0, 0]
  process / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 0, 1, 0, 0, 0, 0, 1]
  processing / tf-idf = 0.021165, this-doc-count = 2
    per-doc-count = [2, 1, 0, 0, 0, 0, 0, 0]
  read / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  relational / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 2, 0, 0]
  sberbank / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  scalability / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 2]
  scale / tf-idf = 0.014974, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 1, 0, 2]
  sending / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 1]
  sets / tf-idf = 0.007487, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 2, 0]
  speeds / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  sql / tf-idf = 0.042330, this-doc-count = 4
    per-doc-count = [4, 0, 0, 0, 0, 6, 0, 0]
  storage / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  store / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 6, 1, 0, 0, 2, 1, 0]
  streaming / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 4, 0, 0, 0, 0, 0]
  strong / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  support / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 2, 0, 0, 0, 1, 2, 2]
  train / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  transact / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  transactional / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  transactions / tf-idf = 0.005291, this-doc-count = 1
    per-doc-count = [1, 5, 0, 0, 0, 2, 2, 0]
  used / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 2, 1, 0, 1, 0, 1, 0]
  wellington / tf-idf = 0.015874, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 0]
  with / tf-idf = 0.004077, this-doc-count = 4
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  workloads / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  write / tf-idf = 0.005291, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 1, 0, 1]
  yahoo / tf-idf = 0.010582, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 1, 0, 0, 0]
  your / tf-idf = 0.003588, this-doc-count = 1
    per-doc-count = [1, 8, 0, 0, 0, 5, 1, 16]

docs/infinispan.txt / 527:
  (jcache / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  100% / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  107 / tf-idf = 0.023675, this-doc-count = 6
    per-doc-count = [0, 6, 0, 0, 0, 0, 0, 0]
  about / tf-idf = 0.006576, this-doc-count = 5
    per-doc-count = [0, 5, 0, 0, 1, 1, 0, 2]
  achieve / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  across / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 2]
  actively / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  adding / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  addition / tf-idf = 0.005583, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 1]
  all / tf-idf = 0.005583, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 2, 0, 6]
  allows / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 5]
  already / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  also / tf-idf = 0.014889, this-doc-count = 8
    per-doc-count = [0, 8, 0, 0, 0, 0, 1, 6]
  among / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  and / tf-idf = 0.000000, this-doc-count = 17
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  another / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  any / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  api / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  apis / tf-idf = 0.002631, this-doc-count = 2
    per-doc-count = [0, 2, 0, 1, 0, 1, 0, 1]
  application / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 12]
  applications / tf-idf = 0.000892, this-doc-count = 1
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]
  architecture / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  architectures / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  are / tf-idf = 0.002631, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 3, 1, 12]
  availability / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  available / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  backed / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  been / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 3, 0, 1]
  being / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 2, 0, 0]
  better / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  both / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  bottleneck / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  brokered / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  business / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  but / tf-idf = 0.007445, this-doc-count = 4
    per-doc-count = [0, 4, 0, 0, 0, 0, 1, 3]
  cache / tf-idf = 0.022334, this-doc-count = 12
    per-doc-count = [1, 12, 0, 0, 0, 0, 2, 0]
  cache.get(key) / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  caching / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [2, 2, 0, 1, 0, 0, 0, 0]
  call / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  can / tf-idf = 0.013378, this-doc-count = 15
    per-doc-count = [0, 15, 1, 0, 0, 6, 4, 6]
  case / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 2]
  cases / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  client / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  cluster / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 0, 1, 0]
  clusterability / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  clustered / tf-idf = 0.015783, this-doc-count = 4
    per-doc-count = [0, 4, 0, 0, 0, 0, 0, 0]
  code / tf-idf = 0.005583, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 3]
  coherency / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  common / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  concurrent / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  consistency / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  consumed / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  core / tf-idf = 0.002631, this-doc-count = 2
    per-doc-count = [0, 2, 0, 3, 0, 2, 0, 1]
  data / tf-idf = 0.007642, this-doc-count = 14
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  database / tf-idf = 0.003567, this-doc-count = 4
    per-doc-count = [3, 4, 0, 0, 2, 5, 1, 0]
  dataset / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 2, 0]
  delegating / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  demos / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  designed / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  development / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 3]
  disabled / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  disk-based / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  distribute / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  distributed / tf-idf = 0.014889, this-doc-count = 8
    per-doc-count = [8, 8, 1, 0, 0, 0, 0, 0]
  documentation / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  easily / tf-idf = 0.007892, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 2]
  efficiently / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  embedding / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  engines / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  enough / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  even / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  example / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  execution / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  exist / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  existing / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  expert / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  explore / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  expose / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  exposes / tf-idf = 0.011837, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 0]
  extends / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  extremely / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  feel / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  feet / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  finalized / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  find / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  firstly / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  for / tf-idf = 0.002534, this-doc-count = 10
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  form / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  framework / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  frameworks / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 3]
  from / tf-idf = 0.001784, this-doc-count = 2
    per-doc-count = [0, 2, 1, 0, 0, 1, 1, 5]
  front / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  fully / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  get / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  github / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  greater / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  greatly / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  grid / tf-idf = 0.007892, this-doc-count = 3
    per-doc-count = [2, 3, 0, 0, 0, 0, 0, 0]
  ground-up / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  group / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  guides / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  has / tf-idf = 0.000546, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  have / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 8]
  heated / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  heavy / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  help / tf-idf = 0.007892, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 2, 0, 0]
  here / tf-idf = 0.011837, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 0]
  hibernate / tf-idf = 0.011837, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 0]
  high / tf-idf = 0.000546, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  high-performance / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  highly / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 3, 0, 1]
  hook / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  hot / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  how / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 1, 0, 0]
  however / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  hurt / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  implement / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  implementations / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  important / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  improve / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  in-memory / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [2, 1, 0, 0, 0, 0, 3, 0]
  including / tf-idf = 0.000892, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 1, 2, 0, 1]
  indexing / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  infinispan / tf-idf = 0.078916, this-doc-count = 20
    per-doc-count = [0, 20, 0, 0, 0, 0, 0, 0]
  infinispan's / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  interface / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  into / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 4]
  introduction / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  isn't / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  it / tf-idf = 0.011837, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 0]
  its / tf-idf = 0.002631, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 1, 1, 1, 0]
  java / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 6]
  java's / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  java) / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  java.util.map / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  jsr / tf-idf = 0.023675, this-doc-count = 6
    per-doc-count = [0, 6, 0, 0, 0, 0, 0, 0]
  jta / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  key / tf-idf = 0.010522, this-doc-count = 4
    per-doc-count = [0, 4, 2, 0, 0, 0, 0, 0]
  learn / tf-idf = 0.015783, this-doc-count = 4
    per-doc-count = [0, 4, 0, 0, 0, 0, 0, 0]
  letting / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  libraries / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  lifting / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  local / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  lookup / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  lookups / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  lots / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  make / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  making / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  management / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 5]
  manager / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  many / tf-idf = 0.003567, this-doc-count = 4
    per-doc-count = [1, 4, 0, 0, 1, 9, 0, 1]
  map / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  memory / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 5]
  modern / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  modes / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  module / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  more / tf-idf = 0.002729, this-doc-count = 5
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  most / tf-idf = 0.003946, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 1, 1, 3, 0]
  move / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  multi-core / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  multi-processor / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  necessary / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  need / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 3]
  network / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  non-jvm / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  nosql / tf-idf = 0.010522, this-doc-count = 4
    per-doc-count = [2, 4, 0, 0, 0, 0, 0, 0]
  object / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  often / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  once / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  one / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  open / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  optional / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  optionally / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  other / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 1, 1]
  our / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 0, 0, 0]
  over / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 4]
  own / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  part / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  participate / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  participates / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  peer-to-peer / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  people / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  perform / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  performance / tf-idf = 0.001638, this-doc-count = 3
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  permanent / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  persist / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  platform / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 1, 0, 0]
  platforms / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  plug-in / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  pluggable / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  powerful / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 2, 0, 1]
  preview / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  primary / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  processing / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [2, 1, 0, 0, 0, 0, 0, 0]
  protocol / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  purpose / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  putting / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  quick / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  reasons / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  reduce / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  relational / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 2, 0, 0]
  resources / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  retrieval / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  retrieving / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  rod / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  sample / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  scalable / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  searches / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  searching / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  second-level / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  section / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  servers / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 1]
  simple / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  since / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  some / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  source / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  speed / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  standard / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 5, 0, 0]
  standards / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 7]
  start / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  state / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  store / tf-idf = 0.005351, this-doc-count = 6
    per-doc-count = [1, 6, 1, 0, 0, 2, 1, 0]
  stores / tf-idf = 0.010522, this-doc-count = 4
    per-doc-count = [0, 4, 1, 0, 0, 0, 0, 0]
  stream / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  streams / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 5, 0, 0, 1, 1, 0]
  structure / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 1, 0]
  such / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 3, 1, 0]
  support / tf-idf = 0.001784, this-doc-count = 2
    per-doc-count = [1, 2, 0, 0, 0, 1, 2, 2]
  supports / tf-idf = 0.003722, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 2, 4]
  system / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 2, 0, 1]
  team / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  temporary / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  that / tf-idf = 0.007892, this-doc-count = 6
    per-doc-count = [0, 6, 4, 0, 0, 8, 0, 13]
  the / tf-idf = 0.003294, this-doc-count = 13
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  these / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 7]
  this / tf-idf = 0.014889, this-doc-count = 8
    per-doc-count = [0, 8, 0, 0, 0, 2, 0, 5]
  time / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 2, 0, 2]
  today / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  topic / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  transaction / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  transactional / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 0]
  transactions / tf-idf = 0.006576, this-doc-count = 5
    per-doc-count = [1, 5, 0, 0, 0, 2, 2, 0]
  true / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  tutorials / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  two / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 1]
  use / tf-idf = 0.011837, this-doc-count = 9
    per-doc-count = [0, 9, 0, 0, 0, 1, 2, 6]
  used / tf-idf = 0.001784, this-doc-count = 2
    per-doc-count = [1, 2, 1, 0, 1, 0, 1, 0]
  valid / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  value / tf-idf = 0.005583, this-doc-count = 3
    per-doc-count = [0, 3, 1, 0, 0, 0, 1, 0]
  various / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  very / tf-idf = 0.001861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 1]
  visit / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  way / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  website / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  well / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 0, 0, 0]
  what / tf-idf = 0.003946, this-doc-count = 3
    per-doc-count = [0, 3, 1, 0, 0, 1, 0, 1]
  when / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  where / tf-idf = 0.002631, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 1, 1, 1]
  which / tf-idf = 0.007892, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 5]
  why / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  will / tf-idf = 0.005261, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  with / tf-idf = 0.000253, this-doc-count = 1
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  would / tf-idf = 0.003946, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 0]
  write / tf-idf = 0.001315, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 1, 0, 1]
  written / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  yes / tf-idf = 0.007892, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 0]
  yet / tf-idf = 0.002631, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  you / tf-idf = 0.007892, this-doc-count = 6
    per-doc-count = [0, 6, 0, 0, 0, 5, 4, 19]
  your / tf-idf = 0.007135, this-doc-count = 8
    per-doc-count = [1, 8, 0, 0, 0, 5, 1, 16]

docs/kafka.txt / 114:
  and / tf-idf = 0.000000, this-doc-count = 3
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  apache / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 2, 0, 0, 0, 0]
  applications / tf-idf = 0.012369, this-doc-count = 3
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]
  between / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 1]
  bottom / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  broad / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  building / tf-idf = 0.017208, this-doc-count = 2
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  called / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 1]
  can / tf-idf = 0.004123, this-doc-count = 1
    per-doc-count = [0, 15, 1, 0, 0, 6, 4, 6]
  capabilities / tf-idf = 0.008246, this-doc-count = 2
    per-doc-count = [1, 0, 2, 1, 1, 0, 0, 4]
  categories / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  classes / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 5]
  cluster / tf-idf = 0.012160, this-doc-count = 2
    per-doc-count = [1, 1, 2, 0, 0, 0, 1, 0]
  concepts / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  consists / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  data / tf-idf = 0.007571, this-doc-count = 3
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  datacenters / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  distributed / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [8, 8, 1, 0, 0, 0, 0, 0]
  dive / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  does / tf-idf = 0.017208, this-doc-count = 2
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  durable / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  each / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  enterprise / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 2]
  exactly / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  explore / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  fault-tolerant / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 1, 0, 0]
  few / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  first / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  for / tf-idf = 0.001171, this-doc-count = 1
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  from / tf-idf = 0.004123, this-doc-count = 1
    per-doc-count = [0, 2, 1, 0, 0, 1, 1, 5]
  generally / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  get / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  has / tf-idf = 0.002524, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  how / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 1, 0, 0]
  kafka / tf-idf = 0.091204, this-doc-count = 5
    per-doc-count = [0, 0, 5, 0, 0, 0, 0, 0]
  kafka's / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  key / tf-idf = 0.024321, this-doc-count = 2
    per-doc-count = [0, 4, 2, 0, 0, 0, 0, 0]
  let's / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  mean / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  message / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 0]
  messaging / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  more / tf-idf = 0.002524, this-doc-count = 1
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  multiple / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  occur / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  one / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  pipelines / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  platform / tf-idf = 0.012160, this-doc-count = 2
    per-doc-count = [1, 1, 2, 0, 0, 1, 0, 0]
  process / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [1, 0, 1, 0, 0, 0, 0, 1]
  publish / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  queue / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  react / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  real-time / tf-idf = 0.036481, this-doc-count = 2
    per-doc-count = [0, 0, 2, 0, 0, 0, 0, 0]
  record / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  records / tf-idf = 0.072963, this-doc-count = 4
    per-doc-count = [0, 0, 4, 0, 0, 0, 0, 0]
  reliably / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  run / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 3]
  servers / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 1]
  similar / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  span / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  store / tf-idf = 0.004123, this-doc-count = 1
    per-doc-count = [1, 6, 1, 0, 0, 2, 1, 0]
  stores / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 4, 1, 0, 0, 0, 0, 0]
  streaming / tf-idf = 0.048642, this-doc-count = 4
    per-doc-count = [1, 0, 4, 0, 0, 0, 0, 0]
  streams / tf-idf = 0.030401, this-doc-count = 5
    per-doc-count = [0, 1, 5, 0, 0, 1, 1, 0]
  subscribe / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  system / tf-idf = 0.006080, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 2, 0, 1]
  systems / tf-idf = 0.006080, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 2, 3, 1]
  that / tf-idf = 0.024321, this-doc-count = 4
    per-doc-count = [0, 6, 4, 0, 0, 8, 0, 13]
  the / tf-idf = 0.003514, this-doc-count = 3
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  these / tf-idf = 0.006080, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 7]
  they / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  things / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  three / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  timestamp / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  topics / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  transform / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  two / tf-idf = 0.006080, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 1]
  understand / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  up / tf-idf = 0.018241, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 0]
  used / tf-idf = 0.004123, this-doc-count = 1
    per-doc-count = [1, 2, 1, 0, 1, 0, 1, 0]
  value / tf-idf = 0.008604, this-doc-count = 1
    per-doc-count = [0, 3, 1, 0, 0, 0, 1, 0]
  way / tf-idf = 0.012160, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 0]
  what / tf-idf = 0.006080, this-doc-count = 1
    per-doc-count = [0, 3, 1, 0, 0, 1, 0, 1]

docs/lucene.txt / 66:
  admin / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  advanced / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]
  analysis / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  and / tf-idf = 0.000000, this-doc-count = 4
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  apache / tf-idf = 0.042009, this-doc-count = 2
    per-doc-count = [0, 0, 1, 2, 0, 0, 0, 0]
  apis / tf-idf = 0.010502, this-doc-count = 1
    per-doc-count = [0, 2, 0, 1, 0, 1, 0, 1]
  around / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  built / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  caching / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [2, 2, 0, 1, 0, 0, 0, 0]
  capabilities / tf-idf = 0.007121, this-doc-count = 1
    per-doc-count = [1, 0, 2, 1, 1, 0, 0, 4]
  core / tf-idf = 0.031507, this-doc-count = 3
    per-doc-count = [0, 2, 0, 3, 0, 2, 0, 1]
  develops / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  faceted / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  flagship / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  high / tf-idf = 0.004359, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  highlighting / tf-idf = 0.063013, this-doc-count = 2
    per-doc-count = [0, 0, 0, 2, 0, 0, 0, 0]
  hit / tf-idf = 0.063013, this-doc-count = 2
    per-doc-count = [0, 0, 0, 2, 0, 0, 0, 0]
  http / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 1]
  including / tf-idf = 0.007121, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 1, 2, 0, 1]
  indexing / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  interface / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  java-based / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  json / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 2, 0, 1]
  lucene / tf-idf = 0.126027, this-doc-count = 4
    per-doc-count = [0, 0, 0, 4, 0, 0, 0, 0]
  open-source / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  our / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 0, 0, 0]
  performance / tf-idf = 0.004359, this-doc-count = 1
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  project / tf-idf = 0.042009, this-doc-count = 2
    per-doc-count = [0, 0, 0, 2, 0, 2, 0, 0]
  provides / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 1, 1]
  pylucene / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  python / tf-idf = 0.042009, this-doc-count = 2
    per-doc-count = [0, 0, 0, 2, 0, 1, 0, 0]
  replication / tf-idf = 0.010502, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 2, 1]
  ruby / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  search / tf-idf = 0.084018, this-doc-count = 4
    per-doc-count = [0, 0, 0, 4, 0, 2, 0, 0]
  server / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 8]
  software / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]
  solrtm / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  spellchecking / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  sub-project / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  technology / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 1]
  the / tf-idf = 0.004046, this-doc-count = 2
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  tokenization / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  using / tf-idf = 0.010502, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 1, 4]
  web / tf-idf = 0.014861, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 2, 0, 0, 11]
  welcome / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  well / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 0, 0, 0, 0]
  with / tf-idf = 0.002023, this-doc-count = 1
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  wrapper / tf-idf = 0.031507, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 0]
  xml / tf-idf = 0.021004, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]

docs/mysql.txt / 56:
  about / tf-idf = 0.012378, this-doc-count = 1
    per-doc-count = [0, 5, 0, 0, 1, 1, 0, 2]
  and / tf-idf = 0.000000, this-doc-count = 3
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  applications / tf-idf = 0.016786, this-doc-count = 2
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]
  become / tf-idf = 0.017515, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 2]
  capabilities / tf-idf = 0.008393, this-doc-count = 1
    per-doc-count = [1, 0, 2, 1, 1, 0, 0, 4]
  choice / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 0]
  cloud / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  database / tf-idf = 0.016786, this-doc-count = 2
    per-doc-count = [3, 4, 0, 0, 2, 5, 1, 0]
  delivering / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 1, 0, 0, 0]
  drives / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  ease-of-use / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  embedded / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  facebook / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  for / tf-idf = 0.002384, this-doc-count = 1
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  generation / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  has / tf-idf = 0.005137, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  high / tf-idf = 0.005137, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  including / tf-idf = 0.008393, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 1, 2, 0, 1]
  innovation / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  its / tf-idf = 0.012378, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 1, 1, 1, 0]
  leading / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  many / tf-idf = 0.008393, this-doc-count = 1
    per-doc-count = [1, 4, 0, 0, 1, 9, 0, 1]
  mobile / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  more / tf-idf = 0.005137, this-doc-count = 1
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  most / tf-idf = 0.012378, this-doc-count = 1
    per-doc-count = [0, 3, 0, 0, 1, 1, 3, 0]
  mysql / tf-idf = 0.148532, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 4, 0, 0, 0]
  new / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  next / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  open / tf-idf = 0.012378, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  oracle / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  performance / tf-idf = 0.005137, this-doc-count = 1
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  popular / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 0]
  power / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  profile / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  properties / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  proven / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 2, 0, 0]
  reliability / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 2, 0, 0]
  source / tf-idf = 0.012378, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  the / tf-idf = 0.004769, this-doc-count = 2
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  twitter / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  used / tf-idf = 0.008393, this-doc-count = 1
    per-doc-count = [1, 2, 1, 0, 1, 0, 1, 0]
  web / tf-idf = 0.035030, this-doc-count = 2
    per-doc-count = [0, 0, 0, 1, 2, 0, 0, 11]
  web-based / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  with / tf-idf = 0.002384, this-doc-count = 1
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  world's / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]
  yahoo / tf-idf = 0.024755, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 1, 0, 0, 0]
  youtube / tf-idf = 0.037133, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 0]

docs/postgresql.txt / 570:
  (and / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  (hstore) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  (jit) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  (mvcc) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  (pitr) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  (via / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  (wal) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  160 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  179 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  1986 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  2001 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  2019 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  about / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 5, 0, 0, 1, 1, 0, 2]
  accent-insensitive / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  access-control / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  accommodate / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  acid-compliant / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  active / tf-idf = 0.010944, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 3, 0, 0]
  add-ons / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  added / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  addition / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 1]
  additional / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  additionally / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  administrators / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  advanced / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]
  advisory / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  aimed / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  all / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 3, 0, 0, 0, 2, 0, 6]
  and / tf-idf = 0.000000, this-doc-count = 20
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  apis / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 2, 0, 1, 0, 1, 0, 1]
  applications / tf-idf = 0.000825, this-doc-count = 1
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]
  architectural / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  architecture / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  are / tf-idf = 0.003648, this-doc-count = 3
    per-doc-count = [0, 2, 0, 0, 0, 3, 1, 12]
  array / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  asynchronous / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  authentication / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  b-tree / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  back / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  become / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 2]
  been / tf-idf = 0.005162, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 3, 0, 1]
  behind / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  being / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 2, 0, 0]
  below / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  berkeley / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  big / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  bloom / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  boolean / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  both / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  brin / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  build / tf-idf = 0.012160, this-doc-count = 5
    per-doc-count = [0, 0, 0, 0, 0, 5, 0, 1]
  building / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  california / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  can / tf-idf = 0.004947, this-doc-count = 6
    per-doc-count = [0, 15, 1, 0, 0, 6, 4, 6]
  case-insensitive / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  certificate / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  certificates / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  challenges / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  character / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  choice / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 0]
  circle / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  clusters / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  code / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 3]
  collations / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  column / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  combined / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  comes / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  community / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  compilation / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  complicated / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  composite / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  concurrency / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  concurrent / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  conform / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  conformance / tf-idf = 0.014593, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 4, 0, 0]
  conforms / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  connect / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  consistently / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  constraints / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  contradict / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  control / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  core / tf-idf = 0.002432, this-doc-count = 2
    per-doc-count = [0, 2, 0, 3, 0, 2, 0, 1]
  could / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  covering / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  custom / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 2]
  customizable / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  customizations / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  data / tf-idf = 0.005552, this-doc-count = 11
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  database / tf-idf = 0.004123, this-doc-count = 5
    per-doc-count = [3, 4, 0, 0, 2, 5, 1, 0]
  databases / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  dataset / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 2, 0]
  date / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  decisions / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  dedication / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  define / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  defined / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 1]
  deliver / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  developers / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  development / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 3]
  different / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  differing / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  disaster / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  discover / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  document / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  documentation / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  does / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  e.g / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  earned / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  easier / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  environments / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  even / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  every / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 1]
  example / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  exclusion / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  expected / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  explicit / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  expressions / tf-idf = 0.010944, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 3, 0, 0]
  extender / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  extends / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  extensibility / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  extensible / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  extensions / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  fault-tolerant / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 1, 0, 0]
  feature / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  features / tf-idf = 0.009728, this-doc-count = 8
    per-doc-count = [1, 0, 0, 0, 0, 8, 1, 1]
  filters / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  for / tf-idf = 0.001406, this-doc-count = 6
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  foreign / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  found / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  free / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  from / tf-idf = 0.000825, this-doc-count = 1
    per-doc-count = [0, 2, 1, 0, 0, 1, 1, 5]
  full / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 2]
  full-text / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  function / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  functionality / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  functions / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  further / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  geometry / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  geospatial / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  getting / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  gin / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  gist / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  gssapi / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  has / tf-idf = 0.003533, this-doc-count = 7
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  have / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 8]
  help / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 3, 0, 0, 0, 2, 0, 0]
  highly / tf-idf = 0.005162, this-doc-count = 3
    per-doc-count = [0, 2, 0, 0, 0, 3, 0, 1]
  how / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 1, 0, 0]
  icu / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  including / tf-idf = 0.001649, this-doc-count = 2
    per-doc-count = [0, 1, 0, 1, 1, 2, 0, 1]
  index-only / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  indexes / tf-idf = 0.007296, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 3, 1, 0]
  indexing / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  inexhaustive / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  innovative / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  integer / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  integrity / tf-idf = 0.010944, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 3, 0, 0]
  interface / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 1, 0, 1, 0, 2, 0, 0]
  international / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  internationalisation / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  isolation / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  its / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 1, 1, 1, 0]
  json / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 0, 0, 1, 0, 2, 0, 1]
  jsonb / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  just-in-time / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  key-value / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [2, 0, 0, 0, 0, 1, 0, 0]
  keys / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 2, 0]
  knn / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  language / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  languages / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 1, 1]
  ldap / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  lead / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  least / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  let / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  levels / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  line / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  list / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  locks / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  logging / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  logical / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  major / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  manage / tf-idf = 0.014593, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 4, 0, 0]
  mandatory / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  many / tf-idf = 0.007421, this-doc-count = 9
    per-doc-count = [1, 4, 0, 0, 1, 9, 0, 1]
  matter / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  meets / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  method / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  more / tf-idf = 0.002019, this-doc-count = 4
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  more) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  most / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 3, 0, 0, 1, 1, 3, 0]
  moves / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  multi-factor / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  multi-version / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  multicolumn / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  nested / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  never / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  not / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 3]
  null / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  number / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  numeric / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  object-relational / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  october / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  open / tf-idf = 0.004864, this-doc-count = 4
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  operating / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  optimizer / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  organisations / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  origins / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  other / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 1, 1]
  out / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 1, 0]
  over / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 4]
  own / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  parallelization / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  part / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  partial / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  partitioning / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  path / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  people / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  performance / tf-idf = 0.000505, this-doc-count = 1
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  performant / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  perl / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  petabytes / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  pgsql / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  pick / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  planner / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  platform / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 1, 0, 0]
  point / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  point-in-time-recovery / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  polygon / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  poor / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  popular / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 0]
  postgis / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  postgres / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  postgresql / tf-idf = 0.072963, this-doc-count = 20
    per-doc-count = [0, 0, 0, 0, 0, 20, 0, 0]
  powerful / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 2, 0, 1]
  primary / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  primitives / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  procedural / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  procedures / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  production / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  programming / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  project / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 2, 0, 2, 0, 0]
  protect / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  proven / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 1, 2, 0, 0]
  provide / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  python / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 2, 0, 1, 0, 0]
  quantity / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  queries / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 2, 0]
  query / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  range / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  read / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  recompiling / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  recovery / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  relational / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [1, 1, 0, 0, 0, 2, 0, 0]
  release / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  reliability / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 1, 2, 0, 0]
  replication / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 2, 1]
  reputation / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  required / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  robust / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  robustly / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  row-level / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  runs / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  safely / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  savepoints) / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  scalable / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  scale / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [2, 0, 0, 0, 0, 1, 0, 2]
  scans / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  scram-sha-256 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  search / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 4, 0, 2, 0, 0]
  security / tf-idf = 0.007296, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 0]
  serializable / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  set / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 2, 0]
  sets / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 2, 0]
  sheer / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  since / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  slightly / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  small / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  software / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]
  solutions / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  solve / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  sometimes / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  sophisticated / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  source / tf-idf = 0.004864, this-doc-count = 4
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  sp-gist / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  specialized / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  sql / tf-idf = 0.014593, this-doc-count = 6
    per-doc-count = [4, 0, 0, 0, 0, 6, 0, 0]
  sql:2016 / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  sspi / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  standard / tf-idf = 0.012160, this-doc-count = 5
    per-doc-count = [0, 1, 0, 0, 0, 5, 0, 0]
  standbys / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  started / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  statistics / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  storage / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  store / tf-idf = 0.001649, this-doc-count = 2
    per-doc-count = [1, 6, 1, 0, 0, 2, 1, 0]
  stored / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  streams / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 1, 5, 0, 0, 1, 1, 0]
  string / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  strong / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  structured / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  such / tf-idf = 0.005162, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 3, 1, 0]
  support / tf-idf = 0.000825, this-doc-count = 1
    per-doc-count = [1, 2, 0, 0, 0, 1, 2, 2]
  supported / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  surprise / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  synchronous / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  syntax / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  system / tf-idf = 0.002432, this-doc-count = 2
    per-doc-count = [0, 1, 1, 0, 0, 2, 0, 1]
  systems / tf-idf = 0.002432, this-doc-count = 2
    per-doc-count = [0, 0, 1, 0, 0, 2, 3, 1]
  table / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  tables / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  tablespaces / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  terabytes / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  text / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  than / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 3]
  that / tf-idf = 0.009728, this-doc-count = 8
    per-doc-count = [0, 6, 4, 0, 0, 8, 0, 13]
  the / tf-idf = 0.004920, this-doc-count = 21
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  there / tf-idf = 0.004864, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 2, 0]
  this / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 8, 0, 0, 0, 2, 0, 5]
  though / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  through / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  time / tf-idf = 0.003442, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 2, 0, 2]
  towards / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  traditional / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  transaction / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  transactions / tf-idf = 0.002432, this-doc-count = 2
    per-doc-count = [1, 5, 0, 0, 0, 2, 2, 0]
  tries / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  types / tf-idf = 0.007296, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 3, 1, 0]
  unique / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  university / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  use / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 9, 0, 0, 0, 1, 2, 6]
  users / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  uses / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  using / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 1, 4]
  uuid / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  various / tf-idf = 0.001721, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  version / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  want / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  what / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 3, 1, 0, 0, 1, 0, 1]
  where / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 1, 1]
  why / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 0]
  with / tf-idf = 0.002343, this-doc-count = 10
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  without / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  workloads / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 0]
  wrappers / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  write / tf-idf = 0.001216, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 1, 0, 1]
  write-ahead / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  writing / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  xml / tf-idf = 0.002432, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 0, 0]
  years / tf-idf = 0.003648, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 0]
  you / tf-idf = 0.006080, this-doc-count = 5
    per-doc-count = [0, 6, 0, 0, 0, 5, 4, 19]
  your / tf-idf = 0.004123, this-doc-count = 5
    per-doc-count = [1, 8, 0, 0, 0, 5, 1, 16]

docs/redis.txt / 235:
  (bsd / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  *bsd / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  achieve / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  also / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 8, 0, 0, 0, 0, 1, 6]
  and / tf-idf = 0.000000, this-doc-count = 10
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  ansi / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  appending / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  are / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 3, 1, 12]
  asynchronous / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  atomic / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  auto-reconnection / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  automatic / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  availability / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 0, 1, 0]
  best / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 2]
  bitmaps / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  broker / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  builds / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  built-in / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  but / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 4, 0, 0, 0, 0, 1, 3]
  cache / tf-idf = 0.008347, this-doc-count = 2
    per-doc-count = [1, 12, 0, 0, 0, 0, 2, 0]
  can / tf-idf = 0.008000, this-doc-count = 4
    per-doc-count = [0, 15, 1, 0, 0, 6, 4, 6]
  case / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 2]
  cluster / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [1, 1, 2, 0, 0, 0, 1, 0]
  command / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  computing / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  data / tf-idf = 0.002448, this-doc-count = 2
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  database / tf-idf = 0.002000, this-doc-count = 1
    per-doc-count = [3, 4, 0, 0, 2, 5, 1, 0]
  dataset / tf-idf = 0.008347, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 1, 2, 0]
  dependencies / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  depending / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  deploying / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  developed / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 2]
  difference; / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  different / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  disabled / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  disk / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 1, 1]
  dumping / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  each / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  effort / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  either / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  element / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  every / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 1]
  eviction / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  external / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  failover / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  fast / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 3]
  feature-rich / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  features / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 8, 1, 1]
  first / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  for / tf-idf = 0.001136, this-doc-count = 2
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  from / tf-idf = 0.002000, this-doc-count = 1
    per-doc-count = [0, 2, 1, 0, 0, 1, 1, 5]
  geospatial / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  getting / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  has / tf-idf = 0.001224, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  hash; / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  hashes / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  high / tf-idf = 0.001224, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  highest / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  hyperloglogs / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  in-memory / tf-idf = 0.012521, this-doc-count = 3
    per-doc-count = [2, 1, 0, 0, 0, 0, 3, 0]
  include / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  incrementing / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  indexes / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 3, 1, 0]
  intersection / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  introduction / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  its / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 1, 1, 1, 0]
  just / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  keys / tf-idf = 0.011798, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 2, 0]
  languages / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 2, 1, 1]
  levels / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  licensed) / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  like / tf-idf = 0.026546, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 3, 0]
  limited / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  linux / tf-idf = 0.026546, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 3, 0]
  list; / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  lists / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  log / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  lru / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  lua / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  master-slave / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  may / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  member / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  message / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 0]
  most / tf-idf = 0.008849, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 1, 1, 3, 0]
  need / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 3]
  net / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  networked / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  non-blocking / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  official / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  on-disk / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  once / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  open / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  operating / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  operations / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  optionally / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  order / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  other / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 1, 1]
  out / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 2, 1, 0]
  outstanding / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  partial / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  partitioning / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  performance / tf-idf = 0.001224, this-doc-count = 1
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  persist / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  persistence / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  posix / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  programming / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  provides / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 1, 1]
  pub / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  pushing / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  queries / tf-idf = 0.011798, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 2, 0]
  radius / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  range / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  ranking / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  recommend / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  redis / tf-idf = 0.097336, this-doc-count = 11
    per-doc-count = [0, 0, 0, 0, 0, 0, 11, 0]
  replication / tf-idf = 0.005899, this-doc-count = 2
    per-doc-count = [0, 0, 0, 1, 0, 1, 2, 1]
  resynchronization / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  run / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 3]
  scripting / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  sentinel / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  set / tf-idf = 0.011798, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 2, 0]
  sets / tf-idf = 0.008347, this-doc-count = 2
    per-doc-count = [1, 0, 0, 0, 0, 1, 2, 0]
  smartos / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  solaris-derived / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  sorted / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  source / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 1, 4, 1, 0]
  split / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  store / tf-idf = 0.002000, this-doc-count = 1
    per-doc-count = [1, 6, 1, 0, 0, 2, 1, 0]
  streams / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 5, 0, 0, 1, 1, 0]
  string; / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  strings / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  structure / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 0, 1, 0]
  structures / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  sub / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  such / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 3, 1, 0]
  support / tf-idf = 0.004000, this-doc-count = 2
    per-doc-count = [1, 2, 0, 0, 0, 1, 2, 2]
  supports / tf-idf = 0.008347, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 2, 4]
  synchronization / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  systems / tf-idf = 0.008849, this-doc-count = 3
    per-doc-count = [0, 0, 1, 0, 0, 2, 3, 1]
  tested / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  the / tf-idf = 0.003409, this-doc-count = 6
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  there / tf-idf = 0.011798, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 2, 0]
  these / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 7]
  time-to-live / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  transactions / tf-idf = 0.005899, this-doc-count = 2
    per-doc-count = [1, 5, 0, 0, 0, 2, 2, 0]
  trivial-to-setup / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  two / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 1]
  types / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 3, 1, 0]
  union / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  use / tf-idf = 0.005899, this-doc-count = 2
    per-doc-count = [0, 9, 0, 0, 0, 1, 2, 6]
  used / tf-idf = 0.002000, this-doc-count = 1
    per-doc-count = [1, 2, 1, 0, 1, 0, 1, 0]
  using / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 1, 4]
  value / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 3, 1, 0, 0, 0, 1, 0]
  very / tf-idf = 0.004174, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 1]
  via / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  where / tf-idf = 0.002950, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 1, 1]
  while / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  windows / tf-idf = 0.008849, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 0]
  with / tf-idf = 0.004546, this-doc-count = 8
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  without / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 0]
  work / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  works / tf-idf = 0.017697, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 2, 0]
  written / tf-idf = 0.005899, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 0]
  you / tf-idf = 0.011798, this-doc-count = 4
    per-doc-count = [0, 6, 0, 0, 0, 5, 4, 19]
  your / tf-idf = 0.002000, this-doc-count = 1
    per-doc-count = [1, 8, 0, 0, 0, 5, 1, 16]

docs/wildfly.txt / 1003:
  "undertow" / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  (and / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  (i.e / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  10-fold / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  100% / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  ability / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  about / tf-idf = 0.001382, this-doc-count = 2
    per-doc-count = [0, 5, 0, 0, 1, 1, 0, 2]
  access / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  across / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 2]
  added / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  addition / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 1]
  adds / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  administration / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  affected / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  aggressive / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  air / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  algorithm / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  all / tf-idf = 0.005867, this-doc-count = 6
    per-doc-count = [0, 3, 0, 0, 0, 2, 0, 6]
  allocation / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  allow / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  allowing / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  allows / tf-idf = 0.006911, this-doc-count = 5
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 5]
  along / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  also / tf-idf = 0.005867, this-doc-count = 6
    per-doc-count = [0, 8, 0, 0, 0, 0, 1, 6]
  amazing / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  and / tf-idf = 0.000000, this-doc-count = 36
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]
  another / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  any / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  api / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  apis / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 2, 0, 1, 0, 1, 0, 1]
  application / tf-idf = 0.016586, this-doc-count = 12
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 12]
  applications / tf-idf = 0.004217, this-doc-count = 9
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]
  approach / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  architecture / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  are / tf-idf = 0.008293, this-doc-count = 12
    per-doc-count = [0, 2, 0, 0, 0, 3, 1, 12]
  aren't / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  arquillian / tf-idf = 0.010366, this-doc-count = 5
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 5]
  arranged / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  asynchrous / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  automation / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  available / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  backend / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  baked / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  base / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  based / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  become / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 1, 1, 0, 2]
  becoming / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  been / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 3, 0, 1]
  behavior / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  benchmarks / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  best / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 2]
  besting / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  between / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 1]
  block / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  blocking / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  blocks / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  boiler / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  boot / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  both / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  bottleneck / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  breaking / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  breathe / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  build / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 5, 0, 1]
  building / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  burden / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  business / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  but / tf-idf = 0.002934, this-doc-count = 3
    per-doc-count = [0, 4, 0, 0, 0, 0, 1, 3]
  cached / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  called / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 1]
  can / tf-idf = 0.002812, this-doc-count = 6
    per-doc-count = [0, 15, 1, 0, 0, 6, 4, 6]
  capabilities / tf-idf = 0.001874, this-doc-count = 4
    per-doc-count = [1, 0, 2, 1, 1, 0, 0, 4]
  case / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 2]
  causing / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  central / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  centralized / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  change / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  changes / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  chase / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  churn / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  class / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  classes / tf-idf = 0.006911, this-doc-count = 5
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 5]
  classloaders / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  classloading / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  classloading) / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  cli / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  client / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  clustering / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  clutter / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  code / tf-idf = 0.002934, this-doc-count = 3
    per-doc-count = [0, 3, 0, 0, 0, 1, 0, 3]
  collections / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  collector / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  combination / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  combined / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  come / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  comes / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  common / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  communicating / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  communication / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  competition / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  component / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  compose / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  comprehend / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  concurrent / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  concurrently / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  configuration / tf-idf = 0.018659, this-doc-count = 9
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 9]
  connections / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  connectivity / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  console / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  consume / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  control / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  controlled / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  controller / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  core / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 2, 0, 3, 0, 2, 0, 1]
  could / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  critical / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  custom / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 2]
  customizable / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  customized / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  data / tf-idf = 0.000860, this-doc-count = 3
    per-doc-count = [5, 14, 3, 0, 0, 11, 2, 3]
  decide / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  defaults / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  defined / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 1]
  delegation / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  delete / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  demands / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  dependency / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  deployed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  deployment / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  deployments / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  described / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  designed / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  developed / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 2]
  developer / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  development / tf-idf = 0.002934, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 3]
  devices / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  diet / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  directly / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  discovery / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  disk / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 1, 1]
  does / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 2, 0, 0, 1, 0, 1]
  domain / tf-idf = 0.018659, this-doc-count = 9
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 9]
  dont / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  driven / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  duplicate / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  dynamic / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  each / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  easily / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 2]
  easy / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  easy-to-use / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  edits / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  ee8 / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  efficient / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  eliminate / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  enable / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  enables / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  encounters / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  enterprise / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 1, 0, 0, 0, 0, 2]
  environment / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  even / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  ever / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  every / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 1, 1]
  evolved / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  example / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 1]
  exceptionally / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  execute / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  execution / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  experience / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  explicitly / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  exposed / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  failed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  failure / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  fast / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 3]
  fault / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  features / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 8, 1, 1]
  fidelity / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  file / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  finally / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  find / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  finding / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  first / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 1]
  fit / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  flexibility / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  flexible / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  focus / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  footprint / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  for / tf-idf = 0.001864, this-doc-count = 14
    per-doc-count = [2, 10, 1, 0, 1, 6, 2, 14]
  forms / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  framework / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  frameworks / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 3]
  fresh / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  from / tf-idf = 0.002343, this-doc-count = 5
    per-doc-count = [0, 2, 1, 0, 0, 1, 1, 5]
  full / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [1, 0, 0, 0, 0, 1, 0, 2]
  full-duplex / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  garbage / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  gateway / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  gives / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  goodbye / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  goose / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  greater / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  ground / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  handling / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  happen / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  has / tf-idf = 0.001434, this-doc-count = 5
    per-doc-count = [0, 1, 1, 0, 1, 7, 1, 5]
  have / tf-idf = 0.007823, this-doc-count = 8
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 8]
  headroom / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  heap / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  heavily / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  hell / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  helps / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  hiding / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  hierarchical / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  high / tf-idf = 0.000287, this-doc-count = 1
    per-doc-count = [1, 1, 0, 1, 1, 0, 1, 1]
  higher / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  highly / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 3, 0, 1]
  host / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  hosts / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  however / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  http / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 1]
  ice / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  impact / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  implementation / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  implements / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  importance / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  improves / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  in / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  include / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  including / tf-idf = 0.000469, this-doc-count = 1
    per-doc-count = [0, 1, 0, 1, 1, 2, 0, 1]
  indexed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  individual / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  infrastructure / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  inside / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  installed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  instantly / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  integral / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  integration / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  intelligent / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  internal / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  into / tf-idf = 0.005529, this-doc-count = 4
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 4]
  isolated / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  isolation / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  it's / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  jar / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  jars / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  java / tf-idf = 0.008293, this-doc-count = 6
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 6]
  javascript / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  jax-rs / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  jboss / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  jetty / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  jmx / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  json / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 2, 0, 1]
  json-p / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  just / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  jvm / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  kept / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  languages / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 2, 1, 1]
  large / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  lastly / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  latest / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  leaves / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  level / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  libraries / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  lightweight / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  limited / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  linking / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  loaded / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  loading / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  longer / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  magic / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  managed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  management / tf-idf = 0.006911, this-doc-count = 5
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 5]
  manner / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  many / tf-idf = 0.000469, this-doc-count = 1
    per-doc-count = [1, 4, 0, 0, 1, 9, 0, 1]
  maximize / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  means / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  meet / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  memory / tf-idf = 0.004889, this-doc-count = 5
    per-doc-count = [1, 1, 0, 0, 0, 0, 0, 5]
  metadata / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  metrics / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  migrate / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  migrating / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  million / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  mind / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  minimal / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  minimize / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  mobile / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  mock / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  mode / tf-idf = 0.010366, this-doc-count = 5
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 5]
  model / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  modern / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  modes / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  modular / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  modularity / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  module / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  modules / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  money / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  more / tf-idf = 0.001147, this-doc-count = 4
    per-doc-count = [1, 5, 1, 0, 1, 4, 0, 4]
  multi-core / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  multi-jvm / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  native / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  nearly / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  need / tf-idf = 0.002934, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 3]
  needed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  needs / tf-idf = 0.010366, this-doc-count = 5
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 5]
  new / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  non-blocking / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  non-critical / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  not / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 2, 0, 3]
  number / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  o(1) / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  object / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  offers / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  often / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 1]
  only / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  optimizations / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  optimize / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  optimized / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  option / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  options / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  organized / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  oriented / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  other / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 1, 1, 1]
  over / tf-idf = 0.003912, this-doc-count = 4
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 4]
  overall / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  overhead / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  packaged / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  parent / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  parses / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  part / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 0, 1]
  participating / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  particular / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  party / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  path / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  pauses / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  peers / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  performance / tf-idf = 0.001147, this-doc-count = 4
    per-doc-count = [0, 3, 0, 1, 1, 1, 1, 4]
  perhaps / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  plate / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  pluggable / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  point / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  pollute / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  power / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 1, 0, 0, 1]
  powerful / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 2, 0, 1]
  prevents / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  previous / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  previously / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  problematic / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  process / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [1, 0, 1, 0, 0, 0, 0, 1]
  processes / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  processors / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  productity / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  products / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  proprietary / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  protocols / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  provide / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  provides / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 1, 1]
  providing / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  proxying / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  purely / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  query / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  quick / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  quirky / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  rather / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  real / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  reduce / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  reduces / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  reduction / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  remain / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  remove / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  removed / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  removing / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  replication / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 1, 2, 1]
  required / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  requires / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  resolution / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  resources / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  responsiveness / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  rest / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  result / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  rich / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  right / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  rules / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  run / tf-idf = 0.002934, this-doc-count = 3
    per-doc-count = [0, 0, 1, 0, 0, 0, 1, 3]
  running / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  runtime / tf-idf = 0.010366, this-doc-count = 5
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 5]
  same / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  sane / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  say / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  scalability / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 2]
  scale / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [2, 0, 0, 0, 0, 1, 0, 2]
  secret / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  sending / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [1, 0, 0, 0, 0, 0, 0, 1]
  sensible / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  separate / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  server / tf-idf = 0.011057, this-doc-count = 8
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 8]
  servers / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 0, 1]
  services / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  servlet / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  session / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  setting / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  settings / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  shown / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  side / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  simple / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 2]
  simply / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  single / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  slimable / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  small / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  smarter / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  socket / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  solutions / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  some / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  specified / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  speed / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 3]
  standalone / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  standards / tf-idf = 0.009675, this-doc-count = 7
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 7]
  start / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  started / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  starts / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  startup / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  stateless / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  still / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  stock / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  subsystem / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  subsystems / tf-idf = 0.006220, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 3]
  support / tf-idf = 0.000937, this-doc-count = 2
    per-doc-count = [1, 2, 0, 0, 0, 1, 2, 2]
  supports / tf-idf = 0.003912, this-doc-count = 4
    per-doc-count = [0, 2, 0, 0, 0, 0, 2, 4]
  synchronizes / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  system / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 2, 0, 1]
  systems / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 0, 1, 0, 0, 2, 3, 1]
  takes / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  tap / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  team / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  technical / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  technology / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 1, 0, 0, 0, 1]
  test / tf-idf = 0.008293, this-doc-count = 4
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 4]
  testability / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  testable / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  tested / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  tests / tf-idf = 0.012439, this-doc-count = 6
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 6]
  than / tf-idf = 0.004146, this-doc-count = 3
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 3]
  thanks / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  that / tf-idf = 0.008984, this-doc-count = 13
    per-doc-count = [0, 6, 4, 0, 0, 8, 0, 13]
  the / tf-idf = 0.006657, this-doc-count = 50
    per-doc-count = [0, 13, 3, 2, 2, 21, 6, 50]
  their / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  these / tf-idf = 0.004838, this-doc-count = 7
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 7]
  third / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  this / tf-idf = 0.004889, this-doc-count = 5
    per-doc-count = [0, 8, 0, 0, 0, 2, 0, 5]
  those / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  throughput / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  time / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 2, 0, 2]
  together / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  tolerance / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  tomcat / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  tools / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  traditional / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 2]
  true / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  two / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 1, 1, 0, 0, 0, 1, 1]
  ultimate / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  undertow / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  unified / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  unit / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  unless / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  unlike / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  unnecessary / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  unparalleled / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  until / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  use / tf-idf = 0.004146, this-doc-count = 6
    per-doc-count = [0, 9, 0, 0, 0, 1, 2, 6]
  useful / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  user-focused / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  uses / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  using / tf-idf = 0.002764, this-doc-count = 4
    per-doc-count = [0, 0, 0, 1, 0, 1, 1, 4]
  utmost / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  various / tf-idf = 0.001956, this-doc-count = 2
    per-doc-count = [0, 1, 0, 0, 0, 1, 0, 2]
  vendor / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  versions / tf-idf = 0.004146, this-doc-count = 2
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 2]
  versions) / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  very / tf-idf = 0.000978, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 1, 1]
  visibility / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  waits / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  want / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 1, 0, 1]
  was / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  web / tf-idf = 0.010757, this-doc-count = 11
    per-doc-count = [0, 0, 0, 1, 2, 0, 0, 11]
  well-organized / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  were / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  what / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 3, 1, 0, 0, 1, 0, 1]
  when / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  where / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [0, 2, 0, 0, 0, 1, 1, 1]
  which / tf-idf = 0.006911, this-doc-count = 5
    per-doc-count = [0, 3, 0, 0, 0, 0, 0, 5]
  while / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  wild / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  wildfly / tf-idf = 0.039391, this-doc-count = 19
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 19]
  wildfly's / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  will / tf-idf = 0.002764, this-doc-count = 2
    per-doc-count = [0, 2, 0, 0, 0, 0, 0, 2]
  wiring / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  with / tf-idf = 0.001065, this-doc-count = 8
    per-doc-count = [4, 1, 0, 1, 1, 10, 8, 8]
  within / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  work / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 1, 1]
  worry / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]
  write / tf-idf = 0.000691, this-doc-count = 1
    per-doc-count = [1, 1, 0, 0, 0, 1, 0, 1]
  yet / tf-idf = 0.001382, this-doc-count = 1
    per-doc-count = [0, 1, 0, 0, 0, 0, 0, 1]
  you / tf-idf = 0.013130, this-doc-count = 19
    per-doc-count = [0, 6, 0, 0, 0, 5, 4, 19]
  your / tf-idf = 0.007498, this-doc-count = 16
    per-doc-count = [1, 8, 0, 0, 0, 5, 1, 16]
  zero / tf-idf = 0.002073, this-doc-count = 1
    per-doc-count = [0, 0, 0, 0, 0, 0, 0, 1]

ある文書にしか出現せず、さらに文書内の出現回数が多ければ重要度が高いですし、

  infinispan / tf-idf = 0.078916, this-doc-count = 20
    per-doc-count = [0, 20, 0, 0, 0, 0, 0, 0]

いろんな文書中に出てくる単語は、重要度が低いです。

  applications / tf-idf = 0.000892, this-doc-count = 1
    per-doc-count = [0, 1, 3, 0, 2, 1, 0, 9]

「and」なんて、とてもよく出てくるので0ですね。

  and / tf-idf = 0.000000, this-doc-count = 3
    per-doc-count = [10, 17, 3, 4, 3, 20, 10, 36]

というか、全文書数をある単語が出現する文書数で割り、さらに常用対数をとっているので、全文書中に出現する単語のTF-IDFは0になります。
このあたりの対策として、重み付けで調整することもあるみたいですね。

とりあえず、OKそうです。

まとめ

TF-IDFは、ちょくちょく名前を見ていたのですが、今回自分で実装することでだいぶ意味が実感できました。

こういうのを確認してみるのも、良いですね。

オマケ

最後に、今回使った文書を載せておきます。

docs/ignite.txt

Ignite™ is a memory-centric distributed database, caching, and processing platform for
transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scale

MAIN BENEFITS
Ignite is Used by
ING, Sberbank, HomeAway,
Wellington, American Airlines, Yahoo! Japan, 24 Hour Fitness, JacTravel, and many more

Keep Your Database
Accelerate existing Relational and NoSQL databases with Ignite™ in-memory data grid and caching capabilities

SQL at NoSQL Scale
Achieve horizontal scalability, strong consistency, and high availability with Ignite™ distributed SQL


MAIN FEATURES
Memory-Centric Storage
Store and process distributed data
in memory and on disk

Distributed SQL
Distributed memory-centric SQL database with support for joins

Distributed Key-Value
Read, write, transact with fastest key-value data grid and cache



ACID Transactions
Enforce full ACID compliance across distributed data sets

Collocated Processing
Avoid data noise by sending computations to cluster nodes

Machine Learning
Train and deploy distributed machine learning models

docs/infinispan.txt

Introduction to Infinispan
Distributed in-memory key/value data grid and cache

What is it?
Infinispan is an extremely scalable, highly available key/value data store and data grid platform. It is 100% open source, and written in Java. The purpose of Infinispan is to expose a data structure that is distributed, highly concurrent and designed ground-up to make the most of modern multi-processor and multi-core architectures. It is often used as a distributed cache, but also as a NoSQL key/value store or object database.
Why would I use it?
Most people use Infinispan for one of two reasons. Firstly, as a cache. Putting Infinispan in front of your database, disk-based NoSQL store or any part of your system that is a bottleneck can greatly help improve performance. Often, however, a simple cache isn't enough - for example if your application is clustered and cache coherency is important to data consistency. A distributed cache can greatly help here.

Infinispan can also be used as a high-performance NoSQL data store. In addition to being in memory, Infinispan can also persist data to a more permanent store. We call this a cache store. Cache stores are pluggable, you can easily write your own, and many already exist for you to use. Learn more about cache stores - and existing implementations you can use today - on the cache stores section of this website.

Yet another common use case is adding clusterability and high availability to frameworks. Since Infinispan exposes a distributed data structure, frameworks and libraries that also need to be clustered can easily achieve this by embedding Infinispan and delegating all state management to Infinispan. This way, any framework can easily be clustered by letting Infinispan do all the heavy lifting.

Where can I learn more?
Visit the Documentation section of this website. Lots of resources - including tutorials, quick start guides, sample code and demos - will help get you on your feet in no time.


How do I use it?
At its core Infinispan exposes a Cache interface which extends java.util.Map It is also optionally is backed by a peer-to-peer network architecture to distribute data efficiently across a cluster of servers.

In addition to its core Java API, Infinispan can also be consumed by non-JVM platforms by making use of the Hot Rod protocol, for which client libraries for various platforms exist.

What about transactions?
A heated topic among many NoSQL engines. Yes, Infinispan is fully transactional. Infinispan supports both JTA as well as XA standards, and can participate in distributed transactions brokered by a valid JTA transaction manager.

Most distributed data stores find that transactions hurt performance. This is true in some cases, but we feel that in many other cases, transactions are necessary for many business applications. As such, we support transactions but this is optional and can be disabled for greater performance.

Can I use it with Hibernate?
Hibernate exposes a hook for second-level caching when retrieving data from a relational database. Infinispan has a Hibernate second-level cache plug-in to speed up your data lookups from a database. Learn more about it here.

Can I perform searches?
Yes. Infinispan's primary form of data retrieval is a key lookup ( Cache.get(key) ), but we also support powerful indexing and searching over your dataset. Learn more about this here.

What about Map/Reduce?
Even better ! Infinispan allows you to use Java's very powerful Stream API which allows code execution, local to the data, in all of our modes, both local and clustered. In addition to streams, Infinispan also supports distributed code execution where you can move your processing into the grid.

Standards
JSR 107
JSR 107 (JCACHE: Temporary Caching for Java), is a standard that the Infinispan development team actively participates in and is a part of the expert group. Infinispan will implement the JSR 107 APIs once these have been finalized. For a preview of the JSR 107 APIs, visit the JSR 107 on GitHub or explore Infinispan's JSR 107 module.

docs/kafka.txt

Apache Kafka is a distributed streaming platform. What exactly does that mean?
A streaming platform has three key capabilities:

Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
Store streams of records in a fault-tolerant durable way.
Process streams of records as they occur.
Kafka is generally used for two broad classes of applications:

Building real-time streaming data pipelines that reliably get data between systems or applications
Building real-time streaming applications that transform or react to the streams of data
To understand how Kafka does these things, let's dive in and explore Kafka's capabilities from the bottom up.

First a few concepts:

Kafka is run as a cluster on one or more servers that can span multiple datacenters.
The Kafka cluster stores streams of records in categories called topics.
Each record consists of a key, a value, and a timestamp.

docs/lucene.txt

Welcome to Apache Lucene
The Apache Lucene project develops open-source search software, including:

Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
SolrTM is a high performance search server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.
PyLucene is a Python wrapper around the Core project.

docs/mysql.txt

About MySQL
MySQL is the world's most popular open source database. With its proven performance, reliability and ease-of-use, MySQL has become the leading database choice for web-based applications, used by high profile web properties including Facebook, Twitter, YouTube, Yahoo! and many more.

Oracle drives MySQL innovation, delivering new capabilities to power next generation web, cloud, mobile and embedded applications.

docs/postgresql.txt

About PostgreSQL
What is PostgreSQL?
PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.

PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organisations.

Getting started with using PostgreSQL has never been easier - pick a project you want to build, and let PostgreSQL safely and robustly store your data.

Why use PostgreSQL?
PostgreSQL comes with many features aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments, and help you manage your data no matter how big or small the dataset. In addition to being free and open source, PostgreSQL is highly extensible. For example, you can define your own data types, build out custom functions, even write code from different programming languages without recompiling your database!

PostgreSQL tries to conform with the SQL standard where such conformance does not contradict traditional features or could lead to poor architectural decisions. Many of the features required by the SQL standard are supported, though sometimes with slightly differing syntax or function. Further moves towards conformance can be expected over time. As of the version 12 release in October 2019, PostgreSQL conforms to at least 160 of the 179 mandatory features for SQL:2016 Core conformance. As of this writing, no relational database meets full conformance with this standard.

Below is an inexhaustive list of various features found in PostgreSQL, with more being added in every major release:

Data Types
Primitives: Integer, Numeric, String, Boolean
Structured: Date/Time, Array, Range, UUID
Document: JSON/JSONB, XML, Key-value (Hstore)
Geometry: Point, Line, Circle, Polygon
Customizations: Composite, Custom Types
Data Integrity
UNIQUE, NOT NULL
Primary Keys
Foreign Keys
Exclusion Constraints
Explicit Locks, Advisory Locks
Concurrency, Performance
Indexing: B-tree, Multicolumn, Expressions, Partial
Advanced Indexing: GiST, SP-Gist, KNN Gist, GIN, BRIN, Covering indexes, Bloom filters
Sophisticated query planner / optimizer, index-only scans, multicolumn statistics
Transactions, Nested Transactions (via savepoints)
Multi-Version concurrency Control (MVCC)
Parallelization of read queries and building B-tree indexes
Table partitioning
All transaction isolation levels defined in the SQL standard, including Serializable
Just-in-time (JIT) compilation of expressions
Reliability, Disaster Recovery
Write-ahead Logging (WAL)
Replication: Asynchronous, Synchronous, Logical
Point-in-time-recovery (PITR), active standbys
Tablespaces
Security
Authentication: GSSAPI, SSPI, LDAP, SCRAM-SHA-256, Certificate, and more
Robust access-control system
Column and row-level security
Multi-factor authentication with certificates and an additional method
Extensibility
Stored functions and procedures
Procedural Languages: PL/PGSQL, Perl, Python (and many more)
SQL/JSON path expressions
Foreign data wrappers: connect to other databases or streams with a standard SQL interface
Customizable storage interface for tables
Many extensions that provide additional functionality, including PostGIS
Internationalisation, Text Search
Support for international character sets, e.g. through ICU collations
Case-insensitive and accent-insensitive collations
Full-text search
There are many more features that you can discover in the PostgreSQL documentation. Additionally, PostgreSQL is highly extensible: many features, such as indexes, have defined APIs so that you can build out with PostgreSQL to solve your challenges.

PostgreSQL has been proven to be highly scalable both in the sheer quantity of data it can manage and in the number of concurrent users it can accommodate. There are active PostgreSQL clusters in production environments that manage many terabytes of data, and specialized systems that manage petabytes.

docs/redis.txt

Introduction to Redis
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

You can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing an element to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set.

In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log. Persistence can be optionally disabled, if you just need a feature-rich, networked, in-memory cache.

Redis also supports trivial-to-setup master-slave asynchronous replication, with very fast non-blocking first synchronization, auto-reconnection with partial resynchronization on net split.

Other features include:

Transactions
Pub/Sub
Lua scripting
Keys with a limited time-to-live
LRU eviction of keys
Automatic failover
You can use Redis from most programming languages out there.

Redis is written in ANSI C and works in most POSIX systems like Linux, *BSD, OS X without external dependencies. Linux and OS X are the two operating systems where Redis is developed and tested the most, and we recommend using Linux for deploying. Redis may work in Solaris-derived systems like SmartOS, but the support is best effort. There is no official support for Windows builds.

docs/wildfly.txt

What is WildFly?
WildFly is a flexible, lightweight, managed application runtime that helps you build amazing applications.

Unparalleled Speed
Fast Startup
Experience ground breaking startup speed!

In the highly optimized boot process of WildFly 8, services are started concurrently to eliminate unnecessary waits and to tap into the power of multi-core processors. Non-critical services are kept on ice until first use.

As a result, WildFly offers a 10-fold reduction in startup time over previous versions and even gives Jetty and Tomcat a run for their money.

Ultimate Web Performance & Scalability
Connectivity, responsiveness, and the ability to scale are of the utmost importance for modern web applications. To meet these demands we have developed a new flexible high performance web server, called Undertow , and it's an integral part of WildFly 8. Undertow has the ability to scale to over a million connections, and third party benchmarks have shown it besting the competition when it comes to throughput.

Exceptionally Lightweight
Memory Diet
WildFly takes an aggressive approach to memory management. The base runtime services were developed to minimize heap allocation. These services use common cached indexed metadata over duplicate full parses, which reduces heap and object churn. The use of modular class loading prevents duplicate classes and loading more than the system configuration requires. This not only reduces the base memory overhead but also helps to minimize garbage collector pauses. Finally the administration console is 100% stateless and purely client driven. It starts instantly and requires zero memory on the server.

These optimizations combined enable WildFly to run with stock JVM settings and also on small devices. It also leaves more headroom for application data and supports higher scalability.

Slimable / Customizable Runtime
WildFly's architecture is based on pluggable subsystems that can be added or removed as needed. This allows you to remove capabilities you dont need, and also reduce the overall disk footprint and memory overhead required by the server.

This is all controlled by configuration which is arranged into subsystem blocks. To remove a subsystem you simply need to delete that simple block of configuration. For example, if you decide you only want Servlet support, you can delete every subsystem but the "undertow" subsystem.

Powerful Administration
Unified configuration & Management
Rather than sending you on a wild goose chase to change a setting in the application server, configuration in WildFly is centralized, simple and user-focused. The configuration file is organized by subsystems that you can easily comprehend and no internal server wiring is exposed. Subsystems use intelligent defaults, but can still be customized to best fit your needs. If you are running in domain mode, the configuration for all servers participating in the domain is specified in a well-organized manner within the same file.

Configuration changes aren't limited to file edits. All management capabilities are exposed in a unified manner across many forms of access. These include a CLI, a web based administration console, a native Java API, an HTTP/JSON based REST API, and a JMX gateway. These options allow for custom automation using the tools and languages that best fit your needs.

Domain & Standalone Management
WildFly offers two modes: a traditional, single JVM, standalone mode, and a multi-JVM option, domain mode, which synchronizes configuration across any number of processes and hosts.

All of the management capabilities previously described are available in both modes. However domain mode adds a central control point, the domain controller, for all of your systems.

Unlike solutions from other products, domain mode was designed to consume minimal resources, and in the case of failure to not impact running applications. Also, if you have a large domain, you can directly query each individual host for runtime metrics, which prevents the domain controller from becoming a bottleneck.

Supports Latest Standards and Technology
Java EE 8
WildFly implements the latest in enterprise Java standards. Java EE8 improves developer productity by providing rich enterprise capabilities in easy to consume frameworks that eliminate boiler plate and reduce technical burden. This allows your team to focus on the core business needs of your application.

In addition the frameworks that compose Java EE are heavily tested in combination. Using these standards means you no longer have to worry about finding the magic combination of various frameworks (and versions) that happen to work together.

Finally, building your application on standards allows you the flexibility to migrate between various vendor solutions. If you have applications using these standards on another, perhaps proprietary server, migrating to WildFly could breathe some fresh air into those applications.

Modern Web
WildFly supports the latest standards for web development. Web Socket support allows your applications the ability to use optimized custom protocols and full-duplex communication with your backend infrastructure. This is particular useful in communicating with mobile devices.

As web applications have evolved to become more client oriented with rich dynamic JavaScript, data access over the web has become critical. WildFly supports the latest standards for REST based data access, including JAX-RS 2, and JSON-P.

The connectivity and responsiveness needs of modern web applications are greater than ever. While WildFly will optimize traditional blocking I/O applications, it also provides asynchrous and non-blocking APIs that allow you to maximize the performance of critical resources in your application.

Lastly, fault tolerance, clustering, session replication, and efficient web proxying are all baked into WildFly as base level features.

Modular Java
No more jar hell!
Hierarchical classloaders are problematic, often causing failed deployments and quirky behavior. The time has come to say goodbye to the parent delegation model and find the path to modularity (i.e. sane classloading).

WildFly does classloading right. It uses JBoss Modules to provide true application isolation, hiding server implementation classes from the application and only linking with JARs your application needs. Modules, packaged as collections of classes, are peers that remain isolated unless explicitly defined as a dependency of another module. Visibility rules have sensible defaults, yet can be customized.

Fast Linking & Concurrent Loading
The dependency resolution algorithm in JBoss Modules is O(1), which means that classloading performance is not affected by the number of versions of libraries you have installed.

Classes are loaded concurrently, allowing for quick class discovery and loading, even on large deployments.

Easily Testable
Arquillian
From the very start, WildFly has been designed with testability in mind. The secret to that fidelity is Arquillian , a component model for integration tests that execute inside the real runtime environment.

By removing the clutter from your tests and handling deployment and test execution, Arquillian enables you to write tests for just about any use case your application encounters: real tests.

Thanks to the speed of WildFly, Arquillian tests run nearly as fast as unit tests.

Smarter Development
The quick boot of WildFly combined with the easy-to-use Arquillian framework allows for test driven development using the real environment your code will be running in. You no longer need to pollute your application object model with mock classes and test code. Your test code is separate and simply deployed along side your application where it has full access to server resources.

Infinispan 10.0でMarshallingがリファクタリングされたという話(Embedded Mode)

これは、なにをしたくて書いたもの?

Infinispan 10.0.0.Finalがリリースされたということで。
※その後、すぐに10.0.1.Finalがリリースされましたが…

Blog: Infinispan 10.0.0.Final - Infinispan

Blog: Infinispan 10.0.1.Final - Infinispan

Infinispanは10.0で大きく変わったようなので、気になるところをちょっとずつ見ていこうかなと思っています。

今回は、Marshallingまわりを。

Marshalling

Infinispan 10.0で、Marshallingが大きくリファクタリングされたようです。

Blog: Infinispan 10.0.0.Final - Infinispan

Marshalling
The internal marshalling capabilities of Infinispan have undergone a significant refactoring in 10.0. The marshalling of internal Infinispan objects and user objects are now truly isolated. This means that it’s now possible to configure Marshaller implementations in embedded mode or on the server, without having to handle the marshalling of Infinispan internal classes. Consequently, it’s possible to easily change the marshaller implementation user for user types, in a similar manner to how users of the HotRod client are accustomed.

As a consequence of the above changes, the default marshaller used for marshalling user types is no longer based upon JBoss Marshalling. Instead we now utilise the ProtoStream library to store user types in the language agnostic Protocol Buffers format. The ProtoStream library provides several advantages over jboss-marshalling, most notably it does not make use of reflection and so is more suitable for use in AOT environments such as Quarkus.

ざっくり、以下のような感じです。

  • Infinispanの内部クラスと、ユーザーオブジェクトのMarshallingの分離
  • Embedded ModeとClient/Server ModeでのMarshallerの共通化が可能に
    • これまでは、Embedded ModeではExternalizer、AdvancedExternalizerを使っていた
  • デフォルトのMarshallingライブラリの変更(JBoss Marshalling → ProtoStream(Protocol Buffers))

Protocol Bufferesにしたのは、リフレクションを使わないことでQuarkusなどのネイティブイメージを使う環境下でもより扱いやすくなることを
狙っているようです。

今回、Embedded Modeでこのあたりを確認してみます。ユーザー定義のクラスを、Cacheに登録/取得してみましょう。

環境

今回の環境は、こちらです。

$ java -version
openjdk version "11.0.4" 2019-07-16
OpenJDK Runtime Environment (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3)
OpenJDK 64-Bit Server VM (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3, mixed mode, sharing)


$ mvn -version
Apache Maven 3.6.2 (40f52333136460af0dc0d7232c0dc0bcf0d9e117; 2019-08-28T00:06:16+09:00)
Maven home: $HOME/.sdkman/candidates/maven/current
Java version: 11.0.4, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: ja_JP, platform encoding: UTF-8
OS name: "linux", version: "4.15.0-66-generic", arch: "amd64", family: "unix"

利用するInfinispanのバージョンは、10.0.1.Finalとします。

準備

Maven依存関係。

        <dependency>
            <groupId>org.infinispan</groupId>
            <artifactId>infinispan-core</artifactId>
            <version>10.0.1.Final</version>
        </dependency>

実は、これだけでは今回のお題では不完全なのですが。

あとは、テストコードで確認したいので、JUnit 5とAssertJも入れておきます。

        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-api</artifactId>
            <version>5.5.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-engine</artifactId>
            <version>5.5.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.assertj</groupId>
            <artifactId>assertj-core</artifactId>
            <version>3.14.0</version>
            <scope>test</scope>
        </dependency>

   <!-- 省略 -->

    <build>
        <plugins>
            <plugin>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.22.2</version>
            </plugin>
        </plugins>
    </build>

テストコードの雛形

それでは、テストコードの雛形を、以下のように作成します。
src/test/java/org/littlewings/infinispan/marshalling/MarshallingTest.java

package org.littlewings.infinispan.marshalling;

import java.io.IOException;
import java.io.UncheckedIOException;
import java.util.Arrays;
import java.util.List;
import java.util.function.Consumer;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

import org.infinispan.Cache;
import org.infinispan.manager.DefaultCacheManager;
import org.infinispan.manager.EmbeddedCacheManager;
import org.junit.jupiter.api.Test;

import static org.assertj.core.api.Assertions.assertThat;

public class MarshallingTest {
    public <K, V> void withCache(String cacheName, int numInstances, Consumer<Cache<K, V>> func) {
        List<EmbeddedCacheManager> cacheManagers =
                IntStream
                        .rangeClosed(1, numInstances)
                        .mapToObj(i -> {
                            try {
                                return new DefaultCacheManager("infinispan.xml");
                            } catch (IOException e) {
                                throw new UncheckedIOException(e);
                            }
                        })
                        .collect(Collectors.toList());

        cacheManagers.forEach(m -> m.getCache(cacheName));

        try {
            Cache<K, V> cache = cacheManagers.get(0).getCache(cacheName);
            func.accept(cache);
            cache.stop();
        } finally {
            cacheManagers.forEach(EmbeddedCacheManager::stop);
        }
    }

    // ここに、テストを書く!
}

任意の数のCacheManagerを作成して、クラスタを構成できるようにします。

Infinispanの設定ファイルは、まずはこんな形で用意しました。
src/test/resources/infinispan.xml

<?xml version="1.0" encoding="UTF-8"?>
<infinispan
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:infinispan:config:10.0 http://www.infinispan.org/schemas/infinispan-config-10.0.xsd"
        xmlns="urn:infinispan:config:10.0">

    <cache-container shutdown-hook="REGISTER">
        <transport stack="udp"/>

        <local-cache name="localCache"/>

        <distributed-cache name="distributedCache">
    </cache-container>

</infinispan>

デフォルトのUDPスタックでクラスタが組めるようにして、Local CacheとDistributed Cacheをそれぞれ定義しています。

Stringで確認してみる

最初に、ユーザー定義のクラスなんて難しいことは言わず、基本的な型で試してみましょう。

クラスタのNode数1で、Local CacheとDistributed Cacheで確認。

    @Test
    public void simplyClassLocalCache() {
        this.<String, String>withCache("localCache", 1, cache -> {
            cache.put("key1", "value1");
            cache.put("key2", "value2");
            cache.put("key3", "value3");

            assertThat(cache.get("key1")).isEqualTo("value1");
            assertThat(cache.get("key2")).isEqualTo("value2");
            assertThat(cache.get("key3")).isEqualTo("value3");
        });
    }

    @Test
    public void simplyClassDistributedCache() {
        this.<String, String>withCache("distributedCache", 1, cache -> {
            cache.put("key1", "value1");
            cache.put("key2", "value2");
            cache.put("key3", "value3");

            assertThat(cache.get("key1")).isEqualTo("value1");
            assertThat(cache.get("key2")).isEqualTo("value2");
            assertThat(cache.get("key3")).isEqualTo("value3");
        });
    }

これは、問題なく動きます。

クラスタのNode数を3に増加。

    @Test
    public void simplyClassClusteredLocalCache() {
        this.<String, String>withCache("localCache", 3, cache -> {
            cache.put("key1", "value1");
            cache.put("key2", "value2");
            cache.put("key3", "value3");

            assertThat(cache.get("key1")).isEqualTo("value1");
            assertThat(cache.get("key2")).isEqualTo("value2");
            assertThat(cache.get("key3")).isEqualTo("value3");
        });
    }

    @Test
    public void simplyClassClusteredDistributedCache() {
        this.<String, String>withCache("distributedCache", 3, cache -> {
            cache.put("key1", "value1");
            cache.put("key2", "value2");
            cache.put("key3", "value3");

            assertThat(cache.get("key1")).isEqualTo("value1");
            assertThat(cache.get("key2")).isEqualTo("value2");
            assertThat(cache.get("key3")).isEqualTo("value3");
        });
    }

これも、問題なく動きます。

ユーザー定義クラスを使う

では、ユーザー定義のクラスを使ってみましょう。

書籍をお題にします。
src/test/java/org/littlewings/infinispan/marshalling/Book.java

package org.littlewings.infinispan.marshalling;

public class Book {
    String isbn;

    String title;

    int price;

    //@ProtoFactory
    public static Book create(String isbn, String title, int price) {
        Book book = new Book();

        book.setIsbn(isbn);
        book.setTitle(title);
        book.setPrice(price);

        return book;
    }

    // getter/setterは省略
}

Serializableにしていないところにやや意図的なところもありますが、こちらに対してLocal Cacheの時と同様のバリエーションを実行します。

クラスタのNode数が1で、Local CacheとDistributed Cache。

    @Test
    public void userDefinedClassLocalCache() {
        this.<String, Book>withCache("localCache", 1, cache -> {
            List<Book> books =
                    Arrays
                            .asList(
                                    Book.create("978-1782169970", "Infinispan Data Grid Platform Definitive Guide", 5337),
                                    Book.create("978-1785285332", "Getting Started With Hazelcast - Second Edition", 3848),
                                    Book.create("978-1783988181", "Mastering Redis", 6172)
                            );

            books.forEach(b -> cache.put(b.getIsbn(), b));

            assertThat(cache.get("978-1782169970").getTitle()).isEqualTo("Infinispan Data Grid Platform Definitive Guide");
            assertThat(cache.get("978-1785285332").getTitle()).isEqualTo("Getting Started With Hazelcast - Second Edition");
            assertThat(cache.get("978-1783988181").getTitle()).isEqualTo("Mastering Redis");
        });
    }

    @Test
    public void userDefinedClassDistributedCache() {
        this.<String, Book>withCache("distributedCache", 1, cache -> {
            List<Book> books =
                    Arrays
                            .asList(
                                    Book.create("978-1782169970", "Infinispan Data Grid Platform Definitive Guide", 5337),
                                    Book.create("978-1785285332", "Getting Started With Hazelcast - Second Edition", 3848),
                                    Book.create("978-1783988181", "Mastering Redis", 6172)
                            );

            books.forEach(b -> cache.put(b.getIsbn(), b));

            assertThat(cache.get("978-1782169970").getTitle()).isEqualTo("Infinispan Data Grid Platform Definitive Guide");
            assertThat(cache.get("978-1785285332").getTitle()).isEqualTo("Getting Started With Hazelcast - Second Edition");
            assertThat(cache.get("978-1783988181").getTitle()).isEqualTo("Mastering Redis");
        });
    }

こちらは、動作します。

クラスタのNode数を3にして、Local Cache。

    @Test
    public void userDefinedClassClusteredLocalCache() {
        this.<String, Book>withCache("localCache", 3, cache -> {
            List<Book> books =
                    Arrays
                            .asList(
                                    Book.create("978-1782169970", "Infinispan Data Grid Platform Definitive Guide", 5337),
                                    Book.create("978-1785285332", "Getting Started With Hazelcast - Second Edition", 3848),
                                    Book.create("978-1783988181", "Mastering Redis", 6172)
                            );

            books.forEach(b -> cache.put(b.getIsbn(), b));

            assertThat(cache.get("978-1782169970").getTitle()).isEqualTo("Infinispan Data Grid Platform Definitive Guide");
            assertThat(cache.get("978-1785285332").getTitle()).isEqualTo("Getting Started With Hazelcast - Second Edition");
            assertThat(cache.get("978-1783988181").getTitle()).isEqualTo("Mastering Redis");
        });
    }

まあ、あまり意味はないですが、こちらはうまくいきます。

クラスタのNode数を3にして、Distributed Cache。

    @Test
    public void userDefinedClassClusteredDistributedCache() {
        this.<String, Book>withCache("distributedCache", 3, cache -> {
            List<Book> books =
                    Arrays
                            .asList(
                                    Book.create("978-1782169970", "Infinispan Data Grid Platform Definitive Guide", 5337),
                                    Book.create("978-1785285332", "Getting Started With Hazelcast - Second Edition", 3848),
                                    Book.create("978-1783988181", "Mastering Redis", 6172)
                            );

            books.forEach(b -> cache.put(b.getIsbn(), b));

            assertThat(cache.get("978-1782169970").getTitle()).isEqualTo("Infinispan Data Grid Platform Definitive Guide");
            assertThat(cache.get("978-1785285332").getTitle()).isEqualTo("Getting Started With Hazelcast - Second Edition");
            assertThat(cache.get("978-1783988181").getTitle()).isEqualTo("Mastering Redis");
        });
    }

これは、失敗します。

WARN: ISPN000559: Cannot marshall 'class org.littlewings.infinispan.marshalling.Book'
java.lang.IllegalArgumentException: No marshaller registered for Java type org.littlewings.infinispan.marshalling.Book
    at org.infinispan.protostream.impl.SerializationContextImpl.getMarshallerDelegate(SerializationContextImpl.java:279)
    at org.infinispan.protostream.WrappedMessage.writeMessage(WrappedMessage.java:240)
    at org.infinispan.protostream.ProtobufUtil.toWrappedStream(ProtobufUtil.java:196)
    at org.infinispan.marshall.persistence.impl.PersistenceMarshallerImpl.objectToBuffer(PersistenceMarshallerImpl.java:157)
    at org.infinispan.marshall.persistence.impl.PersistenceMarshallerImpl.objectToByteBuffer(PersistenceMarshallerImpl.java:137)
    at org.infinispan.marshall.persistence.impl.PersistenceMarshallerImpl.objectToByteBuffer(PersistenceMarshallerImpl.java:145)
    at org.infinispan.marshall.core.GlobalMarshaller.writeRawUnknown(GlobalMarshaller.java:638)
    at org.infinispan.marshall.core.GlobalMarshaller.writeUnknown(GlobalMarshaller.java:627)
    at org.infinispan.marshall.core.GlobalMarshaller.writeUnknown(GlobalMarshaller.java:618)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:384)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:352)
    at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26)
    at org.infinispan.commands.write.PutKeyValueCommand.writeTo(PutKeyValueCommand.java:81)
    at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeCommandParameters(ReplicableCommandExternalizer.java:70)
    at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeObject(ReplicableCommandExternalizer.java:66)
    at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeObject(ReplicableCommandExternalizer.java:54)
    at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:656)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:371)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:352)
    at org.infinispan.marshall.core.BytesObjectOutput.writeObject(BytesObjectOutput.java:26)
    at org.infinispan.commands.remote.SingleRpcCommand.writeTo(SingleRpcCommand.java:52)
    at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeCommandParameters(ReplicableCommandExternalizer.java:70)
    at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.marshallParameters(CacheRpcCommandExternalizer.java:120)
    at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.writeObject(CacheRpcCommandExternalizer.java:116)
    at org.infinispan.marshall.exts.CacheRpcCommandExternalizer.writeObject(CacheRpcCommandExternalizer.java:66)
    at org.infinispan.marshall.core.GlobalMarshaller.writeInternal(GlobalMarshaller.java:656)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:371)
    at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:352)
    at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:181)
    at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:174)
    at org.infinispan.marshall.core.GlobalMarshaller.objectToBuffer(GlobalMarshaller.java:302)

        〜省略〜

作成したクラスに対するMarshallerがない、と言われています。

java.lang.IllegalArgumentException: No marshaller registered for Java type org.littlewings.infinispan.marshalling.Book

ちなみに、この設定だとDistributed CacheでNodeが複数にならないとMarshallingが実行されないだけであって、たとえば
以下のように必ずMarshallingが行われるようにすると、ユーザー定義のクラスを扱ったテストはLocal Cache、Distributed Cacheを
問わず失敗するようになります。

        <local-cache name="localCache">
            <memory>
                <binary size="-1"/>
            </memory>
        </local-cache>

        <distributed-cache name="distributedCache">
            <memory>
                <binary size="-1"/>
            </memory>
        </distributed-cache>

本当に、JBoss Marshallingが使われなくなったんですね。

さて、どうしましょう。

Marshallingのドキュメントを読む

ここで、Marshallingのドキュメントをちゃんと読み直してみます。

Marshalling

ProtoStreamがデフォルトだと言っています。

ProtoStream (Default)

プリミティブなどの一部のクラスは、Marshallingが可能なようです。

Usage

先ほどのStringの例で、Distributed Cacheを使っても失敗しなかったのは、これが理由ですね。

プリミティブまわりだと、このあたり。

https://github.com/infinispan/infinispan/blob/10.0.1.Final/core/src/main/java/org/infinispan/marshall/core/Primitives.java#L43-L65

さて、ドキュメントに従うと、ユーザー定義のクラスにアノテーションを付与して、Protocol BuffersのIDLやMarshallerを自動生成
するようにして、SerializationContextInitializerとして登録します。

Generating SerializationContextInitializers

まず、pom.xmlに以下を追加。

        <dependency>
            <groupId>org.infinispan.protostream</groupId>
            <artifactId>protostream-processor</artifactId>
            <version>4.3.0.Final</version>
            <optional>true</optional>
        </dependency>

ビルド時での、アノテーションからの自動生成に使うだけなので、optional trueでよいでしょう。

続いて、Bookクラスに@ProtoFieldアノテーションを付与して、Protocol Buffersとしてのフィールド定義を行います。
src/test/java/org/littlewings/infinispan/marshalling/Book.java

package org.littlewings.infinispan.marshalling;

import org.infinispan.protostream.annotations.ProtoFactory;
import org.infinispan.protostream.annotations.ProtoField;

public class Book {
    @ProtoField(number = 1, required = true)
    String isbn;

    @ProtoField(number = 2, required = true)
    String title;

    @ProtoField(number = 3, required = true, defaultValue = "0")
    int price;

    @ProtoFactory
    public static Book create(String isbn, String title, int price) {
        Book book = new Book();

        book.setIsbn(isbn);
        book.setTitle(title);
        book.setPrice(price);

        return book;
    }

    // getter/setterは省略
}

@ProtoFactoryについては、必要に応じてファクトリメソッドやコンストラクタを指定してください。

なお、@ProtoFieldアノテーションには、staticフィールド、privateフィールドには付与できません。

https://github.com/infinispan/protostream/blob/4.3.0.Final/core/src/main/java/org/infinispan/protostream/annotations/impl/ProtoMessageTypeMetadata.java#L306-L317

さらに、SerializationContextInitializerインターフェースを継承したクラスを作成し、@AutoProtoSchemaBuilderアノテーションを付与して
protoファイルの自動生成の設定を行います。
src/test/java/org/littlewings/infinispan/marshalling/LibraryInitializer.java

package org.littlewings.infinispan.marshalling;

import org.infinispan.protostream.SerializationContextInitializer;
import org.infinispan.protostream.annotations.AutoProtoSchemaBuilder;

@AutoProtoSchemaBuilder(
        includeClasses = {
                Book.class
        },
        schemaFileName = "library.proto",
        schemaFilePath = "proto/",
        schemaPackageName = "book")
public interface LibraryInitializer extends SerializationContextInitializer {
}

@AutoProtoSchemaBuilderアノテーションには、Marshalling対象のクラスと、生成するProtocol BufferesのIDLファイルの情報を
設定します。

これで、ビルド時にprotoファイルとMarshaller、作成したSerializationContextInitializerのサブインターフェースの実装クラスが
生成されます。

$ find target/test-classes target/generated-test-sources -type f
target/test-classes/proto/library.proto
target/test-classes/infinispan.xml
target/test-classes/org/littlewings/infinispan/marshalling/LibraryInitializerImpl.class
target/test-classes/org/littlewings/infinispan/marshalling/MarshallingTest.class
target/test-classes/org/littlewings/infinispan/marshalling/LibraryInitializer.class
target/test-classes/org/littlewings/infinispan/marshalling/Book.class
target/test-classes/org/littlewings/infinispan/marshalling/Book$___Marshaller_f17d03721c1841044d5c821f6b06c39a80a6392a51f12f33aaaabb0dabbd74cc.class
target/generated-test-sources/test-annotations/org/littlewings/infinispan/marshalling/LibraryInitializerImpl.java
target/generated-test-sources/test-annotations/org/littlewings/infinispan/marshalling/Book$___Marshaller_f17d03721c1841044d5c821f6b06c39a80a6392a51f12f33aaaabb0dabbd74cc.java

生成されたprotoファイルは、こんな感じです。
target/test-classes/proto/library.proto

// File name: library.proto
// Generated from : org.littlewings.infinispan.marshalling.LibraryInitializer

syntax = "proto2";

package book;



message Book {
   
   required string isbn = 1;
   
   required string title = 2;
   
   required int32 price = 3 [default = 0];
}

生成されたファイルのパスや中身には、アノテーションの内容が反映されていますね。

@AutoProtoSchemaBuilder(
        includeClasses = {
                Book.class
        },
        schemaFileName = "library.proto",
        schemaFilePath = "proto/",
        schemaPackageName = "book")

...

    @ProtoField(number = 1, required = true)
    String isbn;

    @ProtoField(number = 2, required = true)
    String title;

    @ProtoField(number = 3, required = true, defaultValue = "0")
    int price;

自動生成されたSerializationContextInitializerインターフェースの実装は、こちらですね。Implという接尾語が付与されるようです。
今回、LibraryInitializerという名前で作成したので、LibraryInitializerImplですね。

target/test-classes/org/littlewings/infinispan/marshalling/LibraryInitializerImpl.class
target/generated-test-sources/test-annotations/org/littlewings/infinispan/marshalling/LibraryInitializerImpl.java

で、だいぶわかりづらいのですが、自動生成されたクラスを、serialization要素内に、context-initializerとして登録します。

    <cache-container shutdown-hook="REGISTER">
        <transport stack="udp"/>

        <serialization>
            <context-initializer class="org.littlewings.infinispan.marshalling.LibraryInitializerImpl"/>
        </serialization>

        <local-cache name="localCache"/>

        <distributed-cache name="distributedCache"/>
    </cache-container>

今回の場合、「LibraryInitializerImpl」というクラス名になりますね。

これで、今回のBookクラスに対するMarshallerが登録され、先ほどのコードがテストにパスするようになります。
強制的にMarshallingするようにしても大丈夫です。

ユーザー定義のクラスに対する、Marshallingできるようになりました、と。

今後は、こんな感じの使い方になるんですねぇ。

その他

オマケ的にいくつか。ドキュメントの紹介のみですが。

IDLおよびMarshallerの自動生成を行わない場合

今回はユーザー定義のクラスを作成し、ProtoStreamのアノテーションを付与しました。ですが、このように自分でソースコード
変更することができないようなクラスがMarshalling対象となる場合は、以下のようにprotoファイルやMarshallerを作成します。

Manually Implementing SerializationContextInitializers

まあ、先の例で自動生成していた部分を、自分でやります、と。

SerializationContextInitializerを実装したクラスも作成して、同じようにcontext-initializerとして登録すればよさそうです。

Java Serialization Marshaller

Javaシリアライズの仕組みをMarshallerとして使う場合。

Java Serialization Marshaller

JBoss Marshalling

以前の、Marshallingのデフォルト実装ですね。

JBoss Marshalling

こちらはinfinispan-coreから分離され、infinipsan-jboss-marshallingとなったので、使用する場合は別途依存関係を追加する必要があります。

ですが、JBoss Marshallingは非推奨となり、将来削除される予定みたいなのでもう使うことはないんだろうな、と。

JBoss Marshalling is deprecated and planned for removal in a future version of Infinispan.

また、AdvancedExternalizerもなくなりそうですね。

Infinispan ignores implementations of the AdvancedExternalizer interface when persisting data unless you configure JBoss marshalling. However, this interface is also deprecated and planned for removal.

Custom Marshaller

独自のMarshallerを使う場合。

Custom Implementation

Protocol Buffersでもなく、Javaシリアライズでもなく、独自のMarshallingを導入する場合に使うことになるかなと。

まとめ

Infinispan 10.0でリファクタリングされた、Marshallingについて、まずはEmbedded Modeで試してみました。

今後は、Protocol Buffersを利用するようになるんでしょうね。

ただ、Protocol Buffersのバージョンは2なので、そのうち3になることを…。Road Mapには、乗っているんですけどね。

SerializationContextInitializerの定義や設定への反映、protostream-processorの依存関係の追加など、やや戸惑うところがありましたが、
なんとか通せて良かったです。

今回作成したソースコードは、こちらに置いてあります。

https://github.com/kazuhira-r/infinispan-getting-started/tree/master/embedded-without-jboss-marshalling