CLOVER🍀

That was when it all began.

LocalAI+bert.cppでテキストをベクトル化してみる

これは、なにをしたくて書いたもの?

以前、こんなエントリーを書いてみました。

llama-cpp-pythonで立てたOpenAI API互換のサーバーで、テキストをベクトル化してみる - CLOVER🍀

このエントリーの最後にテキストをベクトル化して、ドキュメントの集合に対してコサイン類似度で検索してみたのですがなんとも微妙な
結果になりました。

ここで、LocalAIとbert.cppを使うと結果が変わるのでは?と思い試してみることにしました。

LocalAI+bert.cpp

LocalAIのバックエンドで、テキスト埋め込みをサポートしているモデルは以下のページで確認できます。

Model compatibility :: LocalAI documentation

llama.cpp、bert.cpp、sentence-transformersの3つが対象です。このうちllama.cppは「doesn’t seem to be accurate」と補足があり
やや疑わしい感じがします。実際、llama-cpp-python越しに動かした時も微妙な結果になったので別のバックエンドを使えるLocalAIで
試してみようと思ったのが今回試してみた動機です。
※このエントリーを書いていて気づきましたが、使っていたモデルが悪い気がしますね…

テキスト埋め込みのページはこちらで、各モデルでの使い方が書いてあります。

🧠 Embeddings :: LocalAI documentation

bert.cppについては、GitHubリポジトリーを見ることになります。

GitHub - skeskinen/bert.cpp: ggml implementation of BERT

どうやら日本語は苦手な感じがするので、前回と同様に英文で試した方が良さそうです。

Tokenizer doesn't correctly handle asian writing (CJK, maybe others)

bert.cpp / Limitations & TODO

bert.cppはモデルのフォーマットにGGMLを使う必要があり、bert.cppのREADME.mdでは自分でビルドする方法が書かれているのですが。

LocalAIのページを見るとGGMLに変換済みのモデルがあるようなので、こちらで試してみたいと思います。

For instance you can download the ggml quantized version of all-MiniLM-L6-v2 from https://huggingface.co/skeskinen/ggml:

Embeddings / Bert embeddings

それでは、試してみましょう。

環境

今回の環境はこちら。

LocalAI。

$ ./local-ai-avx2-Linux-x86_64 --version
LocalAI version v2.4.0 (bcf02449b37f4f9221c0685428d1abf4b9794fb0)

Python。

$ python3 --version
Python 3.10.12


$ pip3 --version
pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

LocalAIの準備

まずはLocalAI側の準備を行います。

モデルをこちらからダウンロード。

skeskinen/ggml · Hugging Face

all-MiniLM-L12-v2/ggml-model-q4_0を使うことにします。

$ mkdir models
$ curl -L https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L12-v2/ggml-model-q4_0.bin -o models/all-MiniLM-L12-v2-ggml-model-q4_0.bin

こんな設定ファイルを作成。

localai-config.yaml

- name: text-embedding-ada-002
  backend: bert-embeddings
  embeddings: true
  parameters:
    model: all-MiniLM-L12-v2-ggml-model-q4_0.bin

LocalAIを起動。

$ ./local-ai-avx2-Linux-x86_64 --config-file localai-config.yaml --models-path models --threads 4

テキスト埋め込みのエンドポイントはこちらです。

ENDPOINTS / Embeddings / Create embeddings

動作確認。

$ curl -XPOST -H 'Content-Type: application/json' localhost:8080/v1/embeddings -d '{"model": "text-embedding-ada-002", "input": "Hello World."}'
{"error":{"code":500,"message":"could not load model: rpc error: code = Unknown desc = failed loading model","type":""}}

なんと失敗します…。

issueを見ていると、どうも古いモデルなら動きそうな様子(結果は微妙みたいですが)。

When using the bert embedding model, the process crashes when passing a certain string · Issue #780 · mudler/LocalAI · GitHub

こちらのバージョンで試してみました。

$ curl -L https://huggingface.co/skeskinen/ggml/resolve/b89d24f5ad2de2ce8306839fb3a8e23f1f59df97/all-MiniLM-L12-v2/ggml-model-q4_0.bin -o models/all-MiniLM-L12-v2-ggml-model-q4_0.bin

https://huggingface.co/skeskinen/ggml/tree/b89d24f5ad2de2ce8306839fb3a8e23f1f59df97/all-MiniLM-L12-v2

今度は結果が返ってきました。

$ curl -XPOST -H 'Content-Type: application/json' localhost:8080/v1/embeddings -d '{"model": "text-embedding-ada-002", "input": "Hello World."}'
{"created":1704464155,"object":"list","id":"01dd7b98-fabf-4478-a7fa-4e36f3b72aa5","model":"text-embedding-ada-002","data":[{"embedding":[-0.055707518,0.0003755873,0.023456097,-0.060628925,0.0034334238,-0.17254478,0.065997295,0.02052893,-0.050387364,0.06870441,-0.0015679031,-0.045133803,0.01880054,-0.013254267,-0.0492112,0.014403614,0.069525525,-0.08158547,-0.06772742,-0.04356296,-0.06250432,0.02006686,-0.05179327,0.03680826,-0.071426705,-0.02178828,0.053203057,0.028997315,0.062450938,-0.089345604,0.01893532,0.024499472,0.085619874,-0.0042029317,-0.03957942,-0.0032008265,0.04206376,-0.050160468,-0.032053646,0.09263854,0.017336003,-0.010563963,-0.05229845,0.029144093,-0.024735775,-0.04784084,0.0064742537,-0.028608225,0.049451675,-0.051364496,0.0039498485,-0.022974428,0.0057864757,-0.016882226,0.08982491,-0.046759088,-0.00833787,0.03922543,0.017892556,-0.039300848,-0.048574377,-0.0047213393,0.019886704,-0.002025618,0.019287663,0.005274521,0.015193669,0.060543288,-0.010904695,-0.0927869,-0.04437331,0.014444485,0.05743999,-0.036440756,0.019805191,-0.02373424,0.06179562,0.019722633,-0.00022595428,0.01527771,0.020137366,-0.09899312,0.0682251,0.044644702,0.0578913,0.020235965,0.036527634,0.054886315,0.05865164,-0.023150543,-0.09462258,0.01791217,0.06246526,0.07609492,-0.020596715,0.03779657,0.04354742,0.0053492044,-0.07052353,0.25149843,-0.07539313,0.032874193,-0.032444313,0.09289441,-0.032539003,0.03612972,-0.013801879,0.10097128,0.007944174,-0.022529678,-0.050531007,0.0047978456,-0.07830395,-0.03797801,0.03325385,-0.06609763,0.012193191,0.04613942,-0.0073556835,0.025613524,0.024211822,-0.026121955,0.031730756,-0.067548014,0.004217653,-0.075113565,-0.047162574,0.041670054,0.04310346,0.011722457,0.07301923,0.14492528,0.041620594,-0.028017374,0.0039015382,0.030719684,0.07129587,0.02830669,0.011865219,-0.07679543,0.004590461,0.07248499,-0.013388571,0.023635203,0.017292839,-0.004983722,0.037021335,0.06278834,0.009885339,0.05366459,-0.054637857,0.019670302,0.005020621,-0.007923028,0.05819283,-0.052988403,0.023943618,0.027114403,0.10748222,0.0067536184,-0.08655766,0.0384689,-0.00057974784,-0.031289157,-0.012211517,-0.031373736,-0.008547462,-0.046705853,-0.07867813,0.0137158735,0.022137318,-0.06495786,-0.0011469374,-0.030246504,0.043171067,0.028785208,-0.006690446,0.031896755,0.03371551,0.002338521,-0.08207289,0.02345097,-0.019970678,0.03356426,0.02599093,0.020071164,-0.02467199,0.048704617,0.02697544,0.035380755,0.033974864,0.039165895,0.036689,-0.0230244,0.038122516,0.016099062,0.012298665,0.010828561,0.059357595,-0.010514352,-0.059430428,0.04801912,0.04367155,0.07294297,0.16069485,-0.094530605,0.00079188385,0.044296846,-0.027113847,0.032635316,-0.011283311,0.025684986,-0.008302678,-0.010010705,-0.043378916,-0.044526745,-0.037425432,0.015989363,-0.048972577,0.03180086,0.038103793,-0.044846624,-0.039140075,1.3498234e-32,0.021783736,0.018304715,-0.079643056,-0.041162778,-0.03317576,0.004117927,-0.07211811,0.026265882,-0.04252388,0.04133357,0.035663515,0.022423603,0.0556743,-0.0062787817,-0.036579546,-0.03736653,0.12708342,-0.0062570996,-0.06398942,0.051768944,-0.018123515,-0.028535813,-0.08343188,-0.030252649,0.0056203557,0.005071198,-0.029715903,0.05599359,-0.10371912,-0.011015235,-0.03535344,0.08803534,-0.023437308,0.0706533,0.03232016,0.01996845,-0.096253395,-0.15422936,0.031047646,-0.0061241984,-0.030927895,-0.02260987,-0.080585994,0.07415225,-0.08781958,-0.0037250454,-0.07512957,0.04838397,0.021392282,-0.11022927,0.005542539,-0.020443616,0.035500955,-0.06884671,-0.12672123,0.044053588,-0.07179252,-0.036618296,0.0040195836,-0.02466008,-0.00059482484,0.0570458,0.07321232,0.041371927,0.061526757,0.033605244,0.10768644,0.034523208,0.042581566,-0.05650699,-0.06462194,-0.018100515,-0.003522056,-0.026864702,-0.027745746,0.043704674,-0.03227746,-0.019893866,0.023254743,0.021995453,0.014928417,0.06081148,0.018583808,0.024115572,0.03187515,0.0042050313,0.060978487,0.0061015002,-0.053108525,-0.024335565,-0.007854954,0.0049676932,0.016790563,0.02718481,-0.09055001,-6.108438e-33,-0.07189688,-0.02383935,0.07357722,-0.058474395,0.067925915,0.12860355,0.053099345,0.027835658,-0.083729856,0.010967085,-0.034337398,0.073035724,-0.03346505,-0.057342123,-0.0032557307,-0.024084376,-0.06894051,-0.0051558116,0.053306952,-0.017153356,0.061940886,0.016649097,-0.013647344,0.021281878,0.07158137,-0.042669397,-0.097900376,-0.052383307,-0.000873163,-0.050987057,0.044383634,0.022267267,-0.056469373,-0.01485328,-0.026436295,-0.026998756,0.02917846,-0.0031674376,0.052175343,-0.073051065,0.002612983,0.011249844,0.069536604,-0.10898132,0.032033283,0.08346,-0.05261967,0.019422693,0.06275057,-0.019484224,-0.006440391,0.06863358,0.058349065,-0.0074646664,-0.012929345,0.016804898,-0.012698644,-0.04337536,-0.09278484,0.058133755,0.015499066,-0.028114956,-0.007501239,0.060511876],"index":0,"object":"embedding"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

とりあえず、こちらで試してみることにしましょう。

OpenAI Python APIライブラリーからアクセスする

続いて、LocalAIに対してOpenAI Python APIライブラリーを使ってアクセスしてみます。

ライブラリーのインストール。

$ pip3 install openai numpy

バージョン。

$ pip3 list
Package           Version
----------------- ----------
annotated-types   0.6.0
anyio             4.2.0
certifi           2023.11.17
distro            1.9.0
exceptiongroup    1.2.0
h11               0.14.0
httpcore          1.0.2
httpx             0.26.0
idna              3.6
numpy             1.26.3
openai            1.6.1
pip               22.0.2
pydantic          2.5.3
pydantic_core     2.14.6
setuptools        59.6.0
sniffio           1.3.0
tqdm              4.66.1
typing_extensions 4.9.0

ソースコードは以前のエントリーとほぼ同じものを使いました。エンドポイントのポートが異なるだけです。

search.py

import sys
from openai import OpenAI
import numpy as np

## https://github.com/openai/openai-python/blob/v0.28.1/openai/embeddings_utils.py#L65-L66
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

seeds = [
    {"name": "Apple", "feature": "With red flesh and thin skin, it has a balanced taste of mild acidity and sweetness."},
    {"name": "Banana", "feature": "A yellow fruit with a smooth texture and mild sweetness, known for its high nutritional value."},
    {"name": "Grapes", "feature": "Purple in color, these small fruits cluster together with juicy and mildly tangy flavor."},
    {"name": "Melon", "feature": "A green fruit with a refreshing texture and aroma, rich in sweetness and water content."},
    {"name": "Orange", "feature": "Wrapped in an orange peel, it offers a harmonious blend of refreshing acidity and sweetness, rich in vitamin C."},
    {"name": "Strawberry", "feature": "Recognized by its red hue, it carries a distinctive fragrance and a sweet-tart taste, with tiny seeds adding texture."},
    {"name": "Pineapple", "feature": "Featuring yellow flesh, it has a sweet-tangy flavor and a unique texture, accompanied by a rich aroma."},
    {"name": "Mango", "feature": "An orange fruit with a rich aroma and intense sweetness, offering a smooth and luscious flesh."},
    {"name": "Kiwi", "feature": "Green flesh with a balanced combination of acidity and sweetness, enhanced by small black seeds for texture."},
    {"name": "Peach", "feature": "Displaying peach-colored flesh, it is juicy and soft with a sweet aroma, complemented by the peach's beautiful appearance."}
]

openai = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy-api-key")

docs = []

for seed in seeds:
    feature = seed["feature"]
    response = openai.embeddings.create(input=feature, model="text-embedding-ada-002")
    docs.append({"name": seed["name"], "feature": feature, "embedding": response.data[0].embedding})


query = sys.argv[1]
response = openai.embeddings.create(input=query, model="text-embedding-ada-002")
query_embedding = response.data[0].embedding

docs_with_similarity = [{
    "name": d["name"],
    "feature": d["feature"],
    "embedding": d["embedding"],
    "similarity":  cosine_similarity(d["embedding"], query_embedding)
} for d in docs]

sorted_docs = sorted(docs_with_similarity, key=lambda d: d["similarity"], reverse=True)

print(f"query = {query}")
print()

print("ranking:")
for doc in sorted_docs:
    print(f"  name: {doc['name']}")
    print(f"    feature: {doc['feature']}")
    print(f"    similarity: {doc['similarity']}")

最初にこちらのドキュメントの特徴をベクトル化して

seeds = [
    {"name": "Apple", "feature": "With red flesh and thin skin, it has a balanced taste of mild acidity and sweetness."},
    {"name": "Banana", "feature": "A yellow fruit with a smooth texture and mild sweetness, known for its high nutritional value."},
    {"name": "Grapes", "feature": "Purple in color, these small fruits cluster together with juicy and mildly tangy flavor."},
    {"name": "Melon", "feature": "A green fruit with a refreshing texture and aroma, rich in sweetness and water content."},
    {"name": "Orange", "feature": "Wrapped in an orange peel, it offers a harmonious blend of refreshing acidity and sweetness, rich in vitamin C."},
    {"name": "Strawberry", "feature": "Recognized by its red hue, it carries a distinctive fragrance and a sweet-tart taste, with tiny seeds adding texture."},
    {"name": "Pineapple", "feature": "Featuring yellow flesh, it has a sweet-tangy flavor and a unique texture, accompanied by a rich aroma."},
    {"name": "Mango", "feature": "An orange fruit with a rich aroma and intense sweetness, offering a smooth and luscious flesh."},
    {"name": "Kiwi", "feature": "Green flesh with a balanced combination of acidity and sweetness, enhanced by small black seeds for texture."},
    {"name": "Peach", "feature": "Displaying peach-colored flesh, it is juicy and soft with a sweet aroma, complemented by the peach's beautiful appearance."}
]

コマンドライン引数で入力されたクエリーをベクトル化してコサイン類似度を取り、値の降順でソートします。

query = sys.argv[1]
response = openai.embeddings.create(input=query, model="text-embedding-ada-002")
query_embedding = response.data[0].embedding

docs_with_similarity = [{
    "name": d["name"],
    "feature": d["feature"],
    "embedding": d["embedding"],
    "similarity":  cosine_similarity(d["embedding"], query_embedding)
} for d in docs]

sorted_docs = sorted(docs_with_similarity, key=lambda d: d["similarity"], reverse=True)

試してみます。

$ python3 search.py green
query = green

ranking:
  name: Kiwi
    feature: Green flesh with a balanced combination of acidity and sweetness, enhanced by small black seeds for texture.
    similarity: 0.47216116572918426
  name: Melon
    feature: A green fruit with a refreshing texture and aroma, rich in sweetness and water content.
    similarity: 0.45520558611906997
  name: Grapes
    feature: Purple in color, these small fruits cluster together with juicy and mildly tangy flavor.
    similarity: 0.3602212050150307
  name: Strawberry
    feature: Recognized by its red hue, it carries a distinctive fragrance and a sweet-tart taste, with tiny seeds adding texture.
    similarity: 0.28667974915671773
  name: Banana
    feature: A yellow fruit with a smooth texture and mild sweetness, known for its high nutritional value.
    similarity: 0.261944463934626
  name: Peach
    feature: Displaying peach-colored flesh, it is juicy and soft with a sweet aroma, complemented by the peach's beautiful appearance.
    similarity: 0.2574946950704811
  name: Mango
    feature: An orange fruit with a rich aroma and intense sweetness, offering a smooth and luscious flesh.
    similarity: 0.24962740966418687
  name: Orange
    feature: Wrapped in an orange peel, it offers a harmonious blend of refreshing acidity and sweetness, rich in vitamin C.
    similarity: 0.2171008485999807
  name: Apple
    feature: With red flesh and thin skin, it has a balanced taste of mild acidity and sweetness.
    similarity: 0.21225964619067184
  name: Pineapple
    feature: Featuring yellow flesh, it has a sweet-tangy flavor and a unique texture, accompanied by a rich aroma.
    similarity: 0.2096289129400678

体感速度がllama-cpp-pythonより速めな感じがしました。が、それはモデルのサイズの影響な気がします…。

あと、llama-cpp-pythonの時よりは良さそうな結果になった気がしないでもないのですが、なんとも言い難い気もします…。

日本語に対応した埋め込みができるモデルを探してこないとダメな気がしますね。

とりあえず、今回はLocalAIとbert.cppを組み合わせて動かせたということで…。

おわりに

LocalAIとbert.cppを使って、テキストをベクトル化して検索してみました。

前にやったこととOpenAI API互換のサーバーを変更して試したわけですが、bert.cppを組み込む方法がわかっただけで、あまり進展が
ない気がします。

一方で、テキストのベクトル化において日本語に対応したモデルを探してこないと理解が深まらない気がしてきたので、このあたりを
また調べてみようと思います。