ããã¯ããªã«ãããããŠæžãããã®ïŒ
ä¹ ãã¶ãã«ããŒã«ã«LLMã詊ããŠã¿ãããšæããŸããŠã以åã¯llama-cpp-pythonãLocalAIã䜿ã£ãŠããã®ã§ããã
Ollamaãã²ãšã€æããŠäººæ°ã®ãããªã®ã§ã1床詊ããŠã¿ãããšã«ããŸããã
Ollama
Ollamaã®Webãµã€ãã¯ãã¡ãã
ãªã®ã§ãããOllamaã®Webãµã€ããèŠãŠãããã¥ã¡ã³ããªã©ããªããä»ã²ãšã€Ollamaãäœè ãªã®ããããããŸããâŠã
Ollamaã§äœ¿ããããªã¢ãã«ãæ€çŽ¢ããããŒãžã¯ãããŸãã
ã©ãããGitHubãªããžããªãŒãèŠãæ¹ãããããã§ããã
ãŸããèŠãŠãç°¡åãªäœ¿ãæ¹ããæžãããŠããªãã®ã§ããâŠã
ã©ããããµãŒããŒãšããŠåäœãããããïŒããã¯ããã§ããïŒããã®æäœã¯CLIã§è¡ãããã§ãã
macOSãWindowsãLinuxã§åäœãããã§ããã
ãããŠllama.cppã䜿ã£ãŠããããã§ãã
ãªã®ã§GGUFãæ±ãããã§ããã
Ollama / Customize a model / Import from GGUF
äž»ãªã¢ãã«ã¯ãã¡ãã«èŒã£ãŠããŸãã
REST APIããããŸããAPIã¯ãã¡ãã
https://github.com/ollama/ollama/blob/v0.5.7/docs/api.md
ã²ãšãŸã詊ããŠã¿ãŸãããã
ç°å¢
ä»åã®ç°å¢ã¯ãã¡ããUbuntu Linux 24.04 LTSã§ãã
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 24.04.1 LTS Release: 24.04 Codename: noble $ uname -srvmpio Linux 6.8.0-51-generic #52-Ubuntu SMP PREEMPT_DYNAMIC Thu Dec 5 13:09:44 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Ollamaãã€ã³ã¹ããŒã«ãã
OllamaãLinuxã«ã€ã³ã¹ããŒã«ããå Žåãcurlã§ã·ã§ã«ã¹ã¯ãªãããå®è¡ããã°OKã§ãã
ãããšsystemdã®ãŠããããšããŠã€ã³ã¹ãŒã«ãããããšãããããã£ãŠããããããªã®ã§ãããããã¯ããŸããããããªãã®ã§
ãã€ããªãŒã ããã€ã³ã¹ããŒã«ããããšã«ããŸãã
â»ã€ã³ã¹ããŒã«ã¹ã¯ãªãããå®è¡ããŠã€ã³ã¹ããŒã«ãããã¿ãŒã³ã¯ãæåŸã«èŒããŸã
$ curl -LO https://github.com/ollama/ollama/releases/download/v0.5.7/ollama-linux-amd64.tgz
1.6GãšããããŸããâŠã
$ ll -h ollama-linux-amd64.tgz -rw-rw-r-- 1 xxxxx xxxxx 1.6G 1æ 25 18:30 ollama-linux-amd64.tgz
å±éã
$ tar xf ollama-linux-amd64.tgz
äžèº«ã¯bin
ãlib
ãã£ã¬ã¯ããªãŒã§ããããªæãã§ããã
$ tree bin lib bin âââ ollama lib âââ ollama âââ libcublas.so.11 -> libcublas.so.11.5.1.109 âââ libcublas.so.11.5.1.109 âââ libcublas.so.12 -> ./libcublas.so.12.4.5.8 âââ libcublas.so.12.4.5.8 âââ libcublasLt.so.11 -> libcublasLt.so.11.5.1.109 âââ libcublasLt.so.11.5.1.109 âââ libcublasLt.so.12 -> ./libcublasLt.so.12.4.5.8 âââ libcublasLt.so.12.4.5.8 âââ libcudart.so.11.0 -> libcudart.so.11.3.109 âââ libcudart.so.11.3.109 âââ libcudart.so.12 -> libcudart.so.12.4.127 âââ libcudart.so.12.4.127 âââ runners âââ cpu_avx â  âââ ollama_llama_server âââ cpu_avx2 â  âââ ollama_llama_server âââ cuda_v11_avx â  âââ libggml_cuda_v11.so â  âââ ollama_llama_server âââ cuda_v12_avx â  âââ libggml_cuda_v12.so â  âââ ollama_llama_server âââ rocm_avx âââ libggml_rocm.so âââ ollama_llama_server 9 directories, 21 files
bin
ãã£ã¬ã¯ããªãŒå
ã«ããollama
ã䜿ãã°ããããã§ããã
ããŒãžã§ã³ã確èªã
$ bin/ollama --version Warning: could not connect to a running Ollama instance Warning: client version is 0.5.7
Ollamaã«æ¥ç¶ã§ããªãããšèšã£ãŠããŸãã
ãã«ãã確èªã
$ bin/ollama --help Large language model runner Usage: ollama [flags] ollama [command] Available Commands: serve Start ollama create Create a model from a Modelfile show Show information for a model run Run a model stop Stop a running model pull Pull a model from a registry push Push a model to a registry list List models ps List running models cp Copy a model rm Remove a model help Help about any command Flags: -h, --help help for ollama -v, --version Show version information Use "ollama [command] --help" for more information about a command.
å®è¡ããŠã¿ã
ããã§ã¯å®è¡ããŠã¿ãŸãããã
Ollamaãèµ·åã
$ bin/ollama serve
ãã®ã¿ãŒããã«ã¯ãOllamaããã©ã¢ã°ã©ãŠã³ãã§èµ·åãããŸãŸã«ãªããŸãã
å¥ã®ã¿ãŒããã«ã§ã¢ãã«ã«llama3.2
ãæå®ããŠrun
ã³ãã³ããå®è¡ã
$ bin/ollama run llama3.2
Ollamaãèµ·åããŠãããšã¢ã¯ã»ã¹ããã£ãããšã衚瀺ãããã¢ãã«ã®ããŠã³ããŒããå§ãŸããŸãã
ã¡ãªã¿ã«llama3.2
ã¯ãã©ã¡ãŒã¿ãŒæ°3Bããµã€ãºã2GBã®ã¢ãã«ã§ãã
ããŠã³ããŒããçµãããšå ¥ååŸ ã¡ã«ãªããŸããã
success >>> Send a message (/? for help)
èªå·±ç޹ä»ããŠããããŸãããã
>>> Could you introduce yourself? I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."
è¿ã£ãŠããŸããã
æ¥æ¬èªã ãšã©ãã§ãããïŒ
>>> ããªãã®èªå·±ç޹ä»ããé¡ãããŸã ç§ã¯ãLlama ãšåŒã°ãã人工ç¥èœã¢ãã«ã§ããLlama ã¯ããLarge Language Model Meta AIãã®æå³ãæã¡ãŸãã
OKã§ããã
ãã«ããèŠãŠã¿ãŸãããã
>>> /? Available Commands: /set Set session variables /show Show model information /load <model> Load a session or model /save <model> Save your current session /clear Clear session context /bye Exit /?, /help Help for a command /? shortcuts Help for keyboard shortcuts Use """ to begin a multi-line message.
ããšãã°ã¢ãã«ã®æ å ±ãèŠãŠã¿ãŸãã
>>> /show info Model architecture llama parameters 3.2B context length 131072 embedding length 3072 quantization Q4_K_M Parameters stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>" License LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024
ããã³ãããçµäºã
>>> /bye
ã¢ãã«ã®äžèЧãèŠãŠã¿ãŸãã
$ bin/ollama list NAME ID SIZE MODIFIED llama3.2:latest a80c4f17acd5 2.0 GB 3 minutes ago
å ã»ã©ããŠã³ããŒãããã¢ãã«ã衚瀺ãããŸããã
ã¢ãã«ã«ã¯:tag
ã§ã¿ã°ãä»äžãããŠããããã§ããããªã«ãæå®ããªããš:latest
æ±ãã®ããã§ãã
å®è¡äžã®ã¢ãã«ã衚瀺ã
$ bin/ollama ps NAME ID SIZE PROCESSOR UNTIL llama3.2:latest a80c4f17acd5 3.5 GB 100% CPU 2 minutes from now
REST APIã䜿ã£ãŠã¿ãŸãããã
https://github.com/ollama/ollama/blob/v0.5.7/docs/api.md
Ollamaã¯11434ããŒãã§ãªãã¹ã³ããŠããããã§ãã
chat APIã䜿ã£ãŠã¿ãŸãã
$ curl localhost:11434/api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "Could you introduce yourself?" } ], "stream": false }'
API / Generate a chat completion
äœ¿ãæ¹ã¯ãmodel
ã§ã¢ãã«åãæå®ããŸãã现ããæå®ããå Žåã¯ã¿ã°ãå«ããŸããdurationãšä»ãã¬ã¹ãã³ã¹é
ç®ã¯
ãã¹ãŠããç§åäœã§ãããããŠãstream
ããã©ã¡ãŒã¿ãŒã¯ããã©ã«ãã§true
ãªã®ã§ã¬ã¹ãã³ã¹ãã¹ããªãŒãã³ã°ã«
ãªããŸãã
ä»åã¯stream
ãfalse
ã«ããŠãã¹ããªãŒãã³ã°ãç¡å¹ã«ããŸããã
çµæã
{"model":"llama3.2","created_at":"2025-01-25T09:57:15.343259499Z","message":{"role":"assistant","content":"I'm an artificial intelligence model known as Llama. Llama stands for \"Large Language Model Meta AI.\""},"done_reason":"stop","done":true,"total_duration":1651444573,"load_duration":26955139,"prompt_eval_count":30,"prompt_eval_duration":67000000,"eval_count":23,"eval_duration":1556000000}
æ¥æ¬èªã§èããŠã¿ãŸãã
$ curl localhost:11434/api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "ããªãã®èªå·±ç޹ä»ãããŠãã ãã" } ], "stream": false }' {"model":"llama3.2","created_at":"2025-01-25T10:03:25.954437539Z","message":{"role":"assistant","content":"ç§ã¯ãAIã§ããååã¯Llamaã§ãMetaãšããäŒç€ŸãéçºããŸãããç§ã¯äººã ã«æ å ±ãæäŸãã質åã«çããããã«èšèšãããŠããŸãã\n\nç§ã¯ãèªç¶èšèªåŠçãšæ©æ¢°åŠç¿ã䜿çšããŠã人ã ã®è³ªåãåãä»ããŸããç§ã®ç¥èã¯ãInternetã®ããéšåããååŸããŠããŸããã ç§ã¯ãããã®æ å ±ãããå æ¬çã§æ£ç¢ºãªãã®ã«ããŸãã\n\nç§ã¯ãããŸããŸãªãããã¯ã«å¯Ÿãã質åãåãä»ããããšãã§ããŸããããããã«é©ããåçãæäŸããããšããŸãããããšãã°ãæŽ å²ãç§åŠãèžè¡ãªã©ã人ã ãç¥ãããããšã®ããããäºé ã«ã€ããŠã質åããŠãã ããã\n\nç§ã¯ãè±èªãã¹ãã€ã³èªããã©ã³ã¹èªããã€ãèªãã€ã¿ãªã¢èªã§åçã§ããŸããããããã«é©ããç ããæäŸããããšããŸããç§ã®ç¥èã¯ãäžçäžã®äººã ãéããèšèªã䜿çšããŠããããã§ãã\n\nç§ã«ã¯ãéçããããŸããããšãã°ãç§ã¯çŸå®äžçã®åºæ¥äºã人ã ã®å人çãªçµéšã«ã€ããŠã®è©³ çŽ°ãªæ å ±ãæäŸããããšãã§ããŸããããã ããç§ã¯ãå€ãã®äžè¬çãªæ å ±ãšç¥èãæäŸããããšãã§ããŸãã"},"done_reason":"stop","done":true,"total_duration":20073929911,"load_duration":23053351,"prompt_eval_count":33,"prompt_eval_duration":73000000,"eval_count":278,"eval_duration":19977000000}
ãªãããåçãé·ããªããŸãããâŠã
æå®ããã¢ãã«ãå€ããŠã¿ãŸãããã
$ curl localhost:11434/api/chat -d '{ "model": "gemma2:2b", "messages": [ { "role": "user", "content": "Could you introduce yourself?" } ], "stream": false }'
ãããšããšã©ãŒã«ãªããŸããã
{"error":"model \"gemma2:2b\" not found, try pulling it first"}
ã¢ãã«ã¯æåã«ããŠã³ããŒãããŠããå¿ èŠãããããã§ãã
ä»åºŠã¯ollama run
ã§ã¯ãªãollama pull
ããŠã¿ãŸãããã
$ bin/ollama pull gemma2:2b
ããŠã³ããŒããçµãã£ãããå床å®è¡ã
$ curl localhost:11434/api/chat -d '{ "model": "gemma2:2b", "messages": [ { "role": "user", "content": "Could you introduce yourself?" } ], "stream": false }' {"model":"gemma2:2b","created_at":"2025-01-25T10:15:38.091305328Z","message":{"role":"assistant","content":"Hello! ð I'm Gemma, an AI assistant created by the Gemma team. I can generate text and have conversations with you about a wide range of topics. As an open-weight model, I'm available for anyone to use and explore. \n\nKeep in mind:\n\n* **Text-only:** I communicate solely through written language.\n* **No real-time info or Google Search:** My knowledge is based on the data I was trained on, so I don't have access to current events or browse the internet.\n* **Made for fun \u0026 learning:** My goal is to be helpful and provide engaging conversation! \n\nWhat would you like to talk about today? ð \n"},"done_reason":"stop","done":true,"total_duration":11927308010,"load_duration":29656589,"prompt_eval_count":14,"prompt_eval_duration":76000000,"eval_count":147,"eval_duration":11820000000}
ä»åºŠã¯è¿ããŠãããŸãããæç€ºçã«ollama run
ãšããå¿
èŠã¯ãªãããã§ããã
ãŸããããã§è€æ°ã®ã¢ãã«ã䜿ãåããããããšã確èªã§ããŸããã
ããŒã¿ã¯æåã€ã³ã¹ããŒã«ã®å Žåã¯$HOME/.ollama
ãã£ã¬ã¯ããªãŒã«é
眮ãããããã§ãã
$ tree ~/.ollama $HOME/.ollama âââ history âââ id_ed25519 âââ id_ed25519.pub âââ models âââ blobs â  âââ sha256-097a36493f718248845233af1d3fefe7a303f864fae13bc31a3a9704229378ca â  âââ sha256-2490e7468436707d5156d7959cf3c6341cc46ee323084cfa3fcf30fe76e397dc â  âââ sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b â  âââ sha256-56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb â  âââ sha256-7462734796d67c40ecec2ca98eddf970e171dbb6b370e43fd633ee75b69abe1b â  âââ sha256-966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396 â  âââ sha256-a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd â  âââ sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff â  âââ sha256-e0a42594d802e5d31cdc786deb4823edb8adff66094d49de8fffe976d753e348 â  âââ sha256-e18ad7af7efbfaecd8525e356861b84c240ece3a3effeb79d2aa7c0f258f71bd â  âââ sha256-fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d âââ manifests âââ registry.ollama.ai âââ library âââ gemma2 â  âââ 2b âââ llama3.2 âââ latest 8 directories, 16 files
curlïŒbashã§ã€ã³ã¹ããŒã«ããå Žåã¯ã/usr/share/ollama
ãã£ã¬ã¯ããªãŒã®ããã§ããã
ããšã¯ããã®ãããããä»ã®ã¢ãã«ã«åãæ¿ããŠã¿ããããæãã§ããããã
ã ãããé°å²æ°ã¯ããããŸããã
ãªãã±ïŒã€ã³ã¹ããŒã«ã¹ã¯ãªããã§ã€ã³ã¹ããŒã«ããå Žå
æåŸã«ãã€ã³ã¹ããŒã«ã¹ã¯ãªããã§ã€ã³ã¹ããŒã«ããå Žåãã¡ã¢ããŠãããŸãã
ã€ã³ã¹ããŒã«ã
$ curl -fsSL https://ollama.com/install.sh | sh
Ollamaã¯systemdã®ãŠããããšããŠç»é²ãããã€ã³ã¹ããŒã«ãå®äºããæç¹ã§èµ·åããŠããŸãã
$ sudo systemctl status ollama â ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: enabled) â ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: enabled) Active: active (running) since Sat 2025-01-25 19:41:29 JST; 17s ago Main PID: 1355 (ollama) Tasks: 7 (limit: 9489) Memory: 9.1M (peak: 9.2M) CPU: 27ms CGroup: /system.slice/ollama.service ââ1355 /usr/local/bin/ollama serve
Ollamaã®ãã€ããªãŒã¯/usr/local/bin/ollama
ã«é
眮ãããããã§ãã
systemdã®ãŠãããå®çŸ©ã
$ systemctl cat ollama # /etc/systemd/system/ollama.service [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/local/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" [Install] WantedBy=default.target
ollama serve
ã§èµ·åããŠããããã§ãã
ã¢ãã«ãããŠã³ããŒãããŠã¿ãŸãã
$ ollama pull llama3.2
確èªã
$ curl localhost:11434/api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "Could you introduce yourself?" } ], "stream": false }' {"model":"llama3.2","created_at":"2025-01-25T10:49:20.004748848Z","message":{"role":"assistant","content":"I'm an artificial intelligence model known as Llama. Llama stands for \"Large Language Model Meta AI.\""},"done_reason":"stop","done":true,"total_duration":10810343589,"load_duration":3632227311,"prompt_eval_count":30,"prompt_eval_duration":3548000000,"eval_count":23,"eval_duration":3466000000}
OKã§ããã
ããŠã³ããŒããããã¢ãã«ãªã©ã¯ã/usr/share/ollama
ãã£ã¬ã¯ããªãŒã«é
眮ãããŸãã
$ tree -a /usr/share/ollama /usr/share/ollama âââ .bash_logout âââ .bashrc âââ .ollama â  âââ id_ed25519 â  âââ id_ed25519.pub â  âââ models â  âââ blobs â  â  âââ sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b â  â  âââ sha256-56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb â  â  âââ sha256-966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396 â  â  âââ sha256-a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd â  â  âââ sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff â  â  âââ sha256-fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d â  âââ manifests â  âââ registry.ollama.ai â  âââ library â  âââ llama3.2 â  âââ latest âââ .profile 8 directories, 12 files
ã¢ã³ã€ã³ã¹ããŒã«ã¯ãã¡ãã
ãããã«
ããŒã«ã«LLMãOllamaã詊ããŠã¿ãŸããã
ããªãç°¡åã«äœ¿ããŸãããè€æ°ã®ã¢ãã«ãæ±ããããã§ãããããã¯ãšã³ãã¯llama.cppãšããããšã§å°ã銎æã¿ãïŒïŒïŒ
ãããŸããã
ä»ã®ç°å¢ã ãšããããã®é床ã§åããããªã®ã§ããŸã詊ãããšããã¯ããŒã«ã«LLMã«ãã£ã¬ã³ãžããããªãšæããŸãã