Skip to content

Ollama 介绍

一、Ollama 简介

"Get up and running with large language models locally."

Ollama 仓库创建于 2023年6月26日,截至2024年8月,Ollama 经过了一年多的发展。相信在不久的未来,将会在越来越多的边缘端产品中看到大模型的身影。

什么是 Ollama ?正如 Ollama 官方仓库所说:本地启动并运行大型语言模型。

Ollama 是一个开源的大型语言模型服务工具,旨在帮助用户快速在本地运行大模型。通过简单的安装指令,用户可以通过一条命令轻松启动和运行开源的大型语言模型。 它提供了一个简洁易用的命令行界面和服务器,专为构建大型语言模型应用而设计。用户可以轻松下载、运行和管理各种开源 LLM。与传统 LLM 需要复杂配置和强大硬件不同,Ollama 能够让用户在消费级的 PC 上体验 LLM 的强大功能。

Ollama 会自动监测本地计算资源,如有 GPU 的条件,会优先使用 GPU 的资源,同时模型的推理速度也更快。如果没有 GPU 条件,直接使用 CPU 资源。

Ollama 极大地简化了在 Docker 容器中部署和管理大型语言模型的过程,使用户能够迅速在本地启动和运行这些模型。

二、Ollama 特点

  • 开源免费:Ollama 及其支持的模型完全开源且免费,用户可以随时访问和使用这些资源,而无需支付任何费用。
  • 简单易用:Ollama 无需复杂的配置和安装过程,只需几条简单的命令即可启动和运行,为用户节省了大量时间和精力。
  • 支持多平台:Ollama 提供了多种安装方式,支持 Mac、Linux 和 Windows 平台,并提供 Docker 镜像,满足不同用户的需求。
  • 模型丰富:Ollama 支持包括 DeepSeek-R1、 Llama3.3、Gemma2、Qwen2 在内的众多热门开源 LLM,用户可以轻松一键下载和切换模型,享受丰富的选择。
  • 功能齐全:Ollama 将模型权重、配置和数据捆绑成一个包,定义为 Modelfile,使得模型管理更加简便和高效。
  • 支持工具调用:Ollama 支持使用 Llama 3.1 等模型进行工具调用。这使模型能够使用它所知道的工具来响应给定的提示,从而使模型能够执行更复杂的任务。
  • 资源占用低:Ollama 优化了设置和配置细节,包括 GPU 使用情况,从而提高了模型运行的效率,确保在资源有限的环境下也能顺畅运行。
  • 隐私保护:Ollama 所有数据处理都在本地机器上完成,可以保护用户的隐私。
  • 社区活跃:Ollama 拥有一个庞大且活跃的社区,用户可以轻松获取帮助、分享经验,并积极参与到模型的开发和改进中,共同推动项目的发展。

三、支持的模型

Ollama 支持的模型库列表 https://ollama.com/library

下面是一些受欢迎的模型:

ModelTagParametersSizeDownload
DeepSeek-R1-7B4.7GBollama run deepseek-r1
DeepSeek-R1-671B404GBollama run deepseek-r1:671b
Llama 3.3-70B43GBollama run llama3.3
Llama 3.2-3B2.0GBollama run llama3.2
Llama 3.2-1B1.3GBollama run llama3.2:1b
Llama 3.2 VisionVision11B7.9GBollama run llama3.2-vision
Llama 3.2 VisionVision90B55GBollama run llama3.2-vision:90b
Llama 3.1-8B4.7GBollama run llama3.1
Llama 3.1-405B231GBollama run llama3.1:405b
Gemma 2-2B1.6GBollama run gemma2:2b
Gemma 2-9B5.5GBollama run gemma2
Gemma 2-27B16GBollama run gemma2:27b
mistral-7b4.1GBollama run mistral:7b
qwen-110b63GBollama run qwen:110b
Phi 4-14B9.1GBollama run phi4
codellamaCode70b39GBollama run codellama:70b
qwen2-72b41GBollama run qwen2:72b
llavaVision7b4.7GBollama run llava:7b
nomic-embed-textEmbeddingv1.5274MBollama pull nomic-embed-text:v1.5
所有支持的模型(数据统计至2024.8.2)。
ModelTagParametersSizeDownload
llama3.1-405b231GBollama run llama3.1:405b
llama3.1-70b40GBollama run llama3.1:70b
llama3.1-8b4.7GBollama run llama3.1:8b
gemma2-27b16GBollama run gemma2:27b
gemma2-9b5.4GBollama run gemma2:9b
gemma2-2b1.6GBollama run gemma2:2b
mistral-nemo-12b7.1GBollama run mistral-nemo:12b
mistral-large-123b69GBollama run mistral-large:123b
qwen2-72b41GBollama run qwen2:72b
qwen2-7b4.4GBollama run qwen2:7b
qwen2-1.5b935MBollama run qwen2:1.5b
qwen2-0.5b352MBollama run qwen2:0.5b
deepseek-coder-v2Code236b133GBollama run deepseek-coder-v2:236b
deepseek-coder-v2Code16b8.9GBollama run deepseek-coder-v2:16b
phi3-14b7.9GBollama run phi3:14b
phi3-3.8b2.2GBollama run phi3:3.8b
mistral-7b4.1GBollama run mistral:7b
mixtral-8x22b80GBollama run mixtral:8x22b
mixtral-8x7b26GBollama run mixtral:8x7b
codegemmaCode7b5.0GBollama run codegemma:7b
codegemmaCode2b1.6GBollama run codegemma:2b
command-r-35b20GBollama run command-r:35b
command-r-plus-104b59GBollama run command-r-plus:104b
llavaVision34b20GBollama run llava:34b
llavaVision13b8.0GBollama run llava:13b
llavaVision7b4.7GBollama run llava:7b
llama3-70b40GBollama run llama3:70b
llama3-8b4.7GBollama run llama3:8b
gemma-7b5.0GBollama run gemma:7b
gemma-2b1.7GBollama run gemma:2b
qwen-110b63GBollama run qwen:110b
qwen-72b41GBollama run qwen:72b
qwen-32b18GBollama run qwen:32b
qwen-14b8.2GBollama run qwen:14b
qwen-7b4.5GBollama run qwen:7b
qwen-4b2.3GBollama run qwen:4b
qwen-1.8b1.1GBollama run qwen:1.8b
qwen-0.5b395MBollama run qwen:0.5b
llama2-70b39GBollama run llama2:70b
llama2-13b7.4GBollama run llama2:13b
llama2-7b3.8GBollama run llama2:7b
codellamaCode70b39GBollama run codellama:70b
codellamaCode34b19GBollama run codellama:34b
codellamaCode13b7.4GBollama run codellama:13b
codellamaCode7b3.8GBollama run codellama:7b
dolphin-mixtral-8x7b26GBollama run dolphin-mixtral:8x7b
dolphin-mixtral-8x22b80GBollama run dolphin-mixtral:8x22b
nomic-embed-textEmbeddingv1.5274MBollama pull nomic-embed-text:v1.5
llama2-uncensored-70b39GBollama run llama2-uncensored:70b
llama2-uncensored-7b3.8GBollama run llama2-uncensored:7b
phi-2.7b1.6GBollama run phi:2.7b
deepseek-coderCode33b19GBollama run deepseek-coder:33b
deepseek-coderCode6.7b3.8GBollama run deepseek-coder:6.7b
deepseek-coderCode1.3b776MBollama run deepseek-coder:1.3b
dolphin-mistral-7b4.1GBollama run dolphin-mistral:7b
orca-mini-70b39GBollama run orca-mini:70b
orca-mini-13b7.4GBollama run orca-mini:13b
orca-mini-7b3.8GBollama run orca-mini:7b
orca-mini-3b2.0GBollama run orca-mini:3b
mxbai-embed-largeEmbedding335m670MBollama pull mxbai-embed-large:335m
dolphin-llama3-70b40GBollama run dolphin-llama3:70b
dolphin-llama3-8b4.7GBollama run dolphin-llama3:8b
zephyr-141b80GBollama run zephyr:141b
zephyr-7b4.1GBollama run zephyr:7b
starcoder2Code15b9.1GBollama run starcoder2:15b
starcoder2Code7b4.0GBollama run starcoder2:7b
starcoder2Code3b1.7GBollama run starcoder2:3b
mistral-openorca-7b4.1GBollama run mistral-openorca:7b
yi-34b19GBollama run yi:34b
yi-9b5.0GBollama run yi:9b
yi-6b3.5GBollama run yi:6b
llama2-chinese-13b7.4GBollama run llama2-chinese:13b
llama2-chinese-7b3.8GBollama run llama2-chinese:7b
llava-llama3Vision8b5.5GBollama run llava-llama3:8b
vicuna-33b18GBollama run vicuna:33b
vicuna-13b7.4GBollama run vicuna:13b
vicuna-7b3.8GBollama run vicuna:7b
nous-hermes2-34b19GBollama run nous-hermes2:34b
nous-hermes2-10.7b6.1GBollama run nous-hermes2:10.7b
tinyllama-1.1b638MBollama run tinyllama:1.1b
wizard-vicuna-uncensored-30b18GBollama run wizard-vicuna-uncensored:30b
wizard-vicuna-uncensored-13b7.4GBollama run wizard-vicuna-uncensored:13b
wizard-vicuna-uncensored-7b3.8GBollama run wizard-vicuna-uncensored:7b
codestralCode22b13GBollama run codestral:22b
starcoderCode15b9.0GBollama run starcoder:15b
starcoderCode7b4.3GBollama run starcoder:7b
starcoderCode3b1.8GBollama run starcoder:3b
starcoderCode1b726MBollama run starcoder:1b
wizardlm2-8x22b80GBollama run wizardlm2:8x22b
wizardlm2-7b4.1GBollama run wizardlm2:7b
openchat-7b4.1GBollama run openchat:7b
aya-35b20GBollama run aya:35b
aya-8b4.8GBollama run aya:8b
tinydolphin-1.1b637MBollama run tinydolphin:1.1b
stable-codeCode3b1.6GBollama run stable-code:3b
openhermes-v2.54.1GBollama run openhermes:v2.5
wizardcoderCode33b19GBollama run wizardcoder:33b
wizardcoderCodepython3.8GBollama run wizardcoder:python
codeqwenCode7b4.2GBollama run codeqwen:7b
wizard-math-70b39GBollama run wizard-math:70b
wizard-math-13b7.4GBollama run wizard-math:13b
wizard-math-7b4.1GBollama run wizard-math:7b
granite-codeCode34b19GBollama run granite-code:34b
granite-codeCode20b12GBollama run granite-code:20b
granite-codeCode8b4.6GBollama run granite-code:8b
granite-codeCode3b2.0GBollama run granite-code:3b
stablelm2-12b7.0GBollama run stablelm2:12b
stablelm2-1.6b983MBollama run stablelm2:1.6b
neural-chat-7b4.1GBollama run neural-chat:7b
all-minilmEmbedding33m67MBollama pull all-minilm:33m
all-minilmEmbedding22m46MBollama pull all-minilm:22m
phind-codellamaCode34b19GBollama run phind-codellama:34b
dolphincoderCode15b9.1GBollama run dolphincoder:15b
dolphincoderCode7b4.2GBollama run dolphincoder:7b
nous-hermes-13b7.4GBollama run nous-hermes:13b
nous-hermes-7b3.8GBollama run nous-hermes:7b
sqlcoderCode15b9.0GBollama run sqlcoder:15b
sqlcoderCode7b4.1GBollama run sqlcoder:7b
llama3-gradient-70b40GBollama run llama3-gradient:70b
llama3-gradient-8b4.7GBollama run llama3-gradient:8b
starling-lm-7b4.1GBollama run starling-lm:7b
xwinlm-13b7.4GBollama run xwinlm:13b
xwinlm-7b3.8GBollama run xwinlm:7b
yarn-llama2-13b7.4GBollama run yarn-llama2:13b
yarn-llama2-7b3.8GBollama run yarn-llama2:7b
deepseek-llm-67b38GBollama run deepseek-llm:67b
deepseek-llm-7b4.0GBollama run deepseek-llm:7b
llama3-chatqa-70b40GBollama run llama3-chatqa:70b
llama3-chatqa-8b4.7GBollama run llama3-chatqa:8b
orca2-13b7.4GBollama run orca2:13b
orca2-7b3.8GBollama run orca2:7b
solar-10.7b6.1GBollama run solar:10.7b
samantha-mistral-7b4.1GBollama run samantha-mistral:7b
dolphin-phi-2.7b1.6GBollama run dolphin-phi:2.7b
stable-beluga-70b39GBollama run stable-beluga:70b
stable-beluga-13b7.4GBollama run stable-beluga:13b
stable-beluga-7b3.8GBollama run stable-beluga:7b
moondreamVision1.8b1.7GBollama run moondream:1.8b
snowflake-arctic-embedEmbedding335m669MBollama pull snowflake-arctic-embed:335m
snowflake-arctic-embedEmbedding137m274MBollama pull snowflake-arctic-embed:137m
snowflake-arctic-embedEmbedding110m219MBollama pull snowflake-arctic-embed:110m
snowflake-arctic-embedEmbedding33m67MBollama pull snowflake-arctic-embed:33m
snowflake-arctic-embedEmbedding22m46MBollama pull snowflake-arctic-embed:22m
bakllavaVision7b4.7GBollama run bakllava:7b
wizardlm-uncensored-13b7.4GBollama run wizardlm-uncensored:13b
deepseek-v2-236b133GBollama run deepseek-v2:236b
deepseek-v2-16b8.9GBollama run deepseek-v2:16b
medllama2-7b3.8GBollama run medllama2:7b
yarn-mistral-7b4.1GBollama run yarn-mistral:7b
llama-pro-instruct4.7GBollama run llama-pro:instruct
nous-hermes2-mixtral-8x7b26GBollama run nous-hermes2-mixtral:8x7b
meditron-70b39GBollama run meditron:70b
meditron-7b3.8GBollama run meditron:7b
nexusraven-13b7.4GBollama run nexusraven:13b
codeupCode13b7.4GBollama run codeup:13b
llava-phi3Vision3.8b2.9GBollama run llava-phi3:3.8b
everythinglm-13b7.4GBollama run everythinglm:13b
glm4-9b5.5GBollama run glm4:9b
codegeex4Code9b5.5GBollama run codegeex4:9b
magicoderCode7b3.8GBollama run magicoder:7b
stablelm-zephyr-3b1.6GBollama run stablelm-zephyr:3b
codeboogaCode34b19GBollama run codebooga:34b
mistrallite-7b4.1GBollama run mistrallite:7b
wizard-vicuna-13b7.4GBollama run wizard-vicuna:13b
duckdb-nsqlCode7b3.8GBollama run duckdb-nsql:7b
megadolphin-120b68GBollama run megadolphin:120b
goliath-120b-q4_066GBollama run goliath:120b-q4_0
notux-8x7b26GBollama run notux:8x7b
falcon2-11b6.4GBollama run falcon2:11b
open-orca-platypus2-13b7.4GBollama run open-orca-platypus2:13b
notus-7b4.1GBollama run notus:7b
dbrx-132b74GBollama run dbrx:132b
internlm2-7b4.5GBollama run internlm2:7b
alfred-40b24GBollama run alfred:40b
llama3-groq-tool-use-70b40GBollama run llama3-groq-tool-use:70b
llama3-groq-tool-use-8b4.7GBollama run llama3-groq-tool-use:8b
mathstral-7b4.1GBollama run mathstral:7b
firefunction-v2-70b40GBollama run firefunction-v2:70b
nuextract-3.8b2.2GBollama run nuextract:3.8b

最新支持模型请参考:https://ollama.com/search

注意:运行 7B 模型至少需要 8GB 内存,运行 13B 模型至少需要 16GB 内存,运行 33B 模型至少需要 32GB 内存。

四、Ollama 常用命令

终端输入 Ollama,输出如下:

bash
Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.

总结如下:

命令描述
ollama serve启动 Ollama
ollama create从 Modelfile 创建模型
ollama show显示模型信息
ollama run运行模型
ollama stop停止正在运行的模型
ollama pull从注册表中拉取模型
ollama push将模型推送到注册表
ollama list列出所有模型
ollama ps列出正在运行的模型
ollama cp复制模型
ollama rm删除模型
ollama help显示任意命令的帮助信息
标志描述
-h, --help显示 Ollama 的帮助信息
-v, --version显示版本信息

多行输入命令时,可以使用 """ 进行换行。

使用 """ 结束换行。

终止 Ollama 模型推理服务,可以使用 /bye

注意:Ollama 进程会一直运行,如果需要终止 Ollama 所有相关进程,可以使用以下命令:

bash
Get-Process | Where-Object {$_.ProcessName -like '*ollama*'} | Stop-Process