Cheap LLM API Models for Chatbots

Chatbot workloads can become output-token heavy, so this guide highlights models with low standardized workload costs, clear output pricing, and practical alternatives.

50Models listed

1M + 500KCost example tokens

USD / 1MNormalized prices

Quick shortlist

Start with Ring-2.6-1T (free).

This guide is sorted by standard workload cost, so the first rows are the strongest budget shortlist before model-quality testing.

Lead model 🔥Ring-2.6-1T (free)

ProviderinclusionAI

Sample cost$0

Context262.14K

The ranking is a discovery aid, not a final recommendation. Always compare the model against your workload and verify provider pricing before production use.

How to read this ranking

Models are sorted by estimated cost for 1 million input tokens and 500 thousand output tokens. Use this page when your first constraint is API spend.

Estimate your workload cost

Customize guide costs

Prices are normalized to USD per 1M tokens.

Monthly input tokens Monthly output tokens

This estimate uses normalized public API pricing per 1M tokens. It is a planning aid, not a billing quote. Verify provider pricing, limits, and terms before production use.

Model Ranking

Browse all models

Model	Provider	Prompt	Output	Example Cost	Your Cost	Context	Rank	Release
🔥Ring-2.6-1T (free)	inclusionAI	$0	$0	$0	$0	262.14K	#10	2026-05-08
🔥Nemotron 3 Super (free)	NVIDIA	$0	$0	$0	$0	262.14K	#12	2026-03-11
🔥Owl Alpha	OpenRouter	$0	$0	$0	$0	1.05M	#17	2026-04-28
CoBuddy (free)	Baidu Qianfan	$0	$0	$0	$0	131.07K	Unranked	2026-05-06
Nemotron 3 Nano Omni (free)	NVIDIA	$0	$0	$0	$0	256K	Unranked	2026-04-28
Laguna XS.2 (free)	Poolside	$0	$0	$0	$0	131.07K	Unranked	2026-04-28
Laguna M.1 (free)	Poolside	$0	$0	$0	$0	131.07K	Unranked	2026-04-28
Qianfan-OCR-Fast (free)	Baidu	$0	$0	$0	$0	65.54K	Unranked	2026-04-20
Gemma 4 26B A4B (free)	Google	$0	$0	$0	$0	262.14K	Unranked	2026-04-03
Gemma 4 31B (free)	Google	$0	$0	$0	$0	262.14K	Unranked	2026-04-02
Trinity Large Thinking (free)	Arcee AI	$0	$0	$0	$0	262.14K	Unranked	2026-04-01
Lyria 3 Pro Preview	Google	$0	$0	$0	$0	1.05M	Unranked	2026-03-30
Lyria 3 Clip Preview	Google	$0	$0	$0	$0	1.05M	Unranked	2026-03-30
MiniMax M2.5 (free)	MiniMax	$0	$0	$0	$0	196.61K	Unranked	2026-02-12
Free Models Router	OpenRouter	$0	$0	$0	$0	200K	Unranked	2026-02-01
LFM2.5-1.2B-Thinking (free)	LiquidAI	$0	$0	$0	$0	32.77K	Unranked	2026-01-20
LFM2.5-1.2B-Instruct (free)	LiquidAI	$0	$0	$0	$0	32.77K	Unranked	2026-01-20
Nemotron 3 Nano 30B A3B (free)	NVIDIA	$0	$0	$0	$0	256K	Unranked	2025-12-14
Nemotron Nano 12B 2 VL (free)	NVIDIA	$0	$0	$0	$0	128K	Unranked	2025-10-28
Qwen3 Next 80B A3B Instruct (free)	Qwen	$0	$0	$0	$0	262.14K	Unranked	2025-09-11
Nemotron Nano 9B V2 (free)	NVIDIA	$0	$0	$0	$0	128K	Unranked	2025-09-05
gpt-oss-120b (free)	OpenAI	$0	$0	$0	$0	131.07K	Unranked	2025-08-05
gpt-oss-20b (free)	OpenAI	$0	$0	$0	$0	131.07K	Unranked	2025-08-05
GLM 4.5 Air (free)	Z.ai	$0	$0	$0	$0	131.07K	Unranked	2025-07-25
Qwen3 Coder 480B A35B (free)	Qwen	$0	$0	$0	$0	262K	Unranked	2025-07-23
Uncensored (free)	Venice	$0	$0	$0	$0	32.77K	Unranked	2025-07-09
Llama 3.3 70B Instruct (free)	Meta	$0	$0	$0	$0	65.54K	Unranked	2024-12-06
Llama 3.2 3B Instruct (free)	Meta	$0	$0	$0	$0	131.07K	Unranked	2024-09-25
Hermes 3 405B Instruct (free)	Nous	$0	$0	$0	$0	131.07K	Unranked	2024-08-16
Mistral Nemo	Mistral	$0.02	$0.03	$0.04	$0.04	131.07K	Unranked	2024-07-19
Llama 3.1 8B Instruct	Meta	$0.02	$0.05	$0.04	$0.04	16.38K	Unranked	2024-07-23
Llama 3 8B Instruct	Meta	$0.04	$0.04	$0.06	$0.06	8.19K	Unranked	2024-04-18
Llama 3 8B Lunaris	Sao10K	$0.04	$0.05	$0.07	$0.07	8.19K	Unranked	2024-08-13
Granite 4.0 Micro	IBM	$0.017	$0.11	$0.07	$0.07	131K	Unranked	2025-10-20
Gemma 3 4B	Google	$0.04	$0.08	$0.08	$0.08	131.07K	Unranked	2025-03-13
LFM2-24B-A2B	LiquidAI	$0.03	$0.12	$0.09	$0.09	32.77K	Unranked	2026-02-25
Mistral Small 3	Mistral	$0.05	$0.08	$0.09	$0.09	32.77K	Unranked	2025-01-30
Qwen2.5 7B Instruct	Qwen	$0.04	$0.1	$0.09	$0.09	32.77K	Unranked	2024-10-16
MythoMax 13B	gryphe	$0.06	$0.06	$0.09	$0.09	4.1K	Unranked	2023-07-02
Granite 4.1 8B	IBM	$0.05	$0.1	$0.1	$0.1	131.07K	Unranked	2026-04-30
gpt-oss-20b	OpenAI	$0.03	$0.14	$0.1	$0.1	131.07K	Unranked	2025-08-05
Gemma 3 12B	Google	$0.04	$0.13	$0.11	$0.11	131.07K	Unranked	2025-03-13
Nova Micro 1.0	Amazon	$0.035	$0.14	$0.11	$0.11	128K	Unranked	2024-12-05
Command R7B (12-2024)	Cohere	$0.0375	$0.15	$0.11	$0.11	128K	Unranked	2024-12-14
Qwen3.5-9B	Qwen	$0.04	$0.15	$0.11	$0.11	262.14K	Unranked	2026-03-10
Trinity Mini	Arcee AI	$0.045	$0.15	$0.12	$0.12	131.07K	Unranked	2025-12-01
Nemotron Nano 9B V2	NVIDIA	$0.04	$0.16	$0.12	$0.12	131.07K	Unranked	2025-09-05
Gemma 3n 4B	Google	$0.06	$0.12	$0.12	$0.12	32.77K	Unranked	2025-05-20
Qwen3 235B A22B Instruct 2507	Qwen	$0.071	$0.1	$0.12	$0.12	262.14K	Unranked	2025-07-21
Llama 3.2 1B Instruct	Meta	$0.027	$0.2	$0.13	$0.13	60K	Unranked	2024-09-25

Pricing FAQ

How is the sample workload cost calculated?

The sample workload uses 1 million input tokens plus 500 thousand output tokens, then applies each model's normalized USD price per 1 million tokens.

Why do input and output token prices matter separately?

Many applications are output-token heavy, while retrieval and classification workloads may be input-token heavy. Comparing both prices helps avoid picking a model that is cheap for the wrong workload shape.

Should I verify prices before production use?

Yes. AI Model Matrix normalizes public pricing metadata for comparison, but provider availability, limits, and prices can change. Always verify the final contract or provider dashboard before production use.