Стоимость Grok API 2026: токены, free tier и примеры расчета

AI Free API Team

•2 июл. 2026 г.•11 мин чтения•API Guides

Стоимость Grok API нельзя оценивать по одной цене за токен: нужны официальная строка xAI, кэш, выходные токены, инструменты, Batch, Priority, хранение, повторы и лимиты аккаунта.

Стоимость Grok API 2026: токены, free tier и примеры расчета

На 2 июля 2026 года в документации xAI строка grok-4.3 указана как базовая строка для большинства текущих расчетов Grok API: 1.25 доллара за 1 млн входных токенов, 0.20 доллара за 1 млн кэшированных входных токенов и 2.50 доллара за 1 млн выходных токенов. Публичная документация xAI не обещает постоянный универсальный бесплатный API tier. Безопасное предположение для бюджета такое: кредиты, eligibility, регион, billing mode, лимиты и доступные модели надо проверять в своей консоли xAI, а не в старом сниппете или чужом туториале.

Реальная стоимость Grok API начинается со строки цены, но не заканчивается ей. В одну таблицу надо включить свежие input tokens, cached input, output и reasoning tokens, Web Search, X Search, Code Execution, file search, RAG, Batch, Priority, storage, downloads, retry rate, rate-limit tier и spending limits команды. Поэтому практический вопрос звучит не “сколько стоит Grok API”, а “сколько стоит успешная задача после всех инструментов, повторов и ограничений”.

Держите контракты отдельно. xAI API pricing не равен подписке Grok App или X, а сторонний free route не является официальной строкой xAI. Начните с маленького prepaid теста, поставьте низкий или нулевой postpaid limit, запишите model ID, токены, инструменты, ошибки и accepted outputs. Масштабируйте только тогда, когда логи консоли совпадают с этой таблицей.

Быстрая таблица перед расходом

Вопрос	Текущий безопасный ответ	Действие для бюджета
Какая главная строка цены Grok API?	xAI Docs указывают grok-4.3: $1.25 input, $0.20 cached input, $2.50 output за 1 млн токенов.	Используйте как базовую строку и перепроверьте pricing page перед публикацией цифры.
Есть ли free tier?	Публичные документы не дают постоянный официальный free tier; quickstart просит загрузить credits перед использованием API.	Проверьте credits, срок действия и billing mode в своей консоли.
Что сильнее всего меняет invoice?	Mix токенов, кэш, длина ответа, инструменты, Batch, Priority, хранение, retry и tier.	Стройте worksheet под workload, а не копируйте sticker price.
Как безопаснее тестировать?	Малый prepaid, низкий postpaid limit, полный лог, pinned model ID.	Остановитесь, если console row, model list или token log расходятся с worksheet.

Эта таблица отделяет официальную строку от затрат рабочей нагрузки. Официальную строку публикует xAI. Затраты вашей нагрузки создают prompts, retrieval, tools, output length, retries and billing controls.

Актуальные официальные строки xAI

Официальная точка отсчета находится в pricing документации xAI, а не в стороннем обзоре или калькуляторе провайдера. На 2 июля 2026 года релевантные строки Chat API были такими:

Model row	Context в docs	Input / 1M	Cached input / 1M	Output / 1M	Как использовать
grok-4.3	1M	$1.25	$0.20	$2.50	Базовая строка для большинства текущих текстовых и image-input задач.
grok-build-0.1	256k	$1.00	$0.20	$2.00	Более низкая строка, если она доступна и качество подходит.
grok-4.20-multi-agent-0309	1M	$1.25	$0.20	$2.50	Специализированная строка; проверьте доступность и назначение.
grok-4.20-0309-reasoning	1M	$1.25	$0.20	$2.50	Reasoning workload надо считать по измеренному output и behavior.
grok-4.20-0309-non-reasoning	1M	$1.25	$0.20	$2.50	Тестируйте качество и длину ответа перед выбором ради цены.

Страница Grok 4.3 также перечисляет grok-4.3, grok-4.3-latest и grok-latest, text and image input, text output и context window 1M. Там же есть caveat: запросы выше 200K context могут использовать другой rate. Для длинных документов и больших retrieval пакетов это не мелкая деталь, а отдельная строка теста.

Доска официальных переменных цены Grok API: grok-4.3, grok-build-0.1, инструменты, Batch, Priority и storage.

Не фиксируйте эти строки как вечные. xAI может менять цену, список моделей, aliases, rate limits, регионы или доступность. В production budget сохраните дату, exact model ID, docs URL, team/account, sample size и ссылку на usage export или screenshot консоли.

Инструменты, Batch, Priority и storage

Большинство плохих бюджетов Grok API считает только токены. Pricing page xAI также содержит отдельные поверхности затрат, которые в agentic flows могут стать главной строкой счета.

Поверхность затрат	Правило, проверенное 2 июля 2026	Когда важно
Web Search	$5 за 1000 calls	Agents with current web evidence.
X Search	$5 за 1000 calls	Social, market or real-time X workflows.
Code Execution	$5 за 1000 calls	Coding, data and sandbox execution.
File Attachments search	$10 за 1000 calls	Large file and uploaded document workflows.
Collections Search / RAG	$2.50 за 1000 calls	Retrieval-heavy knowledge base tasks.
Batch API	20%-50% off standard text/language token rates, usually within 24 hours	Bulk classification, extraction and summarization without realtime latency.
Priority Processing	2x standard token rates after prompt caching discounts	Latency-sensitive paths where priority is explicitly enabled.
File storage	$0.025/GiB/day	Uploaded files retained across jobs.
Collection storage	$0.10/GiB/day	Stored retrieval collections.
Downloads	$0.20/GiB downloaded	Export or download-heavy workflows.

Поэтому support bot, RAG assistant, coding assistant и research agent нельзя считать одной формулой с одинаковыми коэффициентами. Support может выиграть от cached input и short output. RAG может проиграть из-за file search. Research agent может иметь дешевые токены и дорогие search calls. Batch jobs могут стать дешевле, если latency в 24 часа приемлема.

Что означает free tier

Публичная документация xAI, проверенная на 2 июля 2026 года, не гарантирует постоянный официальный free API tier. Quickstart говорит: sign up and load the account with credits to start using the API. Это не то же самое, что “бесплатно для всех”.

Route	Что это может означать	Безопасная формулировка
Official xAI API	Usage billed or credited inside your xAI team/account.	Проверяйте credits, eligibility, billing mode, model list and rate limits in your console.
Console credits or promotions	Account-specific balance, trial or promotion.	Запишите как состояние конкретного аккаунта, не как universal free tier.
Third-party free route	Provider sponsors, proxies, throttles or limits usage under its own contract.	Это provider route, не official xAI price row.
Grok app, X subscription, SuperGrok	Consumer access product.	Отдельно от API billing и developer budget.

Русскоязычные материалы часто смешивают “где взять API”, бесплатные ключи, Groq/Grok confusion, старые model rows и сторонние калькуляторы. Для читателя полезнее прямо сказать: бесплатный эксперимент может быть удобным, но production budget начинается с официальной строки, своей консоли и провайдерского договора, если он вообще используется.

Формула стоимости Grok API

Используйте явную формулу, затем замените assumptions логами:

text
estimated cost =
  fresh_input_count / 1,000,000 * input_price
+ cached_input_count / 1,000,000 * cached_input_price
+ output_count / 1,000,000 * output_price
+ tool_calls / 1,000 * tool_call_price
+ storage_gib_days * storage_price
+ downloads_gib * download_price
+ retry_cost
+ priority_multiplier_or_batch_discount

Формула нужна, чтобы не ошибиться в трех местах. Кэш дешевле только при реальном cache hit. Инструменты не бесплатны только потому, что они спрятаны внутри агента. Base row не равен invoice, если retries, schema repair, priority, storage retention or downloads меняют cost per accepted result.

Рабочая таблица должна включать: model ID, input tokens per task, cached input tokens, output tokens, tool calls, retry rate, accepted output count, batch eligibility, priority requirement, storage retained, console limit and fallback behavior. Если вы считаете не accepted outputs, а first responses, бюджет почти всегда выглядит лучше реальности.

Примеры рабочих нагрузок

Эти примеры показывают драйверы затрат, а не обещают универсальную месячную цену. Перед масштабированием замените значения собственными логами.

Примеры затрат Grok API для support chat, documents, coding и research workloads.

Support chat

Support bot чувствителен к output length и cache hit. Stable system prompt, tone rules, policy block and tool instructions могут хорошо кэшироваться. Дорогой частью часто становятся длинные ответы, handoff summaries и retries after rejected answers.

Assumption	Example value	Cost implication
Requests	100000 replies/month	High volume makes tiny per-task differences visible.
Fresh input	800 tokens/reply	Base input is usually manageable.
Cached input	1200 tokens/reply	Cache hit rate can materially reduce cost.
Output	350 tokens/reply	Output price matters more than many teams expect.
Tools	0 to 1 retrieval/search call/reply	Tool calls can overtake token savings if used on every reply.

Control rule: cache stable instructions, cap answer length, log accepted vs retried replies, and sample quality before routing every ticket.

Documents and RAG

Document workflows are input-heavy. One answer can include retrieved passages, file search, user query, policy text and a long response. The token row may look cheap until file search or collection search becomes frequent.

Assumption	Example value	Cost implication
Requests	20000 answers/month	Medium volume with large context can still be expensive.
Fresh input	6000 tokens/answer	Retrieval window is the main lever.
Cached input	1000 tokens/answer	Stable instructions help, retrieved chunks are usually fresh.
Output	700 tokens/answer	Citations and summaries increase output.
Tools	File Attachments search or Collections Search	Tool rows must be counted separately.

Control rule: retrieve fewer and better chunks, keep citations compact, set maximum context budget and compare answer quality before widening retrieval.

Coding assistant

Coding tasks can be cheap for suggestions and expensive for agentic loops. Cost drivers include files read, diffs generated, tests, Code Execution, explanation length and number of attempts before the patch is accepted.

Assumption	Example value	Cost implication
Tasks	5000 coding turns/month	Turns can hide multi-attempt work.
Fresh input	2500 tokens/turn	Files, diffs, tests and repo rules accumulate.
Cached input	500 tokens/turn	Reused repo instructions may help.
Output	900 tokens/turn	Patch explanations and structured output can be long.
Tools	Code Execution when enabled	Tool fees and failed test loops need their own line.

Control rule: measure successful task cost, not first-answer cost. Record failed tests, retries, review time and rollback cases.

Research agent

A research agent may look cheap in tokens and expensive in tools. Web Search, X Search, file search and long evidence summaries can dominate. It is also the workload where stale facts are most damaging.

Assumption	Example value	Cost implication
Reports	1000 reports/month	Lower volume can still be expensive per task.
Fresh input	4000 tokens/report	Query plan, evidence and instructions are substantial.
Cached input	800 tokens/report	Reusable report scaffolds may cache.
Output	1500 tokens/report	Evidence packets and summaries are output-heavy.
Tools	Multiple Web Search or X Search calls	Tool calls can dominate total cost.

Control rule: cap tool calls, require source quality, batch non-urgent work and stop if the agent cannot show which facts came from current official sources.

Rate limits and billing controls

xAI rate-limit docs say each API team has per-model RPS and TPM limits, and tiers depend on cumulative API spend since January 1, 2026. All consumed token types count toward TPM: prompt, completion, reasoning, cached prompt, image and audio tokens. Model-page numbers are useful, but your team console is the operational truth.

Billing management covers invoices, payment methods, prepaid credit balance, top-ups, historical usage, postpaid invoice preview and spending limits. A safe first production test looks like this:

Start with prepaid credits.
Set postpaid limit low or zero if you want prepaid-only behavior.
Log tokens, cached tokens, model ID, tool calls, retries, errors, latency and accepted result.
Compare actual spend against the worksheet after a small sample.
Increase limits only after spend, quality, latency and failure rate remain stable.

Чеклист контроля расходов Grok API: docs verification, console credits, limits, logs, cache, Batch and stale model rows.

The stop rule should live in code or operations, not in a monthly finance review. If request volume, tool calls, retry rate, output tokens or Priority use exceed the worksheet by your threshold, pause the route.

Старые строки модели после May 15 retirement

May 15 retirement notice is the freshness warning for Grok API pricing. xAI says several retired slugs redirect to grok-4.3 after May 15, 2026, and deprecated slug requests after that date are billed at grok-4.3 pricing. Old snippets centered on Grok 4.1 Fast, Grok 3 or old free credits are not safe budget inputs.

If you see	Treat as	Safer move
Grok 4.1 Fast as the current cheap default	Stale until official docs or console prove it.	Recheck pricing page and console model list.
Universal monthly free credits	Account-specific or provider-specific until xAI docs say otherwise.	Verify your credit balance and expiration.
Provider calculator with free usage	Separate provider contract.	Keep it out of official xAI pricing.
grok-latest alias	Convenient but movable.	Pin exact model ID for cost tests.

This is not bureaucracy. A retired slug may still answer while your price assumption is wrong. Budget from current official rows and your own console behavior.

Safe test plan

Before scaling Grok API usage, run a representative test.

Step	What to do	Pass signal
1. Pin model	Start with grok-4.3 or the exact row you intend to test.	Logs show expected model ID and team/account.
2. Set spend stop	Use prepaid credits and low postpaid limit.	A runaway test cannot create a large invoice.
3. Run real sample	Use real prompts, retrieval, tools and output format.	The sample resembles production work.
4. Count successful-task cost	Count accepted outputs, retries, tool calls and review time.	Cost per accepted result is clear.
5. Compare alternatives	Try lower row, Batch, cache, fewer tools or shorter output where quality allows.	The cheaper route still passes quality.
6. Scale gradually	Raise limits only after logs match worksheet.	Spend, quality, latency and failure rate remain stable.

For model-ID, alias, migration and rollout decisions, use the Grok 4.3 API guide. Keep the pricing worksheet focused on cost, billing and workload behavior.

Часто задаваемые вопросы

Сколько стоит Grok API?

На 2 июля 2026 года xAI Docs list grok-4.3 at $1.25 input, $0.20 cached input and $2.50 output per 1M tokens. Real cost also depends on output length, cache hits, tool calls, Batch or Priority mode, storage, retries and account limits.

Grok API бесплатный?

Публичные документы xAI не гарантируют постоянный официальный free API tier. Quickstart asks users to load credits before using the API. Some accounts or providers may have credits or free routes, but those are separate from the official xAI price row.

Какую модель сначала закладывать в бюджет?

For general Grok API work, start with grok-4.3 unless your console and workload point to another available model. If you test grok-build-0.1 or a specialized grok-4.20 row, record quality, availability and output behavior.

Почему cached input дешевле?

Cached input discounts repeated prompt content when cache behavior applies. It helps stable prompts, policy blocks and templates, but it is not automatic savings. Measure cache hits before reducing budget.

Инструменты меняют цену?

Yes. Web Search, X Search, Code Execution, File Attachments search and Collections Search/RAG have separate prices. Include them in the formula when the workflow uses them.

Когда использовать Batch API?

Use Batch only when latency is flexible. xAI lists 20%-50% discounts for eligible text/language model batch work, usually within 24 hours, while image and video through Batch may still be billed at standard rates.

Что чаще всего забывают в бюджете?

Tool calls, retries, long outputs, Priority multiplier, file or collection storage, retired model redirects and postpaid limits. These usually matter more than another copied price snippet.