Translation Credits
Per-model rates for AI token billing
SwiftIn bills your AI quota using a per-model input/output multiplier formula. Each backend model has two coefficients — one for input tokens (what you send) and one for output tokens (what the model returns). The product is what gets deducted from your monthly quota.
The formula
Every successful AI translation deducts an amount calculated like this:
Debit = prompt_tokens × inputMultiplier + completion_tokens × outputMultiplierPer-model multipliers
Multipliers are calibrated proportionally to each provider's real $/1M token price. Output is billed at a higher rate than input on every model because providers charge it that way (output is typically 2-8× more expensive at the API level).
| Model | Input × | Output × | 5M quota ≈ |
|---|---|---|---|
| DeepSeek V4 Flash | 0.2 | 0.4 | ~18.7M |
| GPT-5 nano | 0.1 | 0.5 | ~21.4M |
| Llama 3.3 70B | 0.15 | 0.4 | ~21.4M |
| Claude Haiku 3 | 0.3 | 1.5 | ~7.1M |
| Gemini 3.1 Flash-Lite | 0.3 | 1.8 | ~6.3M |
| Grok 4.3 | 1.5 | 3.0 | ~2.5M |
The "5M quota ≈" column estimates how many real provider tokens a Pro user can spend on each model, assuming a typical 2:1 input-to-output ratio. Actual reach depends on your input/output mix.
Worked example
Translate one page where the model reports prompt_tokens=5000 and completion_tokens=2000. On GPT-5 nano (0.1 / 0.5), the debit is 5000 × 0.1 + 2000 × 0.5 = 500 + 1000 = 1500 credits. The same page translated on Grok 4.3 (1.5 / 3.0) debits 5000 × 1.5 + 2000 × 3.0 = 7500 + 6000 = 13,500 credits — nine times more, because Grok is roughly nine times more expensive per token at the provider level.
Why models burn at different rates
AI providers price input tokens (what you send) and output tokens (what they generate) at different rates — and those rates vary by model. A premium model like Grok 4.3 costs 8-10× more per token than DeepSeek V4 Flash. If we billed all models at a flat rate, premium-model users would be subsidised by cheap-model users. The per-model multiplier reflects this directly, so you spend your quota in proportion to the real cost of the choice you made.
When nothing is debited
Failed provider calls, same-language detection, local cache hits, free engines (Google / Bing), and Smart Context cache hits all deduct zero from your quota. The full list lives on /docs/plans-limits — "When you're charged — and when you're not".
See plans and pricing