Skip to content

AI engines & cascade

How translation engines work

Swiftin uses different translation engines depending on your plan. AI models give the best quality; Google and Bing handle volume and act as a free fallback.

Available engines

Gemini 3.1 Flash-Lite

Free with quota / Pro / Team

Google's lightweight AI model. Context-aware, supports translation styles.

Llama 3.3 70B

Free with quota / Pro / Team

Meta's open-source model on Groq specialized hardware. Fastest in our mix — sub-second latency.

GPT-5 nano

Free with quota / Pro / Team

OpenAI's smallest GPT-5 model. Reliable for general translation, cost-effective.

Claude Haiku 4.5

Free with quota / Pro / Team

Anthropic Claude — balanced cost and quality, idiomatic phrasing.

DeepSeek v4 Flash

Free with quota / Pro / Team

Strong on technical and Asian-language content. Called directly from China; auto-prompt-cache discounts repeat prefixes.

Grok 4.3

Free with quota / Pro / Team

xAI's fast model. Casual tone, good for social media context.

Google Translate

Free / Guest

Standard machine translation. Reliable, fast, no API key needed. Works for guests too.

Bing Translator

Free / Guest

Microsoft's engine. Used as a fallback when Google rate-limits. Lower concurrency.

More models coming

We're testing and adding new AI engines. Stay tuned.

How fallback works

When one engine fails or hits a limit, Swiftin falls back automatically — you don't hit an error wall.

Guest path

google → google-anon → bing. Each step retried on failure.

Paid AI path

Selected AI model → other AI models → Google → Bing. Failures cascade through cheaper options.

When AI quota runs out

Free users with 50K AI tokens used: AI engines silently downgrade to Google + Bing. Pro/Team get larger quotas (5M / 10M per seat).

Switching engines

Pick a per-feature engine in extension settings: Page translate, Selection, Input each have their own engine setting. Your choice persists across sessions.

Cache & token deduction

Translations are cached locally for 30 days. Revisits don't spend tokens.

Local translation cache

Each translation is saved in your browser's IndexedDB for 30 days. When you revisit a page, the translation loads from cache instantly — no engine call, no token deduction.

Free engines = no limits

Google + Bing run without tokens and without limits. Translate any pages, as many times as you want — never charged, never throttled.

Paid AI = first request bills, then cache

Gemini, DeepSeek, Grok deduct tokens only on the first translation. Visit the same page later — translation loads from your local cache, tokens stay untouched.

Switching engine = fresh request

If you translated a page with Gemini and switch to DeepSeek, DeepSeek makes a fresh request (cache is keyed per engine). Switch to a paid engine — tokens deduct. Switch to Google/Bing — no charge.

AI Smart Context

AI Smart Context tells the AI engine what the page is about before it translates. Result: tone, jargon, and references match the actual content — not just the words.

Pro/Team only. Free users with the toggle on get title-only context (no AI summary).

Why context matters

Without context, AI translates each paragraph in isolation. Technical articles get casual translations; casual chats get formal ones. With context, the model knows the page is a finance article, a meme thread, or a docs page — and adjusts.

What's used as context

Page title (always, all plans), meta description (when present and substantial), and an AI-generated summary (~300–500 tokens, Pro/Team only).

Summary cache (24h)

The summary is generated once per URL + first 500 chars of content, then cached for 24 hours. Other Pro/Team users hitting the same page in that window get the cached summary for free.

Token cost

Each summary costs ~300–500 tokens of your AI quota — only on the page's first translation. Subsequent translations (within the 24h window) reuse the cached summary at no cost.

Free vs Pro/Team

Free: title-only context (no AI summary, no token cost). Pro/Team: full Title + Meta + AI summary for context-aware translations.

Opt-in by default

Off until you turn it on. Toggle in: extension popup → Translate card → Smart Context, or in Options → Page Translate tab.

Common issues

Engine-level problems often look like translation problems. Each row maps a symptom to the actual cause.

QUOTA_GONE

AI suddenly stops, falls back to Google / Bing

Free plan: 50 K AI tokens / month. Pro: 5 M. Team: 10 M / seat. When exhausted within a cycle, the cascade silently downgrades to free engines. Buy a token addon, upgrade plan, or wait for monthly reset.

CHAIN_EXHAUSTED

"AI providers temporarily unavailable" pill

Every backend AI provider failed for non-quota reasons (transient outage, safety filter, malformed responses across the board). The cascade falls to Google / Bing for the request. Retry in a few minutes.

WRONG_TONE

Translation reads oddly formal / casual

Different engines have different default tones. Grok is casual, Gemini balanced, DeepSeek precise. Switch the engine for the surface (page, selection, input) in Options. For per-translation tone control, use AI engines + Slang/Business styles.

CACHE_STALE

Same page shows old translation

Local cache lives 30 days, keyed per (URL, engine, target lang, style). Switch engine to force a fresh request; or clear the local cache from Options.

SMART_CONTEXT_NOT_APPLIED

Smart Context toggle on but translation feels generic

Smart Context summary is Pro/Team only and only applies on the page's first translation (then the 24h cached summary applies). Free with the toggle on gets title-only context.

AI_NOT_VISIBLE

AI engines greyed out for guest browsers

AI engines (Plus mode) need an account so the backend can track quota. Sign up for a free account (50 K tokens / month) or use BYOK to plug your own provider key without signing in for backend quota.

Translation stylesBYOK — your AI key