AI engines & cascade
How translation engines work
Swiftin uses different translation engines depending on your plan. AI models give the best quality; Google and Bing handle volume and act as a free fallback.
Available engines
Gemini 3.1 Flash-Lite
Free with quota / Pro / TeamGoogle's lightweight AI model. Context-aware, supports translation styles.
Llama 3.3 70B
Free with quota / Pro / TeamMeta's open-source model on Groq specialized hardware. Fastest in our mix — sub-second latency.
GPT-5 nano
Free with quota / Pro / TeamOpenAI's smallest GPT-5 model. Reliable for general translation, cost-effective.
Claude Haiku 4.5
Free with quota / Pro / TeamAnthropic Claude — balanced cost and quality, idiomatic phrasing.
DeepSeek v4 Flash
Free with quota / Pro / TeamStrong on technical and Asian-language content. Called directly from China; auto-prompt-cache discounts repeat prefixes.
Grok 4.3
Free with quota / Pro / TeamxAI's fast model. Casual tone, good for social media context.
Google Translate
Free / GuestStandard machine translation. Reliable, fast, no API key needed. Works for guests too.
Bing Translator
Free / GuestMicrosoft's engine. Used as a fallback when Google rate-limits. Lower concurrency.
More models coming
We're testing and adding new AI engines. Stay tuned.
How fallback works
When one engine fails or hits a limit, Swiftin falls back automatically — you don't hit an error wall.
Guest path
google → google-anon → bing. Each step retried on failure.
Paid AI path
Selected AI model → other AI models → Google → Bing. Failures cascade through cheaper options.
When AI quota runs out
Free users with 50K AI tokens used: AI engines silently downgrade to Google + Bing. Pro/Team get larger quotas (5M / 10M per seat).
Switching engines
Pick a per-feature engine in extension settings: Page translate, Selection, Input each have their own engine setting. Your choice persists across sessions.
Cache & token deduction
Translations are cached locally for 30 days. Revisits don't spend tokens.
Local translation cache
Each translation is saved in your browser's IndexedDB for 30 days. When you revisit a page, the translation loads from cache instantly — no engine call, no token deduction.
Free engines = no limits
Google + Bing run without tokens and without limits. Translate any pages, as many times as you want — never charged, never throttled.
Paid AI = first request bills, then cache
Gemini, DeepSeek, Grok deduct tokens only on the first translation. Visit the same page later — translation loads from your local cache, tokens stay untouched.
Switching engine = fresh request
If you translated a page with Gemini and switch to DeepSeek, DeepSeek makes a fresh request (cache is keyed per engine). Switch to a paid engine — tokens deduct. Switch to Google/Bing — no charge.
AI Smart Context
AI Smart Context tells the AI engine what the page is about before it translates. Result: tone, jargon, and references match the actual content — not just the words.
Why context matters
Without context, AI translates each paragraph in isolation. Technical articles get casual translations; casual chats get formal ones. With context, the model knows the page is a finance article, a meme thread, or a docs page — and adjusts.
What's used as context
Page title (always, all plans), meta description (when present and substantial), and an AI-generated summary (~300–500 tokens, Pro/Team only).
Summary cache (24h)
The summary is generated once per URL + first 500 chars of content, then cached for 24 hours. Other Pro/Team users hitting the same page in that window get the cached summary for free.
Token cost
Each summary costs ~300–500 tokens of your AI quota — only on the page's first translation. Subsequent translations (within the 24h window) reuse the cached summary at no cost.
Free vs Pro/Team
Free: title-only context (no AI summary, no token cost). Pro/Team: full Title + Meta + AI summary for context-aware translations.
Opt-in by default
Off until you turn it on. Toggle in: extension popup → Translate card → Smart Context, or in Options → Page Translate tab.
Common issues
Engine-level problems often look like translation problems. Each row maps a symptom to the actual cause.
QUOTA_GONEAI suddenly stops, falls back to Google / Bing
Free plan: 50 K AI tokens / month. Pro: 5 M. Team: 10 M / seat. When exhausted within a cycle, the cascade silently downgrades to free engines. Buy a token addon, upgrade plan, or wait for monthly reset.
CHAIN_EXHAUSTED"AI providers temporarily unavailable" pill
Every backend AI provider failed for non-quota reasons (transient outage, safety filter, malformed responses across the board). The cascade falls to Google / Bing for the request. Retry in a few minutes.
WRONG_TONETranslation reads oddly formal / casual
Different engines have different default tones. Grok is casual, Gemini balanced, DeepSeek precise. Switch the engine for the surface (page, selection, input) in Options. For per-translation tone control, use AI engines + Slang/Business styles.
CACHE_STALESame page shows old translation
Local cache lives 30 days, keyed per (URL, engine, target lang, style). Switch engine to force a fresh request; or clear the local cache from Options.
SMART_CONTEXT_NOT_APPLIEDSmart Context toggle on but translation feels generic
Smart Context summary is Pro/Team only and only applies on the page's first translation (then the 24h cached summary applies). Free with the toggle on gets title-only context.
AI_NOT_VISIBLEAI engines greyed out for guest browsers
AI engines (Plus mode) need an account so the backend can track quota. Sign up for a free account (50 K tokens / month) or use BYOK to plug your own provider key without signing in for backend quota.