When looking at the cost of translation APIs, I was floored by the prices. Azure is $10 per million characters, Google is $20, and DeepL is $25.
To come up with a rough estimate for a real-time translation use case, I assumed 150 WPM speaking speed, with each word being translated 3 times (since the text gets retranslated multiple times as the context lengthens). This resulted in the following costs:
- Azure: $1.62/hr
- Google: $3.24/hr
- DeepL: $4.05/hr
Assuming the same numbers, `gemini-2.0-flash-lite` would cost **less than $0.01/hr**. Cost varies based on prompt length, but I'm actually getting just under $0.005/hr.
That's over 800x cheaper than DeepL, or 0.1% of the cost.
Presumably the quality of the translations would be somewhat worse, but how much worse? And how long will that disadvantage last? I can stomach a certain amount of worse for 99% cheaper, and it seems easy to foresee that LLMs will surpass the quality of the legacy translation models in the near future.
Right now the accuracy depends a lot on the prompting. I need to run a lot more evals, but so far in my tests I'm seeing that the translations I'm getting are as good (most of the time identical) or *better* than Google's the vast majority of the time. I'm confident I can get to 90% of Google's accuracy with better prompting.
I can live with 90% accuracy with a 99.9% cost reduction.
For many, 90% doesn't cut it for their translation needs and they are willing to pay a premium for the best. But the high costs of legacy translation APIs will become increasingly indefensible as LLM-based solutions improve, and we'll see translation incorporated in ways that were previously cost-prohibitive.