We recently hosted a webinar on “GenAI in Localization: 2024 in Review and What to Expect in 2025” with two industry experts: Balázs Kis, Chief Evangelist at memoQ, and Konstantin Savenkov, CEO and Co-Founder at Intento.
During the webinar, our audience had many great questions about GenAI applications in localization. While we couldn’t address all of them live, our speakers answered the remaining questions after the event—from evaluation methods to data protection and regulatory compliance. We share their insights in this blog post.
If you want to know all the questions, watch the recording to learn how leading companies are using GenAI in localization.
Do you experiment with LLMs and AI tools to evaluate translations for specific purposes, like UI compatibility or SEO effectiveness? For example, checking if translated content breaks website layouts or how well multilingual search performs compared to original language queries?
We experimented with such checks, for example, using website screenshots to assess if the localized version of the website breaks the UI or evaluating the quality of real-time multilingual search by comparing the output of English and translated queries. When the context is retrieved correctly, and the prompt length is managed, it leads to a better automatic QA.
What limitations can Personal Data Protection Regulations have on developing GenAI applications in translation?
In general, GenAI models (LLMs and similar tools) do not differ much in terms of GDPR and similar regulations compared to traditional NLP models. Moreover, LLM providers articulate their data protection and privacy terms much more clearly due to elevated customer concerns. While some companies introduced specific restrictions in 2023, these are being gradually lifted.
What is the copyright situation for the LLMs you are working with?
Major LLM providers guarantee that if the user owns rights to the inputs, they will also own rights to the outputs. Moreover, they provide indemnification against IP lawsuits resulting from the use of generated content.
Currently, it seems unlikely that national regulators in healthcare would allow the acceptance of AI-based translations in life sciences without any human post-editing. What’s your take on that?
We should distinguish post-editing from subject matter or in-country review. The lack of need to edit translations stemming from high AI quality is definitely favorable, and flawless translation drafts would accelerate processes in many regulated industries. However, regulations still require review. The actual quality of current human review processes in these industries and the potential for improvement through AI tools is a separate topic, which I hope we’ll be able to publicly address in the near future.
Do you have AI engineers who can offer prompt management?
Yes, we do. We called them computational linguists in the past and now tend to call them language engineers.
If your prompt is too long, how do you bypass the length limitation in AI?
The problem is not with the length limitation itself, as context windows may span hundreds of thousands and even millions of tokens. The issue is that the longer the prompt is, the less accurately it’s followed by an LLM. Additionally, long prompts introduce issues for change-management processes, as changing the prompt to address one issue may result in a regression. We address these issues using a modular agential architecture, where instead of one prompt, we use a set of agents, each responsible for specific translation requirements.
Are LLMs better suited for Machine Translation Evaluation than purpose-made models like COMET or BERTScore? Or are you using those models in the chain of Agentic Evaluation?
Our studies found that LLMs perform better than BERTscore and COMET (at least in COMET’s 2022 version).
To explore how GenAI can enhance your Machine Translation solution, reach out to hello@inten.to or book a demo.