Earlier this month, Intento hosted its first webinar, bringing together speakers from Nike, Stripe, GAP, Procore, NetApp, Wish, AstraZeneca, and Esri to discuss the biggest topics in the world of machine translation from 2022 and plans for 2023. They shared a mix of real-life customer stories, reflections on MT ROI, machine voice, the expanding role of localization teams, and more.
The webinar’s keynote address was presented by Intento Co-founder and CEO Konstantin Savenkov, focusing on the biggest highlights of 2022 and what’s coming in the new year. We’re at the end of a very complex year — with the war in Ukraine and economic turbulence all over the world. Nevertheless, it was a fascinating year for machine translation and artificial intelligence. We’re not yet at the point where it can solve all of our problems, but it’s helping a lot, providing solutions in many different use cases.
Keep reading for a brief summary of Konstantin’s presentation, including the rise of massively multilingual models, source quality improvement, centralization versus sharing, AI beyond MT, and more.
If you enjoy this preview of the keynote address, check out the event recording here to catch all of Konstantin’s insights.
• • •
Ideas that worked in 2022
Massively multilingual models
The idea is pretty simple — to build a model where different languages live together, like how humans can build on the language skills they’ve learned in one language as they try to pick up another.
This year we’ve seen a few MT providers begin to use massively multilingual models in production, resulting in the unprecedented growth of support language pairs for machine translation. This includes the increase of 25,000 languages over the year, with 680 in September 2022 and another 100 in the last couple of months.
This might not seem like a major difference from the economic standpoint, but for those who speak these languages, MT is opening up the world.
This year, we had the idea to run a post-editing analysis, looking into what post-editors do with different types of leverage (translation memories, repetitions, MT, etc.) This kind of ‘check-up’ shows symptoms of issues in different parts of the localization stack, so you know how your system can be improved (over-editing, under-editing, misconfigured connectors, etc.)
The localization stack is often very complex, and hidden bottlenecks and inefficiencies can result in 20–30% more costs. We’ve seen cases this year where the localization checkup has uncovered up to 60% wasted effort.
We also have seen visible and tangible proof that good MT requires less effort than TM fuzzy matches. This ultimately leads to conversations about replacing fuzzy matches with MT and, at the same time, suggesting that MTPE rates should be at least not higher than TMPE rates.
MT/AI for corporate training videos
The workflows for video translation are also very complex, and we’ve found the best integration points into these complex workflows, which can lead to around 65% of the effort saved.
Translating a bulk of videos that took around a year can now be done in as little as a month.
Source Quality improvement
There are many ways a given text can be translated into another language. For the same translation, there may be very different source sentences. If there are many different source sentences, applying MT will create multiple outcomes with various distances to the desired translation outcome.
We had a hypothesis — what if we found a way to transform the source text so that its machine translation lands much closer to the desired outcome?
We found that by transforming the source text, we could reduce post-editing by about 15% on the backside. The way that workflows are built today, all editing happens after MT. It seems that in many cases, editing before MT yields significant value.
• • •
Ideas that may work in 2023
Centralization versus sharing
Translations are not the only priority for localization departments — localization managers also have to focus on source quality, post-editing, checking the quality of the translations, project management, and much more.
Human workflows require centralization. However, when we automate these workflows by replacing them with AI, we suddenly require less centralization. It becomes possible to share these AI assets across the company by directly integrating the tools for MT and automatic post-editing with the software systems across the company in other departments (customer service, HR, corporate training, etc), such as Salesforce, ServiceNow, and others.
This will move costs closer to where ROI is generated while cutting costs to localization departments, increasing their footprint, and creating more translation ownership across the company.
This trend is absolutely enabled by developments in technology, so hopefully, we’ll see it more and more next year.
AI beyond MT
When MT is used for localization departments, they handle pretty much everything that is required (source and translation quality management, post-editing, etc). However, when MT is spread across the company, these other processes become bottlenecks. This is because there is no one to run these processes if, for example, the MT is deployed in the customer service system.
When automation is spread across the enterprise, you’ll need more support. This support will come from source quality improvement, automatic post-editing, and translation quality estimation.
We already support parts of these on the Intento platform — for example, that is what enables real-time translation for customer service — and we’ll see it more and more in the coming year.
General pre-trained transformers
The new version of GPT (3.5, released a few weeks ago) shows a major boost in performance. Its conversational version ChatGPT has become very popular in social media.
The new models are becoming capable of doing a lot of tasks besides translation. If you think of all the other tasks that are becoming involved in the localization workflows, they can also be handled using these models.
For example, they can take care of automatic source quality improvement by disintegrating texts and making them more sensible and translatable. They are also helpful in automatic post-editing and changing the style of the text.
General pre-trained transformers create a very convenient path to production, starting from zero-shot training (when you express what needs to be done with your text), then few-shot training as you start to get a sense of what needs to be improved, and then with even more data over time you can start fine-tuning.
These models will transform the industry by enabling everyone to use AI, for a broad range of tasks beyond simple translation, building complex automatic workflows with source quality improvement, translation, automatic post-editing, and quality assessment.
Because Intento has connectors to all translation management systems, in the new year, you will be able to leverage these workflows in your TMS — further reducing the effort needed to create high-quality content. Please, reach us if you want to test it on your content!
• • •
Click here to view Konstantin’s entire presentation. Through the recording, you’ll also have the opportunity to watch both webinar panels on demonstrating MT value across the enterprise, and the future of MT, along with an exploration into image generators.