Read about

Blog/Monthly Digest

May 2024: Integrations with LLMs from Anthropic, Google, and OpenAI, and more

June 11, 2024

In the past month, we’ve made several improvements and additions to our Enterprise Language Hub:

• Added automated glossary management through our API
• Integrated Anthropic, Google, and OpenAI’s GPT-4o language models for machine translation across all connectors
• Expanded project support in Lokalise TMS
• Introduced language auto-detection
• Set per-user translation limits in our Windows and Mac applications

We also investigated promising new machine translation evaluation metrics that use references for comparison: MaTESe, MetricX, GEMBA-MQM, XCOMET-XL, and CometKiwi-XL.

Product updates

Automated glossary management via API

We’ve enhanced our API to allow direct management and updating of translation glossaries from your systems. This improvement, particularly effective for bulk updates and the rapid addition of new terms, ensures your glossaries will work with all major MT providers, including those without native glossary support. You can also use the same glossary across different providers. Contact us, and we’ll show you how to automatically update your terminology bases.

The latest Anthropic and Google LLMs available through Intento

Our Enterprise Language Hub now offers access to the latest LLMs from Anthropic and Google. Integrate these powerful models into your translation workflows to make them quicker, more accurate, and context-sensitive.

If you want to do more and build intelligent solutions for content generation, summarization, data analysis, or tailoring content to your style, you can create custom AI skills using these LLMs that are also available in our GenAI Portal. Check it out now!

Available Anthropic models:

Claude-3-opus-20240229: Best for complex tasks, in-depth analysis, and handling long documents up to 150,000 words in English.
Claude-3-sonnet-20240229: Ideal for enterprise workloads and business applications requiring high-quality output.
Claude-3-haiku-20240307: Perfect for fast-paced, real-time applications and customer support tasks needing quick responses.
Claude 2.1: Optimal for processing and analyzing large documents up to 75,000 words in English.

Available Google models:

Google Gemini-1.5-pro: This model excels in high-level content generation and detailed analysis and supports extensive language pairs. It delivers robust performance optimized for professional and enterprise applications. It also features a context window of up to 1 million tokens (approximately 750,000 words in English), making it one of the largest context windows available on the market.
Google Gemini-1.5-flash: Designed for rapid response and real-time usage, this model is perfect for dynamic applications such as live customer support, quick content updates, and other time-sensitive tasks. It ensures high-speed processing without compromising on accuracy. Like the Pro model, it also boasts a context window of up to 1 million tokens (approximately 750,000 words in English), providing extensive contextual awareness and continuity.
Palm2-chat-bison-002 and Palm2-chat-bison-32k-002, capable of processing 6,000 and 32,000 words, respectively, ensure safe and non-toxic output in over 100 languages. These models were developed specifically for multi-turn chats, ensuring user interactions remain seamless and efficient.

We integrated GPT-4o on its launch day!

OpenAI has launched its latest model, GPT-4o (“o” stands for “omni”), which is twice as fast and 50% less expensive than GPT-4 Turbo. It features greatly enhanced tokenization, needing up to 4x fewer tokens for non-English languages. This cuts costs and boosts speed even more. GPT-4o is immediately available for translation across all Intento connectors and integrations.

Support of new types of projects in Lokalise TMS

We’ve enhanced our integration with Lokalise to support their new content import function. This function lets you import content directly from platforms like HubSpot and Figma into their translation workflow.

Simply connect Lokalise with your external system and select Intento as your translation engine to avoid manual content transfers and ensure high-quality translations. All new content imported into your Lokalise project will be translated using Intento’s advanced MT and AI technology, including glossaries and automatic post-editing.

Here is the complete list of systems compatible with Lokalise integration: Contentful, Contentful Native, Webflow, Iterable, Marketo, Freshdesk, Contentstack, Ditto, Hubspot, Intercom Articles, Salesforce CRM, Salesforce Knowledge, Storyblok, WordPress, Zendesk Dynamic Content, Zendesk Guide. Contact us to learn more.

Language detection in Enterprise Language Hub

Although many MT providers have an embedded language detection feature, we have our own language detection in Language Hub. We use it to perform Source Quality Improvement before translation when the source language is unknown. This is necessary to translate multilingual user-generated content, apply glossaries and tone of voice, and leverage Translation Storage.

We’ve just updated the detection algorithm so that our clients can customize it for their particular content and language set, leading to improved accuracy. https://inten.to/demo/, and we’ll show you how it works!

Per-user translation limits for Intento’s Windows and Mac apps

We originally developed Windows and Mac apps for engineers in multinational companies who work with various online and desktop tools. Just copy the text into a clipboard, hit the translation shortcut, and voilà!

But sometimes, you need to translate huge chunks of text, which can lead to unexpected costs. Now, if you try to translate more than 10,000 characters at once, you’ll see a warning to prevent accidental large translations. This helps you keep expenses under control while tracking usage in the Intento Console. Try it out and enjoy a smoother translation experience (Intento Translator for Windows, Intento Translator for Mac)!

New reference-based metrics for MT evaluation: MaTESe, MetricX, GEMBA-MQM, XCOMET-XL, and CometKiwi-XL

We’ve just published a new blog post exploring the frontiers of machine translation evaluation, featuring insights from the WMT23 Metrics Shared Task – from the Conference on Machine Translation (WMT) held in 2023. It focused on evaluating different automatic metrics used to assess the quality of machine translation systems.

In this post, we dive deep into five innovative metrics – MaTESe, MetricX-23, GEMBA-MQM, XCOMET-XL, and CometKiwi-XL – and assess their potential for enhancing our commercial evaluations and annual Machine Translation Report.

Key takeaways:

✅ Most metrics had limitations preventing production use, such as language pair restrictions, interpretability issues, slow speeds, or high costs.

✅ CometKiwi-XL stood out for its quick, effective, reference-free evaluations under a non-commercial license.

✅ Staying informed about the latest MT evaluation developments is crucial for data-driven localization strategies.

Read the full post to discover how these cutting-edge metrics could shape the future of translation quality assessment and what it means for your localization workflows.

Our pick of localization events

GALA in Valencia

The primary reason MT and GenAI translations often require human edits is due to the legacy localization tech stack that works with segments and strings. Konstantin Savenkov, CEO of Intento, gave a brief session at GALA in Valencia to question the norm and help us free language from segments!

WorldSpeak in Toronto

Konstantin Savenkov also participated in a panel, “A Professional Perspective on the Use of AI,” at WorldSpeak in Toronto, organized by The Canadian Language Industry Association.

Together with Maria Sierra from McGill University, Nazanin Azari from Nations Translations, and Oleksandr Pysaryuk from GitLab, he discussed the multifaceted uses of AI, blending academic insights with real-world applications.

Upcoming event

How to boost your global content quality with XTM and Intento

Join our webinar with XTM International on June 12 to learn how integrating XTM Cloud and Intento Enterprise Language Hub optimizes localization workflows, improves translation quality, reduces manual tasks, and minimizes post-editing time.

John Weisgerber from XTM and Vann Maxson from Intento, both solutions gurus, will discuss strategies to enhance source quality for faster localization and integrate non-language teams into localization workflows without excessive resource deployment. Register now!

The latest in artificial intelligence

DeepLearning.AI’s article discusses various ways to give prompts to LLMs.
Google’s new LLMs, Gemini 1.5 Pro, and smaller and faster Gemini 1.5 Flash are generally available. Both Gemini 1.5 models support a very long context of 1M tokens, and Gemini 1.5 Pro additionally has an experimental mode with a 2M context size. The models are available at Intento.
You can customize Gemini with grounding; see Grounding with Google Search.
Many-shot in-context learning allows using thousands of examples in a single prompt for models with a large context size.
An interesting development: Google proposes Context Caching to save money for repeated requests with very large context.
At Google I/O, the company also announced the new vision-language open model. PaliGemma, and the upcoming Gemma 2 update for the open model Gemma. Gemma 2 with 27B parameters is comparable to the much larger Llama 3 70B model.
Google open-sourced LLM Comparator, an interactive visualization tool for analyzing side-by-side LLM evaluation results.
Cohere For AI announced Aya 23, a new multilingual, generative large language research model (LLM) covering 23 different languages.
An interesting article about AI prompt engineering.
If you’re not sure what temperature means in LLMs, here’s a clear visual explanation.

Already a member? Sign In