In the real world, Kyiv won over Kiev in Wikipedia and all other major style guides — but most Machine Translation engines aren’t there yet.
Russia invaded Ukraine on February 24, 2022 — and since then, the war has remained at the forefront of all our minds. You may not know that there has been a long-standing linguistic battle dating back to Ukraine’s independence from the USSR. It’s about how to write the name of Ukraine’s capital city in English.
In 1804, it was romanized as “Kiev,” after Russian “Киев,” in John Cary’s “New Russia: Journey from Riga to the Crimea by way of Kiev.” In 1995, Ukraine introduced the version “Kyiv,” after Ukrainian “Київ.” It initially appeared as “Kyiw” in “Geographical Dictionary of the Kingdom of Poland” in 1883 and finally got to Oxford Dictionary in 2018.
With the Russo-Ukrainian war dominating headlines, the Kyiv vs. Kiev dispute has been gaining more attention and was covered last week by major international publications, including The Washington Post, Independent, and Multilingual Magazine.
At Intento, we work with multiple Machine Translation platforms and have many customers relying on those tools for real-time automated translation. Therefore, we have decided to check what our customers see when using those different MT systems.
We chose the sentence “Russia attacked Ukraine, Kyiv under siege”, as to not trigger any historical context. To check both possible directions (Ukrainian > English and Russian > English), we have translated it into Ukrainian (“Росія напала на Україну, Київ в облозі.”), and Russian (“Россия напала на Украину, Киев в осаде.”). Then, we’ve translated it back to English with the most popular MT engines that support these language pairs.
We found that only Google sticks to Kyiv in any context. Amazon, ModernMT, and PROMT support Kyiv only when translating from Ukrainian but stick to Russian “Kiev” when translating from Russian. Other MT systems use “Kiev” in all cases.
We have written this post to support the #KyivNotKiev campaign and nudge MT providers to adjust their models. Meanwhile, we suggest that our customers use MT glossaries natively supported by MT platforms or implemented universally for all MT via Intento to ensure the correct version.