Scientists and corporate organizations have been working on the development of artificial intelligence since the 1950s. Considerable progress has also been made in the field of machine translation. This was made possible by more powerful hardware and new technical developments such as machine learning and neural networks, which emulate the human brain in a simplified way. But this development has also been facilitated by the enormous amount of high-quality bilingual content now available in electronic format, which serves as training material for translation engines based on artificial intelligence algorithms.
Founded in Cologne in 2009 under the name Linguee, DeepL caused a sensation in 2017 with its free DeepL Translator service, whose underlying neural networks produce texts of hitherto unmatched quality. This has been confirmed by both automated evaluation metrics and blind tests carried out with professional translators. Paid subscriptions to DeepL Pro have also been available since 2018 in addition to an API for developers for integrating DeepL into other systems. DeepL can now also be integrated into common CAT tools such as Across, memoQ and SDL.
What is behind DeepL?
DeepL for any kind of text has been under continuous development since 2016 using machine learning and innovative language processing technologies. According to the company, a billion translations – consisting of bilingual sentences collected by Linguee’s web crawler on the Internet – were used to train DeepL. Even before DeepL was launched, it was possible to research individual expressions in context in Linguee’s online application. All hits offered by the web crawler are based on the bilingual texts of companies, authorities and the EU available on the Internet.
A technical text from the field of corporate finance was chosen to put DeepL through its paces (see illustration below). DeepL’s output impresses with its fluid text: the sample text is completely free of clumsy grammatical mistakes. Instead, the subject of text – a company’s liquidity ratios and liabilities – is immediately apparent.
The results from Google Translator are less convincing: one incorrectly translated technical term immediately catches the eye of technically savvy readers. Google has transmogrified the financial term “total liabilities” (German: Gesamtverbindlichkeiten) to “joint and several liability” (German: Gesamthaftung), which used in legalese. However, the actual reference is to the company’s entire financial liabilities, as DeepL correctly recognized.
On closer examination, however, it becomes clear that even DeepL has failed to translate all technical terms correctly. The terms “current ratio” and “quick ratio” – translated by DeepL into German as “aktuelle” and “schnelle Kennzahlen” – are ratios for different degrees of liquidity. However, the terms are usually used untranslated in German banking and accountancy as “Current Ratio” (or cash liquidity) and “Quick Ratio” (or acid test). So, without post-editing by a specialist translator and without the correct specialist terminology, machine-translated texts are not very convincing for a demanding specialist audience. Many similar examples can be found on the Internet. Sometimes the errors are so subtle that they can only be detected by experts.
Do machine translation engines, such as DeepL, make technical translators superfluous?
That depends entirely upon such factors as the type of text, the target group and purpose of communication, all of which must be taken into account when deciding for or against machine translation. Demanding target groups, such as investors or doctors, are unlikely to be convinced by poorly translated brochures or leaflets. However, a machine translation may be sufficient or other types of text, for example, those not intended for publication or source texts that have been optimized for machine translation through the use of rule-based writing.
Another important factor is the specific language combination in question, as the output quality can vary depending on the source and target languages: the structure and grammar of the respective languages plays a role as does the available volume and quality of the bilingual texts with which the translation algorithms are trained. The results are also influenced by the subject area concerned, as there are certain specialized topics for which the available bilingual training material is extremely scant.
Where man is superior to machine
Machine translation delivers results that are good enough for certain purposes, yet when it comes to advertising texts involving puns, technical texts replete with technical jargon or texts that are flawed or ambiguous, the human mind still trumps artificial intelligence.
Familiarity with technical jargon
The vocabulary used in neural-network-based machine translation is usually limited to the 50 to 80,000 most common words, so certain technical terms may not be available at all – these are referred to as “OOV (out of vocabulary) words”. Here, too, professional translators, with their knowledge of the relevant specialist jargon, are superior.
Understanding the logic of a text
Source texts often contain logical errors or are based loosely around a central theme that is not easy to understand. Even such a simple omission as leaving out the word “not” can render a text self-contradictory. To be able to accurately reproduce a source text in the target language, translators focus closely on the meaning of texts and are apt to notice such inconsistencies and refer them back to the client for clarification when necessary.
Processing flawed texts
Source texts and datasets can contain errors. English-language texts, in particular, are produced all over the world, often by non-native speakers, and can sometimes difficult to interpret. But even a simple typo can be enough to baffle a translation engine. Translators, on the other hand, with their finely-honed cognitive skills, can interpret such flawed text passages and take a view on when to refer to the customer for guidance.
Creativity in translation
Certain types of texts, such as advertising and marketing content, need to be adapted to foreign markets, which requires creativity on the part of the translator based on his or her in-depth knowledge of the target culture, groups and language.
Dealing with ambiguity
Most languages include words that can have various different meanings depending on the context. Whilst DeepL recognizes that the word “liabilities” as used in financial texts should be translated as “Verbindlichkeiten” in German, other translation engines translate it as “Haftung”, which is a German term used specifically to refer to liability in the legal sense. Human translators, on the other hand, are always aware of the context and select the appropriate terminology.
The same applies to the use of synonyms, which, whilst often used to enhance literature, are frowned upon in certain types of texts. An experienced technical translator will recognize the fact that “device”, “system” and “instrument” in the text sample below all have the same referent and may well advise the client to standardize the terminology. The issue is simply ignored in the machine translation.
Find more information about Deepl and similar translation software here.