Enterprise content teams requiring high-quality translation services have traditionally insisted that agencies translate from scratch - no post-editing from machine translation.
GlobalDoc has translated for teams at enterprises like IBM, Tenneco, Xerox, Toshiba and other high-profile clients for decades and shared their concerns about machine translation.
At the same time, the massive growth in content and languages generate constant motivation to use advances in technology, and to automate more.
As a technology-centric language service provider, GlobalDoc had followed innovations in machine translation over the decades, but could not confidently recommend a machine-driven translation solution to clients that could deliver both quality and immediate savings.
GlobalDoc needs to deliver perfect quality and be fair and transparent to both clients and translators as it introduces new technologies.
Off-the-shelf machine translation APIs that power consumer-focused translation apps do not work well on high-visibility and sensitive enterprise content like corporate communications, marketing materials, multimedia, web UI strings and technical manuals. Professional human translators have to spend significant effort post-editing the machine output to maintain the final quality within traditional translation workflows.
GlobalDoc clients need to know in advance what the cost of translation will be, to optimize content and language coverage according to their budget. But machine translation quality can vary wildly. In order to share the efficiencies of machine translation with clients and translators predictably and transparently, the GlobalDoc operations team needs to accurately forecast how much machine translation will assist the translator’s work on each document.
The total volume of orders varies across clients, business units within clients and project element types. But for integrating machine translation, proper setup - starting with know-how about machine translation providers, formatting, security, customization and quality control - is required for each client, project and language pair.
As the COVID-19 pandemic hit societies around the world in Q1 and Q2 2020, enterprises felt their businesses and their own operations severely disrupted. They faced pressure to quickly disseminate critical emergency information in many languages and to launch new businesses.
Meanwhile, many employees were suddenly working from home without access to corporate systems, and many translators were also struggling in their professional and personal lives.
“We were already aggressively looking for ways to offer more automation and cost efficiencies to our clients, and the pandemic accelerated the absolute necessity to succeed in doing this seemingly overnight.”
— Michael Cooper, Founder and CEO, GlobalDoc
GlobalDoc was fortunate to be fully remote-capable and quickly shifted more resources to the development of working solutions that fulfilled these unprecedented requirements.
Machine translation quality estimation is a topic of open research and inside technology companies like Amazon, Google, Facebook and Microsoft that have formidable machine learning research teams.
Quality estimation:
Automatic methods for estimating the quality of neural machine translation output at run-time, without relying on reference translations 1
Deep-learning approaches, based on massive multilingual language models, are gaining ground for instantly predicting both segment-level quality metrics, like post-editing effort, and aggregate metrics - document- or project-level evaluation. 2
GlobalDoc CEO Michael Cooper assessed the landscape and ranked ModelFront, with its translation risk prediction API and console, as the leading provider of production-strength solutions for quality estimation and evaluation.
GlobalDoc and ModelFront partnered to integrate ModelFront technology into GlobalDoc’s translation management system, LangXpert, and share know-how on use cases and translation technology.
With GlobalDoc’s guidance, ModelFront tunes the accuracy for GlobalDoc use cases, develops support for the required project and document formats and integrates customizable machine translation from the most suitable providers.
The partners present the solution as an option to GlobalDoc clients, who are able to preview real results behind the scenes - actual machine translation and risk prediction output - on recent projects before they choose the post-editing option for future projects.
Let’s look at how it worked on a real project from Q3 2020 where the client, a Fortune 500 in the automotive space, selected the option for full human post-editing from machine translation.
10 instruction manuals as Adobe InDesign® documents (IDML format)
2595 segments (14181 words)
English (United States) to German (Germany)
1095 exact translation memory (TM) matches and 1500 new segments
GlobalDoc’s workflow parses the documents into segments - meaningful units of text like titles or sentences - and invokes ModelFront for custom machine translation and translation risk prediction on the 1500 new segments for which there was no exact translation memory match from previous projects. The machine translation has been customized based on the translation memory.
For this project, many of the segments have very low predicted risk - the estimated probability that the segment will need to be post-edited, even by one character.
The system factors in not just the quality of the machine translation, but also the inherent difficulty and quality of the source content. The aggregate score is length-weighted to account for actual post-editing effort.
The dashboard also has a preview where the segments are sorted by predicted risk and labeled by error type so that the project manager can drill down into the actual text in a targeted way.
With all this information, the GlobalDoc project manager knows that the professional human translator will be able to review and approve many segments - a significant speedup, which is passed on to the client as savings.
GlobalDoc provides a quote to the client and to the translators that reflect the actual difficulty tier before the project began. After the project is delivered with the final post-edited translations, the post-editing data is used to evaluate the risk prediction system and continuously improve its accuracy for future projects.
This approach was successfully applied at GlobalDoc over Q3 and Q4 2020 to projects along the full spectrum of language pairs and content types, from film subtitles to a literary novel.
“The feedback from both our clients and our translators is consistently positive.”
— David Jett, Vice President of Operations, GlobalDoc
The success of integrating machine translation and risk prediction depends on multiple criteria.
GlobalDoc clients require excellent quality. Because they generally also provide their original documents in excellent quality and in a style consistent with their previous projects, customized machine translation performs better. It’s also important to segment documents in a way that’s optimal for machine translation.
GlobalDoc has maintained translation memories for clients over decades. Customization of machine translation mainly depends on the size and quality of the client’s translation memories.
Not all content is a good fit for post-editing from machine translation. It depends on the document type, the language pair and data available for customization. Different machine translation options have different strengths and weaknesses. The number, features and language and locale support of machine translation service providers is constantly expanding.3
From the project- to the word-level, cutting-edge technology can provide valuable inputs to human experts who make the final decisions and maintain the final quality of the traditional translation workflow.
You can read the full case study Integrating Machine Translation and Risk Prediction to Achieve Cost Savings (PDF) from GlobalDoc
You can contact GlobalDoc for more information or to order full human post-editing for your premium content.
GlobalDoc and ModelFront are actively exploring more ways to automate and to apply risk prediction to more use cases.
Quality Estimation Task, Fifth Conference on Machine Translation, EMNLP 2020
Seven Machine Translation Trends in 2020, Maxim Khalilov, TAUS
An overview of the features and limitations of the major machine translation APIs, ModelFront