Use case: Data-driven optimization of LLM/MT translation

Measure, analyze, and optimize the performance of your MT and LLM systems – for better translations, more efficient processes, and maximum cost-effectiveness.

Thanks to the measurement of the average Levenshtein distance and post-editing time as well as Translation Quality Estimation (TQE), you always have an overview of which content can be released directly after machine translation and what still needs to be revised by your linguists. At the same time, you can use the measured data to continuously optimize your AI and LLM language resources through RAG, terminology integration and customized prompting. The interplay between AI optimization, Translation Quality Estimation and real post-editing data creates a solid foundation for continuously evaluating and further improving the quality, efficiency, and cost-effectiveness of your MT and LLM workflows.

The challenge

Many companies and language service providers nowadays use different MT and LLM systems to accelerate translation processes. The real challenge, however, is not to generate translations, but to objectively evaluate and further optimize their quality and cost-effectiveness.

Which engine delivers the best results? Which content causes the most post-editing effort? For which clients and texts can terminology databases, machine translation engines or client-specific language models achieve the greatest impact?

To answer these questions, companies need reliable data on the actual quality of automatic translations and the effort involved in post-editing them. And this is exactly where translate5 comes in.

translate5 supports the optimization of machine translation

translate5 tackles the challenge with analyses implemented by default

The system automatically collects relevant key figures at segment level throughout the entire translation process, such as post-editing time, Levenshtein distance and Translation Quality Estimation (TQE).

Levenshtein distance shows the extent to which a pre-translated segment has been modified by the linguists. Post-editing time measures the actual effort required for this work. Translation Quality Estimation assesses the expected quality of individual segments or entire documents even before post-editing and indicates to project managers and linguists which segments require more and which less attention or which ones can be confirmed directly

The combination of these key figures provides a solid basis for the analysis and optimization of MT and LLM workflows and the language resources used in them. What is particularly valuable is the ability to use the results to refine RAG configurations, prompting strategies, terminology integration, and customer-specific AI models. The effects of such optimizations can then be measured objectively and compared with each other on the basis of TQE, Levenshtein distance and post-editing time.

A possible translation and analysis workflow

You want to pre-translate technical documentation with the help of an LLM, reduce the manual effort to a minimum and prepare the content in translate5 with the appropriate language resources.

After pre-translation, translate5 automatically scores each segment using Translation Quality Estimation. Your project management can then filter out all segments with a quality rating of at least 95 % and approve them directly and possibly even lock them so that they cannot be changed manually in the further process. Only segments with lower quality values are assigned to linguists for post-editing.

After completion of the editing process, translate5 also analyses the average Levenshtein distance and post-editing time over all relevant segments. On the one hand, this shows which segments actually caused a high level of effort and whether the quality assessment was reliable. On the other hand, the numbers provide indications of how well the language resources used for the pre-translation are suitable for the text type in question and how they could be optimized through skilful prompting.

Furthermore, those segments with a high Levenshtein distance can be checked again for terminology that might have been missing in the terminology database in order to update it accordingly.

Draw versatile insights from the analysis results

The insights from the analysis

The data obtained enables a detailed assessment of the language resources used.

For example, it is possible to determine that certain LLMs deliver particularly good results for marketing texts, while technical content benefits above all from consistent terminology, RAG based knowledge enrichment and optimized prompts. The analyses not only show you which language resources deliver the best results, but also which AI optimizations actually lead to a measurable increase in quality.

It is also possible to identify segments which, despite a high quality rating, require an above-average amount of post-editing time. Common causes are:

missing, inconsistent or niche terminology
subject-specific formulations
product names or protected terms
technical content with code blocks or commands

Especially for technical documentation, it is possible to analyse the impact of protected terms, Content protection rules or terminology databases on translation quality and post-editing effort.

Find out how you can optimize your models

The various optimization options

On the basis of the findings obtained, you can improve your translation processes continuously as well as the underlying AI and language resources. Changes to prompts, RAG data sources, terminology databases or client-specific models can be tested specifically and evaluated on the basis of objective quality and efficiency metrics.

Possible measures are:

selection of the best-suited MT or LLM resource for specific content types
enrichment of existing terminology databases
introduction of Content protection rules for product names or code, and, if necessary, automatic conversion of character and/or number strings into predefined target language patterns
optimization of quality-based decisions based on Translation Quality Estimation
targeted post-editing of only the genuinely critical segments
training of client-specific MT or LLM models as well as continuous optimization of RAG configurations and prompting strategies based on real quality and usage data

Recognize the changes after you have optimized your models

The conclusion and your benefits

By combining Translation Quality Estimation, Levenshtein distance and post-editing time, you obtain an objective view of the actual performance of your translation systems for the first time.

Instead of making decisions based on individual samples, translate5 enables you to analyse, compare and continuously optimize MT and LLM workflows on a data-driven basis. This reduces the amount of post-editing required, improves translation quality and enables linguistic resources to be deployed in a targeted manner where they create the greatest added value.

Try it yourself

Contact us to explore translate5 together. We will support you from the very first steps and show you how translate5 can be seamlessly integrated into your workflows.