Thoughts on Machine Translation and Post-Editing Analysis

This is a guest post by Jason Hall that he has written in response to my blog article on MemSource and Studio 2011: a side-by-side comparison. Jason is concerned about the post-editing analysis feature in MemSource and its implications for translators:

Post analysis is a fairly new concept which I imagine will become one of the primary selling points for agency licenses for the new generation of CAT tools such as MemSource because it allows agencies to pay translators based on the work actually performed rather than on a snapshot of what the files looked like when they were assigned. For example, if translator A and B are assigned files on Monday, and then translator A translates some common or similar text on Tuesday, and then translator B leverages those TUs through a cloud-based TM on Wednesday, the corresponding repetition discounts would be applied to those segments for translator B. That’s all fine and well, and while we translators may not enjoy seeing our “estimated fees” diminished, it makes good sense, and I think most translators would find the concept to be acceptable.

However, when MT is thrown into the mix, the issue of post analysis suddenly turns into a controversial one. Consider this scenario: a translator uses the MT feature within a CAT tool, making the appropriate edits, and then the post analysis calculates the similarity of the confirmed TU against what was provided by MT and presents the results in the post analysis report. In the words of the MemSource help file, “[the post-editing analysis] analyzes the MT post-editing effort for each segment and compares the machine translation output with the final post-edited translation. Therefore, if the machine translation output was accepted without further editing (the linguist did not need to change it at all), it would come up as a 100% match in the analysis.”

My concern is that including MT in the post analysis seems to suggest that post-editing of MT should be dealt with the same as TM leverage, which, of course, is absolutely absurd.

Let’s look at this logically: when I leverage a human TU, Trados, MemoQ, etc. is nice enough to highlight the differences for me, so I might change a date, a name, a verb conjugation, etc. and only quickly glance at the rest of the segment because I know that I, or a trusted colleague, has already checked it. This, in a nutshell, is the justification for repetition discounts: when TUs have already been checked by a human being, less effort is required to translate a similar segment, and so the new segment is discounted. Again, we translators may not want to charge less, but most would agree that this concept is fair and in everyone’s interest, and so it has become a generally accepted practice in the industry.

On the other hand, let’s say I am using the MT feature within a CAT tool. If I perform just a few changes to a MT segment, the post analysis may deem my efforts to be 90% similar to the original MT. However, despite the apparent similarity, the amount of effort required to edit MT is entirely different than that required to recycle my own or a colleague’s work in the form of a match extracted from a specially assigned, checked, and subject-matter and client-specific TM which has presumably already been checked by one human being and possibly even proofread by a second. In contrast, when leveraging MT, there is obviously no presumption that the results will be relevant, consistent, accurate, etc. and so a translator must focus his or her efforts on the entire segment, not just the changed or new bits. If you are still not convinced that TM and MT leverage are entirely different, just consider what happens when “the machine” gets lucky and a segment requires no change at all. In this case, the post analysis would deem the segment a 100% match, despite the fact that confirming the accuracy and suitability of the segment may still require significant effort by the translator. After all, we are not going to start taking Google’s word for it, are we?

As things stand today, translators should be wary of enabling the MT feature within a CAT tool for jobs to be paid on the basis of post analysis, but what is the long-term solution? Obviously, it would not be in anyone’s interest to turn our backs on a tool that has the potential to drastically increase output and efficiency, but at the same time, it is simply unreasonable to bill MT leverage according to established repetition rates based on human-confirmed TUs. The most obvious solution would be for translators and agencies to establish a completely separate rate structure for calculating MT in post analyses which recognizes the effort required by translators to use MT properly while still providing agencies the discounts they need to remain competitive. The table below offers a traditional discount scheme for human TM leverage on the left as well as a proposed discounts scheme for leveraging MT in the post analysis on the right.

Traditional repetition discounts

Proposed MT post analysis discounts

% Similarity

% of Rate to be Billed

% Similarity

% of Rate to be Billed

100%

0-30%

100%

80%

95-99%

50%

95-99%

85%

75-94%

75%

75-94%

90%

0-74%

100%

0-74%

100%

Obviously, the issue of translation rates, whether or not to discount them, and if so, the percentage of such a discount is something that must be carefully considered and negotiated between translators and agencies in each case. However, the mere potential to establish a discount scheme for leveraging MT such as the one above demonstrates two points:

  • First, leveraging MT is entirely different from leveraging TM, and therefore, any discounts applied for MT leverage must be significant less than those applied for traditional TM matches.
  • Second, it is in fact possible to use MT within a CAT tool in a way that is transparent, in accordance with professional ethics, and in the interest of all of the stakeholders in a translation project.

We translators may not enjoy seeing our rates discounted, but as long as such discounts accurately reflect the actual effort made, we are likely to accept them. Besides, given that cloud-based TM is rapidly being adopted as standard practice for most translation agencies, the concept of post analysis is likely to be adopted as a natural consequence of working in “the cloud”.

Jason Hall is a Spanish-English Linguist living in Cuenca, Ecuador. He holds an MLS in Global Affairs & Translation Studies from the University of Denver, University College and operates the translation service provider Cuenca Translations.

This entry was posted in SDL Trados Studio and tagged . Bookmark the permalink.

12 Responses to Thoughts on Machine Translation and Post-Editing Analysis

  1. I think I might be more inclined to make my initial distinction between trusted sources and untrusted sources, MT falling into the latter category (and accepting the possibility that either type of source could provide segments that are entirely acceptable with no changes). But that could be a reflection of some of the TMs I’ve seen 🙂

    As my last 2 blog entries show, I’m still a bit uncertain about how much use we should make of MT, but that too could be a reflection of my own circumstances. I certainly agree with the underlying principle that the main thing is to make sure we are properly rewarded for the work done.

    • Jason Hall says:

      A very good point about “trusted” and “untrusted sources”. After all, standard rep. discounts are based on the assumption that the TM is of good quality, relevant to the job, etc., which of course is not always the case. I’m sure we have all had proofreading jobs where the quality of the translation turns out to be less than ideal and additional hours are required, but has that ever happened to you with a TM? What I mean is, have you ever accepted rep. discounts based on the assumption that the TM is of high quality, only to find that it is poor quality, irrelevant, wrong target dialect, etc.? In that case, those 95% matches will take you much longer to process than the rep. discount would suggest.

      I have a question for the proofreaders out there. Do repetition discounts paid to the translator influence your proofreading rates? For example, Agency A pays translators for 100% repetitions, and so translators presumably deliver a thoroughly reviewed, consistent source, whereas Agency B pays 0 for 100% reps, and so its translators do not check those segments and deliver work that is more likely to contain false matches and consistency issues. Isn’t not paying for 100% matches just another way of transferring work from the translator to the proofreader?

      • Hi Jason – yes, indeed, it is previous experience with dodgy TMs that led me to propose my alternative nomenclature. I’ve certainly worked with some TMs that were as hard to work with (in terms of needing to edit (part-)matches) as MT output.
        Whereas I never agree to proof/correct anything without seeing both source and target text first, so I’m usually able to agree to reasonably accurate price without having to estimate on the basis of expectation.

  2. David Canek says:

    Jason, thanks for an informative article. By the way, I completely agree with you that it would be absurd to apply translation memory net rates to MT post-editing. That’s why we keep the TM and MT leverage as separate categories in the MemSource post-editing analysis.

    One of the goals for developing the post-editing analysis has been to provide some objective measure ot the post-editing effort instead of an agency saying to a freelancer: It’s been machine pre-translated, we will cut your translation rate by 30%….

    I do hope that the post-editing analysis will bring more transparency into how much editing was needed to turn MT into an actual translation. I think in some years from now the post-editing analysis will become as common as the translation memory analysis as we know it today.

  3. jasonjameshall says:

    I have to admit that when I first come across the MemSource feature of including MT in post analysis, I assumed that there would a tendency, or at least a temptation, for agencies to apply a discount structure similar to traditional rep. discounts, which is course controversial and is what opened up this whole can of worms in the first place.
    I feel that the industry is still adjusting to (or reeling from) the impact that machine translation has had on the way we perform and bill translation work in the last 5 years or so. While some agencies have established policies for MT, many do not address the issue directly but do accept its use indirectly by exerting pressure for translators to lower rates, so there is a need to discuss this issue openly in order to work towards some sort of census or standard practice in the industry.
    I agree that calculating rates based on post analysis will likely become that standard practice, at least for large agencies and large jobs that require several translators to work on a given project simultaneously.

  4. Pingback: Weekly favorites (Dec 17-23) | Adventures in Freelance Translation

  5. Pingback: What Is the Difference between Post-editing and Proofreading? | Multilizer Translation Blog

  6. There are good arguments on both sides about how to compensate translators for post-editing based on automated PE analysis. The fact is, though, that this subject is contentious and may lead to arguments after the work is done and the money is about to change hands. It is best to agree on a fixed price before the work is started, and to send a PO to the translator with a specific sum. This usually spares any fights about money and everyone can be happy.

  7. christopherjyrkilord says:

    This is a very interesting discussion. I am unusual in that having tried the various options, ten years ago I realised that post-editing pretranslated text is much faster than line-by-line translation, and I have mostly done that. It means I am much better at post-editing than most people. I started using a Word Macro called Wordfisher written by a Hungarian programmer and translator which allowed you to build your own general-purpose TM which would be applied within Word with no need for another CAT tool. But rather than the whole-sentence strings typical of TM-based systems, I found that the most efficient thing was to translate the common words and short phrases that make up the majority of most sentences, and then use search and replace to deal with the special terminology and other material that was left in the source language. It took a year or so to develop the wordlists for the languages I do, and I found that some languages and some contexts worked better than other, so that for instance Spanish and italian legalese was very difficult to do, due to paragraph-length sentences with grammatical structures that can’t just be transposed into English. I never used google, due to the inaccurate results that we all know about, but the thing is that Google Translator Toolkit has changed the whole situation radically, and I don’t think agencies even understand this, as most project managers are not translators themselves. SDL is offering bad and obsolete MT when the corpus-based system of the Toolkit is much better. So I can do 10,000 words a day of good clean translation: but the problem is that the whole industry now demands the generation of mostly badly corrupted TMs, so I have with a heavy heart now switched to using Trados Studio. This immediately cuts my productivity in half, with no benefits for the quality of the translation. It seems to me that a much better use of translator time is to improve a global corpus of translation with fine editing, as this will help keep up with the ever-growing demand for translations. But the business has other objectives of course. Anyone got any thoughts?

  8. Hi,
    I think there is a huge difference between Google translate and a trained MT system like Moses or Microsoft Hub. When you train a system for a specific client and domain the results you get from the MT are very relevant and I do not see why an MT matrix cannot be the same as a TM matrix. After all there is a considerable investment from the agency to develop a trained MT system. Any way it is a free market out there…

    • Great responses!

      Michael, you make a very good point about trained or custom built MT systems being much more competent than Google Translate. In fact, it makes the debate a little tougher to argue because the increased amount of human input implied in such a system does narrow the gap between TM and MT, but in my opinion, a significant gap remains, with the difference being the amount of human input. While a proper TM has had the segments written, or at very least reviewed by a human being, in the case of a trained MT system, despite the fact that there has been more human input in creating the rules or defining the context, the segments are still being created by the “machine” and are susceptible to all of the errors that this implies. Given that such segments have not yet been reviewed by human eyes, in my opinion, they are not worthy of the same rep. discounts that TM segments, or “recycled human translation”, are.

      “it is a free market out there…”

      You are absolutely right!

      One only has to observe the downward pressure that MT has placed on translation rates over the last 5-10 years! Translation outsourcers are certainly free to try to negotiate discounts based on MT, and we can’t deny that attitudes towards this issue are likely to change drastically in the coming years. Just recall what attitudes towards MT were ten years ago!!

  9. Pingback: Thoughts on Machine Translation and Post-Editing Analysis | jjhall.net

Comments are closed.