Category Archives: tools

Last updated by at .

5 tips to ensure the success of machine translation

Machine Translation has been in place for more than 50 years now. As any other technology, it has evolved and it is now being widely used in many different scenarios.

However, machine translation is not just entering the text in an online service like Google, clicking a button and waiting for the translation to come up. That is just the primitive, raw translation. Machine translation can be more useful and efficient (and much more correct) if some basic requirements are met, both before and after translation.

Translate neutral texts

Texts with lots of cultural references are difficult to translate for humans. Imagine how it is for a machine that does not have context to take into account! Therefore, machine translation will work better with texts that do not make any reference to culture, traditions, religion, politics, TV or plays on words. For us, it works better with technical texts.

The more specific, the better

Although you may get some decent results with general MT systems, like Google Translate, which mix translations from all types of domains (legal, medical, food, tourism, IT…), MT will work best if you work with a system that has been made with texts belonging to the same topic (domain). This way, the terminology used will be that of the topic.

Number of existing translations for that domain

Although some rules are applied to MT systems, data is still the most important source of content. The more already translated texts for that topic, the better results you will get. Very often, 100% human translations are used for the engine to be created, in order to populate it with enough quality content that will help the system to do the translations automatically.

Post-Editing

Machine Translation will not work if you do not fix the errors. Otherwise, you will not be able to retrain the system and improve the quality of the translation. Depending on your needs, you can make a bigger or a smaller effort on improving those translations, but you need to do a minimum so that MT fits its purpose, which is delivering better and better translations in less time.

Preparing the source text

As said before, the MT system lacks most of the context you have as a translator, and it will translate what it sees. Therefore, if there is a wrong word, it will translate it and it will not make any sense in the translation. Actually, this would be the same in the source language, but a human eye will easily notice the error and infer what the correct word is. Having said this, do always run the spell checker, regardless of whether you are using machine translation or not.

The same happens with punctuation. A comma in a wrong place can change the meaning of a sentence, and therefore, the translation will also be different.

Use short sentences. And here is why we think that MT works better with technical texts than with marketing texts. Long sentences full of subordinated clauses may be difficult for the MT system to translate.

Keep your terminology consistent. Again, something that you will find more often on technical texts. If you change the terminology continually, the system will not know which term should be used on each instance, and you will have to do more post-editing work to fix those errors.

At Jensen Localization we have MT systems in place that can be tailored to your needs. If you take all these tips into account when ordering your MT project, it will be easier and faster to build an MT system for you, so that you can start managing your translations more efficiently.

You can also contact us to let us know about your translation needs.

Summary of 2014

For new visitors and for those who do not want to check our archives month by month, we are now offering you a selection of articles posted in 2014, so that you can get a quick update.

During 2014, our blog has been addressing issues so important to our clients such as how they can save money on translation.

On articles Misconceptions about Localization Costs for Companies I and II, we explain you in detail how you can easily save costs on your translation by applying easy procedures:

You may be interested in sharing theses posts with your content creators before starting your translation adventure.

We have also talked about procedures that may not be so known to our clients, but that are quite important in order to consider a translation job as finished. In October, we talked about the testing phase, which usually takes place after a website has been translated.

In a global world and a global crisis, all companies want to increase their sales. And in order to do it, they increase their marketing strategies. And as they want to sell abroad, marketing becomes international marketing. And, guess what? Translation is an extremely important tool in international marketing! For this reason, in March we published an interesting article about the relationship between Market Entry Strategies and Localization.

For translators, we published in November an interesting article about how to prepare your files for translation with SDL Trados Studio and make sure that you only translate what you need. Your clients will also appreciate it, since this also prevents charging for text that is not actually to be translated.

Finally, we also published an article about our new service, the Controlled TM+MT Environment, available for companies that do their translations in-house but do not have a system in place for keeping consistency and reusing translations.

We hope to continue writing about interesting topics in the translation, localization and interpreting industry. Is there any topic of interest for you that you would like us to research on and write about? Feel free to propose it in the comments section or contact us!

 

The Neverending Story of Word Counts

 Do you remember Michael Ende’s book, The Neverending Story? Or, like me, do you remember more the them song by Limahl’s?

Whenever I have to make a quotation for a client and I ask them for the source files, I enter in the fantasy world of word counts, which is full of  fantastic creatures that can make your word count as big as Falkor, the luckdragon, if not prepared correctly.

Our last adventure in the fantasy world of word counts took place quite recently.

We got a quotation request for translation of a website, and we asked the client for the source files. The client exported the website into individual xml files, and we analyzed them to get a word count. We used Trados Studio for that, and we got a word count that was a very nice starting point, but which we knew that was not real: more than 60,000 words.

The file was not prepared correctly for translation, so we had to prepare it ourselves. We needed to create a configuration file that Trados would use to know what is translatable and what is not. If you are a translator, follow these steps to learn how to do it. If you are a client, just skip to the end of the article, and you will be happy to know what the word count will be after all these steps.

When you create the Project in Trados Studio, you will reach a point where you have to select the files to translate. Before doing that, go to the File Types option:

File types

When you click on File Types, you will see a list of all file types supported by Trados Studio. However, as I mentioned, we want to create our own file type, based on the files we are going to translate.

Just click on New and select the desired type. In our example, we are going to select XML.

Select Type

Follow the instructions of the wizard and select if you want to create an XML file based on default settings or based on settings from an existing settings file. In our example, we are going to select the second option, and we will browse to select one of the translation files:

Create File Type

The Parser Rules dialog box will now appear. Here is where we need to select what is translatable and what is not. Just go through the list of rules and double click on each of them in order to select the status from Translatable to Not Translatable.

Parser Rules

Once you are done, your file type will appear in the list of files types.

Project File Type Settings

And when you add the files to translate to the project, they will all appear under the file type you created.

New Project

By doing this, when you analyze the files, your word count will be much real, as it will only take into account those strings that need translation.

In our example, we moved from a word count of more than 60,000 words to around 19,000 words. It is a big difference in our income, but if we had not done it, the client would have not accepted our quotation or he might have paid much too much.

Also, translating segments full of xml code that is not to be touched can be really annoying, so the translators will end up spending more time than expected.

However, even if we have sorted out this problem and managed to provide clients with an accurate word count that matches with what they need to translate, we still need to face another issue: how can we convince clients that they should send the full source files for quotation? Have you managed to get them? Tell us your strategies in the comments!

Related articles:

Why do repetitions have to be included in a text to translate?

  • Don’t touch my source files!
  • When the file to translate is sent “as is”
  • The Localization Project. Part 3: Creating the Source

Survey on CAT tools: Results

Recently we made a survey about CAT tools through SurveyMonkey. The survey lasted from July 24th 2013 until October 16th and participants had to answer 7 different questions about CAT tools.

The main purpose of this survey was to learn what translators feel about CAT tools. We wanted to learn if they are really a helpful tool or if they are just imposed by the industry.

As usually happens in this type of surveys, the replies very much depend on the actual experience of the translator. CAT tools are mainly used in the localization industry, so it is not strange to find literary or marketing translators rejecting them, but we have to say that it is not always the case.

We hope that this article helps translators considering the use of CAT tools to make a decision on their purchase, but we also think that tool developers should have a look at it and consider them before launching a new release of their tools.

The first question we asked was: What do you value the most of a CAT tool (excluding costs)? Here is the result.

value of a CAT tool

The majority (95.45%) agreed that the ability to use a translation memory is one of the most valuable things about CAT tools, when costs are excluded. Other valuable things mentioned by respondents where the compatibility with file formats and QA tools included on them.

In the second question, What do you miss in a CAT tool that no developer has provided yet? we found a bigger variety of answers. However, we noticed that some of them mentioned features that do already exist in some CAT tools, such as automated translation. We wonder if, as some of the respondents claimed, as tools are too extensive they contain many options that translators do not use and this may prevent them from finding the ones they would really use. The features mentioned by respondents were:

  • A failsafe way to preview XML files in a useful way, without having to rely on a style sheet.
  • Most current tools are too extensive – they simply offer too many options or take a long time to startup.
  • Good spelling features.
  • Better QA tools.
  • Ability to pick words from any text or website and easily add them to a MultiTerm-type of termbase.
  • The functionality to search through multiple files at the same time.
  • Automatic translation.
  • Automatic creation of term lists .
  • Smooth terminology management .
  • Easier concordance .
  • Intuitive UI.
  • Non-bloated interface.
  • Standardized file formats or better compatibility with file formats (not having to convert back and forth).

As mentioned in our introduction, CAT tools were created for use mainly in the localization industry.  However, we were surprised to see that some translators even used them to translate books. To the question  Do you think CAT tools can be used for any type of text? 63.64% participants answered Yes and 36.36% answered No.

In the following questions we wanted to know which tools were most appropriate for each type of text. We asked translators to mention their favorite tools for translating software strings, documentation and websites/online help files.

In all three questions, SDL tools where the preferred ones. However, it was interesting to see the increasing preference for other tools such as Catalyst and WorldServer for software localization and MemoQ and Wordfast for document translation. See below the results for each question:

 Which is your best tool for translating software strings?

tool sw strings

Which is your best tool for translating documentation?

tool documentation

Which is your best tool for translating websites/Online help?

tool help

From what we have seen so far, translators find CAT tools useful but, are prices in line with the features they offer? Most participants felt that the tools are too expensive, especially if you need more than one, but some meant that the price was fine for MemoQ and WordFast. They also mentioned that they offer too much things that they will not need and therefore they are too expensive.

After analyzing the results, we think that translators like CAT tools, they use them and they will continue using them, but we also feel that tool developers are not totally taking into account the actual needs of translators when launching new releases of their tools. Are they maybe more focused on the needs of engineers or big companies with big localization departments?

We know that some tools have corporate versions and freelance versions, which are simpler and cheaper. But, do these freelance versions cover translators’ needs? What about interoperability? We hear a lot about interoperability, but does it really exist?

We hope that, whether you are a tool user or a tool developer, this article gave you a better understanding of the different CAT tools compared to each other and what things should be improved regarding CAT tools in general.

Even if the survey has been closed, you are welcome to give your opinion on CAT tools in the comments section. We would be happy to hear from you!

Survey on CAT tools

In the localization industry, we all know about translation tools (usually known as CAT tools): features, prices, compatibility issues… but we have not read that many articles about the actual opinion of their users, translators all around the world.

 

We have created a very short survey to learn what translators think of CAT tools. We would appreciate to get as many replies as possible, and we will share the results with you in an article that we will publish on the 24th October.

 

The survey will remain open until the 16th October. Click on the link to access the survey on translation tools.

 

We thank you in advance for helping us in this research, and with this post, The Jensen Localization Blog will go on holidays. We hope you have a nice summer, whether you are working, having fun or just relaxing!

 

We will come back in September with more articles and news about languages, translation and localization. Jensen Localization, however, will remain open during the whole summer, so do not hesitate to contact us should you have any translation requirements.

 

Have a nice summer!

1212550_palm_trees