Category CAT-Tool Tips

The curious case of the “Upwards Arrow With Tip Rightwards” in Trados Studio

A long struggle with a strange character that appeared in Trados Studio just came to a successful conclusion with the help of a wonderful inofficial Trados help group on Facebook. Here is the curious story of the “Upwards Arrow With Tip Rightwards” that had a number of power users stumped.

Once upon a time, I accepted a project to translate a patent from English into German, whereby the source document was sent to me in form of an innocent-looking Word file. However, after importing that document into Trados Studio 2017, the trouble began. The document was riddled with strange-looking symbols everywhere, see the two screenshots below.

arrow

Strange arrow appearing in hundreds of places in Trados Studio.


text with arrows

Sample source segment in Trados Studio, text is redacted for confidentiality.

I looked for these strange symbols in the source document, to no avail, they were not shown (and thus not search- and replaceable). After some back and forth in the aforementioned user group, I was able to determine that Studio treats these characters as whitespace characters, not as formatting or tags. According to this Wikipedia entry, the symbol itself is a so-called “upwards arrow with tip rightwards,” with unicode hex symbol U+21b1. Searching for that unicode hex symbol in Word only resulted in errors (Mac version) or “not found” messages (Windows version). Various transformations and trying to save the source document in various other formats lead nowhere. Saving the entire document as plain text and then reimporting into Word was not an option due to various intricate equations and other formatting that needed to be preserved.

After some more back and forth, thanks to the wonderful colleagues in the user group, we were able to determine that it is Studio’s way to display “left to right” bidirectionality marks. Such marks are completely superfluous in this document, which is entirely in English, and the overabundant appearance of these marks ever second word is definitely an error. In the Word for Windows version I was finally able to search for these invisible characters with “^h” and replace them with…nothing! (As an aside, the Mac version only output an error message, saying that the search for “^h” is not a valid search.) Saving the document with the thus removed bidirectionality marks resulted in a clean document. And I translated happily ever after.

The end…

CAT-Tools “for Dummies”

Cat and toy

I hate to disappoint the animal lovers among my readers, but the term CAT-tool has nothing to do with our furry friends. CAT is the acronym for “computer-assisted translation,” which, in turn, has only marginally to do with machine translation. Computer-assisted translation simply refers to software or apps that help translators with the technical aspects of a translation such that the translator can concentrate on what matters: the translation. By technical aspects I mean tasks such as the following:

  • Looking up terms in a dictionary or in a library of previously translated text, a so-called “translation memory”
  • Ensuring terminological consistency
  • Copying numbers in the correct formats from the source into the target text
  • Various QA checks that involve numbers, formats, punctuation, etc.
  • Applying the same formatting as the source to the target text
  • And many more tasks that can be automated in a simple way

Every CAT-tool on the market that I know of accomplishes the automation of some or all of the aforementioned tasks by chopping up the source text into smaller pieces, so-called segments. A segment can be one sentence, but depending on the language combination, the nature of the text, and a variety of other factors, a segment can be a smaller or larger unit than one sentence. This segmentation is both a blessing and a curse: a blessing, because it enables the software to perform the aforementioned tasks efficiently; and a curse, because the segmentation can lead to a target text that reads more like a never-ending bulleted list of dictionary entries than a fluid text that’s enjoyable to read. Therefore, depending on the type of text to be translated, CAT-tools can be either indispensable or an overcomplication. CAT-tools are indispensable to ensure a consistent terminology for texts such as patents, manuals, textbooks, and any text with repetitive content such as annual reports and/or where a consistent terminology is absolutely required. On the other hand, using a CAT-tool to translate literary works may not be the best approach.

Which CAT-tool do you recommend?

This is the question that other translators have asked me recently quite often, although I am no CAT-tool expert. I just know how to use a few of them and learned most of the functionality by doing. There are so many tools on the market, in various price ranges and with various functionalities, for various operating systems that I cannot really answer this question. In addition, I think the answer depends highly on individual preferences, language combinations, and the type of text.

A good tool should at least have all of the following capabilities:

  • A workflow that fits your specific needs and preferences
  • All required functions for source and target language
  • Ability to handle the source file format
  • Terminology management capabilities
  • Ability to export and import translation memories
  • Ability to perform a spell-check, either tool-intern or extern (I prefer the latter because I run the Duden spell check for German externally, which is more complete than any other spell check I have found)
  • Ability to perform a QA check, either tool-intern or extern

Some people have additional requirements such as the ability to work with voice recognition software etc. The above list is the minimum functionality I expect of a CAT-tool. Of course, another consideration is price, as well as frequency of updates and availability of technical support.

Most tool vendors offer fully functional trial versions. Thus, I would recommend to download these trial versions of various tools and spend half a day or so trying out the various tools on a real text to see which tool works best for you.

Here, I want to again emphasize the issue of segmentation. Some tools segment better than others, and some tools have a better layout than others to help you avoid overly choppy writing. I noticed a marked difference in my style between tools, simply because of the different layouts. Some tools require one editing step more for me after the translation to make the text flow more naturally.

This is definitely something to keep in mind when choosing a tool, because after a while, you will be more or less stuck with that tool. Over the years, I have collected quite a few glossaries and created my own termbases with terms that I researched over many hours in total. These glossaries and termbases are in the format of one specific tool, and exporting and importing these into another tool would require too much effort to make it worthwhile. There is a common format for translation memories (TMX) that makes them easy to transfer from one tool to another. Unfortunately, the same is not (yet?) true for translation projects and termbases, which every tool saves in its own proprietary format. Sometimes termbases can be exported as well — however, more often than not this is only possible for the source and target term pairs, but not for annotations and categorizations of the terms. That is, the expertise that went into the creation of the termbases cannot be transferred, which is what makes these termbases so valuable in the first place.

Therefore: Choose wisely, grasshopper!

How to connect to GLTM for new Wordfast with old Wordfast version

If you have an older version of Wordfast lying around, but are sent a link to a new gltm (global TM), you likely found out that the format of the link is invalid in the older Wordfast version. The gltm link format is as follows:
gltm://username:password@url:portnumber/

You can easily render this link such that it’s compatible with an older Wordfast version, which has the following format:
wf://username:password@url:portnumber/
In other words, just replace the gltm by wf and you’re all set.

Clear all target segments in Trados Studio 2009

Here’s how to clear all target segments in Trados Studio 2009:

  1. Select the first segment of the block that you want to clear by clicking on the number of the segment on the left-hand side as shown below.
    Selection of segments in Trados

    Selection of segments in Trados
    (The text in the segments is masked for confidentiality reasons.)

  2. Go to the last segment of the block of segments that you want to clear, keep the Shift key pressed while selecting the number of this last segment, as illustrated above. The entire block of segments should now be selected.
  3. Right click with your mouse and select “Clear Target Segment” from the context menu to clear all selected segments.
    Context menu for selection

    Context menu for selection

  4. Alternatively, you can also select “Translation > Clear Target Segment” from the main menu.
    Main menu - Translation

    Main menu – Translation

  5. As a third alternative you can use the keyboard shortcut “Alt+Del” to clear the selected target segments.

Superfluous Tags in Word Documents

Sometimes Word documents contain a lot of extra (hidden) tags, that can really hinder the use of a CAT-tool and stop your translation workflow in its tracks. Particularly PDF-documents that are converted to Word can have tags between every single word or even letter! The reason for this overabundance of tags is that Word seems to apply certain formatting settings to every single word or letter, instead of more globally to every single paragraph. To get rid of these tags without destroying the formatting of your source document, follow the procedure below. In any case, you’ll most likely have to do some reformatting of the translation no matter what, because I have yet to see a source and a target language pair with the exact same length of words.

The procedure applies to Word 2010, 2007, and 2003 — if you have a different version, or a different Word processor, there should be an equivalent procedure with equivalent keyboard shortcuts.

  1. Open the source document in Word. It is advisable to save a copy of the original in the unlikely case something goes wrong.
  2. Use CTRL+A to mark the entire text.
  3. Use CTRL+D to open the character formatting dialog box, go to the “Character Spacing” tab (or the equivalent in your Word version) as shown below.
    Font dialog box in Word

    Font dialog box in Word

  4. Set the Scale to 100%, the spacing and position to Normal, and disable Kerning.
  5. Save the file and try opening it in your favorite CAT-tool. The superfluous tags should have disappeared. If not, proceed to the next step.
  6. If you are in Word 2010 or Word 2007, save the file as a Word 97-2003 document. The step from .docx-format to .doc-format usually removes all additional tags. You may have to do some minor reformatting when you save the final translated document in the original 2010 format.
  7. If you are in Word 2003 or the file is already in Word 97-2003 format, try saving it in a newer format, and then save it again in Word 97-2003 format.