My Neural Machine Translation Project – Prologue

My Neural Machine Translation Project – Prologue

Lately, when I introduce myself as a translator, or more specifically, as a patent translator, people invariably always ask me whether I’m worried that I’d be replaced by neural machine translation (NMT) in the next few years. Obviously, this being Silicon Valley with its ubiquitous self-driving cars, drones, and robot security guards, I can’t just reply no, point to the latest MT translation error meme that is making the rounds on social media and be done with it. Here, a deeper (pun intended) argument is needed. In addition, the European Patent Office (EPO) announced their new Unitary Patent, which is supposed to reduce translation costs for applicants significantly, whereby many now mandatory translations of patents are to be replaced by machine translations. This new Unitary Patent was supposed to go into effect on January 2018; however, it currently looks as if this timeframe will be delayed.

Nevertheless, my inbox is also beginning to fill up more and more with offers for post-editing of machine translation output (MTPE). I am not the most efficient editor when I am editing translations by human colleagues, even when the text is excellent, because I tend to get sidetracked by matters of style. Thus I am utterly unsuited for MTPE, because I simply lack the patience to deal with nonsensical machine errors. However, in light of all of the above, a plan began to form in my head: I want to set up my own machine translation engine.

While this is certainly very ambitious, it’s not impossible. I have years/decades of background in advanced mathematics (theoretical physics) and computer programming. Furthermore, there are now several open source NMT toolkits on the market, complete with various libraries and discussion forums. Obviously, I could just download one of the toolkits, train the net with various open source corpora and be done with it. But that would be too easy! And not very productive. I want to get to a point where the net is trained well enough so that I can actually use the output in my daily work. I also want to achieve an expert level where I understand how NMT actually works, perhaps to work as an NMT consultant instead of an MTPE slave when the NMT apocalypse descends on the translation world (which is not likely to happen anytime soon). In addition, I will document my experiences on this blog. Since this is a side-project, I can’t promise to blog regularly, because my progress will be highly dependent on my daily workload. I certainly won’t be “live-blogging” due to the inevitable R-rated Austrian expletives that will accompany the programming stage.

I began the journey over a year ago by taking an introductory class by Andrew Ng on Machine Learning on Coursera. Andrew Ng is not only the co-founder of Coursera and a Stanford professor, he is also an excellent teacher. The course introduced all the necessary concepts with just the right amount of math (for me as a physicist) and programming (in the symbolic language MATLAB). I highly recommend this course as an advanced introduction for anybody who is interested in the topic. However, note, that Andrew Ng’s excellent course does not cover machine translation. I followed this up with several courses on Robotics (on Coursera) and on Artificial Intelligence (on EdX) at the introductory Master’s level. I even built an autonomously navigating robot, nicknamed Boticelli. While I am far from an expert now, I certainly know more than the average amateur about artificial intelligence and neural nets. I will summarize what I’ve learned so far in a presentation at the 58th Annual Conference of the American Translators Association this fall.

The next steps will be to buy the necessary computer hardware and pick one open source NMT toolkit. Neural nets require dedicated hardware, that is, very high-end graphics processing units (GPUs), because the training phase of neural nets basically consists of huge numbers of matrix multiplications. Dedicated GPUs are capable of performing large numbers of computations in parallel, in contrast to CPUs, which are best used for serial computations. Thus, to set up an NMT engine, a “gaming” PC with a VR-ready high-end graphics card is necessary, because ironically, the computations for virtual reality computer games and the computations for neural nets in serious applications such as translation are quite similar.

But more on that in the next post. Stay tuned!

Digiprove sealCopyright secured by Digiprove © 2021 Carola F Berger

Carola F Berger


Carola F. Berger is a German-English patent translator with a PhD in physics and a master’s degree in engineering physics. She is ATA certified for translation from German into English and from English into German and owns and operates CFB Scientific Translations LLC. Carola serves as webmaster on the NCTA Board of Directors and is also the Administrator of ATA’s Science and Technology Division.

Leave a Reply

Your email address will not be published. Required fields are marked *