Category Translation

Handouts for ATA59 – An Introduction to Neural Machine Translation

I will be giving another presentation at the upcoming ATA Annual Conference in New Orleans, ATA59, jointly in the SciTech and Language Technology tracks. The presentation will give an introduction to neural machine translation. My talk is preliminarily scheduled for the very last time slot on Saturday before the final keynote. I hope to see you there, despite the late hour!


“The end of the human translator,” “nearly indistinguishable from human translation” – these and similar headlines have been used to describe neural machine translation (NMT). Most language specialists have probably asked themselves: How much of that is hype? How far can this approach to machine translation really go? How does it work? The presentation will examine one of the available open source NMT toolkits as an illustrative example to explain the underlying concepts of NMT and sequence-to-sequence models. It will follow in the same spirit as last year’s general introduction to neural networks, which is summarized in the accompanying handouts.


I have just uploaded the handout for the presentation onto the ATA server. The material is a slightly updated version of my blog post on neural networks, which summarizes my presentation at ATA58. You can download the handout here.

Workshop: How to Create a Professional Website with WordPress

I recently presented this workshop on website creation with WordPress, which was organized by the Northern California Translators Association.


Themes, posts, pages, widgets, plug-ins, SQL databases, SSL, … oh my! In this workshop, you will not only learn what these terms mean, you will learn how to build a professional, responsive website from scratch with WordPress, one of the most popular solutions for this task. Attendees can follow along on their computers and optionally create their own (free, with limitations) site on

Prerequisites: While no experience with WordPress or website programming and design knowledge is required, attendees should have some basic familiarity with computers and their setup. Those who want to follow along with their devices and start building their site during the workshop (optional), should bring a device with internet connectivity, an installed browser, and a keyboard (tablets will probably work, too).

The slides for the first part of the workshop are available here.

My Neural Machine Translation Project – Overview over Open-Source Toolkits – updated Dec 2017

Updated: December 2017

Before deciding on a toolkit, I needed to get an overview over the various open-source neural machine translation (MT) toolkits that are available at the time of writing (September 2017). In the following, I will summarize the features of the various toolkits from my point of view. Note that this summary does not include open-source MT toolkits such as Moses, which is based on a statistical approach. I will mainly summarize the impressions I got after lurking on the various support discussion forums/groups for a while.

The big kahuna – TensorFlow

Provided by: Google (TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.)
Language: Python (main API), with APIs available for C, Java, and Go, however the latter seem to have somewhat less functionality
Architecture: Since Tensorflow is a whole framework, both recurrent as well as convolutional neural networks are available.
White paper: Large-Scale Machine Learning on Heterogeneous Distributed Systems, M. Abadi et al., Nov. 9, 2015
Support: Stack Overflow for technical questions; a Google group (what else?) for higher-level discussions about features etc., although some technical questions are also discussed in the Google group; and a blog announcing new features and tutorials
Summary: TensorFlow is a large-scale, general-purpose open-source machine learning toolkit, not necessarily tailored for machine translation, but it does include tutorials on vector word representations, recurrent neural networks, and sequence-to-sequence models, which are the basic building blocks for a neural machine translation system. TensorFlow also provides various other neural network architectures and a vast number of features one could play around with for language learning and translation. Definitely not a plug-and-play system for beginners.

The more user-friendly one – OpenNMT

Provided by: Harvard University and Systran
Language: Lua, based on the Torch framework for machine learning; there exist two “light” versions using Python/PyTorch and C++
Update: As of December 2017, the main lua version is now accompanied by a full-fledged Python version, based on the PyTorch framework, and a version based on the Tensorflow framework.
Architecture: Recurrent neural network
White paper: OpenNMT: Open-Source Toolkit for Neural Machine Translation, G. Klein et al., Jan 10, 2017
Support: a very active discussion forum (where, among other people, Systran’s CTO is very involved)
Summary: More suited for machine learning beginners, although the choice of the programming language Lua, which is not that widely used, may be a bit of a hurdle. Update December 2017: Since there are now two other versions, based on Python and Tensorflow, this should no longer be an issue. End update. On the other hand, there exist lots of tutorials and step by step instructions. Some of the questions that are asked in the forum are indeed quite elementary (and I’m far from an expert!). Thus, if one wants to play around with inputs (that is, well-chosen corpora!) and various metrics and cost functions for the output, this is the toolkit to choose. In machine translation systems input and output are just as critical as the architecture itself, if not more so. Because for neural networks, and thus also neural machine translation systems, the old adage “garbage in – garbage out” is particularly true. Therefore, it may make more sense for linguists and translators to approach the machine translation problem from the angle of the input (corpora) and output (translation “quality” metrics), instead of getting lost in the architecture and the code.

The newer kid on the block – Nematus

Provided by: University of Edinburgh
Website: Not really a website, but the project plus documentation and tutorials is here on Github.
Language: Python, based on the Theano framework for machine learning
Architecture: Recurrent neural network
White paper: Nematus: a Toolkit for Neural Machine Translation, R. Sennrich et al., Mar 13, 2017
Support: a Google group
Summary: This is the third kid on the block, not as active as the other two above. Like OpenNMT, it is a toolkit only for language translation, as opposed to the general-purpose TensorFlow framework. It uses the better-known Python as opposed to Lua, which would be an advantage, at least for me, over OpenNMT. However, the user base does not seem quite as extensive or active as OpenNMT’s. Thus, at the time of writing, Nematus seems to be an option to keep in mind, but not necessarily the first choice.

The brand new kid on the block – Sockeye

Provided by: Amazon
Website: The main website is here: Not really a website, but a tutorial how to use Sockeye has been published on Amazon’s AWS (Amazon Web Services) AI blog —
Language: Python, built on the Apache MXNet framework for machine learning
Architecture: Recurrent neural network
White paper: SOCKEYE: A Toolkit for Neural Machine Translation, F. Hieber et al., Dec 15, 2017
Support: Aside from the website with documentation and a FAQ, there is the general AWS Discussion Forum.
Summary: The newest open source NMT toolkit is geared towards advanced users, who are also familiar with the AWS and MXNet setup. On the other hand, like with Google’s Tensorflow, there are many available architectures and advanced options, and therefore many more options for experimentation.

Another big one – Fairseq

Provided by: Facebook
Website: The Github repository is here: Not really a website, but a tutorial how to use Fairseq has been published on Facebook’s code blog —
Language: Lua, built on the Torch framework for machine learning
Architecture: Convolutional neural network
White paper: Convolutional Sequence to Sequence Learning, J. Gehring et al., May 8, 2017
Support: A Facebook group (what else?), and a Google group.
Summary: This is another open source toolkit for advanced users, for one, because it is also based on the more esoteric (as compared to Python) language Lua, for another, because the intended user base seem to be advanced researchers, not curious end users. It is also fairly new.

My Neural Machine Translation Project – Summary

I recently embarked on the ambitious project to set up my own neural machine translation engine. This post serves as the overview page over the posts, with posts being added over time.