Archives August 2017

My Neural Machine Translation Project – Summary

I recently embarked on the ambitious project to set up my own neural machine translation engine. This post serves as the overview page over the posts, with posts being added over time.

My Neural Machine Translation Project – Step 0 – Hardware Selection

This is the second installment of a series of blog posts, where I want to describe my attempts to set up my own neural machine translation engine. In the previous episode, I introduced myself and the project.

For neural network applications, the choice of hardware is just as crucial as the software and algorithms, because the training of neural networks consists basically of a large number of matrix multiplications that are best done in parallel. This is why the proliferation of dedicated graphics processing units (GPUs), spurred by the invention of LCD monitors and the popularity of virtual reality games, made the current advances in artificial neural networks possible. The idea of artificial neural networks is not new. The concept has been around since the 1940s, except nobody could really accomplish any practical tasks with neural networks until sufficiently powerful GPUs came along. GPUs are constructed specifically for large scale parallel computing and matrix multiplications, while even the fastest CPUs are wired for serial computing, not parallel computing.

But I digress. This post deals with setting up the right hardware for a neural machine translation network. According to the authors of various open source NMT toolkits, a powerful GPU with at least 6 GB of dedicated, on-board GPU memory is recommended. So it was really a no-brainer when my local nerd store, Fry’s, advertised an Asus gaming PC with an Nvidia GeForce GTX 1070 graphics card with 8 GB on-board memory. The 1070 is one level down from Nvidia’s current flagship GPU, the GTX 1080 Ti, but at roughly half the price of the 1080 Ti, it is definitely the most bang for the buck at this time. Aside from Nvidia, AMD also makes good GPUs with its Radeon line, but the PC package I bought was on sale as an open-box display item, so the price couldn’t be beat. Of course, the Asus BIOS came with its own headaches, so please keep reading if you are interested in the gory technical details of the lengthy setup process that ensued.

The PC came with Windows 10 preinstalled, which is essentially useless for serious computations. All open source neural net toolkits run standard on Linux. Although I have no intention of using the “gaming PC” for gaming and thus have no immediate use for Windows, I decided to keep Windows 10 on the machine, also because it came preinstalled without an installation medium. So I repartitioned the hard drive, which is surprisingly simple in Windows 10, and installed Ubuntu in a dual boot configuration. Instructions on how to do that can be found in abundance on the web. The process was quite straightforward with a USB stick and all seemed well, until I noticed that the Nvidia card wasn’t using the proprietary Nvidia driver for Ubuntu, but another driver. This defeats the purpose of having this high-end graphics card, because the standard driver is not using all the high-end features of the card. Here is where the headaches began.

I began by installing the latest Nvidia driver for Ubuntu with the following commands (drop the sudo if you are logged in as root):

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-384

384 is the latest version at the time of writing. However, upon reboot to activate the driver, things started to go awry. Ubuntu would not let me log in and always returned to the login screen, no matter what I tried.

The culprit with my particular setup seemed to be the so-called Secure Boot settings in the Asus UEFI BIOS, which seem as useless as sliced bread to me. These secure boot settings are supposed to prevent non-Windows operating systems from using certain firmware that is not trusted by the system manufacturer, even if that firmware is by a manufacturer of one of the components — in other words, in my case Asus doesn’t seem to trust Nvidia. After rebooting, I entered the UEFI BIOS by pressing F2, and accessed the Secure Boot settings. I was unable to disable it, “Enabled” was simply greyed out, so instead I chose the option “Other OS” instead of “Windows,” as you can see in the screenshots below. This fixed one problem.

Asus UEFI Bios boot settings

Asus UEFI Bios boot settings

Asus UEFI Bios boot settings detail

Asus UEFI Bios boot settings detail

However, upon another reboot, I got a screen filled with error messages about a PCIe bus error, pcieport, etc. that wouldn’t stop, along with syslog and kern.log files that filled up my entire TB harddisk and froze the system once the disk was full. Here, a simple additional option in the grub boot menu solved the problem:

  • I went to the command line with Ctrl + Alt + F1.
  • I emptied the syslog and kern.log files that had eaten up my entire harddisk with the following commands:

    sudo truncate -s0 syslog
    sudo truncate -s0 kern.log

  • I backed up the grub configuration and then edited it as follows:

    sudo cp /etc/default/grub /etc/default/grub.bak
    sudo -H gedit /etc/default/grub

    In gedit, I replaced the following line



    GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash pci=nomsi”

    MSI is short for Message Signaled Interrupts, which are supposed to be more stable against system freezes than other interrupt signals. However, it is known that the MSI support of specific combinations of hardware is inherently unstable and tends to freeze the system instead of preventing such freezes. My current setup of Asus motherboard with Intel chipset and Nvidia GeForce GPU card on Ubuntu 16.04 seems to be an example for this.

  • I saved the edited grub configuration file and exited gedit. Then I updated grub and restarted the system:

    sudo update-grub
    sudo reboot

I have run a few GPU intensive computations (not neural net related), and everything seems well. No overflowing system log files, no strange PCIe-related errors, and no log-in or boot issues. Windows 10 also seems to work fine.

Next up: The choice of toolkit — OpenNMT or Google’s Tensorflow? Decisions, decisions…

My Neural Machine Translation Project – Prologue

Lately, when I introduce myself as a translator, or more specifically, as a patent translator, people invariably always ask me whether I’m worried that I’d be replaced by neural machine translation (NMT) in the next few years. Obviously, this being Silicon Valley with its ubiquitous self-driving cars, drones, and robot security guards, I can’t just reply no, point to the latest MT translation error meme that is making the rounds on social media and be done with it. Here, a deeper (pun intended) argument is needed. In addition, the European Patent Office (EPO) announced their new Unitary Patent, which is supposed to reduce translation costs for applicants significantly, whereby many now mandatory translations of patents are to be replaced by machine translations. This new Unitary Patent was supposed to go into effect on January 2018; however, it currently looks as if this timeframe will be delayed.

Nevertheless, my inbox is also beginning to fill up more and more with offers for post-editing of machine translation output (MTPE). I am not the most efficient editor when I am editing translations by human colleagues, even when the text is excellent, because I tend to get sidetracked by matters of style. Thus I am utterly unsuited for MTPE, because I simply lack the patience to deal with nonsensical machine errors. However, in light of all of the above, a plan began to form in my head: I want to set up my own machine translation engine.

While this is certainly very ambitious, it’s not impossible. I have years/decades of background in advanced mathematics (theoretical physics) and computer programming. Furthermore, there are now several open source NMT toolkits on the market, complete with various libraries and discussion forums. Obviously, I could just download one of the toolkits, train the net with various open source corpora and be done with it. But that would be too easy! And not very productive. I want to get to a point where the net is trained well enough so that I can actually use the output in my daily work. I also want to achieve an expert level where I understand how NMT actually works, perhaps to work as an NMT consultant instead of an MTPE slave when the NMT apocalypse descends on the translation world (which is not likely to happen anytime soon). In addition, I will document my experiences on this blog. Since this is a side-project, I can’t promise to blog regularly, because my progress will be highly dependent on my daily workload. I certainly won’t be “live-blogging” due to the inevitable R-rated Austrian expletives that will accompany the programming stage.

I began the journey over a year ago by taking an introductory class by Andrew Ng on Machine Learning on Coursera. Andrew Ng is not only the co-founder of Coursera and a Stanford professor, he is also an excellent teacher. The course introduced all the necessary concepts with just the right amount of math (for me as a physicist) and programming (in the symbolic language MATLAB). I highly recommend this course as an advanced introduction for anybody who is interested in the topic. However, note, that Andrew Ng’s excellent course does not cover machine translation. I followed this up with several courses on Robotics (on Coursera) and on Artificial Intelligence (on EdX) at the introductory Master’s level. I even built an autonomously navigating robot, nicknamed Boticelli. While I am far from an expert now, I certainly know more than the average amateur about artificial intelligence and neural nets. I will summarize what I’ve learned so far in a presentation at the 58th Annual Conference of the American Translators Association this fall.

The next steps will be to buy the necessary computer hardware and pick one open source NMT toolkit. Neural nets require dedicated hardware, that is, very high-end graphics processing units (GPUs), because the training phase of neural nets basically consists of huge numbers of matrix multiplications. Dedicated GPUs are capable of performing large numbers of computations in parallel, in contrast to CPUs, which are best used for serial computations. Thus, to set up an NMT engine, a “gaming” PC with a VR-ready high-end graphics card is necessary, because ironically, the computations for virtual reality computer games and the computations for neural nets in serious applications such as translation are quite similar.

But more on that in the next post. Stay tuned!