Baseline vs Building – It’s Neural Machine Translation Journey, additional Q&A

  • Post author:
  • Post comments:0 Comments
  • Reading time:5 mins read
  • Post last modified:

Dominick Kelly’s session on Neural Machine translation was one of the most popular at LEO’s 5th International Virtual Conference. Here he answers some of your questions that we didn’t have time to cover live. 

Q: If “the picture of reality can be strange”, could mechanical translation be fiction and face-to-face more real world? 

A: In some cases, neural machine translation engines inject words to make them more readable – even if these words do not exist in the source text. This means the MT makes up what it thinks is right for the translation based on its understanding of the world, as it knows it. The more details, data and metadata neural machine engines can be exposed to the better they become, and this means that humans need to be smarter to fix the problems, as translation produced is much better than it had been before. 

Humans will always be more real world for me, but as AI systems learn and mature, the gap between machine and mind will be harder to distinguish. Hence, business in the future will require smarter human linguists to provide support.   

Q: Dominick, just curious: which languages was that giant machine translating at the time?  

A: Here is the link to the NTEU site, is has all the information regarding the world’s largest neural machine translation engine-building project. 

Q: Can machines learn how to recognize accent?  

A: Accent would be more to do with synthetic voice systems rather the neural MT. But the answer would be, yes. These technologies can learn accents. However, they require hundreds of hours of voice recording to do so. These are relatively new technologies and still some way off being widely used like machine translation.  

If you are asking if Machine translation engines can learn a specific style, the answer is, yes, but only if you have trained an engine with a content that is translated in a specific style. When customized, you should see that NMT engines pick up on the data used and try to reproduce the style.  

Q: What sources are you using for translation if not Google?  

A: As I work at KantanAI, I use the KantanAI engines for machine translation and to customize them for client and use cases. But there are 100’s of MT systems that are available online today. 

Q: Which MT tools / custom engines are free?  

A: There are many “free” systems on the market. However, you will often find that free system means you trade your data instead of parting with your cash. This means when you are translating, using the system, the content that is uploaded can be reused and repurposed. This is how Google Translate system works, for example.  

These free technologies are not GDPR compliant, so I would not suggest using them in a professional role unless given a formal sign off by an end client to do so. Google AutoML is a paid system, like KantanAI, Systran, etc. are the professional tools that are GDPR compliant. 

Q: Great illustration, Dominick! What do you suggest to us if ever we venture to study translation for beginners? What platform is best to start with? 

A: Well, I am a little biased as I worked for XTM for about 8 years, so I am a bit of a fan of this tool as I helped design some of the features. Generally speaking, you should think of CAT tools, just like the ones in real life we have in our shed, some tools work better for some jobs and others work better for another job. So, I would say: 

  • All in one TMS systems with CAT tools: XTM, Memsource; others Wordbee, Smartling; 
  • App Localization Tools: Crowdin, Phrase, Cloudwords; 
  • Website Translation Tools: Easyling, Weglot, Transifex; 
  • Traditional CAT Tools: SDL Trados, MemoQ, STAR’s Transit NXT; 
  • Free: Swordfish, Matecat, OmegaT 

Q: Can we say that this type of Custom MT engine can understand and engage in the unique and specific culture of the industry that we are working for?  

A: A custom neural MT system when supplied with the correct volume and quality of data can understand specific domains or specific content.  

Q: What do you think about Microsoft WORD translator and Apple iPhone translator?  

A: In some cases, these solutions can work really well, in others they will lack the context or domain knowledge they need to deliver translations that are useable. These solutions will continue to improve as they are used more and more. Once again, be careful of the data you are translating to ensure GDPR compliance. 

We interviewed Dominick prior to the conference. Watch the video below to learn more. 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.