GSoC/GCI Archive
Google Code-in 2010 The Apertium project

Categorise translation errors in Afrikaans to Dutch MT

completed by: AureiAnimus

mentors: Francis Tyers

Take two test sentence corpora:

  eval.2011-01-07.wikipedia.3.apertium.txt

  eval.2011-01-07.wikipedia.4.apertium.txt

from https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-af-nl/dev/eval

And count the number of errors, categorising into the following categories: Unknown word, Morphology, Disambiguation, Multiword, Syntactic transfer, Polysemy, Compounding, Separable verb

The number of errors will need to tally with the number of errors produced by the apertium-eval-translator script for Word Error Rate (the edit distance).