GSoC/GCI Archive
Google Code-in 2013 Apertium

Write an sentence aligner for the UDHR

completed by: Sushain Cherivirala

mentors: Jonathan Washington

Write a script to align two translations of the UDHR (final destination: trunk/apertium-tools/udhr_aligner.py). It should take two UDHR translations and output a tmx file with one article per entry. It should use the xml formatted UDHRs available from http://www.unicode.org/udhr/index_by_name.html as input and output the aligned texts in tmx format.