GSoC/GCI Archive
Google Code-in 2010 The Apertium project

Add missing nouns to Dutch analyser

completed by: AureiAnimus

mentors: Francis Tyers

Add the following missing nouns to the Dutch analyser:

http://pastebin.com/3uWn6Lqi

first column = bidix entry
second column = existing entries in nl.dix

Important thing: if you think that a word is not a noun, then don't add it as a noun, delete it from the bidix and the corresponding entry from the afrikaans dix.

if you don't want to do that, then mark it in the bidix with something like NOTNOUN, change the translation in the dutch side to "NOTNOUN".

remember, you can make some easy changes looking for suffixes:

$ cat /tmp/o | grep 'isme' | wc -l

53

$ cat /tmp/o | grep 'ium' | wc -l

24

e.g.

$ for i in `cat /tmp/o | grep 'isme' | cut -f2 -d'^' | cut -f1 -d'/'`; do echo '    <e lm="'$i'"><i>'$i'</i><par n="lipide__n_nt"/></e>'; done