GSoC/GCI Archive
Google Code-in 2010 The Apertium project

Add compound word support to Dutch--Afrikaans

completed by: AureiAnimus

mentors: Francis Tyers

Compound word support has recently been added in Apertium by a GCI student (great work!). It works out-of-the-box, but the dictionaries need to support it. The job of this task is to add compound word support to the Dutch--Afrikaans translator. This will involve editting the noun paradigms to add special symbols which denote which parts of a word can form compounds and not. The first one in both dictionaries has been done for you:

    <pardef n="vering__n_f">
      <e>       <p><l></l><r><s n="n"/><s n="f"/><s n="sg"/></r></p><par n="cmp-R"/></e>
      <e>       <p><l>en</l><r><s n="n"/><s n="f"/><s n="pl"/></r></p><par n="cmp-R"/></e>
    </pardef>

    <pardef n="led__n_m">
      <e>       <p><l></l><r><s n="n"/><s n="m"/><s n="sg"/></r></p><par n="cmp-R"/></e>
      <e>       <p><l>s</l><r><s n="n"/><s n="m"/><s n="pl"/></r></p><par n="cmp-R"/></e>
      <e>       <p><l></l><r><s n="n"/><s n="m"/><s n="sg"/></r></p><par n="cmp"/></e>
    </pardef>

You should also add the epenthesis (e.g. parasite sounds -e- -s- etc.)

For more information see: http://wiki.apertium.org/wiki/Compounds

Before you start this task you should talk to a mentor.