GSoC/GCI Archive
Google Code-in 2013 Apertium

Create a program to generate a flex lexer from an XML description

completed by: Dalimil Hájek

mentors: Mikel L. Forcada, Francis Tyers, Kirill Krylov

Given an XML file with definitions like:

 

<section-def-cats>

  <def-cat n="noun">

    <cat-item tags="n.*"/>

  </def-cat>

  <def-cat n="adjec">

    <cat-item tags="adj.*"/>

    <cat-item tags="vblex.pp.*"/>

  </def-cat>

</section-def-cats>

 

Create a lexer that looks something like:

 

^[a-zA-Z]\+<n>\(<[a-zA-Z0-9_]\+>\)/[a-zA-Z]\+\(<[a-zA-Z0-9_]\+>\)$ { return noun; }

^[a-zA-Z]\+<adj>\(<[a-zA-Z0-9_]\+>\)/[a-zA-Z]\+\(<[a-zA-Z0-9_]\+>\)$ { return adjec; }

^[a-zA-Z]\+<vblex><pp>\(<[a-zA-Z0-9_]\+>\)/[a-zA-Z]\+\(<[a-zA-Z0-9_]\+>\)$ { return adjec; }