GSoC/GCI Archive
Google Summer of Code 2011 LanguageTool

Lucene Based Fast Rule Evaluation for LanguageTool with Chinese Language Support

by Tao Lin for LanguageTool

I will develop a fast rule evaluation tool for LanguageTool. Lucene is used to index large corpus like Wikipedia with POS taggers, and to query fast on the rules. This will greatly improve the performance of new rule checking and increase the speed of new rule creation. I will also contribute the Chinese language pack support. Lessons on Chinese pattern rule creation learned from this project will benefit the development of other eastern languages in future.