GSoC/GCI Archive
Google Summer of Code 2011 Apache Software Foundation

LUCENE-2959: Implementing State of the Art Ranking for Lucene

by David Nemeskey for Apache Software Foundation

Lucene employs the Vector Space Model (VSM) to rank documents, which compares unfavorably to state of the art algorithms, such as BM25. Moreover, the architecture is tailored specifically to VSM, which makes the addition of new ranking functions a non-trivial task. This project aims to bring state of the art ranking methods to Lucene and to implement a query architecture with pluggable ranking functions.