GSoC/GCI Archive
Google Summer of Code 2015

Rspamd spam filtering system

License: New and Simplified BSD licenses

Web Page: https://rspamd.com/ideas.html

Mailing List: https://groups.google.com/forum/?hl=en#!forum/rspamd

Rspamd is a sophisticated spam filtering system that is designed to achieve high performance. Rspamd is used by many email systems that are sensitive to the scanning rate and resources consumed for spam analysis. Rspamd is distributed under the terms of permissive BSD license (namely, BSD 2 clause license) and is written in C programming language using event-driven model. Plugins and advanced rules for rspamd are written in Lua programming language (optionally with LuaJIT compiler).

Rspamd development is held in github: https://github.com/vstakhov/rspamd

Projects

  • Implement Meta-Statistics Algorithm My work in this project is to implement meta-statistics algorithms and supervised machine learning algorithms suitable for spam filtering in the rspamd library. After the completion of this project, rspamd will be able to render machine learning to classify spam and ham effectively instead of depending on meta rules.
  • Support of HTTPCrypt in the Web interface Rspamd supports opportunistic encryption of all messages. However, its current web interface does not support encryption at all. So in this project I will be using tweetnacl-js after modifying it in accordance with cryptobox used in rspamd for encrypting the HTTP session of its web interface and hence improving its security. Also making it resilient against some MiTM attacks such as replay attack.
  • Symbols dependency graph My ultimate goal is to implement symbols dependency graph in Rspamd. Currently Rspamd used composites to combine rules and create more complex rules, but it’s not enough. With symbols dependency graph Rspamd will organizing complex rules where the results of top level checks depends on the results of other checks, which could not be implemented by composites only in general.