GSoC/GCI Archive
Google Summer of Code 2012 XBMC Foundation

Clean scraping API

by topfs2 for XBMC Foundation

One thing which makes XBMC amazing is the fact that it is able to scrape websites in order to add meta data to a file with little help from users. Up till now XBMC have used regular expressions to achieve this task, and has done so rather successfully, but it has highlighted a few limitations, amongst many they can quickly become complex and are not very resilient to changes done in the website. The current scraping API has little tests and little is known what files it actually handles (outside the hierarchy the tight community uses), nor any knowledge about what it should handle, this along with making a new and future proof API is what I propose as my project.