GSoC/GCI Archive
Google Code-in 2014 Wikimedia Foundation

pywikibot: Wikidata isbn support

completed by: m4tx

mentors: Fabian, John Vandenberg

Pywikibot is a Python-based framework to write bots for MediaWiki. See https://www.mediawiki.org/wiki/Manual:Pywikibot for more information. Patches can be submitted via Gerrit (you need a MediaWiki.org account). More documentation on Gerrit can be found at https://www.mediawiki.org/wiki/Manual:Pywikibot/Gerrit. After you have successfully claimed this task in Google Melange please do use the task in Phabricator for communication instead of Google Melange. This allows more PWB developers to be reached! General development questions can be asked on the Pywikibot mailing list at https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l and the #pywikibot IRC channel (seehttps://www.mediawiki.org/wiki/MediaWiki_on_IRC).

Wikidata is a multilingual collaboratively edited wiki knowledge base that stores structured data in JSON records.  The software is an extension to MediaWiki called Wikibase.  See https://www.wikidata.org/wiki/Wikidata:Introduction and https://www.mediawiki.org/wiki/Wikibase for more information.   Its primary use is as a central store of facts that can be used by all wiki projects.  Each claim (fact) in Wikidata is essentially in the form of property=value.  A property may have many values.  e.g. India 'shares a border' (property P47) with several countries.

Pywikibot includes a script/bot called isbn.py that reports and fixes invalid ISBN numbers, and converts ISBN10 to ISBN13. It only works on Wikipedia, or other sites which include ISBNs in the raw wikitext of pages.

This task is to enhance isbn.py to provide the same functionality when used on Wikidata, using the ISBN10 and ISBN13 properties: http://wikidata.org/wiki/Special:search/property:isbn .  To complete this task, add a test case to tests/isbn_tests.py to perform validation and cleansing of a test item on https://test.wikidata.org/ .

The Phabricator task is https://phabricator.wikimedia.org/T85242 .

Students are required to read Wikimedia's general instructions at https://www.mediawiki.org/wiki/Google_Code-in_2014#Instructions_for_GCI_students first.


Always refer to https://www.mediawiki.org/wiki/Google_Code-in_2014#Instructions_for_GCI_students for general information and phabricator.wikimedia.org for information on specific tasks.