GSoC/GCI Archive
Google Summer of Code 2013


Web Page:

Mailing List:

The SFT (Software for Experiments) group is part of CERN (European Organization for Nuclear Research,, and focuses on providing common software for its experiments. CERN is one of the world’s largest and most exciting centers for fundamental physics research. Experiments at CERN have probed the fundamental nature of matter and the forces which affect it. CERN is also the birthplace of the World Wide Web (, invented by Tim Berners-Lee. The SFT group's efforts, like most of CERN's current activities, are directed towards the world’s highest-energy elementary particle accelerator - the Large Hadron Collider (LHC, and its experiments. There are four large experiments at the LHC (ALICE, ATLAS, CMS, LHCb) which seek to expand the frontiers of knowledge and complete our understanding of the constituents of matter and their interactions, of the conditions in the first instants after the Big Bang and of the differences between matter and anti-matter. During 2012, ATLAS and CMS announced the discovery of a new boson, which has been confirmed recently to have the properties of a Higgs boson - similar to the one required by the Standard Model of Particle Physics. 

NOTE: The vast majority of our GSoC projects do not require any physics knowledge.

Operating the LHC and running each experiment requires a large amount of software. A large part of this software is common and open source. The open source software spans the range from system software to more specialized physics-oriented tools and toolkits.

The Root system ( is used to handle, store and analyze the data of all LHC experiments. The experiments store both their raw data and intermediate, processed results using Root, as it offers an open source format and is very compact. Having the data defined as a set of objects, it is possible to get access separately to particular attributes of the selected objects, without touching the remaining attributes. Root includes many tools for analysis of data, from histograming methods in an arbitrary number of dimensions, curve fitting, function evaluation, minimization, graphics and visualization. It includes also a built-in C++ interpreter the command language which is used as a scripting, or macro language.

The CERN Virtual Machine (CernVM, is an R&D project established in the Software group of CERN’s Physics Department (PH/SFT) to investigate how virtualization technologies can be used to improve and simplify the daily interaction of physicists with experiment software frameworks and the Grid infrastructure. CernVM maintains a Virtual Software Appliance designed to provide a complete and portable environment for developing and running LHC data analysis applications on any end user computer (laptop, desktop) as well as on the Grid and on Clouds.

The Geant4 toolkit ( is a key component of the common physics software. It simulates the interactions of radiation with material in any setup, including the detectors of the LHC or other High Energy Physics (HEP) experiments. Uses in other fields including medical diagnostics and to satellite engineering and planetary science. One use is assessing the effects of radiation on the electronics of satellites. Another is in designing improved medical detectors with specialised applications such as the Geant4 Application for Tomographic Emission GATE ( LHC experiments use Geant4 to compare the signatures of events from new physics (such as the Higgs boson and particles which are candidates for dark matter) to the signatures of events coming from known interactions which could mimic them. Geant4 is created by the Geant4 collaboration ( of over 100 physicists and engineers from around the world, bringing together teams at leading High Energy Physics laboratories such as CERN (Geneva, Switzerland), Fermilab (Batavia, IL), KEK (Tsukuba, Japan), SLAC (Stanford, CA) and Triumf (Vancouver, Canada), as well as many universities and institutions. The toolkit continues to be developed to improve its precision and scope of application, and to better utilise current and emerging computer architectures.

NOTE: To discuss via IM with mentors, please use the Jabber/XMPP address  ( We cannot use IRC. )


  • Add the ability to accept LaTeX and Markdown/MathJax formulas in Indico abstract editor My proposal consists on improving Indico abstract editor to be able to accept LaTeX and Markdown/MathJax formulas to generate HTML and PDF compiled outputs. Using PageDown editor to translate the plain text from input to LaTeX documents ("book of abstracts") and HTML while live previewing the edition on a canvas.
  • Automatic Differentiation Library Using Cling Automatic differentiation has numerous scientific applications which can be accommodated by a specialized tool like the CERN ROOT. This proposal provides technical background and two suggested automatic differentiation implementation strategies - one relying on operator overloading (similar to ADOL-C) and the other involving source code transformation (mirroring Tapenade). The goal of the project is to extend the Cling functionality in order to make it possible for the tool to differentiate non-trivial functions and find partial derivatives for trivial cases. The final product will consist of C++ files, documentation, and test cases
  • CERN app for Android proposal This is detailed proposal where I explain my main ideas for CERN Android application. I specify how I plan on making the application extensible, how I've planned to organise it and how I think it'll be most convenient way to code it. I refer to existing iOS app, extending and modyfing some ideas. I present both development details and thought process behind them as well as detailed timeline and description of my experience and background.
  • HEPunion filesystem implementation as Linux module At CERN LHCb experiment (Large Hadron Collider beauty), in order to reconstitute and filter events, a huge computing facility is required (currently ~1500 nodes). This computing farm, runs SLC (Scientific Linux CERN), a Red Hat Enterprise Linux (RHEL) derivate. All these computing nodes get their operating system (Linux) from some dedicated servers. Each node act as diskless nodes and use dhcp and pxe/NetBoot images, loads the kernel and initial ram disk with tftp and finally loads the rest of the system from nfs which makes booting very fast.To add flexibility to the diskless nodes handling, file system union has to be added, which is often implemented in Linux live media. The aim of this project is to adapt and extend union file system as a proper Linux module to address the requirements of a HEP experiment's dedicated farm.
  • Implement Pre-run Verification In Cling This project enhances Cling's execution engine with pre-run verification, including the detection of a dereference of a null pointer. The expected results is that the execution of a NULL pointer dereference does not causes a crash.
  • Marketplace for VM Contextualization Artifacts Creating a marketplace for contextualization artifacts for CernVM. The marketplace will display a list of contexts that are publicly available for all users. Instead of engaging into repetitive tasks of creating new contexts, users can make use of the readily available contexts and add them to their list of available contexts, as well as pair them with the desired CernVM. Contexts will have different specifications, and every context will have the option to be rated, added to the list of contexts and commented on. Users can specify if they want the context to be made publicly available. In the section for creating a new context, new fields will specify whether the context should be publicly available. Search and Advanced Search options will be developed in order to alleviate navigation throughout the marketplace. After filtering the searched keywords, a list of available contexts will be displayed, ordered by date of submission. There will also be another filter available for filtering the results according to rating. Developing the marketplace for contexts for CernVM will reduce the work of the users, so users can concentrate on improving the contexts and sharing them with others.
  • Performance Monitoring on User-defined ROI This is a project to extend Linux perf tool to collect and organize performance data for user-defined regions of the code. Currently, perf collects data for the whole workload. This is not useful if only a subset of large code is of interest. The project is to enhance perf tool by enabling collecting performance data for the relevant regions of the code and organizing data accordingly.
  • TFormula class for CERN In this application I’ll present main idea for CERN GSoC project for re-implementing TFormula class. Existing code is rather old and hard to read and maintain. Idea is to convert mathematical expressions into formula that Cling can parse and evaluate. Such solution can be easily upgradable together with development of Cling.