GSoC/GCI Archive
Google Summer of Code 2014

CERN SFT

License: GNU Library or "Lesser" General Public License (LGPL)

Web Page: http://sftweb.cern.ch/gsoc14

Mailing List: mailto:sft-gsoc@cern.ch

The SFT (Software for Experiments) group is part of CERN (European Organization for Nuclear Research, http://www.cern.ch), and focuses on providing common software for its experiments. CERN is one of the world’s largest and most exciting centers for fundamental physics research. Experiments at CERN have probed the fundamental nature of matter and the forces which affect it. CERN is also the birthplace of the World Wide Web (http://info.cern.ch), invented by Tim Berners-Lee. The SFT group's efforts, like most of CERN's current activities, are directed towards the world’s highest-energy elementary particle accelerator - the Large Hadron Collider (LHC, http://public.web.cern.ch/public/en/lhc/lhc-en.html) and its experiments. There are four large experiments at the LHC (ALICE, ATLAS, CMS, LHCb) which seek to expand the frontiers of knowledge and complete our understanding of the constituents of matter and their interactions, of the conditions in the first instants after the Big Bang and of the differences between matter and anti-matter. During 2012, ATLAS and CMS announced the discovery of a new boson, which has been confirmed recently to have the properties of a Higgs boson - similar to the one required by the Standard Model of Particle Physics. NOTE: The vast majority of our GSoC projects do not require any physics knowledge. Operating the LHC and running each experiment requires a large amount of software. A large part of this software is common and open source. The open source software spans the range from system software to more specialized physics-oriented tools and toolkits. The projects to which students can contribute span two end-user applications, the SixTrack accelerator simulation and the Geant4/Geant-V detector simulation toolkit, the interpreter Cling which will serve as a core component of the the Root system for storing and analyzing the data of the LHC experiments and IgProf, a key tool for obtaining profiling of large scale applications, such as the experiment simulation, reconstruction and analysis tool-chains. SixTrack is a simulation tool for the trajectory of high energy particles in accelerators. It has been used in the design and optimization of the LHC and is now being used to design the upgrade that will be installed in the next decade, the High-Luminsity LHC (HL-LHC). Sixtrack has been adapted to take advantage of large scale volunteer computing resources provided by the LHC@Home project. It has been engineered to give the exact same results after millions of operations on several, very different computer platforms. The Root system (http://root.cern.ch/) is used to handle, store and analyze the data of all LHC experiments. The experiments store both their raw data and intermediate, processed results using Root, as it offers an open source format and is very compact. Having the data defined as a set of objects, it is possible to get access separately to particular attributes of the selected objects, without touching the remaining attributes. Root includes many tools for analysis of data, from histogramming methods in an arbitrary number of dimensions, curve fitting, function evaluation, minimization, graphics and visualization. It includes also a built-in C++ interpreter the command language which is used as a scripting, or macro language. Cling is a new C++11 standard-compliant interpreter, an interpreter built on top of Clang (www.clang.llvm.org) and LLVM (www.llvm.org) compiler infrastructure. Cling is being developed at CERN as a standalone project. It is also being integrated into the ROOT data analysis (root.cern.ch) framework, giving ROOT access to an C++11 standards compliant interpreter. ROOT is an open system that can be dynamically extended by linking external libraries. This makes ROOT a premier platform on which to build data acquisition, simulation and data analysis systems. The Geant4 toolkit (http://cern.ch/geant4) is a key component of the common physics software. It simulates the interactions of radiation with material in any setup, including the detectors of the LHC or other High Energy Physics (HEP) experiments. There are many diverse uses in other fields: assessing the effects of radiation on the electronics of satellites and designing improved medical detectors with specialized applications such as the Geant4 Application for Tomographic Emission GATE (http://www.opengatecollaboration.org). LHC experiments use Geant4 to compare the signatures of events from new physics (such as the Higgs boson and particles which are candidates for dark matter) to the signatures of events coming from known interactions which could mimic them. Geant4 is created, developed and maintained by the Geant4 collaboration (http://geant4.org) of over 100 physicists and engineers from around the world from Europe (CERN, IN2P3/France, INFN/Italy), US (Fermilab, SLAC), Japan (KEK), Canada (Triumf), Russia (Lebedev) as well as many universities. A key area of current research are extension to utilize current and emerging computer architectures. One of these efforts is the Geant Vector Prorotype project, which aims to demonstrate improved performance on the latest CPU and accelerator hardware. IgProf (https://igprof.org) is a lightweight performance profiling and analysis tool. It can be run in one of three modes: as a performance profiler, as a memory profiler, or in instrumentation mode. When used as a performance profiler it provides statistical sampling based performance profiles of the application. In the memory profiling mode it can be used to obtain information about the total number of dynamic memory allocations, profiles of the ``live'' memory allocations in the heap at any given time and information about memory leaks. The memory profiling is particularly important for C/C++ programs, where large amounts of dynamic memory allocation can affect performance and where very complex memory footprints need to be understood. In nearly all cases no code changes are needed to obtain profiles. The CERN Virtual Machine (CernVM, http://cernvm.cern.ch) is a project to investigate how virtualization technologies can be used to improve and simplify the daily interaction of physicists with experiment software frameworks and the Grid infrastructure. CernVM maintains a Virtual Software Appliance designed to provide a complete and portable environment for developing and running LHC data analysis applications on any end user computer (laptop, desktop) as well as on the Grid and on Clouds.

Projects

  • API for partical tracking and refactoring of existing module using the API. Current module is implemented in Fortran 77 which is not quite popular these days and has its own limitations. It has some limitations like use of too many global variables(whose significance is hard to understand for a new developer), lack of code re-usability etc. So Aim of this project is to develop and test a particle tracking API having the power of parallel computing and use standard easy to use and understand naming conventions and refactoring of existing module using this API.
  • Cling bundle for most popular platforms Cling is an interactive C++ interpreter based on Clang and LLVM compiler infrastructure. It serves as a core component of the ROOT system for storing and analyzing the data of the Large Hadron Collider (LHC) experiments. This project implements scripts to extend the capabilities of the continuous integration server used to produce nightly builds of Cling for a wide range of platforms. The scripts are also targeted to be independent of the current CI services powered by Electric Commander.
  • Cling Name Autodetection and Library Autoloading The aim of this project is to improve the Cling user prompt to provide hints about the location of unknown names in the code. The suggestion will prompt the user to load a particular missing header (and library, when applicable) where the name is to be found. If there is no ambiguity, the user may choose an option to automatically load the required files and evaluate the previous input again.
  • Complete ROOT — R interface Develop an interface in ROOT to call R function using the R C++ interface (Rcpp, see http://dirk.eddelbuettel.com/code/rcpp.html). As a proof of concept implement the Minimizer interface of ROOT to use a R package for minimization. Developing this interface opens the possibility in ROOT to use the very large set of mathematical and statistical tools provided by R.
  • Implement Automatic Differentiation library using Cling Automatic differentiation is the process of obtaining a derivative of specific function in terms of computer transformations. This proposal provides a plan for improving and implementing missing features of the clang based automatic differentiation plugin clad. The plan is divided into steps. Each step is separated as an individual task where a short description of the problem is presented. The goal of the project is to implement functionality that is able to differentiate non-trivial functions.
  • Improving the support for ARM in IgProf IgProf is a performance and memory usage profiling tool, originally developed for measuring and analysing performance and memory usage of applications related to the CMS project at CERN. Currently IgProf supports Linux applications on Intel 32-bit x86, 64-bit x86-64 and 32-bit ARM. The proposal is to extend IgProf to support also the 64-bit ARM architecture.
  • Optimization of Jet Clustering Algorithms for the LHC at CERN Parallelization is a branch of computing where multiple processes are executed simultaneously. Our research focuses on applying parallelizing FastJet, a C++ package used at CERN and other particle accelerators. Event reconstruction is computationally intensive, and further advances will require improved efficiency. By altering FastJet to run on multiple cores, we will decrease computation time. Outcomes from this research can be implemented in order to optimize jet clustering algorithms.
  • R/ROOT Interfacing and Data Analysis Programming There are two main goals of this proposal. First, capitalize on the effectiveness of R’s computational and analytical tools for use in the world of particle physics—where raw data is infinite and significance is hard to find. And second, clean up functional but inefficient interpolation programs in ROOT to speed analysis.
  • Reengineer Propagation of Charged Tracks in a Electromagnetic Field In this project, we propose to redesign stepper and equation classes in Geant4 to achieve a better performance. By applying template techniques and vectorization, we are going to write the completely abstract steppers and equations and get rid of virtual functions. The functionality and performance of our project will also be evaluated and compared with existed version.
  • SixDesk library for managing massive SixTrack simulations This projects attempts to reorganize the input/output data management by building a SixDesk compatible python library using SQLite user files and a centralized MySQL server so as to reduce the small intermediate files being created in the whole process and create a central database for easy synchronization between user (local machine or cern cluster) and server (Boinc server)
  • Streamline CernVM Contextualization Plug-ins The CernVM Contextualization Process involves three entities- µCernVM bootloader, Amiconfig and CloudInit. We can streamline them by modifying the micro-contextualization process, overriding the default CloudInit initialization and adding a custom part-handler for Amiconfig. This will reduce the ‘number of ways user data are fetched’ X ‘techniques of user data interpretation’ thereby reducing boot time delay of CernVM.