Jun 20, 2007

CERN is host to ROOT workshop

The seventh international ROOT workshop was held from 26–28 March at CERN. About 100 people attended, comprising primarily particle physicists but also scientists from other fields and some representatives from commercial companies. The ROOT workshops are typically held every 18 months and previous events have been held at Fermilab, SLAC and CERN.

ROOT event

The aim of the workshop is for ROOT developers to present their latest work, and for users to report on how they are using the system and to tell the developers what new features they would like and what can be improved.

The programme consisted of 38 talks and seven posters. Most talks lasted 20–30 minutes, apart from the keynote talks from the ROOT developers.

The main topics were: the use of ROOT as a general framework, new features in the input–output system, 3D graphics, graphical user interface, progress with the merging of Reflex, C++ introspection, library with CINT, the C++ interpreter, and the status of the different language bindings like Python and Ruby.

Several talks were given about the new maths libraries and statistical tools, which will become important now that the LHC is reaching completion and the first data will soon have to be analysed. There were also presentations on distributed data analysis, PROOF, and the different Analysis Object Data (AOD) models of the LHC experiments.

The talks were generally of a high quality with detailed technical content. It is impossible to cover all of the talks in this article so I will mention some of the highlights. The full workshop agenda and all talks are available from the web at http://indico.cern.ch/conferenceDisplay.py?confId=13356.

Lightweight ROOT distribution

On the first day, René Brun gave a talk entitled "From ROOT to BOOT, from BOOT to ROOT". He outlined plans to make the ROOT distribution process more lightweight based on a small and stable BOOT kernel that at run-time will import, via the web, only the libraries that the user needs. To achieve this ROOT must be modularized into more components, and these components must be made smaller by reducing the high overhead of the CINT dictionaries and improving the C++ compiler performance to allow on-demand and on-the-fly compilation of the needed components. This development is driven by the observation that during a typical ROOT session a typical user needs only a few percent of the more than 1 million lines of ROOT code. These few thousand lines of code can then be custom compiled, using the compiler optimizations, to yield the best performance for the user’s platform.

Statistical tools

On the final day there were several talks about the statistical analysis tools and frameworks that are integrated with ROOT.

First there was a talk on the Toolkit for Multivariate Data Analysis (TMVA), which implements sophisticated classification algorithms based on neural networks. The toolkit is a welcome development as these machine-learning techniques to analyse high-energy physics data are being used more and more.
Then there was an update on the RooFit package, which is an advanced data modelling language. RooFit represents modelling concepts such as observables, parameters and probability density functions as C++ objects, and provides operator objects for addition, multiplication, convolution and so on to build data models of arbitrary complexity.
There was also a presentation on the RooStat package, which is built on top of RooFit. It provides a system that will enable the combining of the results of multiple measurements, including systematic uncertainties, from different LHC experiments. The systematic uncertainties will be evaluated with different techniques ranging from Bayesian to fully Frequentist.

Distributed data analysis

Finally there were several talks about distributed data analysis and the AOD models of the LHC experiments. A presentation of the latest developments on the Parallel ROOT Facility (PROOF) indicated that a new workload distribution method (packetizer) showed clear performance gains, especially with remotely accessed data.

Representatives of the ALICE experiment gave a presentation that showed the promising results of using PROOF to promptly analyse physics data on the CERN Analysis Facility (CAF). With PROOF the data is analysed in parallel on many CPUs, and this yields a higher throughput at a high level of efficiency than when the same job is split into many smaller jobs and submitted via a traditional batch system. The ALICE AOD model is tuned for use in stand-alone ROOT and PROOF. CMS is developing an AOD that can be used either in the full CMS Software (CMSSW) framework, or in stand-alone ROOT and PROOF. LHCb’s AOD model is based on exclusive use via their Gaudi framework and to be run on a cluster or Grid via Ganga.

Taking ROOT to the next level

Overall this was an interesting ROOT workshop. It covered many topics and there was an emphasis on analysing LHC data in advance of the LHC start-up. During the question and answer sessions the ROOT team asked the audience if any areas needed to be developed further or what functionality was missing. At previous workshops there would be many suggestions for new functionality but this time the audience did not identify any major items that were missing. It is good to know that ROOT now covers the data-handling and analysis needs of the LHC users. Of course this does not mean that ROOT is finished. It is now up to the ROOT team to innovate and take the system to the next level.

About the author

Fons Rademakers, PH/SFT