In the last of our series of editorials profiling the different groups within the IT department, CNL speaks to Bob Jones, the director of the Enabling Grids for E-sciencE (EGEE) project and leader of the EGE group in IT. We ask him about the challenges his group faces managing the worldwide consortium of 91 partner institutions that are participating in the EGEE project.

What is the role of the EGE group?

The EGE group at CERN hosts the management activity for the EGEE project. This includes myself; the technical director of EGEE, Erwin Laure; and the project's administration, which looks after all the relations we have with the consortium of 91 partners concerning administrative and financial matters. The EGE group also handles all relations with the European Commission on behalf of the consortium, including, for example, the regular reviews of the project.

Our group also manages dissemination and outreach for EGEE, and we are responsible for the use by the EGEE partners of CERN-based IT tools – such as Project Progress Tracking (PPT), Indico, and Engineering Data Management Service (EDMS). Members of the group also work on related projects, such as the Diligent project for Grid-based digital libraries; and we run the Related Projects Liaison Office, which manages relations with other Grid projects and application projects around Europe. In total this represents 16 people. In addition, a host of fellows and associates are registered with our group but they work in the different technical activities inside the EGEE project.

How many people are involved in the EGEE project?

What people don't always see at CERN is that EGEE is a multiscience, multipurpose infrastructure. While the most prolific users are the physics community for the LHC, more than 5000 people are registered to use the infrastructure, and we estimate that more than 12,000 scientists benefit from its existence. About 500 people work full time for the EGEE project today; to put that in perspective it means we manage the equivalent of about 20% of CERN personnel.

Through the PPT project-planning tool we know that more than 1100 people have worked for the project in one way or another over the nearly four years of its existence. The European Commission and external reviewers have repeatedly praised the EGEE project for its ability to manage such a complex international effort. Indeed we are now used as a role model for the administration of other large EU projects.

What have been the highlights of the EGEE project over the last year?

People often think of EGEE in terms of either the infrastructure or the middleware. These are just two out of 11 activities in the project. There's also training, applications support, outreach, work on standards and international collaboration with other Grids, work with the business community, and so on. For example, two EGEE training events are held every week somewhere around the world. These have been mainly end-user and applications interface training, but as the middleware is deployed on other infrastructures we are now getting many requests for site manager training, and from private companies.

Nevertheless, the scale and level of service offered by the infrastructure remains a key measure of the project's success. We ended the European Data Grid project four years ago with 1000 CPUs at 20 sites. We now have more than 40,000 CPUs in 250 sites across 48 countries, and about 12 PB of storage. We anticipate that the processing and storage capacity of the EGEE Grid will easily double before the LHC start-up.

The quality of the service of the infrastructure has improved significantly over the last year. This is thanks to improvements in the operational procedures, and a suite of monitoring tools and testing techniques that have been developed in the project, some of this in collaboration with industry. The throughput of the EGEE Grid is now more than 100,000 jobs per day, and 68% of that is related to the LHC. But the remaining 32% represents a dramatic increase for other scientific disciplines. These are accelerating in their use of EGEE relative to the LHC users.

Perhaps the most impressive result over the last year, though, is that we are managing this much larger and more complex infrastructure with the same manpower as in the first two-year phase of EGEE, which ended in 2006. So the infrastructure is increasing rapidly but not the manpower to maintain it.

What challenges face the EGEE project over the coming year?

Regarding infrastructure, one challenge for the future is that many of the other sciences have been piggybacking on resources made available for the LHC. As the LHC comes online it is likely that capacity will be used, so we are working with the other disciplines to ensure that they contribute more resources to the infrastructure.

There are some issues here. Do these disciplines have the personnel to connect more resources? Does the gLite middleware support all the platforms that are used in other disciplines? At the moment gLite is constrained by the Linux distributions it works on. For example, in the life sciences we have partners who have resources but only if they can run on Windows. This is why we have been working with new EGEE business associates to better support the Windows platform.

With the gLite middleware there has been a restructuring this year to take out redundancies and unnecessary connections and constraints. The goal is to factorize gLite into more independent components that can be ported and deployed in a simpler fashion.

We have recently had confirmation that the European Commission will fund a third stage of the EGEE project, starting next spring. The key mission of the project in this third phase will be to assure the continued operation of the production infrastructure through to 2010, which is particularly crucial for the physics community during the LHC start-up. In parallel with this, the ambition is to achieve a sustainable Grid infrastructure, with the vision of a permanent European Grid Initiative (EGI) that would take over from EGEE at the end of its next phase.

Much of the work in EGEE-III will be aimed at restructuring the consortium of partners to ensure a long-term grouping based on national Grid initiatives. An EGI design study project was launched in September, with the support of 37 countries. The project, of which CERN is a founding member, will design the operational and governance model for a sustainable Grid infrastructure. This will be a challenging political process, and it has to converge rapidly to ensure the future of the Grid infrastructure on which the LHC depends.

Another challenge is to encourage industry to take up this technology. On the one hand many companies are interested in developing products and services based on our Grid standards, and this could help support the Grid infrastructure in the longer term. On the other hand the big companies like Microsoft, Google and Yahoo! are not interested in Grid standards since they see a competitive advantage in having a private infrastructure and selling services on it. EGEE wants to go for the open standards approach, to create a market in the same manner as has happened for the web. If Tim Berners-Lee had not insisted on open standards what value would the web have had? We've been here before, so let's try to do the right thing again.