The Particle Physics Data Grid Collaboratory Pilot
(PPDG) developed and deployed production Grid systems vertically
integrated with experiment-specific applications, Grid technologies,Grid
and facility computation and storage resources to form effective
end-to-end capabilities. In 2005 & 2006 PPDG groups are deploye their systems and applications on the production Open Science Grid. PPDG was a collaboration of computer scientists
with a strong record in Grid technology, and physicists with leading
roles in the software and network infrastructures for major high-energy
and nuclear experiments. The goals and plans were guided by the immediate
and medium-term needs of the physics experiments and by the research
and development agenda of the computer science groups. A record of the status
and achievements are given in the project's Quarterly
PPDG has released the following News Items:
Sustained Production Data Movement over the Grid
has resulted in: a factor of 2-10 more data transfer throughput, operational
effort reduced by a factor of 2, a paradigm shift for distributed
data processing from manual to automated bulk file transfers,earlier
physics results. Job Scheduling over the Grid has
resulted in: x2 Increase in efficiency to run jobs; 20-50% gain in
resources through opportunistic use and distributed distributed resources;
more confidence in physics results because of increased simulations.
Grid2003 Project deployed a 6-VO 2600-CPU 26-site
application grid providing benefit to 10 applications since November
2003. Biology Grid technologies hardened through
PPDG providing Biology applications improvements of 5-10X in performance
SRM Accomplishments in 2005
Open Science Grid "Open for Science"
CMS Data Challenge 04 is Grid based.
Physics using SAMGrid & Grid tools, (Top Quark Results at SciDAC 2005).
Physics - Utilizing the Grid
- A Multi-VO Application Grid
the Virtual Data Toolkit
uses DOESG certificates across the Atlantic
STAR at RHIC: up to 5 TB a week between HPSS at BNL and LBNL
using SRM; allowing next day access to fresh data for analysis
led to 4 month turnaround for physics results. Average data transfer
effectiveness x 10 times greater than before, resulting from less
D0 at Fermilab: 50 Terabytes through GridFTP into MSS. 100 million
events reprocessed remotely to meet otherwise impossible publication
milestone. Using multiple streams in GridFTP increased throughput
a factor of 5.
U.S. ATLAS for LHC:>40 TB have been recorded from the
distributed simulation sites in the U.S, 4 x more data collected.
JLAB for Nuclear Physics: The time to simulate 30M events was
reduced from 3+ months to 1 week, a 10X reduction in time.
U.S. CMS for LHC: 50 million events simulated through Condor-G
with an overall factor 2 more efficiency than a year or two ago
over Grid2003 28sites, 2600 CPUs. 30% jobs run on “opportunistic”
resources. Execution of >75,000 jobs over the grid supported
by a single FTE. GADU at ANL: scaling enhancements, reliability
improvements, and feature development to GRAM, Condor-G, DAGMan,
GridFTP, and the Replica Location Service (RLS). > 7.5M genome
sequences were processed by GADU on Grid2003 resources at a throughput
more than 5 times faster than the pre-Grid capabilities of this
The PPDG collaboration takes an active role in iVDGL
and together with GriPhyN forms
a collaboration of 3 US Grid projects for physics, symbolized by
the Trillium flower. The leadership of PPDG contributes to
international collaboration through work with the EGEE and LCG projects,
and the Particle and Nuclear Physics Research Group at the GlobalGridForum,
PPDG is collaborating with among others the DOE laboratories and
LHC Experiments on the Open
Science Grid U.S. based production Grid infrastructure. The
Particle Physics Data Grid collaboration was formed in 1999 because
its members were keenly aware of the need for Data Grid services
to enable the worldwide distributed computing model of current and
future high-energy and nuclear physics experiments.