0% found this document useful (0 votes)
19 views38 pages

APS DPF Elmer Future of Computing

The DPF-PHENO 2024 conference, led by Peter Elmer from Princeton University, discusses the future of computing in high-energy physics, emphasizing technological advancements and collaborative research efforts. Key trends include high-performance computing, distributed systems, machine learning, and quantum computing, all aimed at addressing the computational challenges of large-scale experiments. The conference also highlights the evolution of software and computing infrastructure in particle physics, underscoring the importance of training and education in building a sustainable workforce for future challenges.

Uploaded by

paulchinedu0006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views38 pages

APS DPF Elmer Future of Computing

The DPF-PHENO 2024 conference, led by Peter Elmer from Princeton University, discusses the future of computing in high-energy physics, emphasizing technological advancements and collaborative research efforts. Key trends include high-performance computing, distributed systems, machine learning, and quantum computing, all aimed at addressing the computational challenges of large-scale experiments. The conference also highlights the evolution of software and computing infrastructure in particle physics, underscoring the importance of training and education in building a sustainable workforce for future challenges.

Uploaded by

paulchinedu0006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

DPF-PHENO 2024 - University of Pittsburgh / Carnegie Mellon University

Future of Computing

Peter Elmer - Princeton University


15 May 2024
DPF-PHENO 2024 - University of Pittsburgh / Carnegie Mellon University

Future of Computing
A story of technologies, collaborations and science driven challenges

Peter Elmer - Princeton University


15 May 2024
Introductions…. Who am I?
I am an experimental particle physicist focused on computational and data
science problems in my field, along with the software/computing systems to solve
them. I am a senior research scientist with Princeton, but based at CERN.

I have primarily collaborated on collider experiments, which will likely color


some of the perspectives in this talk. These include:

Aleph@CERN CMS@CERN
BaBar@SLAC
International dialing at reduced rates to
1990s Technology Context participate in meetings with phone cards
Dec 1991 - first web server (later first DB)
deployed in the US, at SLAC

Sep 1992 - first public Linux version

Jan 1993 - Mosaic browser released

Aug 1995 - Windows 1995

Yahoo, AltaVista, …. Google still to come.

Windows NT was the “strategic platform” at


Modem Dial-up Internet (Geneva)
CERN in the late 1990s. Browser wars!
Local phone calls were 4CHF/hr
Carrying around a laptop really only became ($2.65/hr in 1998) during the day,
a “thing” over the course of the 1990s. Cell 2CHF/hr in the evening and
phones were starting to arrive on the scene, 1CHF/hr in the night —> CERN
for calling/messaging. Pagers were a thing… Automatic Call Back (ACB) system
Gartner Hype Cycle

Future of Computing?

Technologies suffer from the hype


cycle, then often are subsumed into
our everyday work, or just disappear.

In addition, we build “systems” that


support “computing models”, so each
technology needs a context...
Ask ChatGPT - Tell me about the future of computing in high energy physics.

PE You
Tell me about the future of computing in high energy physics

ChatGPT
The future of computing in high-energy physics holds exciting prospects driven by technological advancements and the
evolving needs of scientific research. Here are some trends and areas of development that could shape the future of
computing in this field:

1. High-Performance Computing (HPC): …


2. Distributed Computing and Grid Computing: …
3. Machine Learning and Artificial Intelligence: …
4. Quantum Computing: …
5. Data Management and Storage: …
6. Open Science and Collaboration Platforms: …

Overall, the future of computing in high-energy physics is likely to be characterized by a combination of technological
innovation, collaborative research efforts, and a continued focus on addressing the computational challenges posed by
large-scale experiments and datasets.
CHEP 2023 Scientific Program CHEP 2019 - Scientific Program
Track 1 - Data and Metadata Organization, Management Track 1 - Online and Real-time Computing
and Access Track 2 - Offline Computing
Track 2 - Online Computing
Track 3 - Middleware and Distributed Computing
Track 3 - Offline Computing
Track 4 - Distributed Computing Track 4 - Data Organisation, Management and
Track 5 - Sustainable and Collaborative Software Access
Engineering Track 5 - Software Development
Track 6 - Physics Analysis Tools Track 6 - Physics Analysis
Track 7 - Facilities and Virtualization
Track 7 - Facilities, Clouds and Containers
Track 8 - Collaboration, Reinterpretation, Outreach and
Education Track 8 - Collaboration, Education, Training and
Track 9 - Artificial Intelligence and Machine Learning Outreach
Track 10 - Exascale Science Track 9 - Exascale Science
Track 11 - Heterogeneous Computing and Accelerators
Track 12 - Quantum Computing

http://chep2019.org/

https://www.jlab.org/conference/CHEP2023
CHEP 2007 Scientific Program
CHEP 2000 - Scientific Program
Track 1 - Online Computing
Track 2 - Event Processing Track 1 - Data Analysis: Algorithms and methods
Track 3 - Software components, tools and databases Track 2 - Data Acquisition and Control System
Track 4 - Computer facilities, production grids and Track 3 - Object Persistency and Data Handling
networking Track 4 - Network: Applications and Services
Track 5 - Grid middleware and tools Track 5 - Commodity Hardware and Software and
Track 6 - Distributed data analysis and information Integration in Farm and Large Systems
management Track 6 - Data Analysis: Technology and
Track 7 - Collaborative initiatives with other sciences Presentation
Track 8 - Collaborative tools

https://indico.cern.ch/event/3580/
https://chep2000.pd.infn.it/
https://www.zeuthen.desy.de/CHEP97/chep97.htm
CHEP 1997 - “Future of
HEP Computing” Panel
https://www.zeuthen.desy.de/CHEP97/slide/p518/p518.htm
CHEP 2023 Scientific Program
CHEP 2000 - Scientific Program
Track 1 - Data and Metadata Organization, Management
and Access Track 1 - Data Analysis: Algorithms and methods
Track 2 - Online Computing Track 2 - Data Acquisition and Control System
Track 3 - Offline Computing
Track 4 - Distributed Computing Track 3 - Object Persistency and Data Handling
Track 5 - Sustainable and Collaborative Software Track 4 - Network: Applications and Services
Engineering Track 5 - Commodity Hardware and Software and
Track 6 - Physics Analysis Tools Integration in Farm and Large Systems
Track 7 - Facilities and Virtualization Track 6 - Data Analysis: Technology and
Track 8 - Collaboration, Reinterpretation, Outreach and
Education Presentation
Track 9 - Artificial Intelligence and Machine Learning
Track 10 - Exascale Science
Track 11 - Heterogeneous Computing and Accelerators
Track 12 - Quantum Computing

https://www.jlab.org/conference/CHEP2023 https://chep2000.pd.infn.it/
Some conclusions from the exploring the CHEP history

General trend is towards higher level “system” concepts, while earlier technology tracks now
“subsumed” into the system tracks. Some new “technology” tracks appear (e.g. AI/ML,
Quantum Computing). ChatGPT suggested more technology than systems.

“Conway's Law” effect? - the way an organization communicates and is structured will be
directly reflected in the systems and software it creates

One consistently present track is “data analysis” and/or “analysis tools”. ChatGPT missed
this one.

ChatGPT: “Overall, the future of computing in high-energy physics is likely to be


characterized by a combination of technological innovation, collaborative research
efforts, and a continued focus on addressing the computational challenges posed by
large-scale experiments and datasets.”
Particle Physics
Experiments as
“collaborative
research efforts”

Project Size
Over Time
Community Structures Reflect Our Community Evolution/Needs
The Worldwide LHC Computing Grid (WLCG) project is a
global collaboration of around 170 computing centres in
more than 40 countries, linking up national and international
grid infrastructures. The mission of the WLCG project is to
provide global computing resources ... [2000’s era]
The HEP Software Foundation facilitates cooperation and common
[2010’s era]
efforts in High Energy Physics software and computing internationally.

WLCG/HSF 2024 (13-17 May) - https://indico.cern.ch/event/1369601/


OSG Support for non-LHC experiments
The Open Science Data Federation is built on the Pelican Platform (which uses xrootd),
part of the NSF-funded National Discovery Cloud for Climate (NDC-C)
Software Lifecycle in High Energy Physics

The Life Cycle of HEP Offline Software, CHEP 2007


HL-LHC “Software Upgrade” - Developing a Global R&D Roadmap

Community charge from the Worldwide LHC Computing Grid to the (then nascent)
HEP Software Foundation in July 2016:

● Anticipate a “software upgrade” in preparation for the HL-LHC


● Identify and prioritize the software research and developments investments
1. to achieve improvements in software efficiency, scalability and performance
and to make use of the advances in CPU, storage and network
technologies
2. to enable new approaches to computing and software that could radically
extend the physics reach of the detectors
3. to ensure the long term sustainability of the software through the lifetime of
the HL-LHC

NSF funded the S2I2-HEP Conceptualization Project (s2i2-hep.org/) in July 2016 19


Computing and Software for Big
2016-2017 - Software Roadmap Science volume 3, Article 7 (2019)

“The result: a Programme of Work for


January 2017 the field as a whole, a multifaceted
UCSD approach to addressing growing
computing needs on the basis of
existing or emerging hardware.”

Eckhard Elsen (CERN Director of


Research and Computing), editorial
June 2017 published with CWP/Roadmap
Annecy

Many workshops, involving a diverse group Individual Papers on the arXiv:


• International participants Careers & Training, Conditions Data, DOMA, Data Analysis &
Interpretation, Data and Software Preservation, Detector Simulation,
• Computing Management from the Event/Data Processing Frameworks, Facilities and Distributed
Experiments and Labs Computing, Machine Learning, Physics Generators, Security,
• Individuals interested in the problems Software Development, Deployment, Validation, Software Trigger
• Members of other compute intensive and Event Reconstruction, Visualization
scientific endeavors
• Members of Industry Community White Paper & the Strategic Plan
• http://s2i2-hep.org/ IRIS-HEP
arXiv 1712.06982 arXiv 1712.06592
• https://hepsoftwarefoundation.org/
20
Conceptual motivations behind the HEP Software Foundation

Computer hardware is a
consumable. Software is the
actual "cyberinfrastructure".

Software is also an intellectual


product of our research, not just
a tool.

We can seed and build


collaborations around software
in a similar fashion to our
experimental collaborations.
Software as a “cyberinfrastructure” - 1990s/early-2000s software
Geant4 (RD44 in 1994, V1.0 in 1998) began 30 years ago and just passed the 25th anniversary of
its first release, and ROOT (1994) is at or near its 30th anniversary. Both are ubiquitous in particle,
nuclear and astroparticle physics. (Geant4 also medical and space physics.)

RooFit (D.Kirkby, W.Verkerke) is now nearly 25(?) years old and played a key role in the Higgs
discoveries and LHC (and other) physics.

EvtGen (A.Ryd, D.Lange), originally developed at CLEO and developed in BaBar, lives on in many
experiments.

xrootd (A.Hanushevsky, many others) is now around 20 years old. Initially planned to fix issues
with the Objectivity AMS, it was repurposed into a next generation file server for BaBar with effort
from INFN (A.Dorigo, F.Furano) and (later) CERN.
xrootd has grown into a collaboration of SLAC/CERN/UCSD and others, It is not only widely used
by the LHC experiments, but it is a key element of the OSG’s “Open Science Data Federation
(OSDF)” service broadly supporting science in the US (next slide) that needs high throughput
computing.
Snowmass CompF recommendation
APS DPF and
Coordinating Panel
for Software and
Computing

A “Formation Task Force” has been appointed as an ad-hoc committee by the


DPF EC, with the goal of producing a document similar in purpose to the one
that defines CPAD’s charge, governance, internal organizational structure, and
some initial activities, including some awards programs and community
meetings.

See status report on the FTF by Ian Fisk yesterday


Software Lifecycle in High Energy Physics

The Life Cycle of HEP Offline Software, CHEP 2007


http://iris-hep.org

Conceived as a “software upgrade” project


and guided initially by the “Community White
Paper” roadmap developed in 2016-2017: it
involves 21 universities, spanning ATLAS, CMS
and LHCb.

IRIS-HEP is supported by the U.S. National


Science Foundation through the Office of
Advanced CyberInfrastructure in the
Directorate for Computer and Information
Science and Engineering and the Division of
Physics in the Directorate for Mathematical and
Physical Sciences.
10-year project: Originally funded in 2018 as
OAC-1836650 and renewed in 2023 through
2028 as PHY-2323298.
IRIS-HEP Strategic Plan for the Next Phase of Software Upgrades for HL-LHC Physics (arXiv:2302.01317)
Analysis Grand Challenge
The Analysis Grand Challenge (AGC) is about performing the last steps in an analysis
pipeline at scale to test workflows envisioned for the HL-LHC. This includes

● columnar data extraction from large datasets,


● processing of that data (event filtering, construction of observables, evaluation of
systematic uncertainties) into histograms,
● statistical model construction and statistical inference,
● relevant visualizations for these steps,

all done in a reproducible & preservable way that can scale to HL-LHC requirements.
See also “Demonstrator Analysis 200
Begun as an integration exercise for IRIS-HEP, but has evolved as a community project
Gb/s” (B.Bockelman) at WLCG/HSF
and benchmark reference bringing together different groups and experiments.

example output of analysis notebook

Execution of AGC analysis benchmark


PyHEP.dev 2023 workshop
PyHEP Ecosystem 40 participants
(Scikit-HEP)
Our scientific Python development vision/ecosystem Scientific Python / PyData vision/ecosystem

Application
Specific

Domain
Specific

Technique
Specific

28

Foundational

Jake Vanderplas 2017 keynote-style Python ecosystem, but with layer labeling a la “Array programming with NumPy”, Nature , 585, 357-362 (2020)
The “Future of Computing”
is of course people to do
it: software training!
Training and Education Model
How to build a sustainable and
scalable training framework
that grows skills through multiple
stages of people’s careers?
Training Survey
In early 2019, we did a survey of training needs (link for results summary), 334
people responded!

31
Multiple software, computing, data science and AI/ML
programs aiming at different career stages
● Undergraduate summer programs: US-CMS PURSUE program, US-ATLAS
SUPER program, IRIS-HEP Fellows program
● HSF/IRIS-HEP Training activities (materials and events) - 1600 students and 50
educators in the past few years
● A3D3 PostBac program
● Summer schools: IAIFI, CoDaS-HEP
● US-ATLAS and US-CMS postdoc R&D programs
● DOE CompHEP Traineeship projects: TAC-HEP, WATCHEP, C2-The-P2
● International: Fellows/mentoring in NSF-funded HSF-India project, HSF Google
summer of code, CERN summer students, etc.

32
Central repository of training materials + instructor
HSF Software Training Center community - modeled on The Carpentries

Plus more…
And growing…

1500 students/50 instructors to date.


CoDaS-HEP 2017
Computational and Data Science
(CoDaS-HEP) Summer School
http://codas-hep.org
In-person summer school with Lectures & hands-on
exercises:
CoDaS-HEP 2018
● Parallel Programming
● Data Science Tools and Techniques
● Machine Learning - Technology and Methods
● Practical skills: performance evaluation, use of git
for version control

CoDaS-HEP 2022

CoDaS-HEP 2019
IRIS-HEP Fellows
Connects undergraduates in
physics and computer science
with HEP mentors active in
developing HEP research
software.

They work mostly in remote


with their mentors on
cutting-edge software-centric
summer research projects

35
DOE Traineeships in Computational HEP

New from 2023: 2 year traineeships for graduate students:


coursework/training activities in software/computing plus
dedicated software/computing R&D projects in collaboration
with DOE lab staff, with an aim to develop the next
generation of computational scientists and engineers.

4 funded projects: TAC-HEP, WATCHEP and C2-THE-P2 and


LGT4HEP (see individual websites for universities and labs
involved).

R&D topics span the experimental frontiers (energy, cosmic,


Princeton school, 2023
intensity) plus lattice gauge theory.

Beginnings of a cohort experience: Computational HEP


Traineeship Summer School (Princeton 2023, FNAL 2024)
Building global
collaborations around
common software

Although
participation in
experiment software
is international,
much of the core
software
development comes
primarily from the
US and Europe. HSF-India project: 5-year project: software training, researcher
visits and seeding of software collaborations

http://research-software-collaborations.org/ (OISE-2201990)
And the “Future of Computing”?
The Worldwide LHC Computing Grid (WLCG) project is a
global collaboration of around 170 computing centres in
more than 40 countries, linking up national and international
grid infrastructures. The mission of the WLCG project is to
provide global computing resources ... [2000’s era]

The HEP Software Foundation facilitates cooperation and common


[2010’s era]
efforts in High Energy Physics software and computing internationally.

What collaborative research efforts will the rest of the


2020’s and the 2030’s produce for the “future of
computing” in the HEP community?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy