The Asic: A Fire-Breathing Monster For Molecular Dynamics Simulations
The Asic: A Fire-Breathing Monster For Molecular Dynamics Simulations
Hot Chips 33
24 August 2021
The Anton 3 hardware team
Peter J. Adams†, Brannon Batson, Alistair Bell, Jhanvi Bhatt†, J. Adam Butts,
Timothy Correia, Bruce Edwards, Peter Feldmann, Christopher H. Fenton,
Anthony Forte, Joseph Gagliardo, Gennette Gill, Maria Gorlatova†, Brian Greskamp,
J.P. Grossman†, Jeremy Hunt, Bryan L. Jackson, Mollie M. Kirk†, Jeffrey S. Kuskin†,
Roy J. Mader, Richard McGowen†, Adam McLaughlin, Mark A. Moraes,
Mohamed Nasr†, Lawrence J. Nociolo, Lief O'Donnell, Andrew Parker, Jon L. Peticolas,
Terry Quan, T. Carl Schwink†, Keun Sup Shim, Naseer Siddique†, Jochen Spengler†,
Michael Theobald, Brian Towles†, William Vick†, Stanley C. Wang, Michael Wazlowski,
Madeleine J. Weingarten, John M. Williams†, David E. Shaw
† Work conducted while at D. E. Shaw Research; author’s affiliation has subsequently changed.
2
Molecular dynamics (MD) simulation
• Understand biomolecular
systems through their motions
• Numerical integration of
Newton’s laws of motion
– Model atoms as point masses
– Compute forces on every atom
based on current positions
– Update atom velocities and positions
in discrete time steps of a few
femtoseconds
• Force computation described by
a model: the force field
3
Biomolecular force fields
F= + Rcut +
4
Meet
5
: The defending champion
Flex Tile
Dispatch 256KB Network
Unit SRAM Interface
control/commands
PPIM PPIM PPIM PPIM
SRAM Interface 0 1 17 18
640 KB
local memory PPIM PPIM PPIM PPIM
Dispatch Geometry
Unit Core 19 20 36 37
❶ ❷ ❸
• Evolutionary changes
① Support additional functional forms
128 KiB ② Increase memory capacity
GC
PPIM SRAM ③ Tune instruction set for MD application
③ Increase code density
❹ ❺
RTR BOND • Revolutionary changes
④ Co-locate compute resources
⑤ Specialize bonded force computation
PPIM 128 KiB ① Double effective density of pairwise interaction
GC calculation
SRAM
②④ Implement fine-grained synchronization within
memory and network
8
Bond calculator
Term Stretch Angle Dihedral / Torsion
Positions
Atoms 2 3 4 r1 … rN
C
Parameters
O H H q (k, 0, …)
θ
C Cα 𝑑𝑑𝑉𝑉
C O φ 𝑑𝑑𝑞𝑞
Coordinate (q) θ ϕ
F1 … FN
Potential (V) k ( − 0 )2 k (θ − θ0 )2 ∑n≤6 kn cos (nϕ − ϕ0)
Forces
9
Near versus far
Volume within radius r Electrostatic force at distance r
𝑁𝑁 ∝ 𝑟𝑟 3 𝐹𝐹 ∝ 1� 2
𝑟𝑟
Vol (r)
BIG
F (r)
small
r r
10
Efficient communication: The Edge Tile
❸
❶ ❹ • Evolutionary changes
PCACHE
① Increase SERDES data rate
ICB
② Reduce hop latency
❷
SERDES RTR
❺ • Revolutionary changes
③ Separate edge network
MCAST
11
Laying tiles
Edge tile Core tile
12
Physical design
• Channel-less, abutted layout
• Few unique blocks
• Global, low-skew clock mesh
• Engineered global routing
• Column-level redundancy
• Robust power delivery
13
The evolution of
8×8 nodes
2×64 nodes 512 nodes
17
Network
Z-dimension cabled
(blue and yellow)
18
Taming (cooling) the beast
TJ < 65 °C @ 500 W
19
MD performance
>100× faster
log scale!
20
Acknowledgements
System software group for
machine bring-up
Embedded software group
for creating and tuning the
application
Ken Mackenzie for performance
results and figures
Systems group for support
and infrastructure
And lots of photos!
Chemistry team for putting
Anton to good use
Kevin Yuh for MD simulation
videos
21
Performance references
• GPU performance results
– M. Bergdorf et al, “Desmond/GPU Performance as of April 2021”, DESRES/TR--2021-01, [Online: April 2021].
https://deshawresearch.com/publications.html.
– "A100, V100 and Multi-GPU Benchmarks", [Online: January 2020]. https://github.com/openmm/openmm/issues/2971.
– "NVIDIA HPC Application Performance", [Online: July 2021]. https://developer.nvidia.com/hpc-application-performance.
– "Gromacs/NAMD Multi-GPU Scaling", unpublished internal benchmarking.
• Supercomputer performance results
– “RIKEN Team Use Supercomputer to Explore the Molecular Dynamics of the New Coronavirus,” HPCwire announcement,
[Online: March 2020]. https://www.hpcwire.com/off-the-wire/riken-team-use-supercomputer-to-explore-the-molecular-dynamics-of-the-new-
coronavirus/
– S. Páll et al., “Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS,” in: S. Markidis, E. Laure (eds) Solving
Software Challenges for Exascale. EASC 2014. Lecture Notes in Computer Science, vol. 8759, 2015.
– NAMD scaling on Summit, [Online: May 2018]. http://www.ks.uiuc.edu/Research/namd/benchmarks/
– L. Casalino et al., “AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics,” International Journal of High
Performance Computing Applications, 2021.
– J. R. Perilla and K. Schulten, “Physical Properties of the HIV-1 Capsid from All-Atom Molecular Dynamics Simulations,” Nature Communications,
vol. 8 (15959), 2017.
• Anton performance results (original publications; improved performance used in comparisons)
– D. E. Shaw et al., “Millisecond-Scale Molecular Dynamics Simulations on Anton,” in SC‘09: Proceedings of the Conference on High Performance
Computing Networking, Storage, and Analysis, 2009, 1–11.
– D. E. Shaw et al., “Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer,” in
SC’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2014, 41–53.
– D. E. Shaw et al., “Anton 3: Twenty Microseconds of Molecular Dynamics Simulation Before Lunch,” to appear in SC’21: Proceedings of the
International Conference for High Performance Computing, Networking, Storage, and Analysis, 2021.
22