Co-design at Lawrence Livermore National Lab
The Department of Energy (DOE) has a long history of deploying leading-edge computing capability for science and national security. Going forward, DOE’s compelling science, energy assurance, and national security needs will require a thousand-fold increase in usable computing power, delivered as quicklyand energy efficiently as possible. This will force fundamental changes in all computer components. Among those with extreme-scale computing needs, the collaborative and concurrent development of hardware, software, numerical methods, algorithms, and applications is widely considered to be a necessary step for achieving a usable exascale-class system.
Co-design is about where the state of computing is going, rather than just focusing on creating one specific machine.—Rob Neely
Computation Associate Division Leader
The organizing principle for this type of coordinated development is co-design. Co-design draws on the combined expertise of vendors, hardware architects, system software developers, domain scientists, computer scientists, and applied mathematicians working together to make informed decisions about hardware and software components. To ensure that future architectures are well-suited for key DOE applications and that DOE scientific problems can take advantage of the emerging computer architectures, major DOE centers of computational science, including LLNL, are formally engaged in the co-design process.
Since the earliest days of supercomputing, LLNL has been known for fielding first-of-a-kind machines, most of which were rated among the fastest (or often the fastest) in the world at the time. Those machines were developed through a process very similar to the co-design processes proposed for the exascale era, and LLNL is actively pursuing a strategy to both leverage our co-design experience and to update it to meet the realities of today’s dynamic HPC environment, with huge lead-times between concept and realization.
The LLNL co-design strategy is strongly tied to the overall strategy of the National Nuclear Security Administration (NNSA) and DOE, and we are committed to establishing deep working relationships with the vendor community to help inform our own large application efforts and provide input into their design process. We are actively adapting our existing large applications using incremental improvements such as fine-grained threading, use of accelerators, and scaling to millions of nodes using message processing interface—with the new 20-petaflop Sequoia BlueGene/Q machine providing a living laboratory for these explorations. We are also looking to the next generation of programming models, researching new algorithms, and evaluating the need to rewrite our major multiphysics applications from scratch, to address software architecture complexities and better manage ever-increasing layers of hardware complexity.
Co-design efforts at LLNL include the following
The Advanced Simulation and Computing (ASC) program develops and maintains engineering and physics integrated codes (EPICs) in support of stockpile stewardship. To meet the key needs of the EPICs, ASC has established the National Security Applications (NSApp) Co-Design Center. NSApp will focus on these established applications as the drivers, and participate in co-design and vendor interactions largely through proxy applications.
The Advanced Scientific Computing Research (ASCR) program develops and deploys computational and networking capabilities to analyze, model, simulate, and predict complex phenomena important to the DOE. ASCR is establishing several co-design centers that will be used as collaboration vehicles for national laboratory and university partners. Each center focuses on a specific application and uses development of that application as a way to explore issues of mathematics, algorithms, computer science, systems software, and hardware in the co-design process.
The FastForward initiative is intended to speed up and influence the development of technologies companies are pursuing for commercialization to ensure these products include the features that DOE and NNSA laboratories require for research. FastForward funds innovative new and/or accelerated R&D of technologies targeted for productization in the 5–10 year timeframe.
Are you a vendor or researcher interested in exploring co-design with LLNL?
For additional information, please contact our team.
Summer Co-design School
Rob Neely makes connections.
Livermore computer scientist and high-performance computing (HPC) advocate Rob Neely relishes the diverse nature of his responsibilities. "I like the breadth of work here," he says. "I also like to help build connections and see them blossom into collaborations. That's very satisfying."
Mini-Apps Accelerate Hardware and Software Development
A suite of small programs assists hardware and software developers in making better and more coordinated design decisions.