LLNL logo Lawrence Livermore National Laboratory

Co-design at Lawrence Livermore National Lab

MACSio: A Multi-purpose, Application-Centric, Scalable I/O proxy application

 

Overview

MACSio (pronounced "max-ee-oh") was developed to fill a long existing void in co-design proxy applications that allow for I/O performance testing as well as evaluation of tradeoffs in data model interfaces and parallel I/O paradigms for multi-physics, HPC applications. Two key design features of MACSio set it apart from existing I/O proxy applications and benchmarking tools. The first is the level of abstraction (LOA) at which MACSio is designed to operate.

Increasing LOA
Levels of Abstraction (LOA) in the HPC I/O stack (left), typical abstraction objects (middle)
and example implementations (right)

The second is the degree of flexibility MACSio is designed to provide in driving an HPC I/O workload through parameterized, user-defined data objects and a variety of parallel I/O paradigms and I/O interfaces. Combined, these features allow MACSio to closely mimic I/O workloads for a wide variety of real HPC applications and, in particular, multi-physics applications where data object distribution and composition vary dramatically both within and across parallel tasks. These data objects are then marshaled between primary and secondary storage according to a variety of application use cases (e.g. restart dump or trickle dump) using one or more I/O interfaces (plugins) and parallel I/O paradigms, allowing for direct comparisons of software interfaces, parallel I/O paradigms, and file system technologies with the same set of customizable data objects.

macsio diagram
Block Diagram of MACSio main and I/O plugins. Uni-modal plugins manage data only in files
with bi-modal plugins manage data both in files and in memory.

I/O Performance Characteristics

Here we show (in red) typical I/O performance characteristics as a function of request size for various layers of software in the HPC I/O stack. For any given layer, performance typically always improves with increasing request size. This is because all overheads are amortized over larger and larger transfers. Because higher layers in the stack have more overhead due to additional metadata necessary to implement the cooresponding abstractions, performance is typically lower for a given request size with each layer uppwards in the stack. This is demonstrated by the gaps between the red bandwidth vs. request size performance curves. In general, if an application is able to generate larger requests, these overheads can be amortized away to insignificance.

notional i/o performance chart
Typical I/O performance as a function of request size (red).
Typical I/O request size histogram (yellow) as percent of total dump.

We also show (in yellow) an I/O request histogram for a typical restart dump as a percentage of total bytes in the dump. A given bar indicates the percent of total bytes in the dump that were transferred at that request size. In fact, we show two different categories of requests. Those that originate from the application itself (solid yellow) as well as those that originate from one or more of the lower layers (hashed yellow) in the I/O stack on behalf of the application (typically metadata associated with the abstractions). In this example a majority of the smaller requests originated from the application itself. This suggests that the application could be adjusted to aggregate many of its smaller requests into a single larger request and experience improved performance. By appropriate use of timing and request size information gathered from within MACSio and its I/O plugins, this kind of detailed application I/O emulation and performance analysis is possible.

Software

Download MACsio at Github
LLNL-CODE-676051

Publications

User Manual

MACsio Doxygen docs

Design

MACsio design document

Presentations

MACsio Review

 

Summer Co-design School

Join us!