Co-design at Lawrence Livermore National Lab
Pynamic is a benchmark designed to test a system's ability to handle the Dynamic Linking and Loading requirements of Python-based scientific applications. We developed this benchmark to represent a newly emerging class of DLL behaviors. Pynamic builds on pyMPI, an MPI extension to Python. Our augmentation includes a code generator that automatically generates Python C-extension dummy codes and a glue layer that facilitates linking and loading of the generated dynamic modules into the resulting pyMPI. Pynamic is configurable, enabling it to model the static properties of a specific code. It does not, however, model any significant computations of the target and hence it is not subjected to the same level of control as the target code. In fact, we encourage HPC computer vendors and tool developers to add it to their test suites. This benchmark provides an effective test of the compiler, the linker, the loader, the OS kernel and other runtime systems of a high performance computing (HPC) system to handle an important aspect of modern scientific computing applications. In addition, the benchmark serves as a stress test case for code development tools. Although Python has recently gained popularity in the HPC community, its heavy use of DLL operations has hindered certain HPC code development tools, notably parallel debuggers, from performing optimally.
Method of Solution
The heart of Pynamic is a Python script that generates C files and compiles them into shared object libraries. Each library contains a Python callable entry function as well as a number of utility functions. The user can also enable cross library function calls with a command line argument. The Pynamic configure script then links these libraries into the pynamic-pyMPI executable and creates a driver script to exercise the functions in the generated libraries. The user can specify the number of libraries to create, as well as the average number of utility functions per library, thus tailoring the benchmark to match some application of interest. Pynamic introduces randomness in the number of functions per module and the function signatures, thus ensuring some heterogeneity of the libraries and functions.
Download Pynamic source code on GitHub at https://github.com/scalability-llnl/pynamic.
config_pynamic.py <num_files> <avg_num_functions> [options] [-c <configure_options>]
<num_files> = total number of shared objects to produce
<avg_num_functions> = average number of functions per shared object
pass the whitespace separated list of <configure_options> to configure
when building pyMPI. All args after -c are sent to configure and not
interpreted by Pynamic
maximum Pynamic call stack depth, default = 10
enables external functions to call across modules
add <python_include_dir> when compiling modules
add <length> characters to the function names
add a print statement to every generated function
seed to the random number generator
add timing metrics to the Pynamic driver
-u <num_utility_mods> <avg_num_u_functions>
create <num_utility_mods> math library-like utility modules
with an average of <avg_num_u_functions> functions
NOTE: Number of python modules = <num_files> - <avg_num_u_functions>
use the C compiler located in <command> to build Pynamic modules
use the python locatd at <command> to build Pynamic modules.
Will also be passed to the pyMPI configure script
Upon success, config_pynamic.py will build three executables pyMPI, which is a standalone build of pyMPI, pynamic-pyMPI, which is a pyMPI executable with all of the generated libraries linked in, and pynamic-bigexe, which is an artificially large pyMPI executable that also has the libraries linked in.
In a non-MPI environment, one can directly invoke the code generator with the same options as above:
% python so_generator.py <num_files> <avg_num_functions> [options]
% python pynamic_driver.py
or for an MPI job (use your own equivalent mpirun command):
% mpirun pyMPI pynamic_driver.py
% mpirun pynamic-pyMPI pynamic_driver.py
Options and arguments are provided so that a tester can model certain static properties of a Python-based scientific applications. For example, if the tester wants to model a code that has following properties (these were actually taken from an important LLNL application):
+ Number of shared libraries: 495
- Python callable C-extension libraries: ~280
- Utility libraries (python "uncallable"): ~(495-280)=215
+ Aggregate total of shared libraries: 1.1GB
- Aggregate text size of shared libraries: 234MB
- Aggregate data size of shared libraries: 3.9MB
- Aggregate debug section size of shared libraries: 779MB
- Aggregate symbol table size of shared libraries: 11MB
- Aggregate string table size of shared libraries: 75MB
A tester may use:
% config_pynamic.py 495 1850 -e -u 215 1850 -n 100
Please examine other options to model a target code better.
When a Pynamic build is complete, it prints out a summary message about its own static properties.
Size of aggregate total of shared libraries: 1.4GB
Size of aggregate texts of shared libraries: 491.0MB
Size of aggregate data of shared libraries: 12.9MB
Size of aggregate debug sections of shared libraries: 710.0MB
Size of aggregate symbol tables of shared libraries: 35.7MB
Size of aggregate string table size of shared libraries: 172.0MB
When more details on static properties for individual shared libraries are desired, please look into the full report: "sharedlib_section_info"
Gregory L. Lee, Dong H. Ahn, Bronis R. de Supinski, John Gyllenhaal, Patrick Miller. Pynamic: the Python Dynamic Benchmark. [ PDF ]
Patrick Miller. pyMPI –An Introduction to parallel Python using MPI [ PDF ]
Originally posted as http://computation.llnl.gov/casc/Pynamic/pynamic.htm UCRL-WEB-230211