Dask-Extended External Tasks for HPC/ML In transit Workflows

Abstract

In situ workflows are inescapable to fully leverage exascale architectures. They can be complex build, however, because simulation and data analytics come from two different software ecosystems with their own paradigms programming models. This work extends the deisa bridging model between MPI+X simulations distributed task-based analytics; it introduces concept of external tasks support description graphs spanning multiple timesteps ahead time while improving scalability. new approach leads a straightforward for contracts graph limit transferred that actually analyzed in given execution. We implement this using Dask MPI evaluate an end-to-end in-transit workflow uses unsupervised ML dimensionality reduction. compare our plain postprocessing previous version deisa. Our performs better, up × 7 3 compared deisa, is 18 less costly Dask—all these similar development efforts.