Z2 - Computing services
This project provides the necessary services for collaborative research within W2W. This includes:
- the provision of tools and data sets used across the consortium
- data stewardship and coordination of data management policies
- provision of a Virtual Research Environment to enable seamless sharing of code, data and advanced tools like visualization between researchers at all locations.
Many research projects rely and build on three large ensemble model experiments. These are the "Regional Grand Ensemble" (RGE) led by B3, the global "Global Control Ensemble" (GCE) led by C8, and the "Tropical Ensemble" (TE) experiments led by C2. These simulations will be computationally very expensive and, with more than a petabyte of output volume, they will also pose a major challenge to data management. PhD students will either evaluate these experiments directly or base their own spin‐off simulations on them. This means that the results must be available early in the project phase. We achieve this by having the simulations carried out centrally by experienced scientific programmers in Z2. In addition, a framework is developed for the setup of similarly designed spin‐off experiments, which will be provided to individual research projects.
The strategy of providing a virtual research environment has paid off strongly, especially during the COVID crisis. A JupyterHub and remote visualization servers connected to our central data storage, our central code repository and our internal Wiki has made collaboration across different institutions much easier. To build on this success, we are planning to modernize and expand our capabiliteis to meet the increased demands of our scientists. Part of these demands is the handling of very large amounts of data, both for collaboration within W2W and for publication following FAIR principles. The latter requrement is being addressed through close integration with the respective data management strategies and facilities of the local data centers and libraries, as well as with training events.
The effective use of very large amounts of data requires not only storage, but also the provision of suitable tools. We plan to continue the development of the interactive visual analysis software "Met.3D", as well as the postprocessing and evaluation Python package "Ensemble Tools". The application and integration of lossy‐compression methods implemented in Phase 2 will provide an important contribution to the data handling. In addition, we provide support to students and scientists for the implementation of new methods and the use of modern software development tools.