The Data Mover is a software called nodeum developed by MT-C. The goal of this service is to offer the users of the Fenix Infrastructure programmable, high-speed, scalable and secure data movement between Active Data Repositories (ACDs) and Archival Data Repositories (ARDs). The same software is meant to be offered at all 5 ICEI project member sites (BSC, CEA, JSC, CINECA, CSCS) but will only move data sites locally.
The High-Performance Computing (HPC) filesystems (POSIX conform) represent the ACDs, and the object storage (using SWIFT API) represents the ARDs. Authentication is done using the central Fenix AAI (for details, see also news item on The Fenix Authentication and Authorization Infrastructure (AAI) status and updates). Dedicated nodes are running and hosting the service, exposing a REST interface. For user access (like triggering a data movement), a CLI is called and provided.
Data Mover @ JSC
Next to the CLI, users can trigger movements via SLURM. The Data Mover Service is integrated into SLURM using the burst buffer plugin to offer:
- Automated stage-in of data from a selected ARD to a selected ACD before the execution of a batch job. The availability of the data is ensured before the start of the job.
- Automated stage-out of data from a selected ACD to a selected ARD following the execution of a batch job.
For now, the Data Mover service is available at JSC on JUDAC. Other sites are currently testing and evaluating the service. CSCS and BSC also want to make the service available for users in the near future.
Overall, the Data Mover Service offers HPC users a simple yet rich connection between filesystems and the object store. This connection is easy to use, secure, fast and easily extendable. Login is available for all Fenix sites, and users get the same experience everywhere. Especially for HBP/EBRAINS users, this connection is highly beneficial as workflows between ACD and ARD can easily be implemented and integrated into SLURM jobs.