Streaming Large-Scale Electron Microscopy Data to a Supercomputing Facility
July 03, 2024 ยท Declared Dead ยท ๐ arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Samuel S. Welborn, Chris Harris, Stephanie M. Ribet, Georgios Varnavides, Colin Ophus, Bjoern Enders, Peter Ercius
arXiv ID
2407.03215
Category
physics.ins-det
Cross-listed
cond-mat.mtrl-sci,
cs.DC,
cs.NI
Citations
0
Venue
arXiv.org
Last Checked
3 months ago
Abstract
Data management is a critical component of modern experimental workflows. As data generation rates increase, transferring data from acquisition servers to processing servers via conventional file-based methods is becoming increasingly impractical. The 4D Camera at the National Center for Electron Microscopy (NCEM) generates data at a nominal rate of 480 Gbit/s (87,000 frames/s) producing a 700 GB dataset in fifteen seconds. To address the challenges associated with storing and processing such quantities of data, we developed a streaming workflow that utilizes a high-speed network to connect the 4D Camera's data acquisition (DAQ) system to supercomputing nodes at the National Energy Research Scientific Computing Center (NERSC), bypassing intermediate file storage entirely. In this work, we demonstrate the effectiveness of our streaming pipeline in a production setting through an hour-long experiment that generated over 10 TB of raw data, yielding high-quality datasets suitable for advanced analyses. Additionally, we compare the efficacy of this streaming workflow against the conventional file-transfer workflow by conducting a post-mortem analysis on historical data from experiments performed by real users. Our findings show that the streaming workflow significantly improves data turnaround time, enables real-time decision-making, and minimizes the potential for human error by eliminating manual user interactions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ physics.ins-det
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Calorimetry with Deep Learning: Particle Simulation and Reconstruction for Collider Physics
R.I.P.
๐ป
Ghosted
Highly curved image sensors: a practical approach for improved optical performance
R.I.P.
๐ป
Ghosted
Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets
R.I.P.
๐ป
Ghosted
Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs
R.I.P.
๐ป
Ghosted
A Computational Model of a Single-Photon Avalanche Diode Sensor for Transient Imaging
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted