"We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning
March 25, 2024 ยท Declared Dead ยท ๐ Proc. ACM Hum. Comput. Interact.
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Shreya Shankar, Rolando Garcia, Joseph M Hellerstein, Aditya G Parameswaran
arXiv ID
2403.16795
Category
cs.HC: Human-Computer Interaction
Citations
23
Venue
Proc. ACM Hum. Comput. Interact.
Last Checked
3 months ago
Abstract
Organizations rely on machine learning engineers (MLEs) to deploy models and maintain ML pipelines in production. Due to models' extensive reliance on fresh data, the operationalization of machine learning, or MLOps, requires MLEs to have proficiency in data science and engineering. When considered holistically, the job seems staggering -- how do MLEs do MLOps, and what are their unaddressed challenges? To address these questions, we conducted semi-structured ethnographic interviews with 18 MLEs working on various applications, including chatbots, autonomous vehicles, and finance. We find that MLEs engage in a workflow of (i) data preparation, (ii) experimentation, (iii) evaluation throughout a multi-staged deployment, and (iv) continual monitoring and response. Throughout this workflow, MLEs collaborate extensively with data scientists, product stakeholders, and one another, supplementing routine verbal exchanges with communication tools ranging from Slack to organization-wide ticketing and reporting systems. We introduce the 3Vs of MLOps: velocity, visibility, and versioning -- three virtues of successful ML deployments that MLEs learn to balance and grow as they mature. Finally, we discuss design implications and opportunities for future work.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Human-Computer Interaction
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Improving fairness in machine learning systems: What do industry practitioners need?
R.I.P.
๐ป
Ghosted
Identifying Stable Patterns over Time for Emotion Recognition from EEG
R.I.P.
๐ป
Ghosted
Questioning the AI: Informing Design Practices for Explainable AI User Experiences
R.I.P.
๐ป
Ghosted
Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities
R.I.P.
๐ป
Ghosted
Educational data mining and learning analytics: An updated survey
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted