Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables
April 14, 2025 Β· Declared Dead Β· π SIGMOD Conference Companion
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Daniel Sotolongo, Daniel Mills, Tyler Akidau, Anirudh Santhiar, Attila-PΓ©ter TΓ³th, Ilaria Battiston, Ankur Sharma, Botong Huang, Boyuan Zhang, Dzmitry Pauliukevich, Enrico Sartorello, Igor Belianski, Ivan Kalev, Lawrence Benson, Leon Papke, Ling Geng, Matt Uhlar, Nikhil Shah, Niklas Semmler, Olivia Zhou, Saras Nowak, Sasha Lionheart, Till Merker, Vlad Lifliand, Wendy Grus, Yi Huang, Yiwen Zhu
arXiv ID
2504.10438
Category
cs.DB: Databases
Citations
0
Venue
SIGMOD Conference Companion
Last Checked
3 months ago
Abstract
Streaming data pipelines remain challenging and expensive to build and maintain, despite significant advancements in stronger consistency, event time semantics, and SQL support over the last decade. Persistent obstacles continue to hinder usability, such as the need for manual incrementalization, semantic discrepancies across SQL implementations, and the lack of enterprise-grade operational features. While the rise of incremental view maintenance (IVM) as a way to integrate streaming with databases has been a huge step forward, transaction isolation in the presence of IVM remains underspecified, leaving the maintenance of application-level invariants as a painful exercise for the user. Meanwhile, most streaming systems optimize for latencies of 100 ms to 3 sec, whereas many practical use cases are well-served by latencies ranging from seconds to tens of minutes. We present delayed view semantics (DVS), a conceptual foundation that bridges the semantic gap between streaming and databases, and introduce Dynamic Tables, Snowflake's declarative streaming transformation primitive designed to democratize analytical stream processing. DVS formalizes the intuition that stream processing is primarily a technique to eagerly compute derived results asynchronously, while also addressing the need to reason about the resulting system end to end. Dynamic Tables then offer two key advantages: ease of use through DVS, enterprise-grade features, and simplicity; as well as scalable cost efficiency via IVM with an architecture designed for diverse latency requirements. We first develop extensions to transaction isolation that permit the preservation of invariants in streaming applications. We then detail the implementation challenges of Dynamic Tables and our experience operating it at scale. Finally, we share insights into user adoption and discuss our vision for the future of stream processing.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Databases
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
The Case for Learned Index Structures
R.I.P.
π»
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
π»
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
π»
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
π»
Ghosted
Data Synthesis based on Generative Adversarial Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted