Papers #
A catalogue of papers I have read or looking to read, with links to their sources. Some may include write-ups.
Legend #
- ⏳ Currently Reading
- ✅ Finished Reading
Reading List #
Concurrency and Memory #
- Why Events Are A Bad Idea (for high-concurrency servers)
- What every systems programmer should know about concurrency
- What Every Programmer Should Know About Memory
Databases & Storage #
- What If: Causal Analysis with Graph Databases
- An Overview of Query Optimization in Relational Systems
- Exploiting Cloud Object Storage for High-Performance Analytics
- SQLite: Past, Present, and Future
- OLTP Through the Looking Glass, and What We Found There
- ROSE: Robust Caches for Amazon Product Search
MIT 6.824 (Distributed Systems Spring 2024) #
- MapReduce: Simplified Data Processing on Large Clusters
- The Google File System
- In Search of an Understandable Consensus Algorithm (Extended Version)
- ZooKeeper: Wait-free coordination for Internet-scale systems
- Grove: a Separation-Logic Library for Verifying Distributed Systems
- Spanner: Google’s Globally-Distributed Database ⏳
- Chardonnay: Fast and General Datacenter Transactions for On-Disk Databases
- No compromises: distributed transactions with consistency, availability, and performance
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service
- Ownership: A Distributed Futures System for Fine-Grained Tasks
- Scaling Memcache at Facebook
- On-demand Container Loading in AWS Lambda
- Boki: Stateful Serverless Computing with Shared Logs
- Secure Untrusted Data Repository (SUNDR)
- Practical Byzantine Fault Tolerance
- Bitcoin: A Peer-to-Peer Electronic Cash System
MIT 6.824 (Distributed Systems Spring 2020) #
- The Design for a Practical System For Fault Tolerant Virtual Machines
- Object Storage on CRAQ High-throughput chain replication for read-mostly workloads
- Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases
- Frangipani: A Scalable Distributed File System
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
- Blockstack: A New Internet for Decentralized Applications
- Experiences with a Distributed, Scalable, Methodological File System: AnalogicFS
Distributed Systems #
- Fast Paxos
- ARC: Analysis of Raft Consensus
- The Part-Time Parliament
- Paxos Made Simple
- Paxos Made Live - An Engineering Perspective
- How to Build a Highly Available System Using Consensus
- Distributed Computing Economics
- Rules of Thumb in Data Engineering
- Fallacies of Distributed Computing
- Impossibility of Distributed Consensus with One Faulty Process
- Unreliable Failure Detectors for Reliable Distributed Systems
- Lamport Clocks
- The Byzantine Generals Problem
- Lazy Replication: Exploiting the Semantics of Distributed Services
- Scalable Agreement - Towards Ordering as a Service
- Scalable Eventually Consistent Counters over Unreliable Networks
Systems Research #
- Are Unikernels Ready for Serverless on the Edge?
- Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads
System Design #
- Zanzibar: Google’s Consistent, Global Authorization System
- On Designing and Deploying Internet-Scale Services
- Data on the Outside versus Data on the Inside
Programming Language Design #
Scratch #
Google #
- MapReduce
- Chubby Lock Manager
- Google File System
- BigTable
- Data Management for Internet-Scale Single-Sign-On
- Dremel: Interactive Analysis of Web-Scale Datasets
- Large-scale Incremental Processing Using Distributed Transactions and Notifications
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services
- Photon
- Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing