- List of Knowledge
- 1. Description
- 2. Recommended Basic Course and Books
- 3. SQL & Relational Algebra
- 4. DDL & DML
- 5. Relational Model
- 6. Storage management
- 7. Query Processing
- 8. SQL Parser
- 9. SQL Executor
- 9. SQL Optimization
- 10. Transaction management
- Lock manager
- 11. Network
- 12. Serialization
- 13. Concurrency Control
- 14. Crash Recovery management
- 15. NoSQL
- 16. NewSQL
- 17. Distributed & Paralleled
- 16. Graph Database
- 16. Project Source Code Analysis
- 17. Mini-Project Labs
- 18. AI4DB and DB4AI (frontier tech)
- Database Retrospective
- Courses
- Blogs
- Papers
- Activities
- Talks
- MIT The Missing Semester of Your CS Education
- C/C++
- Go
- Rust
University | ID | Name | Time | Comment |
CMU | 15-213/14-513/15-513 | Intro to Computer Systems(ICS) | Fall 2015 | CS:APP3e |
UC Berkeley | CS61C | Great Ideas in Computer Architecture (Machine Structures) | Spring 2017 | N/A |
UC Berkeley | CS 152/252A | Computer Architecture and Engineering | Spring 2022 | N/A |
N/A | N/A | Crash Course Computer Science | bilibili地址 |
- Reference Course Resources
University | ID | Name | Time | Comment |
Stanford | CS110 | Principles of Computer Systems | Winter 2022 | |
MIT | 6.828 | Operating System Engineering | Fall 2018 | |
UC Berkeley | CS162 | Operating Systems and System Programming | Fall 2020 | |
MIT | 6.033 | | N/A | covers four units of technical content: operating systems, networking, distributed systems, and security |
- Reference Course Resources
University | ID | Name | Time | Comment |
Stanford | CS144 | Introduction to Computer Networking | Fall 2021 | |
CMU | CS 15-744 | Computer Networks | Spring 2018 | |
中科大USTC | N/A | 计算机网络 | N/A | undergraduate stage |
中科大USTC | N/A | 高级计算机网络 | N/A | postgraduate stage |
- Reference Course Resources
University | ID | Name | Time | Comment |
UC Berkeley | CS 61B | Data Structures | Fall 2020 | |
CMU | CS 15-122 | Principles of Imperative Computation | Spring 2023 | Past Courses Page 404 |
CMU | CS 15-121 | Introduction Data Structures | Spring 2018 |
- Reference Course Resources
University | ID | Name | Time | Comment |
UC Berkeley | UC Berkeley | Efficient Algorithms and Intractable Problems | ||
MIT | 6.006 | Introduction to Algorithms | Spring 2020 | Introduction to Algorithmns |
CMU | CS 15-451/651 | Algorithm Design and Analysis | Fall 2021 | |
CMU | CS 15-850 | Advanced Algorithms | Fall 2020 |
- Reference Course Resources
University | ID | Name | Time | Comment |
CMU | CS 15-445/645 | Database Systems | Fall 2022 | Instructor: Andy Pavlo |
UC Berkeley | CS 186 | Introduction to Database Systems | Spring 2022 | |
Pennsylvania | CMPSC 431W | Database Management Systems | Fall 2015 | |
Stanford | CS346 | Database System Implementation | Spring 2015 | |
uwaterloo | CS 448/648 | Database Systems Implementation | Winter 2009 | |
CMU | CS 15-721 | Advanced Database Systems | Spring 2020 | Instructor: Andy Pavlo |
CMU | CS 15-799 | Special Topics: Self-Driving Database Management Systems | Spring 2022 | Instructor: Andy Pavlo |
uwaterloo | CS 856 | Distributed data management fundamentals (architectures, data placement, query optimization) Distributed transaction processing, concurrency control, recovery, interoperability |
Fall 2002 | Only need these two slides, nothing else |
N/A | N/A | Let's Build a Simple Database: Writing a sqlite clone from scratch in C | Thanks to cstack | |
Stanford | CS 345 | Topics in Database Management Systems | Winter 2014 | Paper Readings topic, similar to CMU CS15-721 |
fundamentals of database systems 7th edition solutions, by Ramez Elmasri, Shamkant B. Navathe
Database System Concepts Seventh Edition, by Silberschatz, Korth and Sudarshan 中文版:《数据库系统概念》
Database Management Systems, by Ramakrishnan and Gehrke 中文版:《数据库管理系统原理与设计》
Database Internals A Deep-Dive into How Distributed Data System Work, by Alex Petrov 中文版:《数据库系统内幕》
Designing Data-Intensive Applications, by Martin Kleppmann
Database Design and Implementation, by Edward Sciore
Database System Implementation, by Hector Garcia-Molina, Jeff Ullman, and Jennifer Widom 中文版:《数据库系统实现》
- This book has been replaced by Database Systems: The Complete Book
Transaction Processing Concepts and Techniques by Jim Gray
Principles of Distributed Database Systems, by M. Tamer Özsu and Patrick Valduriez 中文版:《分布式数据库系统原理》
nosql distilled a brief guide to the emerging world of polyglot persistence, by Pramod J. Sadalage and Martin Fowler 中文版:《NoSQL精髓》
High Performance MySQL Third Edition, by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko 中文版:《高性能MySQL》
MySQL High Availability: Tools for Building Robust Data Centers, by CharlesBell,MatsKindahl,LarsThalmann 中文版:《高可用MySQL:构建健壮的数据中心》
CS 15-721 Topics Papers
Stanford CS 345 Winter2014 Topic Paper
Paper 2017 : How to Build a Non-Volatile Memory Database Management System
Aticle : Main Memory Database Systems
Paper thesis 2018 : The Design and Implementation of a Non-Volatile Memory Database Management System
harvard Aticle : The Design and Implementation of Modern Column-Oriented Database Systems
This Section is for distributed database.
University | ID | Name | Time | Comment |
MIT | 6.824 | MIT Distributed Systems | Spring 2021 | Golang |
N/A | N/A | Distributed Systems | ||
Columbia University | COMS 4113 | Distributed Systems Fundamentals | ||
CMU | CS 15-440 | CMU Distributed Systems | ||
Princeton | COS 418 | Princeton Distributed Systems | Fall 2019 | Golang |
Columbia University | Advanced Distributed Systems | Research Papers | ||
MIT | 6.852 | MIT Distributed Algorithms | ||
ETHZ | N/A | Principles of Distributed Computing (lecture collection) | ||
Stanford | CS244b | Distributed Systems | Spring 2020 | |
Washington | CSE 490H | Distributed Systems | Autumn 2010 |
Reference :
- Paper 1978 : Time, Clocks, and the Ordering of Events in a Distributed Syste
- Paper 2008 : Interval Tree Clocks: A Logical Clock for Dynamic Systems , ITC2008
- consensus-bridging-theory-and-practice
- Raft Refloated: Do We Have Consensus?
- The Part Time Parliament
- Memory Coherence in Shared Virtual Memory Systems
- TreadMarks: Distributed Shared Memory on Standard WorkStations and Operating Systems
- Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
- Resolving File Conflicts in the Ficus File System
- Consistency Analysis in Bloom: a CALM and Collected Approach
- Paper 2009 : Object Storage on CRAQ High-throughput chain replication for read-mostly workloads , USENIX2009
- Distributed SQLite: a replicated SQLite service powered by RAFT
Data Stores
- Simplified Harp File System
- Google File System 2.0: A Modern Design and Implementation
- Replication in the Harp File System
- Simplified GFS
- Chunky: a distributed GFS-based file store
- Append-only Datastore
- Key-Value Store Using Chain Replication
- ACID Compliant Distributed Key-Value Store
- RAFT based Key-Value Store with Transaction Support
- Paper 2010 : The Declarative Imperative: Experiences and Conjectures in Distributed Logic , SIGMOD2010
- Paper 2004 : MapReduce: Simplified Data Processing on Large Clusters , OSDI2004
- Paper : Availability in Globally Distributed Storage Systems
Fault Tolerance
- Remus: High Availability via Asynchronous Virtual Machine Replication
- Paxos Made Simple , ACM SIGACT News 2001
- In Search of an Understandable Consensus Algorithm
- Harvest, Yield, and Scalable Tolerant Systems
- Paxos Made Moderately Complex
- Distributed Multi-Agent Consensus for Fault Tolerant Decision Making
- Paper 2003 : The Google File System , SOSP2003
- Paper 2003 : Bigtable: A Distributed Storage System for Structured Data , OSDI2006
- Paper 2007 : Dynamo: Amazon’s Highly Available Key-value Store , SOSP2007
- Peer-to-Peer Note Taking App
- factoryOS: A Distributed and Self-Organizing Planning System in a Supply-Chain Context
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services , CIDR 2011
- PNUTS: Yahoo!’s Hosted Data Serving Platform , VLDB 2008
Oracle7 Server Concepts Manual : Distributed Databases
University of Cambridge : Distributed Systems
分布式系统学习资料汇总 , Blogger : 木鸟杂记
分布式系统(Distributed System)资料 , Github Blogger : ty4z2008
SQL 知识点:Book《Database System Conceptsm》Chapter3、Chapter4、Chapter5
Relational Algebra 知识点:Book《Database System Conceptsm》Chapter6
w3cSchools SQL:
slide : Relational Algebra and SQL
CS 15-445 课程 Lecture03、Lecture04
Microsoft Research Paper : Faster: A Concurrent Key-Value Store with In-Place Updates, SIGMOD2018
How Lock-free Data Structures Perform in Dynamic Environments: Models and Analyses
CS 15-445 课程 Lecture05
wikiPedia : Database index
Google Cloud docs : Indexes
wikiPedia : B+ tree
CS 186 Spring notes pdf : B+ tree
University of Utah sliede : Database Systems Index: B+ Tree
Paper 2004 : Query and Update Efficient B-Tree Based Indexing of Moving Object
课程:《Let's Build a Simple Database》
Paper 2010 , Efficient B-tree Based Indexing for Cloud Data Processing, VLDB, National University of Singapore & IBM Watson Research Center
wikipedia : Hash table
CMU CS15-445 slide : Hash table slide
- slide : Hash table & Extendible Hash & Linear Hash
Blog : An Introduction to B-Tree and Hash Indexes in PostgreSQL
slide : Log Structured Merge Tree, thanks to Pinglei Guo
AlibabaCloud Community Blog : Starting from Zero: Build an LSM Database with 500 Lines of Code
Paper : The Bw-Tree: A B-tree for New Hardware Platforms , Microsoft Research
Lock-free data structures. Inside. Memory management schemes
Blog : Bw-Tree技术解读
Blog : 微软提出的无锁 B 族树 —— Bw-Tree , 木鸟杂记
MySQL 8.0 docs : Chapter 15 The InnoDB Storage Engine
MySQL 8.0 docs : Chapter 16 Alternative Storage Engines
Hybrid storage engine for geospatial data using NoSQL and SQL paradigms
Oracle docs : Chapter 13. Storage Engines
Paper 2021 : Designing a persistent-memory-native storage engine for SQL database systems , , IEEE NVMSA 2021
Database SQL Tuning Guide : 3 SQL Processing
Top 10 SQL Query Optimization Tips to Improve Database Performance, by Avishek Singh
SQL Query Optimization: How to Tune Performance of SQL Queries
Oracle docs : 5 Query Optimization
Google Cloud docs : Optimizing Indexes
Optimize index maintenance to improve query performance and reduce resource consumption
How to create and optimize SQL Server indexes for better performance
MySQL Optimize Table: How to Keep Your Database Running Smoothly
MySQL docs : 8.5 Optimizing for InnoDB Table
Buffer cache: What is it and how does it impact database performance?
EXPLAIN (ANALYZE) needs BUFFERS to improve the Postgres query optimization process
AWS Storage Blog : Storage for I/O-intensive SQL Server using Amazon EBS io2 Block Express
- MySQL docs : How Compression Works for InnoDB Tables
Reference : , thanks henry liang
Paper : Orca: A Modular Query Optimizer Architecture for Big Data , SIGMOD 2014
Paper : An Overview of Cost-based Optimization of Queries with Aggregates
Paper : Optimizer plan change management: improved stability and performance in Oracle 11g , VLDB
Paper : Optimizing Queries over Partitioned Tables in MPP Systems , SIGMOD
Paper : Optimization of Common Table Expressions in MPP Database Systems , VLDB
OceanBase SQL 调试优指南 , Alibaba Ant Group
PolarDB-X SQL 调优指南 , Alibaba Cloud
TiDB SQL性能调优 , PingCAP
了解数据库性能优化 , thanks to TiDB
MySQL 调优笔记 , Github Blogger : wardseptember
101 MySQL tuning and optimization tips , Alibaba Cloud
Google Cloud docs : Transactions
Paper 2006, Cost-based query transformation in Oracle, VLDB
Blog : 使用数据库事务
Blog : 11. 数据库事务
Mass Tree Paper : Cache Craftiness for Fast Multicore Key-Value Storage
Palm Tree Paper : PALM: Parallel Architecture-Friendly Latch-Free Modifications to B+ Trees on Many-Core Processors
ARTree Paper : The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases
- Database · 发展前沿 · NewSQL数据库概述
- 我们是怎样打造一款分布式数据库的?
- 如何编写一个分布式数据库? , PingCAP CEO刘奇
- Understand the Differences Between NewSQL and Distributed SQL
Experiences with a Distributed, Scalable, Methodological File System: AnalogicFS
Monarch: Google’s Planet-Scale In-Memory Time Series Database
Paper 2012 : Spanner: Google's Globally-Distributed Database , OSDI
WikiPedia : OnLine Analytical Processing
AWS Blog : What Is Online Analytical Processing?
Azure Blog : Online analytical processing (OLAP)
wikiPedia : Online transaction processing
Azure Blog : Online transaction processing (OLTP)
Tsinghua University databaseGroup Sigmod2022 slide : HTAP database : A Tutorial
A Common Database Approach for OLTP and OLAP Using an In-Memory Column DataBase
PingCAP : How We Build an HTAP Database That Simplifies Your Data Platform
StoneDB 文章
AlibabaCloud Community Blog : : 400x Faster HTAP Real-time Data Analysis with PolarDB
GreenPlum database Blog : World Class Open Source Distributed HTAP Database Based On PostgreSQL
Paper 2020 : TiDB: A Raft-based HTAP Database, VLDB
Paper 2022 : Kernel-Assisted Copy-on-Write Snapshots for Main-Memory HTAP Databases
Paper : Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing
AlibabaCloud Community Blog : What Is the Next Stop for Big Data? Hybrid Serving/Analytical Processing (HSAP)
- YouTube Video : Introduction to Graph Databases Series
Graph Database : New Oppotunities For Connected Data , Ian Robinson, Jim Webber & Emil Eifrem
Graph Databases For Beginners , Merkl Sasaki, Joy Chao & Rachel Howard
Graph-Databases-For-Dummies , Dr. Jim Webber & Rik Van Bruggen
The Definitive Guide to Graph Databases for the RDBMS Developer , Michael Hunger, Ryan Boyd & William Lyon
- Paper 2022 : ByteGraph: A High-Performance Distributed Graph Database in ByteDance , VLDB
美团技术团队 Blog : 美团图数据库平台建设及业务实践
Graph Databases for Beginners: Why Graph Technology Is the Future
Graph Databases: How They Work, When to Use Them & the Advantages They Offer
Developing a Small-Scale Graph Database: A Ten Step Learning Guide for Beginners
白皮书 : 图数据库技术十大案例
Database | DataBase Type | Blog | Github |
SQLite | a small relational database management system | SQLite源码分析 | |
LevelDB | fast key-value storage library | LevelDB 源码剖析 | |
MySQL | |||
PostGreSQL | PostGreSQL源码解读系列 | ||
Redis | in-memory database that persists on disk. | 1. 如何阅读 Redis 源码? 2. redis源码解析 | |
MongoDB | Cloud-Native Document Database | MongoDB 内核源码分析 | |
TiDB | cloud-native, distributed, MySQL-Compatible database | TiDB源码阅读分析 | |
TiKV | distributed key-value database | TiKV源码解析系列 | |
PolarDB-X | cloud native distributed SQL Database | PolarDB-X 源码解读 | |
OceanBase | distributed relational database | | |
openGauss | open source relational database management system | openGauss数据库源码解析 | |
StoneDB | A Real-time HTAP Database | StoneDB 源码解读系列 | |
RocksDB | A Persistent Key-Value Store for Flash and RAM Storage | 官方wiki文档 | |
ToplingDB | RocksDB的增强分支 | N/A | |
Greenplum | open-source massively parallel data platform for analytics, machine learning and AI. | Greenplum 分布式数据库内核揭秘(上篇) Greenplum 分布式数据库内核揭秘(下篇) | |
YugabyteDB | high-performance, cloud-native, distributed SQL database that aims to support all PostgreSQL features. | 待更新 | |
Neo4j | Graph Database | 待更新 | |
JanusGraph | open-source, distributed graph database | 待更新 | |
OpenMLDB | an open-source machine learning database | 待更新 | |
taobao MySQL 数据库内核月报 :
DeepDB :
CS 15-445 Labs
- University of Magdeburg slides:
- Tsinghua University databaseGroup Github :
- Tsinghua & MIT Paper : AI Meets Database: AI4DB and DB4AI , SIGMOD2021
- openGauss Blog : openGauss AI4DB and DB4AI
- MIT databaseGroup :
- Blogger Github :
- Databases in 2021: A Year in Review , Andy Pavlo
- Databases in 2022: A Year in Review , Andy Pavlo