Course Info

This course is a combination of lectures and programming assignments in which we will study the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in both high-performance transaction processing systems (OLTP) and large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. The course is appropriate for advanced undergraduate and graduate students interested in systems programming.

  • Instructor: Joy Arulraj
  • Time: Tue/Thu 3:30 – 4:45 PM
  • Location: Instructional Center 115
  • Discussion platform: Piazza
  • Grading platform: Gradescope
  • TAs: Gaurav Tarlok Kakkar

Syllabus

Format

The course is a combination of lectures and programming assignments in which we will study the internals of modern database management systems.

Prerequisites

Students are expected to have completed three undergraduate-level computer systems courses:

  • Data structures and Algorithms (CS 1332) (strict)
  • Computer Systems and Networks (CS 2200) (strict)
  • Design Operating Systems (CS 3210) (recommended)
  • Introduction to Database Systems (CS 4400) (recommended)
  • Database System Implementation -- Part I (CS 4420/6422) (recommended)

and to be comfortable with programming in C or C++.

Academic Honesty

All students should adhere to the Georgia Tech Honor Code. University Policies will be followed strictly in this course. Please, pay particular attention to academic misconduct.

Educational Objectives

This is the second part of a two-part series of courses on the design and implementation of database management systems. This course has a heavy emphasis on programming assignments. There will be two exams. Upon successful completion of this course, the student should be able to:

  • Understand and apply state-of-the-art implementation techniques for database management systems following modern coding practices.
  • Identify trade-offs among database systems techniques and contrast alternatives for both on-line transaction processing and on-line analytical workloads.
  • Develop and justify design decisions in the context of a high-performance database system.
  • Implement and evaluate complex, scalable components of database systems, with emphasis on providing experimental evidence for design decisions.

This course will be mostly self-contained. We will cover the following topics in the second part of the series:

  • Query Compilation, Vectorization
  • Concurrency Control
  • Logging and Recovery Methods
  • Query Optimization
  • Leveraging Modern Hardware

Grading Scheme

The final grade for the course will be tentatively based on the following weights:

  • 35% Programming Assignments
  • 15% Exercise Sheets
  • 15% Mid-term Exam
  • 15% Final Exam
  • 10% Class Participation

Programming Assignments

The programming assignments are geared towards exploring the topics covered in the lectures. We will be using an end-to-end toy relational database management system for this course. This system has been developed for educational purposes and should not be used in production.

Exercise Sheets

The exercise sheets consist of a set of subjective problems on the topics covered in the lectures. They are representative of the questions that will appear in the exams.

Exams

There will be two in-person exams as specified in the schedule.

The exam will consist of subjective questions based on the topics discussed in class.

Textbook

Credits

This website is based on a design by Andy Pavlo.