CPSC 343 Database Theory and Practice Fall 2024

CPSC 343 Course Information

On this page:


Course Description and Objectives

Catalogue description:

Computer databases are used to store, organize, and retrieve large collections of information. This course introduces the theory and practice of relational databases and relational database management systems (RDBMS). Topics include data modeling and database design, query languages (SQL), and elements of RDBMS implementation such as file structure and data storage, indexing, and query evaluation. Additional topics may include Web-based access to databases, transaction management, reliability, security, and object-oriented databases.

This course, like many of the other 300- and 400-level computer science courses, explores a particular subdiscipline of computer science. Many aspects of modern society and commerce involve the management of large quantities of data, so databases are a crucial component of computer systems. Furthermore, the study of database systems brings together many areas of computer science (theory, data structures, algorithms, parallel computing, user interfaces, and more); it is particularly nice as an application of theoretical computer science topics.

By the end of the course, the successful student should be able to:

  • analyze and model the data needs of a real-world situation
  • use a relational database management system to effectively store and access data
  • construct a Web application to access a database
  • explain how relational database systems store information and process queries, and apply that knowledge to increasing database performance

The course material can be divided into five sections:

Introduction to Databases: The course starts with an introduction to databases, database management systems, and information management concepts.

Objectives:

  • explain what distinguishes the database approach from programming with data files
  • describe the components of a database system

Data Modeling and Database Design: A database is a structured collection of information; data modeling deals with capturing and expressing that structure. A well-designed database is easier to query, maintain, and extend. This section of the course will address the process of going from specifications to a collection of tables and columns. Specific topics include the entity-relationship (ER) model for data modeling, the relational model, ER-to-relational mapping, and principles of database design.

Objectives:

  • define the concept of data independence and explain its importance
  • explain the difference between different types of data models, such as conceptual data model, physical data model, and representational data model
  • be conversant in the terminology of the ER model
  • be conversant in the terminology of relational databases
  • identify the requirements of real-world situations
  • create ER diagrams to model the entities, relationships, and constraints of real-world situations
  • explain the tradeoffs in modeling in a certain way (e.g. entity vs. attribute) and when to pick what
  • create a relational schema from an ER model
  • identify and employ principles of good design

Query Languages: SQL is the standard language for working with relational database management systems, and this part of the course will introduce SQL for data definition, queries, and data manipulation.

Objectives:

  • express queries in SQL, and explain what queries expressed in SQL do
  • use INSERT, UPDATE, and DELETE statements to manipulate data
  • create relational database schemas in SQL demonstrating appropriate choices of data types and which incorporate key, entity integrity, and referential integrity constraints
  • explain what check constraints, assertions, and triggers are and give examples of when to use them
  • implement triggers

Database-Driven Applications: A database doesn't just sit by itself — application programs access the information stored in the database. This section of the course will introduce technologies for constructing an application that interacts with a database (HTML/PHP/mysqli or Java/JDBC) as well as the privacy, security, and integrity issues that become relevant when the outside world gets involved.

Objectives:

  • explain how a three-tier application works and what technologies address what aspects of the system
  • be conversant in the technologies introduced (i.e. be able to read and understand source code, modify an example to suit a similar purpose, and create basic examples from scratch)
  • explain issues relating to security and how to address them
  • explain what views are and give examples of when to use them
  • explain what stored routines are and give examples of when to use them
  • explain what transactions are and give examples of when to use them
  • use SQL to define and manipulate views, stored routines, and transactions
  • implement a robust, secure application which provides a basic, user-friendly interface and interacts with a database

Database Management Systems and Database Tuning: The last portion of the course will look at the software systems that make databases happen — the database management systems themselves. The goal is to make the student an informed user of database management systems by understanding some of the factors that affect performance and what control the database administrator/user has over these factors. Specific topics include file structure and data storage, indexing, and query evaluation.

Objectives:

  • be conversant in the terminology
  • explain how information is stored in a computer system, and what issues are relevant to performance
  • compare and contrast methods for indexing, and explain when they should be used
  • produce a set of plans for a moderately complex query, and evaluate which is the most efficient
  • use SQL to create and manipulate indexes

Prerequisites

CPSC 225 is required.
Your familiarity with one programming language (Java), with using typical programming constructs to solve problems, and, more generally, with "thinking like a programmer" will help you pick up the new languages (SQL, HTML, PHP) used in this course. In addition, Java may optionally be used for the course project.

CPSC 229 is recommended.
Relational databases are all about relations, a topic introduced in CPSC 229, and there is a great deal of theory underlying the design of relational databases and query languages for those databases. The necessary material will be introduced in this course, but previous exposure to the theoretical side of computer science is useful.


Class Format

This course will utilize a partially flipped format. What that means is that new material will be introduced through readings and/or videos plus a short self-assessment "quiz", to be reviewed/completed before class. Class meetings will often include a short lecture to address tricky points or add additional material, but will also emphasize application of the material as a class or in small groups.


Assignments and Evaluation

This course introduces many new terms and concepts, including multiple languages and sets of notation. This means that it is important to do the readings, to prepare for class and actively practice the material, and to keep up rather than letting things slide.

Daily Lessons: For most class sessions you will be expected to complete a lesson beforehand involving readings and/or videos and some questions about the material to help you self-assess what concepts you understand, learn what you don't, and identify where you have further questions. As the goal of these "quizzes" is self-assessment, grading is based on completeness (did you answer the questions?) rather than correctness.

Homework: Trying to solve problems and applying concepts for yourself is essential for mastery. In most cases the Canvas discussion boards will be used for homework problems in order to facilitate peer feedback and encourage deeper engagement with the problems. You must post your answer before you can see others' responses. Problems will be graded on a four-point scale, with 1 point for making an attempt to answer the question and 4 points for a fully correct answer. An additional point can be earned by posting a unique (not already posted by someone else), substantive, and correct reply to your post or someone else's e.g. reflecting on differences in approach if the answer is correct or explaining what the problem is and giving a correct solution if the answer is incorrect.

Project: A substantial course project involving the design and development of a database and application utilizing that database will provide a practical application for the course material. Work on the project will begin early in the semester; the timeline can be found on the schedule page.

Exams: There will be five in-class hour exams and a two-hour in-class final exam. Dates are on the schedule page. More about the topics covered and the format of the exams will be announced closer to the exam dates.

Engagement and Participation: Learning isn't a passive activity where you simply read about something or sit back and watch someone else do things; you need to work with the material by thinking about it, trying to apply it, asking questions about it, and so forth and you will get more out of the course and will be more successful if you are engaged in the course. "Engagement" covers aspects of participation in the course other than completing graded assignments and exams: daily lessons, attendance, contributing to class, and asking questions (in daily lessons, in class, at office hours). The minimum expectations for a passing engagement grade are that you complete 50% of the daily lessons and satisfy the attendance policy. For full credit (and your own learning), you should demonstrate greater engagement through at least some of the following: completing more daily lessons, missing fewer classes, participating in class, regularly responding to "do you have questions?" prompts, coming to study sessions and/or office hours, etc.

Grades: Grades in this course will be computed as follows:

  • Engagement and Participation: 10%
  • Homework: 15%
  • Project: 25%
  • Exams: 50% (approx. 11% each for exams 1-4, approx. 5.5% for exam 5, final exam up to 22%)

Time Expectations

You are expected to attend all scheduled class meetings (3 hours per week), and should expect to spend approximately 8 hours per week on average (*) on additional work (daily lessons, homework problems, the project, studying) outside of class. Readings will generally be posted a week or so in advance and you can expect new homework problems for most classes. Dates for the project deadlines and exams can be found on the schedule page.

In addition, you are expected to attend at least 5 hours of additional (outside of class) facilitated learning experiences over the course of the semester. This requirement can be met by attending scheduled study sessions and/or office hours. (5 hours = e.g. one hour of study sessions every three weeks, or a 20-minute visit to office hours every week)

(*) The assigned work is intended to take about this much time, though your experience may vary. If you routinely spend substantially more time, especially if you feel like you are spinning your wheels and not making progress, you should visit the study sessions and/or office hours for help.


Course Materials
Textbook

There is no textbook to purchase — material will be posted on the course webpage or Canvas. In addition, the 5th edition of Fundamentals of Database Systems by Elmasri and Navathe has been placed on reserve in the library for additional reference; relevant sections will be noted as appropriate.

Laptop

If you have a laptop that you are able to bring to class on lecture days, please do so. (It's OK if you don't; in-class activities will be done in small groups so you can team up with someone who does have a laptop.)

Software

All of the software needed for this course will be available on the Linux machines in Rosenberg 009, in the Math/CS department lab in Lansing 310, and remotely through the Linux VDI. It can also be downloaded for free and is available for Windows, Mac, and Linux so you can install it on your own computer if you want.