CPSC 343 Database Theory and Practice Fall 2020

CPSC 343 Course Information

On this page:


Course Description and Objectives

Catalogue description:

Computer databases are used to store, organize, and retrieve large collections of information. This course introduces the theory and practice of relational databases and relational database management systems (RDBMS). Topics include data modeling and database design, query languages (SQL), and elements of RDBMS implementation such as file structure and data storage, indexing, and query evaluation. Additional topics may include Web-based access to databases, transaction management, reliability, security, and object-oriented databases.

This course, like many of the other 300- and 400-level computer science courses, explores a particular subdiscipline of computer science. Many aspects of modern society and commerce involve the management of large quantities of data, so databases are a crucial component of computer systems. Furthermore, the study of database systems brings together many areas of computer science (theory, data structures, algorithms, parallel computing, user interfaces, and more); it is particularly nice as an application of theoretical computer science topics.

By the end of the course, the successful student should be able to:

  • analyze and model the data needs of a real-world situation
  • use a relational database management system to effectively store and access data
  • construct a Web application to access a database
  • explain how relational database systems store information and process queries, and apply that knowledge to increasing database performance

The course material can be divided into five sections:

Introduction to Databases: The course starts with an introduction to databases, database management systems, and information management concepts.

Objectives:

  • explain what distinguishes the database approach from programming with data files
  • describe the components of a database system

Data Modeling and Database Design: A database is a structured collection of information; data modeling deals with capturing and expressing that structure. A well-designed database is easier to query, maintain, and extend. This section of the course will address the process of going from specifications to a collection of tables and columns. Specific topics include the entity-relationship (ER) model for data modeling, the relational model, ER-to-relational mapping, and principles of database design.

Objectives:

  • define the concept of data independence and explain its importance
  • explain the difference between different types of data models, such as conceptual data model, physical data model, and representational data model
  • be conversant in the terminology of the ER model
  • be conversant in the terminology of relational databases
  • identify the requirements of real-world situations
  • create ER diagrams to model the entities, relationships, and constraints of real-world situations
  • explain the tradeoffs in modeling in a certain way (e.g. entity vs. attribute) and when to pick what
  • create a relational schema from an ER model
  • identify and employ principles of good design

Query Languages: SQL is the standard language for working with relational database management systems, and this part of the course will introduce SQL for data definition, queries, and data manipulation.

Objectives:

  • express queries in SQL, and explain what queries expressed in SQL do
  • use INSERT, UPDATE, and DELETE statements to manipulate data
  • create relational database schemas in SQL demonstrating appropriate choices of data types and which incorporate key, entity integrity, and referential integrity constraints
  • explain what check constraints, assertions, and triggers are and give examples of when to use them
  • implement triggers

Database-Driven Applications: A database doesn't just sit by itself - application programs access the information stored in the database. This section of the course will introduce technologies for constructing an application that interacts with a database (HTML/PHP/mysqli or Java/JDBC) as well as the privacy, security, and integrity issues that become relevant when the outside world gets involved.

Objectives:

  • explain how a three-tier application works and what technologies address what aspects of the system
  • be conversant in the technologies introduced (i.e. be able to read and understand source code, modify an example to suit a similar purpose, and create basic examples from scratch)
  • explain issues relating to security and how to address them
  • explain what views are and give examples of when to use them
  • explain what stored routines are and give examples of when to use them
  • explain what transactions are and give examples of when to use them
  • use SQL to define and manipulate views, stored routines, and transactions
  • implement a robust, secure application which provides a basic, user-friendly interface and interacts with a database

Database Management Systems and Database Tuning: The last portion of the course will look at the software systems that make databases happen - the database management systems themselves. The goal is to make the student an informed user of database management systems by understanding some of the factors that affect performance and what control the database administrator/user has over these factors. Specific topics include file structure and data storage, indexing, and query evaluation.

Objectives:

  • be conversant in the terminology
  • explain how information is stored in a computer system, and what issues are relevant to performance
  • compare and contrast methods for indexing, and explain when they should be used
  • produce a set of plans for a moderately complex query, and evaluate which is the most efficient
  • use SQL to create and manipulate indexes

Course Format

Because social distancing makes it difficult for groups to work together in class, this course will be fully remote this semester. New material will be introduced before class through readings and/or videos available on the web. Class meetings will emphasize application of the material, as a class or in small groups, with short lectures to go over questions or resolve tricky points. Interaction will take place online using Zoom and other tools, and class sessions will take place via Zoom.


Prerequisites

CPSC 225 is required.
Your familiarity with one programming language (Java), with using typical programming constructs to solve problems, and, more generally, with "thinking like a programmer" will help you pick up the new languages (SQL, HTML, PHP) used in this course. In addition, Java may optionally be used for the course project.

CPSC 229 is recommended.
Relational databases are all about relations, a topic introduced in CPSC 229, and there is a great deal of theory underlying the design of relational databases and query languages for those databases. The necessary material will be introduced in this course, but previous exposure to the theoretical side of computer science is useful.


Required Course Materials

This course is fully remote. Course material in the form of readings, videos, and other activities will be made available over the web. (There is no textbook to purchase.) Class meetings will be held via Zoom. Certain software will be needed during class and to complete homework and other assignments.

In order to participate fully, you will need:

  • a laptop or desktop computer (not a tablet or phone) with a webcam, microphone, and sound (the built-in microphone and sound capabilities are fine, or you can use an external microphone and/or headphones)
  • a Zoom client (app or application), a web browser, and some additional software (see below) installed on your computer
  • Internet access, with sufficient bandwidth and reliability to be able to stream audio and video
  • a quiet space where you can participate in class meetings without bothering others (headphones can be helpful, but you also need to be able to speak with others)

If you do not have access to any of these resources, or anticipate that you may lose access should you be quarantined or need to leave campus or if labs or other campus facilities are closed, please speak to me as soon as possible in order to determine what alternatives might be possible. Also keep in mind that the semester will conclude remotely, so you will need to make sure that you will still have access to the necessary resources after Thanksgiving.

All of the software needed for this course is free (nothing to purchase) and can be installed on your own computer. (Instructions will be provided.) It will also be available on the Linux machines in Lansing 310 and possibly some computers in Rosenberg 009 for in-person and (in the case of Lansing 310) remote access, however, given the need for interaction during class and the potential for interruptions in physical and/or remote access to the lab (potentially with little or no notice), you are strongly encouraged to install what you need on your own computer. Also keep in mind that the semester will conclude remotely, so you will need to make sure that you will have access to the necessary resources if you leave campus at Thanksgiving.


Time Expectations

You should expect steady effort spread throughout the week, with daily lessons before each class period and practice problems after. After the first weeks of the semester there will also be a project to work on - you will have at least two weeks for most phases in order to provide some flexibility, but you should expect to spend several chunks of time to complete each phase satisfactorily. The implementation phase will be the most time demanding. Midterm exams will require a 90-minute window of time within a 2-3 day period.

Dates for all of the major assignments are on the schedule page to aid in planning.