CPSC 327 Data Structures and Algorithms Spring 2010

CPSC 327 Course Information


Course Description

At the heart of computer science is the development of efficient algorithms for solving problems, and the choice of data structure to implement that algorithm can have a significant impact on the simplicity and efficiency of a program.

This course continues the study of data structures, their applications, and the algorithms associated with them. Topics include abstract data types, dictionaries, graphs, searching, and sorting. The design and analysis of algorithms is also covered, with topics such as efficiency and complexity, NP-completeness, dynamic programming, and amortized analysis.


Course Web Page

http://math.hws.edu/~bridgeman/courses/327/s10/
You are expected to regularly consult the course web page for announcements, assignments, and most handouts.


Text

How to Think About Algorithms
Jeff Edmonds
Cambridge University Press, 2008
ISBN 978-0-521-61410-8

Additional material will be handed out or posted on the course webpage.


Prerequisites

CPSC 225 is required.
Programming in this course will be done in Java, and students are expected to be able to write a program from pseudocode or a description of an algorithm. Also expected is familiarity with basic abstract data types (lists, stacks, queues, and binary trees), how those data types are implemented (using arrays, linked lists, and linked structures), and recursion.

CPSC 229 is recommended.
The most relevant topics from 229 are the idea of a formal mathematical proof, and specific proof techniques such as induction and proof by contradiction. Finite automata, context-free grammars, and some notions of computability will also make an appearance. We'll introduce all of these topics as needed, but prior exposure is useful.


Rationale, Aims, and Objectives

Introductory programming courses focus primarily on the syntax and semantics of a programming language - how to tell the computer what to do. More advanced programming courses begin to introduce the idea that there are typically multiple ways to solve a particular problem; a good programmer is concerned about finding a good solution to the problem.

This course addresses how to solve problems in a smart way, by designing efficient solutions and implementing those solutions in efficient and organized ways. This topic is at the very core of computer science - after all, you need an algorithm before you can write a program and, as you will see in this course, there can be enormous differences in the practicality of different algorithms.

The course has three main goals (and several subgoals):

  • developing the skill of analyzing a problem and creating an efficient and provably correct solution to that problem, which includes:
    • gaining a working knowledge of algorithmic efficiency, to inform the algorithm- and program-design process by providing a basis for comparing solutions and defining good solutions
    • developing a toolbox of known data structures and algorithmic strategies which can be used to solve many common problems
    • developing the knowledge of how to think about algorithms and data structures, for when a "canned" data structure or algorithm might not be sufficient
  • fostering an appreciation for the practical value of studying algorithms and data structures
  • developing other skills useful in computer science: abstract thinking, comfort with the idea of tradeoffs, and a habit of critical reflection

By the end of the course, the successful student will be able to:

  • identify the relevant characteristics of a problem
  • evaluate the suitability of multiple algorithmic techniques for solving the problem, and select an appropriate technique
  • evaluate the suitability of multiple abstract data types, and select appropriate ADT(s)
  • develop a correct algorithm to solve the problem
  • describe the algorithm at an appropriate level of abstraction
  • convincingly justify the correctness of the algorithm
  • determine the time and space requirements of the algorithm, and express those requirements in big-Oh notation
  • evaluate the suitability of multiple concrete implementations of the selected ADT(s), and make appropriate choice(s)
  • implement the algorithm and any necessary ADT(s)

Course Content Overview

The course material can be arranged into four main units:

Algorithm Analysis: Determining whether one solution is better than another requires some yardstick by which quality can be measured. Time- and space-efficiency will be discussed, and big-Oh notation will be introduced - concepts which will be the fundamental tools used in the rest of the course. Specific objectives include:

  • describe and discuss different ways in which the efficiency of an algorithm can be determined
  • define big-Oh notation
  • discuss the pros and cons of asymptotic measures (big-Oh) and experimental measures of algorithm efficiency
  • define the difference between best case, worst case, and average case running time/space
  • determine the (best, worst, average) running time/space of an iterative or recursive algorithm
  • arrange algorithms from fast to slow based on their asymptotic running times

Algorithm Design: The study of algorithm design will be organized around three main categories of algorithms: iterative algorithms, recursive algorithms, and algorithms for optimization problems. (A fourth category - randomized algorithms - may be mentioned briefly.) In each case, the emphasis will be on strategies for developing algorithms of that type rather than simply presenting long lists of specific algorithms. Specific objectives include:

  • define: iterative algorithm, recursive algorithm, divide-and-conquer, greedy, recursive backtracking, branch-and-bound, dynamic programming
  • give examples of algorithms utilizing each approach
  • for each approach, identify the problem characteristics that make that approach suitable
  • identify the steps for developing iterative and recursive algorithms
  • develop algorithms for a new problem using suitable approach(es)
  • convincingly justify the correctness of the resulting algorithm

ADTs and Data Structures: Appropriate abstract data types (ADTs) are important for simplifying the development and implementation of algorithms, and the performance of an algorithm can be hampered by an inefficient implementation of the ADTs used. This component of the course will focus on developing a toolbox of abstract data types, a set of guidelines for choosing an appropriate ADT, and a toolbox of concrete data structures for efficiently implementing the ADTs studied. Basic ADTs such as lists, stacks, queues, priority queues, and binary trees and their typical implementations will be quickly reviewed. New ADTs such as sets, systems of sets, dictionaries, ordered collections, and graphs will be introduced along with efficient implementations for these data types. Specific objectives include:

  • identify the key properties and typical operations of each ADT
  • give examples of applications of each ADT
  • match the needs of the problem to an appropriate ADT
  • compare and contrast the time- and space-efficiency of different implementations of ADTs and discuss situations in which each implementation is most appropriate
  • define and implement basic graph algorithms (e.g. depth-first search, breadth-first search, topological sort, shortest path), determine their running times, and discuss their applications

Complexity and NP-Completeness: After a semester-long focus on the efficiency of algorithms, some questions might come to mind: Are some problems fundamentally harder to solve than others? Are there problems that can't be solved? We'll delve briefly back into more theoretical topics to examine these questions (and what "hard" actually means), and to look at some of the hardest problems in computer science. Specific objectives include:

  • define P, NP, NP-hard, and NP-complete and give examples of relevant problems
  • describe several important NP-complete problems (e.g. 3SAT, vertex cover, clique, knapsack, TSP)
  • discuss and apply algorithmic strategies (e.g. backtracking, branch-and-bound) for dealing with NP-complete problems

Valid HTML 4.01!