CPSC 343 Database Theory and Practice Fall 2017

Practice Problems - File Organization

For the following problems, assume that a disk has block size B = 1024 bytes.

1. A file has r = 20,000 BOOK records of fixed length. Each record has the following fields: Book_id (4 bytes), Title (45 bytes), and Publisher_name (45 bytes). Book_id is the primary key; assume the values are 1-20000. Titles are not required to be unique but there are few duplicates; assume there are 19950 unique values. There are 50 unique publisher names.

1. Calculate the record size R in bytes.

2. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned organization.

3. Assume an unordered file organization. Calculate the number of blocks accessed (read and written) in the following cases:

1. Searching for book ID 1000.
3. Searching for books with ID >= 15000.
4. Assume an ordered file organization with book ID as the ordering field. Calculate the number of blocks accessed (read and written) in the following cases:

1. Searching for book ID 1000.
2. Searching for books with the title "Foreigner".
4. Searching for books with ID >= 15000.
5. Assume an ordered file organization with publisher as the ordering field. Calculate the number of blocks accessed (read and written) in the following cases:

1. Searching for book ID 1000.
2. Searching for books with the title "Foreigner".
4. Searching for books with ID >= 15000.
6. Assume a hashed file organization with book ID as the hash field. Calculate the number of blocks accessed (read and written) in the following cases:

1. Searching for book ID 1000.
2. Searching for books with the title "Foreigner".
4. Searching for books with ID >= 15000.
5. Searching for books with ID >= 19950.
(solution)
2. Use the same file properties as the previous question. A block pointer is 6 bytes long.

1. Assume there is a primary index on Book_id. Calculate the number of block accesses needed to search for and retrieve a record given its Book_id.

2. Assume there is a clustering index on Publisher_name. Calculate the number of block accesses needed to search for and retrieve records for a particular Publisher_name.

3. Assume there is a clustering index on Title. Calculate the number of block accesses needed to search for and retrieve records with titles starting with A-D.

4. Assume there is a secondary index on Publisher_name. Calculate the number of block accesses needed to search for and retrieve records for a particular Publisher_name.

5. For each of the preceding problems, how many levels would be needed if a multilevel index was used?

(solution)

 last updated: --Mon Nov 20 00:52:53 EST 2017-- page owned by: bridgeman@hws.edu