CPSC 343 Database Theory and Practice Fall 2017

Practice Problems - Query Processing

Make use of the following DB schema and statistics where relevant. Key attributes are underlined.

  DEPARTMENT(Dname,Dnumber,Mgr_ssn,Mgr_startdate)
  DEPT_LOCATIONS(Dnumber,Dlocation)
  EMPLOYEE(Fname,Minit,Lname,Ssn,Bdate,Dno,Salary)
  PROJECT(Pname,Pnumber,Plocation,Dnum)
  WORKS_ON(Essn,Pno,Hours)
tablecolumn# distinct values (d) low valuehigh value
PROJECTPlocation2001500
Pnumber200010009999
Pname1800  
Dnum50150
DEPARTMENTDnumber50150
Mgr_ssn40  
EMPLOYEESsn10000  
Dno50150
Salary 1000099999
Bdate 1945-01-011989-12-31
WORKS_ONEssn10000  
Pno200010009999
table# records (r)# blocks (b)
PROJECT2000100
DEPARTMENT505
EMPLOYEE100002000
WORKS_ON120001200
index# levels (x)# level 1 blocks (b1)
PROJ_PLOC24
EMP_SSN250
EMP_SAL250

Also assume that all files are ordered by primary key.


  1. For each of the following, draw the query tree corresponding most directly to the query as stated.

    1. SELECT E.Fname,E.Lname,E.Salary
      FROM   EMPLOYEE E, WORKS_ON W, PROJECT P
      WHERE  E.Ssn=W.Essn AND W.Pno=P.Pnumber AND P.Pname='ProductX';
      
    2. SELECT E.Fname,E.Lname,E.Address
      FROM   EMPLOYEE E, DEPARTMENT D
      WHERE  D.Dname='Research' AND D.Dnumber=E.Dno;
      
    3. SELECT E.Fname,E.Lname,D.Dname,L.Dlocation
      FROM EMPLOYEE E JOIN ( DEPARTMENT D NATURAL JOIN DEPT_LOCATIONS L ) 
           ON E.Dno=D.Dnumber
      WHERE E.Bdate < '1960-01-01' AND E.Salary > 50000;
      

    (solution A, solution B, solution C)

  2. Apply heuristic optimization to each of the query trees from the previous problem. Use the database statistics given above; make reasonable assumptions (and state what they are) if the necessary information is lacking. Show your work - draw the query tree after each step of the heuristic optimization algorithm, label which step of the algorithm resulted in each tree, and provide a rationale for your actions in step 3.

    (solution A, solution B, solution C)

  3. Find an execution plan for the following query tree:

    query tree
    1. Identify which of SL, SB, SH, SP, SC, and/or SS are applicable for the SELECT and which of JNL, JSL, and/or JSM are applicable for the JOIN in this query tree. Also provide a rationale for your answers.

    2. Where there's a choice of algorithm, what choices do you expect to produce the best execution plan? Provide a rationale for your choices.

    3. Can anything be pipelined? Explain.

    (solution)

  4. For each of the following scenarios, compute the cost of the execution plan shown in terms of the number of blocks read/written. Show your work - it should be clear how you calculated the result, including what formulas you used. Assume 5 blocks of memory are available.

    • For the SELECTs, use an index if there is one, otherwise exploit the file ordering if it is relevant, otherwise use SL.
    • For the CROSS PRODUCTs, use JNL.
    1. Assume no pipelining.

    2. Assume pipelining is used wherever possible.

    (solution)

  5. For each of the following scenarios, compute the cost of the execution plan shown in terms of the number of blocks read/written. Show your work - it should be clear how you calculated the result, including what formulas you used. Assume 5 blocks of memory are available.

    • For the SELECTs, use an index if there is one, otherwise exploit the file ordering if it is relevant, otherwise use SL.
    • For the JOINs, use JSL if an index can be applied and JNL otherwise.
    1. Assume no pipelining.

    2. Assume pipelining is used wherever possible.

    (solution)


Valid HTML 4.01!