For each of the following scenarios, compute the cost of the execution plan shown in terms of the number of blocks read/written. Show your work - it should be clear how you calculated the result, including what formulas you used. Assume 5 blocks of memory are available.

  • For the SELECTs, use an index if there is one, otherwise exploit the file ordering if it is relevant, otherwise use SL.
  • For the JOINs, use JSL if an index can be applied and JNL otherwise.

No Pipelining

Without pipelining, results are written out after every operation.

σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read
selectivity sl = 1/d = 1/1800
selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up)
blocks written = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1
Pnumber=Pno JNL (Pno is part of the key for WORKS_ON, but even if there's a primary index for PK (Essn,Pno), Pno being second means this index isn't useful) JNL cost: bR+(bRbS)/(n-2) = 1+(1*1200)/(5-2) = 401 blocks read
join selectivity js = min(sB/|S|,sA/|R|) = min((12000/2000)/12000,(1/2)*2) = 6/12000
selection cardinality s = js |R||S| = 6/12000 * 2 * 12000 = 12 rows result
blocking factor for the combined rows bfrPROJECT,WORKS_ON = ⌈bfrPROJECTbfrWORKS_ON/(bfrPROJECT+bfrWORKS_ON)⌉ = ⌈(2000/100)*(12000/1200)/((2000/100)+(12000/1200))⌉ = 7
blocks written = ⌈(rows/bfrPROJECT,WORKS_ON)⌉ = ⌈12/7⌉ = 2
σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read
selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution)
selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected
blocks written = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520
Essn=Ssn JSL (the index on Ssn for EMPLOYEE is not useful for the results of the SELECT, but EMPLOYEE is in order by Ssn and thus the results of the SELECT are also in order so binary search can be used) JSL cost: bR+|R|(log2 bS) = 2+12*log2(520) = 111 blocks read
join selectivity js = min(sB/|S|,sA/|R|) = min(1/2600,((12000/10000)/12) = 1/2600
selection cardinality s = js |R||S| = (1/2600)*12*2600 = 12 rows result
blocking factor for the combined rows bfrPROJECT,WORKS_ON,EMPLOYEE = ⌈bfrPROJECT,WORKS_ONbfrEMPLOYEE/(bfrPROJECT,WORKS_ON+bfrEMPLOYEE)⌉ = ⌈7*(10000/2000)/(7+(10000/2000))⌉ = 3
blocks written = ⌈(rows/bfrPROJECT,WORKS_ON,EMPLOYEE)⌉ = ⌈2/3⌉ = 4
πLname cost: b = 4 blocks read
we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate
blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3
total blocks read: 100+401+2000+111+4 = 2616
blocks written: 1+2+520+4+3 = 530
total: 3146 blocks read/written

With Pipelining

With pipelining, intermediate results are only written if necessary. PROJECT, SL, and the left side of JNL and JSL can be pipelined.

Differences from the unpipelined case are underlined italics.

σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read
selectivity sl = 1/d = 1/1800
selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up)
blocks produced = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1
blocks written = 0 (pipelined)
Pnumber=Pno JNL (Pno is part of the key for WORKS_ON, but even if there's a primary index for PK (Essn,Pno), Pno being second means this index isn't useful) JNL cost: bR+(bRbS)/(n-2) = 0+(1*1200)/(5-2) = 400 blocks read (R does not need to be read, but S is still read bR/(n-2) times)
join selectivity js = min(sB/|S|,sA/|R|) = min((12000/2000)/12000,(1/2)*2) = 6/12000
selection cardinality s = js |R||S| = 6/12000 * 2 * 12000 = 12 rows result
blocking factor for the combined rows bfrPROJECT,WORKS_ON = ⌈bfrPROJECTbfrWORKS_ON/(bfrPROJECT+bfrWORKS_ON)⌉ = ⌈(2000/100)*(12000/1200)/((2000/100)+(12000/1200))⌉ = 7
blocks produced = ⌈(rows/bfrPROJECT,WORKS_ON)⌉ = ⌈12/7⌉ = 2 blocks written = 0 (pipelined)
σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read
selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution)
selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected
blocks written = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520
Essn=Ssn JSL (the index on Ssn for EMPLOYEE is not useful for the results of the SELECT, but EMPLOYEE is in order by Ssn and thus the results of the SELECT are also in order so binary search can be used) JSL cost: bR+|R|(log2 bS) = 0+12*log2(520) = 109 blocks read (R does not need to be read, but S is still used |R| times)
join selectivity js = min(sB/|S|,sA/|R|) = min(1/2600,((12000/10000)/12) = 1/2600
selection cardinality s = js |R||S| = (1/2600)*12*2600 = 12 rows result
blocking factor for the combined rows bfrPROJECT,WORKS_ON,EMPLOYEE = ⌈bfrPROJECT,WORKS_ONbfrEMPLOYEE/(bfrPROJECT,WORKS_ON+bfrEMPLOYEE)⌉ = ⌈7*(10000/2000)/(7+(10000/2000))⌉ = 3
blocks produced = ⌈(rows/bfrPROJECT,WORKS_ON,EMPLOYEE)⌉ = ⌈2/3⌉ = 4
blocks written = 0 (pipelined)
πLname cost: 0 (pipelined)
we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate
blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3
total blocks read: 100+400+2000+109+0 = 2609
blocks written: 0+0+520+0+3 = 523
total: 3132 blocks read/written