For each of the following scenarios, compute the cost of the execution plan shown in terms of the number of blocks read/written. Show your work - it should be clear how you calculated the result, including what formulas you used. Assume 5 blocks of memory are available.

 For the SELECTs, use an index if there is one, otherwise exploit the file ordering if it is relevant, otherwise use SL. For the JOINs, use JSL if an index can be applied and JNL otherwise.

#### No Pipelining

Without pipelining, results are written out after every operation.

 total σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read selectivity sl = 1/d = 1/1800 selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up) blocks written = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1 ⋈Pnumber=Pno JNL (Pno is part of the key for WORKS_ON, but even if there's a primary index for PK (Essn,Pno), Pno being second means this index isn't useful) JNL cost: bR+(bRbS)/(n-2) = 1+(1*1200)/(5-2) = 401 blocks read join selectivity js = min(sB/|S|,sA/|R|) = min((12000/2000)/12000,(1/2)*2) = 6/12000 selection cardinality s = js |R||S| = 6/12000 * 2 * 12000 = 12 rows result blocking factor for the combined rows bfrPROJECT,WORKS_ON = ⌈bfrPROJECTbfrWORKS_ON/(bfrPROJECT+bfrWORKS_ON)⌉ = ⌈(2000/100)*(12000/1200)/((2000/100)+(12000/1200))⌉ = 7 blocks written = ⌈(rows/bfrPROJECT,WORKS_ON)⌉ = ⌈12/7⌉ = 2 σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution) selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected blocks written = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520 ⋈Essn=Ssn JSL (the index on Ssn for EMPLOYEE is not useful for the results of the SELECT, but EMPLOYEE is in order by Ssn and thus the results of the SELECT are also in order so binary search can be used) JSL cost: bR+|R|(log2 bS) = 2+12*log2(520) = 111 blocks read join selectivity js = min(sB/|S|,sA/|R|) = min(1/2600,((12000/10000)/12) = 1/2600 selection cardinality s = js |R||S| = (1/2600)*12*2600 = 12 rows result blocking factor for the combined rows bfrPROJECT,WORKS_ON,EMPLOYEE = ⌈bfrPROJECT,WORKS_ONbfrEMPLOYEE/(bfrPROJECT,WORKS_ON+bfrEMPLOYEE)⌉ = ⌈7*(10000/2000)/(7+(10000/2000))⌉ = 3 blocks written = ⌈(rows/bfrPROJECT,WORKS_ON,EMPLOYEE)⌉ = ⌈2/3⌉ = 4 πLname cost: b = 4 blocks read we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3 blocks read: 100+401+2000+111+4 = 2616 blocks written: 1+2+520+4+3 = 530 total: 3146 blocks read/written

#### With Pipelining

With pipelining, intermediate results are only written if necessary. PROJECT, SL, and the left side of JNL and JSL can be pipelined.

Differences from the unpipelined case are underlined italics.

 total σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read selectivity sl = 1/d = 1/1800 selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up) blocks produced = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1 blocks written = 0 (pipelined) ⋈Pnumber=Pno JNL (Pno is part of the key for WORKS_ON, but even if there's a primary index for PK (Essn,Pno), Pno being second means this index isn't useful) JNL cost: bR+(bRbS)/(n-2) = 0+(1*1200)/(5-2) = 400 blocks read (R does not need to be read, but S is still read bR/(n-2) times) join selectivity js = min(sB/|S|,sA/|R|) = min((12000/2000)/12000,(1/2)*2) = 6/12000 selection cardinality s = js |R||S| = 6/12000 * 2 * 12000 = 12 rows result blocking factor for the combined rows bfrPROJECT,WORKS_ON = ⌈bfrPROJECTbfrWORKS_ON/(bfrPROJECT+bfrWORKS_ON)⌉ = ⌈(2000/100)*(12000/1200)/((2000/100)+(12000/1200))⌉ = 7 blocks produced = ⌈(rows/bfrPROJECT,WORKS_ON)⌉ = ⌈12/7⌉ = 2 blocks written = 0 (pipelined) σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution) selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected blocks written = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520 ⋈Essn=Ssn JSL (the index on Ssn for EMPLOYEE is not useful for the results of the SELECT, but EMPLOYEE is in order by Ssn and thus the results of the SELECT are also in order so binary search can be used) JSL cost: bR+|R|(log2 bS) = 0+12*log2(520) = 109 blocks read (R does not need to be read, but S is still used |R| times) join selectivity js = min(sB/|S|,sA/|R|) = min(1/2600,((12000/10000)/12) = 1/2600 selection cardinality s = js |R||S| = (1/2600)*12*2600 = 12 rows result blocking factor for the combined rows bfrPROJECT,WORKS_ON,EMPLOYEE = ⌈bfrPROJECT,WORKS_ONbfrEMPLOYEE/(bfrPROJECT,WORKS_ON+bfrEMPLOYEE)⌉ = ⌈7*(10000/2000)/(7+(10000/2000))⌉ = 3 blocks produced = ⌈(rows/bfrPROJECT,WORKS_ON,EMPLOYEE)⌉ = ⌈2/3⌉ = 4 blocks written = 0 (pipelined) πLname cost: 0 (pipelined) we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3 blocks read: 100+400+2000+109+0 = 2609 blocks written: 0+0+520+0+3 = 523 total: 3132 blocks read/written