For each of the following scenarios, compute the cost of the execution plan shown in terms of the number of blocks read/written. Show your work - it should be clear how you calculated the result, including what formulas you used. Assume 5 blocks of memory are available.

 For the SELECTs, use an index if there is one, otherwise exploit the file ordering if it is relevant, otherwise use SL. For the CROSS PRODUCTs, use JNL.

#### No Pipelining

Without pipelining, results are written out after every operation.

 total σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution) selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected blocks written = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520 σBdate>'1957-12-31' × WORKS_ON JNL JNL cost: bR+(bRbS)/(n-2) = 520+(520*1200)/(5-2) = 208520 blocks read selection cardinality = |R||S| = 2600*12000 = 31200000 rows result blocking factor for the combined rows bfrEMPLOYEE,WORKS_ON = ⌈bfrEMPLOYEEbfrWORKS_ON/(bfrEMPLOYEE+bfrWORKS_ON)⌉ = ⌈(10000/2000)*(12000/1200)/((10000/2000)+(12000/1200))⌉ = 3 blocks written = ⌈(rows/bfrEMPLOYEE,WORKS_ON)⌉ = ⌈31200000/3⌉ = 10400000 σEssn=Ssn SL (condition doesn't involve a constant value, so binary search or an index wouldn't help even if available) SL cost: b (may be many matches) = 10400000 blocks read join selectivity js = min(sB/|S|,sA/|R|) = min((12000/10000)/10000,1/2600) = 1/2600 (this condition is really a join condition rather than a equality or range condition, so we consider the inputs to the preceding CROSS PRODUCT to determine selectivity - Ssn is PK for EMPLOYEE, but Essn is only part of a PK for WORKS_ON and thus not a key by itself) selection cardinality s = js*r = (1/2600)*31200000 = 12000 rows selected blocks written = ⌈(rows/bfrEMPLOYEE,WORKS_ON)⌉ = ⌈12000/3⌉ = 4000 σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read selectivity sl = 1/d = 1/1800 selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up) blocks written = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1 σEssn=Ssn × σPname='Aquarius' JNL JNL cost: bR+(bRbS)/(n-2) = 400+(400*1)/(5-2) = 534 blocks read selection cardinality = |R||S| = 12000*2 = 24000 rows result blocking factor for the combined rows bfrEMPLOYEE,WORKS_ON,PROJECT = ⌈bfrEMPLOYEE,WORKS_ONbfrPROJECT/(bfrEMPLOYEE,WORKS_ON+bfrPROJECT)⌉ = ⌈3*(2000/100)/(3+(2000/100))⌉ = 3 blocks written = ⌈(rows/bfrEMPLOYEE,WORKS_ON,PROJECT)⌉ = ⌈24000/3⌉ = 12000 σPnumber=Pno SL (condition doesn't involve a constant value, so binary search or an index wouldn't help even if available) SL cost: b (may be many matches) = 12000 blocks read join selectivity js = min(sB/|S|,sA/|R|) = min(1/2,(12000/2000)/12000) = 1/2000 (this condition is really a join condition rather than a equality or range condition, so we consider the inputs to the preceding CROSS PRODUCT to determine selectivity - Pnumber is PK for PROJECT, but there are 12000/2000 rows in WORKS_ON for each Pno) selection cardinality s = js*r = (1/2000)*24000 = 12 rows selected blocks written = ⌈(rows/bfrEMPLOYEE,WORKS_ON,PROJECT)⌉ = ⌈12/3⌉ = 4 πLname cost: b = 4 blocks read we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3 blocks read: 2000+208520+10400000+100+534+12000+4 = 10623158 blocks written: 520+10400000+4000+1+12000+4+3 = 10416528 total: 21039686 blocks read/written

#### With Pipelining

With pipelining, intermediate results are only written if necessary. PROJECT, SL, and the left side of JNL and JSL can be pipelined.

Differences from the unpipelined case are underlined italics.

 total σBdate>'1957-12-31' SL (EMPLOYEE is ordered by Ssn and there is no index on Bdate) SL cost: b = 2000 blocks read selectivity sl = 12/45 (condition covers 12 years of the 45-year range; assuming uniform distribution) selection cardinality s = sl*r = (12/45)*10000 = 2600 rows selected blocks produced = ⌈(rows/bfrEMPLOYEE)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2600/(10000/2000))⌉ = 520 blocks written = 0 (pipelined) σBdate>'1957-12-31' × WORKS_ON JNL JNL cost: bR+(bRbS)/(n-2) = 0+(520*1200)/(5-2) = 208000 blocks read (R does not need to be read, but S is still read bR/(n-2) times) selection cardinality = |R||S| = 2600*12000 = 31200000 rows result blocking factor for the combined rows bfrEMPLOYEE,WORKS_ON = ⌈bfrEMPLOYEEbfrWORKS_ON/(bfrEMPLOYEE+bfrWORKS_ON)⌉ = ⌈(10000/2000)*(12000/1200)/((10000/2000)+(12000/1200))⌉ = 3 blocks produced = ⌈(rows/bfrEMPLOYEE,WORKS_ON)⌉ = ⌈31200000/3⌉ = 10400000 blocks written = 0 (pipelined) σEssn=Ssn SL (pipelined, so can't binary search or use index even if otherwise applicable) SL cost: 0 (pipelined) join selectivity js = min(sB/|S|,sA/|R|) = min((12000/10000)/10000,1/2600) = 1/2600 (this condition is really a join condition rather than a equality or range condition, so we consider the inputs to the preceding CROSS PRODUCT to determine selectivity - Ssn is PK for EMPLOYEE, but Essn is only part of a PK for WORKS_ON and thus not a key by itself) selection cardinality s = js*r = (1/2600)*31200000 = 12000 rows selected blocks produced = ⌈(rows/bfrEMPLOYEE,WORKS_ON,PROJECT)⌉ = ⌈24000/3⌉ = 12000 blocks written = 0 (pipelined) σPname='Aquarius' SL SL cost: b (Pname is not known to be UNIQUE) = 100 blocks read selectivity sl = 1/d = 1/1800 selection cardinality s = sl*r = (1/1800)*2000 = 2 rows selected (round up) blocks produced = ⌈(rows/bfrPROJECT)⌉ = ⌈(rows/(r/b))⌉ = ⌈(2/(2000/100))⌉ = 1 blocks written: 1 (can't pipline right side of JNL) σEssn=Ssn × σPname='Aquarius' JNL JNL cost: bR+(bRbS)/(n-2) = 0+(400*1)/(5-2) = 534 blocks read (R does not need to be read, but S is still read bR/(n-2) times) selection cardinality = |R||S| = 12000*2 = 24000 rows result blocking factor for the combined rows bfrEMPLOYEE,WORKS_ON,PROJECT = ⌈bfrEMPLOYEE,WORKS_ONbfrPROJECT/(bfrEMPLOYEE,WORKS_ON+bfrPROJECT)⌉ = ⌈3*(2000/100)/(3+(2000/100))⌉ = 3 blocks produced = ⌈(rows/bfrEMPLOYEE,WORKS_ON,PROJECT)⌉ = ⌈24000/3⌉ = 12000 blocks written = 0 (pipelined) σPnumber=Pno SL (pipelined, so can't binary search or use index even if otherwise applicable) SL cost: 0 (pipelined) join selectivity js = min(sB/|S|,sA/|R|) = min(1/2,(12000/2000)/12000) = 1/2000 (this condition is really a join condition rather than a equality or range condition, so we consider the inputs to the preceding CROSS PRODUCT to determine selectivity - Pnumber is PK for PROJECT, but there are 12000/2000 rows in WORKS_ON for each Pno) selection cardinality s = js*r = (1/2000)*24000 = 12 rows selected blocks produced = ⌈(rows/bfrEMPLOYEE,WORKS_ON,PROJECT)⌉ = ⌈12/3⌉ = 4 blocks written = 0 (pipelined) πLname cost: 0 (pipelined) we don't know the size of the Lname column and thus can't compute the blocking factor for the final set of rows written, but we do know that one Lname record doesn't take up more space than an entire EMPLOYEE record so bfrEMPLOYEE can be used as a worst-case estimate blocks written: ⌈(rows/bfrLname)⌉ ≤ ⌈(rows/bfrEMPLOYEE)⌉ = ⌈12/(10000/2000)⌉ = 3 blocks written blocks read: 2000+208000+0+100+534+0+0 = 210634 blocks written: 0+0+0+1+0+0+3 = 4 total: 210638 blocks read/written (pipelining can make a big difference!)