CPSC 343 Database Theory and Practice Fall 2017

Homework 4: Access and Integrity

due Wed Nov 1 in class

For this homework you will be working with the following database schema:

  DEPARTMENT(deptname,deptnum,mgrssn,mgrstartdate)
  DEPENDENT(empssn,fname,sex,bdate,relationship)
  DEPT_LOCATIONS(deptnum,deptloc)
  EMPLOYEE(fname,minit,lname,ssn,bdate,address,sex,salary,superssn,deptnum)
  PROJECT(projname,projnum,projloc,deptnum)
  WORKS_ON(empssn,projnum,hours)
  HISTORY(ssn,salary,timestamp)

Key attributes are underlined. The foreign key constraints are the following:

If you want to try out your solutions on an actual database, you can use user_company (where user is your username).

Views

For each of the following, give the CREATE VIEW statement to create the view described. It is OK to define the view using MySQL Workbench and then copy the CREATE VIEW statement that it generates when you click "Apply".

  1. The employee address book: first name, middle initial, last name, address, and the name of the department they work for.

        CREATE VIEW EMPADDRBOOK AS
        SELECT fname,minit,lname,address,deptname
        FROM EMPLOYEE E LEFT JOIN DEPARTMENT D ON E.deptnum=D.deptnum
      

    NATURAL JOIN and JOIN ... ON E.deptnum=D.deptnum will omit any employees who don't have a department listed.

  2. Employees with children: first name, middle initial, last name, SSN for those employees with at least one dependent whose relationship is "Son" or "Daughter".

        CREATE VIEW PARENTS AS
        SELECT DISTINCT E.fname,E.minit,E.lname,E.ssn
        FROM EMPLOYEE E JOIN DEPENDENT D ON E.ssn=D.empssn
        WHERE D.relationship='Son' OR D.relationship='Daughter'    
      

    fname is in both tables, so it is necessary to qualify the reference. DISTINCT is needed for those with more than one child.

    Another option for the query:

        SELECT fname,minit,lname,ssn
        FROM EMPLOYEE
        WHERE ssn IN ( SELECT empssn FROM DEPENDENT WHERE relationship='Son' OR relationship='Daughter' )
      
  3. Managers: first name, middle initial, last name, SSN, birthdate, address, sex, salary, department name, and department number for those employees who manage a department.

        CREATE VIEW MANAGERS AS
        SELECT E.fname,E.minit,E.lname,E.ssn,E.bdate,E.address,E.sex,E.salary,D.deptname,D.deptnum
        FROM EMPLOYEE E JOIN DEPARTMENT D ON E.ssn=D.mgrssn
      

    The deptname and deptnum in the results should belong to the department that the employee manages, not the one they work for. NATURAL JOIN is not correct here, as it matches employees with the department they work in rather than the one they manage.

  4. Department statistics: for each department, the number of employees, the highest and lowest salaries of the people who work in that department, the number of different projects associated with that department, and the total number of hours employees work on projects associated with that department.

    The challenge here is not to omit things where the count is 0, for example departments with no employees or no projects or no one working on the department's projects. LEFT JOIN combines every row in the left table with the matching rows in the right table, or NULL values if there is no matching row in the right table.

        CREATE VIEW DEPTSTATS AS
        SELECT *
        FROM ( SELECT D.deptnum,D.deptname,COUNT(ssn) AS numemployees,MIN(salary) as minsalary,MAX(salary) as maxsalary
               FROM DEPARTMENT D LEFT JOIN EMPLOYEE E ON D.deptnum=E.deptnum
               GROUP BY D.deptnum ) A NATURAL JOIN
             ( SELECT D.deptnum,D.deptname,COUNT(DISTINCT P.projnum) AS numprojs,IFNULL(SUM(hours),0) AS numhours
               FROM DEPARTMENT D LEFT JOIN PROJECT P ON D.deptnum=P.deptnum
                                 LEFT JOIN WORKS_ON W ON P.projnum=W.projnum
               GROUP BY D.deptnum ) B
      

    IFNULL(SUM(hours),0) is a way to get around NULL values appearing for departments with no projects or no one working on its projects.

    We want hours on projects associated with the department, so it is important to join WORKS_ON to DEPARTMENT via the project number rather than through the employee.

    Another strategy to deal with departments with no employees, no projects, and/or no one working on their projects is using UNION to bring in rows with 0 entries for those departments - though this gets very long and involved...

        SELECT *
        FROM ( ( SELECT D.deptnum,D.deptname,COUNT(ssn) AS numemployees,MIN(salary) as minsalary,MAX(salary) as maxsalary
                  FROM DEPARTMENT D NATURAL JOIN EMPLOYEE E 
                  GROUP BY D.deptnum )
                 UNION
                ( SELECT D.deptnum,D.deptname,0 AS numemployees,NULL as minsalary,NULL as maxsalary
                  FROM DEPARTMENT D
                  WHERE NOT EXISTS ( SELECT * FROM EMPLOYEE E WHERE E.deptnum=D.deptnum ) ) ) A
           NATURAL JOIN
             ( ( SELECT D.deptnum,D.deptname,COUNT(DISTINCT P.projnum) AS numprojs
                 FROM DEPARTMENT D NATURAL JOIN PROJECT P 
                 GROUP BY D.deptnum )
                UNION
               ( SELECT D.deptnum,D.deptname,0 AS numprojs
                 FROM DEPARTMENT D 
                 WHERE NOT EXISTS ( SELECT * FROM PROJECT P WHERE D.deptnum=P.deptnum ) ) ) B
           NATURAL JOIN
             ( ( SELECT D.deptnum,D.deptname,SUM(hours) AS numhours
                 FROM DEPARTMENT D NATURAL JOIN PROJECT P 
                 NATURAL JOIN WORKS_ON W 
                 GROUP BY D.deptnum )
                UNION
               ( SELECT D.deptnum,D.deptname,0 AS numhours
                 FROM DEPARTMENT D 
                 WHERE NOT EXISTS ( SELECT *
                                    FROM PROJECT P NATURAL JOIN WORKS_ON W
                                    WHERE P.deptnum=D.deptnum ) ) ) C
      

    A third, much shorter, alternative. Note that care is needed to properly sum the hours for each department - think about how many rows contain information about a particular department or a particular employee or a particular project in the FROM table that is built.

        SELECT D.deptnum,D.deptname,COUNT(distinct ssn),max(salary),min(salary),count(distinct P.projnum),IFNULL(sum(hours)/count(distinct ssn),0)
        FROM DEPARTMENT D LEFT JOIN EMPLOYEE E ON D.deptnum=E.deptnum
                          LEFT JOIN PROJECT P ON D.deptnum=P.deptnum 
                          LEFT JOIN WORKS_ON W ON P.projnum=W.projnum
        GROUP BY deptnum
      

Stored Routines

For each of the following, write a stored procedure or function as appropriate. Give the CREATE PROCEDURE or CREATE FUNCTION statement to create the routine described. It is OK to define the routine using MySQL Workbench and then copy the CREATE PROCEDURE/FUNCTION statement that it generates when you click "Apply".

Assume that the views you defined above are available - make use of them as appropriate.

  1. Given a project number, get the total number of hours employees work on that project.

          CREATE FUNCTION getTotalHours ( proj TINYINT(3) )
          RETURNS DECIMAL(10,1)
          BEGIN
            DECLARE total DECIMAL(10,1) DEFAULT 0;
          
            SELECT SUM(hours)
            INTO total
            FROM WORKS_ON W
            WHERE W.projnum=proj;
          
            RETURN total;
          END
        
  2. Given a salary, find the lowest-paid manager (last name, SSN, salary) whose salary exceeds that threshold. Report NULL if there is no such manager, and any one if there's more than one such manager.

          CREATE PROCEDURE lowestPaidManager( IN threshold DECIMAL(9,2), OUT low_lname VARCHAR(45), OUT low_ssn CHAR(9), OUT low_salary DECIMAL(9,2) )
          BEGIN
            SELECT lname,ssn,salary
            INTO low_lname,low_ssn,low_salary
            FROM MANAGERS
            WHERE salary = ( SELECT MIN(salary)
                             FROM MANAGERS
                             WHERE salary > threshold )
            LIMIT 1;
          END
        

    The LIMIT deals with SELECT ... INTO failing if there are multiple rows.

    The following produce an empty result set rather than a row with NULL entries. LIMIT can be used to deal with ties.

          CREATE PROCEDURE lowestPaidManager( IN threshold DECIMAL(9,2) )
          BEGIN
            SELECT lname,ssn,salary
            FROM MANAGERS
            WHERE salary = ( SELECT MIN(salary)
                             FROM MANAGERS
                             WHERE salary > threshold )
            LIMIT 1;
          END
        
          CREATE PROCEDURE lowestPaidManager( IN threshold DECIMAL(9,2) )
          BEGIN
            SELECT lname,ssn,salary
            FROM MANAGERS
            WHERE salary > threshold
            ORDER BY salary ASC
            LIMIT 1;
          END
        
  3. Given an employee's name (first and last), look up their SSN. Signal an error if there is either no such employee or more than one.

        CREATE FUNCTION getSSN( firstname VARCHAR(30), lastname VARCHAR(45) ) RETURNS char(9)
        BEGIN
          DECLARE result CHAR(9) DEFAULT NULL;
    
          SELECT ssn
          INTO result
          FROM EMPLOYEE 
          WHERE fname=firstname AND lname=lastname;
    
          IF result IS NULL THEN
            SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT='No employee with that name';
          END IF;
    
          RETURN result;
        END
      

    SELECT INTO generates its own error (too many rows) if there's more than one row so there's not a need to do more to handle that case.

    Another alternative, which counts the number of employees with the specified name to make sure there is exactly one:

        CREATE FUNCTION getSSN( firstname VARCHAR(30), lastname VARCHAR(30) ) RETURNS char(9)
        BEGIN
          DECLARE num INT;
          DECLARE result CHAR(9) DEFAULT NULL;
    
          SELECT COUNT(*)
          INTO num
          FROM EMPLOYEE 
          WHERE fname=firstname AND lname=lastname;
    
          IF num = 0 THEN
            SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT='No employee with that name';
          ELSEIF num > 1 THEN
            SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT='Too many employees with that name';
          END IF; 
    
          SELECT ssn
          INTO result
          FROM EMPLOYEE 
          WHERE fname=firstname AND lname=lastname;
    
          RETURN result;
        END
      
  4. Given an employee's SSN, the project number, and a number of hours, update the time the employee spends working on the project. Add a new entry to WORKS_ON if needed.

          CREATE PROCEDURE updateHours ( IN ssn CHAR(9), IN proj INT, IN hrs INT)
          BEGIN
            IF EXISTS ( SELECT * FROM WORKS_ON WHERE empssn=ssn AND projnum=proj )
            THEN 
              UPDATE WORKS_ON
              SET hours=hrs
              WHERE empssn=ssn AND projnum=proj;
            ELSE 
              INSERT INTO WORKS_ON(empssn,projnum,hours) 
              VALUES(ssn,proj,hrs);
            END IF;
          END
        

    Another option is to count the number of entries in WORKS_ON for the specified employee and project, then use that value (0 or 1) to decide whether to update or insert.

  5. Given a department name and number, the manager's SSN and start date, and the department's location, add that department.

        CREATE PROCEDURE AddDept( IN dname VARCHAR(45), IN dnum TINYINT, IN ssn CHAR(9), IN sdate DATE, IN dloc VARCHAR(45) )
        BEGIN
          INSERT INTO DEPARTMENT(deptname,deptnum,mgrssn,mgrstartdate)
          VALUES(dname,dnum,ssn,sdate);
        
          INSERT INTO DEPT_LOCATIONS(deptnum,deptloc)
          VALUES(dnum,dloc);
        END    
    
  6. Given a department number and location, remove that location and set the location for any projects associated with that department and location to NULL. Signal an error if the specified department does not exist in that location.

        CREATE PROCEDURE RemoveLocation ( IN dept TINYINT, IN loc VARCHAR(45) )
        BEGIN
          IF NOT EXISTS ( SELECT *
                          FROM DEPT_LOCATIONS
                          WHERE deptnum=dept AND deptloc=loc ) THEN
            SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT='No such department/location';
          END IF;
        
          DELETE FROM DEPT_LOCATIONS
          WHERE deptnum=dept AND deptloc=loc;
    
          UPDATE PROJECT 
          SET projloc=NULL
          WHERE deptnum=dept AND projloc=loc;
        END    
      

Queries and Statements

Write queries/statements for each of the following. Assume that the views and stored routines you defined above are available - make use of them as appropriate. (Note that you will not get full credit if you do not use a view or stored routine that would simplify the query.)

  1. Find the total number of hours employees work on project 2.

          SELECT getTotalHours(2)
        
  2. Find the projects (project number and project name) where the total number of hours employees work on the project exceeds 40.

          SELECT projnum,projname
          FROM PROJECT
          WHERE getTotalHours(projnum) > 40
        
  3. Find the last name, SSN, and salary of the lowest-paid manager whose salary is above $50,000.

          CALL lowestPaidManager(50000,@name,@ssn,@salary);
          SELECT @name,@ssn,@salary;	
        
  4. Find the pay gap between the highest-paid and the lowest-paid employee in each department.

          SELECT deptnum,deptname,maxsalary-minsalary AS paygap
          FROM DEPTSTATS
        
  5. Find the number of female managers.

          SELECT COUNT(*) AS numfemale
          FROM MANAGERS
          WHERE sex='F'    
        
  6. Find the number of dependents that each manager has.

          SELECT M.ssn,M.fname,M.lname,COUNT(D.empssn) AS numdeps
          FROM MANAGERS M LEFT JOIN DEPENDENT D ON M.ssn=D.empssn
          GROUP BY M.ssn
        

    LEFT JOIN means that managers with no children are included with a count of 0.

  7. Set the number of hours James Borg spends working on project 20 to 30 hours.

          CALL updateHours(getSSN('James','Borg'),20,30)	
        

Triggers

For each of the following, identify how to implement the specified constraint in the database: via column-based settings (by choosing the domain or, default value, or by setting primary key, NOT NULL, or UNIQUE tags), foreign keys, or triggers. Be specific; don't just say "domain", for example, but indicate which column and what particular domain. For triggers, give the full CREATE TRIGGER statement to accomplish the task. (It is OK to define the trigger using MySQL Workbench and the copy the CREATE TRIGGER statement that is generates when you click "Apply".)

Modifications that violate the constraint should be disallowed unless otherwise specified. If there is more than one way to enforce a constraint, choose the best or most appropriate solution. (Only use triggers if there isn't another alternative.)

  1. A department must have a manager.

          DEPARTMENT.mgrssn NOT NULL      
        
  2. There cannot be two projects with the same name in one department.

          UNIQUE constraint for (projname,deptnum) in PROJECT      
        

    Note that a UNIQUE constraint just for PROJECT.projname says that no two projects can have the same name - but the desired constraint allows duplicate project names as long as they are in different departments.

  3. The location for a project must be one of the locations associated with the project's department.

          foreign key constraint from (projloc,deptnum) in PROJECT to (deptloc,deptnum) in DEPT_LOCATIONS
        

    Note that PROJECT.projloc → DEPT_LOCATIONS.deptloc is not sufficient - that requires project locations be department locations, but not that a project location needs to be one of its department's location.

          CREATE TRIGGER PROJECT_location1 BEFORE INSERT ON PROJECT FOR EACH ROW
          BEGIN
            IF NEW.projloc NOT IN ( SELECT deptloc
                                    FROM DEPT_LOCATIONS DL 
                                    WHERE DL.deptnum=NEW.deptnum ) 
            THEN
              SIGNAL SQLSTATE '45000'
              SET MESSAGE_TEXT='project location must match department location';
            END IF;
          END
    
          CREATE TRIGGER PROJECT_location2 BEFORE UPDATE ON PROJECT FOR EACH ROW
          BEGIN
            IF NEW.projloc NOT IN ( SELECT deptloc
                                    FROM DEPT_LOCATIONS DL 
                                    WHERE DL.deptnum=NEW.deptnum ) 
            THEN
              SIGNAL SQLSTATE '45000'
              SET MESSAGE_TEXT='project location must match department location';
            END IF;
          END
        
  4. Employees can only work on projects belonging to the department they work for.

          CREATE TRIGGER WORKS_ON_project BEFORE INSERT ON WORKS_ON FOR EACH ROW
          BEGIN
            IF ( SELECT deptnum FROM EMPLOYEE WHERE ssn=NEW.empssn ) <>
    	  ALL ( SELECT deptnum FROM PROJECT WHERE projnum=NEW.projnum ) 
            THEN SIGNAL SQLSTATE '45000'
      	  SET MESSAGE_TEXT='employee can only work in projects associated the department they work for';
            END IF;
          END
        
  5. Hours worked on a project can only be reported to the tenth of an hour.

          DECIMAL(4,1) for the type of WORKS_ON.hours      
        
  6. Employees cannot work more than 40 hours total on their projects.

    This involves checking that the insertion or update does not involve the employee going over 40 hours - which means considering both rows already in WORKS_ON as well as the row affected by the insert/update.

    Note that when a new row is inserted, there is not already an entry for that employee and project in WORKS_ON, so the sum in the following is summing for other projects only.

          CREATE TRIGGER WORKS_ON_totalhours1 BEFORE INSERT ON WORKS_ON FOR EACH ROW
          BEGIN
            IF ( SELECT SUM(hours)
                 FROM WORKS_ON W
                 WHERE W.empssn=NEW.empssn )+NEW.hours > 40
            THEN SIGNAL SQLSTATE '45000'
              SET MESSAGE_TEXT='employee can only work 40 hours max';
            END IF;
          END
        

    For an update, only the new value for the affected row should be considered in the sum.

          CREATE TRIGGER WORKS_ON_totalhours2 BEFORE UPDATE ON WORKS_ON FOR EACH ROW
          BEGIN
            IF ( SELECT SUM(hours)
                 FROM WORKS_ON W
                 WHERE W.empssn=NEW.empssn AND W.projnum <> NEW.projnum )+NEW.hours > 40
            THEN SIGNAL SQLSTATE '45000'
      	  SET MESSAGE_TEXT='employee can only work 40 hours max';
            END IF;
          END
        

Write triggers for the following:

  1. Managers do not have supervisors and must also work for the department they manage. If a person becomes a manager, their supervisor should be set to NULL and their deptnum set to the department they manage.

          CREATE TRIGGER DEPARTMENT_manager1 BEFORE UPDATE ON DEPARTMENT FOR EACH ROW
          BEGIN
            UPDATE EMPLOYEE
            SET superssn=NULL, deptnum=NEW.deptnum
            WHERE ssn=NEW.mgrssn;
          END
        
          CREATE TRIGGER DEPARTMENT_manager2 BEFORE INSERT ON DEPARTMENT FOR EACH ROW
          BEGIN
            UPDATE EMPLOYEE
            SET superssn=NULL, deptnum=NEW.deptnum
            WHERE ssn=NEW.mgrssn;
          END
        
  2. When an employee's salary is changed (or the employee leaves the company and is deleted from EMPLOYEE), the previous salary and a timestamp indicating when the change was made should be recorded in the HISTORY table. (You may find it useful to look up date and time functions in MySQL.)

          CREATE TRIGGER EMPLOYEE_history1 BEFORE UPDATE ON EMPLOYEE FOR EACH ROW
          BEGIN
            INSERT INTO HISTORY(ssn,salary,timestamp)
            VALUES (OLD.ssn,OLD.salary,NOW());
          END
        
          CREATE TRIGGER EMPLOYEE_history2 BEFORE DELETE ON EMPLOYEE FOR EACH ROW
          BEGIN
            INSERT INTO HISTORY(ssn,salary,timestamp)
            VALUES (OLD.ssn,OLD.salary,NOW());
          END
        

Valid HTML 4.01!