Previous Page Next Page Up One Level

Tutorial S2

Contents

Joining tables

Often the information required is contained in more than one table. You can specify more than one table in the FROM clause. For example the use of the two tables employee and jobhistory in the FROM clause will create a larger table with each row in employee combined with each row in jobhistory. Each of these new rows will have all the columns from the employee table and the jobhistory table. If there are 3 rows in employee and 5 rows in jobhistory this will create a new table of 3 times 5 (i.e. 15) rows. This is known as a Cartesian product.

A Cartesian product will contain many rows of no practical interest, such as rows containing the employee and jobhistory details for two different employees. It is therefore necessary to have some restriction on the join. Here a likely requirement is that the empno field in the employee table matches the empno of the jobhistory table. Each row in the resulting table will then contain employee and jobhistory data for only one employee.

List the employees number, surname, and current job title.

SELECT employee.empno, surname, position
FROM employee, jobhistory
WHERE enddate IS NULL
AND employee.empno = jobhistory.empno

A more modern syntax of this would be.

SELECT employee.empno, surname, position
FROM employee JOIN jobhistory ON (employee.empno = jobhistory.empno)
WHERE enddate IS NULL

Notice that the fields which are not unique must be explicitly referred to by use of the table name and a fullstop followed by the fieldname. For instance empno occurs in both the employee table and the jobhistory table and so it must be explicitly referred to. This also means that it must be explicitly referred to in the SELECT clause even though the values are the same for employee.empno and jobhistory.empno.

You can use more than two tables in the FROM clause. There is no theoretical limit, however there will be some limit placed on you by the system itself. If you have N tables in the FROM clause then you will normally need (N – 1) join conditions.

Aliases or correlation names.

Although table prefixes prevent ambiguity in a query, they can be tedious to enter. You can define temporary labels in the FROM clause and use them elsewhere in the query. Such temporary labels are sometimes known as temporary table aliases.

List the employee number, surname and department of each employee.

SELECT e.empno, surname, dname 
FROM employee e JOIN department d ON (e.depno = d.depno)

Notice that the table employee is given an alias e, and department an alias d. This can then be used during the query. It is also possible to use the actual name. Notice also that the join is on the two tables employee and department.

Equi-joins and non-equi joins

When you join the table department to the table employee, the join condition specifies the relationship between them. Such joins are known as equi-joins because the comparison operator is the equals operator. Any join that does not use this operator is known as a non-equi join.

GROUP BY

Conceptually GROUP BY rearranges the table designated in the FROM clause into partitions or groups, such that within any one group all rows have the same value for the GROUP BY field(s).

List the departments and their total current salary bill

SELECT depno, sum(salary) "Salary"
FROM employee JOIN jobhistory ON (employee.empno=jobhistory.empno)
WHERE enddate IS NULL
GROUP BY depno

In the above example, table employee is grouped so that one group contains all the rows for department 1, another contains all the rows for department 2, and so on.

The sum(salary) "Salary" renames the column Salary.

Each expression in the SELECT clause must be single-valued per group (i.e. it can be one of the GROUP BY fields or an arithmetic expression involving such a field), or a constant, or a function such as SUM that operates on all values of a given field within a group and reduces those values to a single value.

The purpose of such grouping is generally to allow some set function to be computed for each group.

HAVING

The GROUP BY clause may be qualified by a HAVING clause. The HAVING clause restricts the groups which are selected in the output. The groups that do not meet the search condition are eliminated.

Each expression in the HAVING clause must also be single-valued per group.

List the number of people who have been on each course numbered 1 to 6

SELECT courseno, COUNT (*)
FROM empcourse
GROUP BY courseno
HAVING courseno BETWEEN 1 AND 6

Execution of queries

From a conceptual standpoint, the subselect is evaluated in the following manner: First, the Cartesian product of all tables identified in the FROM clause is formed. From that product, rows not satisfying the search condition specified in the WHERE clause are eliminated. Next, the remaining rows are grouped in accordance with the specifications of the GROUP BY clause. Groups not satisfying the search condition in the HAVING clause are then eliminated. Then, the expressions specified in the SELECT clause are evaluated. Finally, the ORDER BY clause, if present, is evaluated and, if the key word DISTINCT has been specified, any duplicate rows are eliminated from the result table.

Joining Tables to Themselves - Self joins

Sometimes a table must be joined to itself. In this case, any references to fieldnames become ambiguous and aliases must be used to uniquely identify required fields.

List the surname and forename of all the employees who work in the same department as employee number 16. In this case two “versions” of the employee table must be used, one for employees other than 16, and one for employee 16 :-

SELECT x.surname, x.forenames
FROM employee x, employee y
WHERE x.depno = y.depno
AND y.empno = 16
AND x.empno != 16

You need to have one version of the table employee so that you can find the department number of employee 16. In the above example this table is called y. You then look through another version of the table employee, here calledx, to find people who are in the same department. Finally, you do not want employee number 16 to be displayed, so you should eliminate this case by adding  x.empno != 16.

Notice you have to make sure that you do not get the different tables confused and display y.surname and y.forenames since this will just display the surname and forename of employee 16 as many times as there are employees in their department.  If there is any risk of confusion you are advised to avoid cryptic labels and use meaningful labels , for example replace "x", "y" with "others", "emp16"; :-

SELECT others.surname, others.forenames
FROM employee others, employee emp16
WHERE others.depno = emp16.depno
AND emp16.empno = 16
AND others.empno != 16

Previous Page Next Page Up One Level