• home
  • forum
  • my
  • kt
  • download
  • Simple Queries in SQL Server

    Author: 2007-09-03 10:16:50 From:

    This chapter and the next describe the most important Transact-SQL statement - SELECT. In this chapter you will learn how to use the SELECT statement to perform simple queries. Every clause in this statement is described, and numerous examples using our sample database are given to demonstrate the practical use of each clause. The second part of this chapter introduces aggregate functions and the UNION operator.

    In this chapter, you will learn about:

    • SELECT Statement: A Basic Form
    • WHERE Clause
    • Simple Subqueries
    • GROUP BY Clause
    • Aggregate Functions
    • HAVING Clause
    • ORDER BY Clause
    • SELECT Statement and IDENTITY Property
    • Set Operators
    • CASE Expressions
    • COMPUTE Clause
    • Temporary Tables
    • Computed Columns

    SELECT Statement: A Basic Form
    The Transact-SQL language has one basic statement for retrieving information from a database: the SELECT statement. With this statement, it is possible to query informa­tion from one or more tables of a database (or even from multiple databases). The result of a SELECT statement is another table, which is also known as a result set.

    The simplest form of the SELECT statement contains both a SELECT and a FROM clause. This form of the SELECT statement has the following syntax:

    SELECT [ ALL |DISTINCT] column_list
       FROM tab_1 [tab_alias1] [{,tab_2 [tab_alias2]}...]

    tab_1, tab_2 ,. are names of tables from which information is retrieved. tab_alias1, tab_alias2 ,. provide aliases for the corresponding tables. An alias is another name for the corresponding table that can be used as a shorthand way of referring to the table or as a way to refer to two logical instances of the same physical table. Don't worry; this will become clearer as examples are presented.

    Note
    This chapter demonstrates the retrieval of information from a single table in a database. The next chapter describes the use of a join operation, and therefore the query of more than one table in a database.

    column_list contains one or more of the following specifications:

    • The asterisk symbol (*), which specifies all columns of the named tables in the FROM clause (or from a single table when qualified, as in: tab_2.*)
    • The explicit specification of column names to be retrieved
    • The specification column_name [as] column_heading, which is a way to replace the name of a column or to assign a new name to an expression
    • An expression
    • A system or an aggregate function
    Note
    In addition to the specifications listed above, there are other options that will be partly presented later in this chapter and in the next chapter.

    A SELECT statement can retrieve either certain columns or rows from a table. The first operation is called projection (or SELECT list), and the second one is called selection (or select operation). The combination of both operations is also possible in a SELECT statement.

    Note
    Before you start to execute queries in this chapter, please re-create the entire sample database.

    EXAMPLE 5.1
    Get full details of all departments.

    USE sample
       SELECT * from department

    The result is

    dept_nodept_namelocation
    d1ResearchDallas
    d2AccountingSeattle
    d3MarketingDallas

    The SELECT statement in Example 5.1 retrieves all rows and all columns from the department table. The symbol * is shorthand for a list of all column names in the table named in the FROM clause, in the order in which those columns are defined in the CREATE TABLE statement for that table. The column names serve as column headings of the resulting output.

    EXAMPLE 5.2
    Get full details of all departments.

    USE sample
       SELECT dept_no, dept_name, location FROM department

    The result is

    dept_nodept_namelocation
    d1ResearchDallas
    d2AccountingSeattle
    d3MarketingDallas

    The SELECT statement in Example 5.2 is equivalent to the SELECT statement in Example 5.1. Generally, the FROM clause contains several options concerning locks. All these options, together with the notion of the transaction, will be explained in detail in Chapter 14

    WHERE Clause
    The simplest form of the SELECT statement, described in the previous section, is not very useful for queries. In practice, there are always several more clauses in a SELECT statement than in the statements shown in Examples 5.1 and 5.2. The following is the syntax of the SELECT statement with (almost) all possible clauses:

    SELECT select_list
    [INTO new_table
    FROM table
    [WHERE search_condition]
    [GROUP BY group_by_expression]
    [HAVING search_condition]
    [ORDER BY order_expression [ASC | DESC] ]

    Note
    The clauses in the SELECT statement must be written in syntactical order¡ªfor example, the GROUP BY clause must come after the WHERE clause and before the HAVING clause.

    In this section we will start with the definition of the first clause (after the FROM clause): WHERE. Often, it is necessary to define one or more conditions that limit the selected rows. The WHERE clause specifies a Boolean expression (an expression that returns a value of TRUE or FALSE) that is tested for each row to be returned (potentially). If the expression is true, then the row is returned; if it is false, it is discarded (see Example 5.3).

    EXAMPLE 5.3
    Get the names and numbers of all departments located in Dallas.

    USE sample
    SELECT dept_name, dept_no
       FROM department
       WHERE location = 'Dallas'

    The result is

    dept_namedept_no
    Researchd1
    Marketingd3

    In addition to the equal sign, the WHERE clause can contain other comparison operators, including the following:

    <> (or !=)not equal
    <less than
    >greater than
    >=greater than or equal to
    <=less than or equal to
    !>not greeater than
    !<not less than

    Example 5.4 shows the use of a comparison operator in the WHERE clause.

    EXAMPLE 5.4
    Get the last and first names for all employees with employee numbers > 15000.

    USE sample
    SELECT emp_lname, emp_fname
       FROM employee
       WHERE emp_no >= 15000

    The result is

    emp_lnameemp_fname
    SmithMatthew
    BarrimoreJohn
    JamesJames
    MoserSybill

    An expression can also be a part of the condition in the WHERE clause. Example 5.5 shows this.

    EXAMPLE 5.5
    Get the project names for all projects with budgets > 60000 £. The current rate of exchange is 0.51 £ per $1.

    USE sample SELECT project_name
       FROM project
       WHERE budget*0.51 > 60000

    The result is

    project_name
    Apollo
    Mercury

    Comparisons of strings (that is, values of data types CHAR, VARCHAR, NCHAR, NVARCHAR, TEXT, or NTEXT) are executed in accordance with the collating sequence in effect (this is the "sort order" specified when SQL Server was installed). If two strings are compared using ASCII code (or any other code), each of the corre­sponding (first, second, third, and so on) characters will be compared. One character is smaller than the other if it appears in the code table before the other one. Two strings of different lengths will be compared, after the shorter one is padded at the right with blanks, so the length of both strings will be equal. Numbers compare algebraically. Values of data type DATETIME compare in chronological order.

    Note
    Columns with TEXT and IMAGE data types cannot be used in the WHERE clause. (The only exceptions are with the LIKE and IS NULL operators.)

    Boolean Operators
    WHERE clause conditions can either be simple or contain multiple conditions. Multiple conditions can be built using the Boolean operators AND, OR, and NOT (see Example 5.6). The behavior of these operators has been described in Chapter 3 using truth tables.

    EXAMPLE 5.6
    Get employee and project numbers of all clerks that work on project p2.

    USE sample
    SELECT emp_no, project_no
       FROM works_on
       WHERE project_no = 'p2'
       AND job = 'Clerk'

    The result is

    emp_noproject_no
    25348p2
    28559p2

    If two conditions are connected by the AND operator, rows are retrieved for which both conditions are true.

    EXAMPLE 5.7
    Get employee numbers for all employees that work either for project p1 or project p2 (or both).

    USE sample
    SELECT project_no, emp_no
       FROM works_on
       WHERE project_no = 'p1'
       OR project_no = 'p2'

    The result is

    project_noemp_no
    p110102
    p225348
    p218316
    p229346
    p19031
    p128559
    p228559
    p129346

    If two conditions are connected by the OR operator, all rows of a table are retrieved in which either the first or the second condition (or both) are true.

    The result of Example 5.7 contains some duplicate values of column emp_no. If this redundant information is to be eliminated, the DISTINCT option should be used, as shown here:

    USE sample
    SELECT DISTINCT emp_no
       FROM works_on
       WHERE project_no = 'p1'
       OR project_no = 'p2'

    In this case, the result

    emp_no
    9031
    10102
    18316
    25348
    28559
    29346

    Note
    Columns with TEXT and IMAGE data types cannot be retrieved with the DISTINCT option.

    Note that the DISTINCT clause can be used only once in a SELECT list, and it must precede all column names in that list. Therefore, Example 5.8 is wrong.

    EXAMPLE 5.8 (EXAMPLE OF AN ILLEGAL STATEMENT)

    USE sample
    SELECT emp_fname, DISTINCT emp_no
       FROM works_on
       WHERE project_no = 'pi'
       OR project_no = 'p2'

    The result is

    Server: Msg 156, Level 15, State 1, Line 1
    Incorrect syntax near the keyword 'DISTINCT'.

    Note
    When there is more than one column in the SELECTlist, the DISTINCT clause displays all rows where the combination of columns is distinct.

    The WHERE clause may include any number of the same or different Boolean opera­tions. You should be aware that the three Boolean operations have different priorities for evaluation: the NOT operation has the highest priority, AND is evaluated next, and the OR operation has the lowest priority. If you do not pay attention to these different priori­ties for Boolean operations, you will get unexpected results, as Example 5.9 shows.

    EXAMPLE 5.9

    USE sample
    SELECT *
       FROM employee
       WHERE emp_no = 25348 AND emp_lname = 'Smith'
       OR emp_fname = 'Matthew' AND dept_no = 'd1'

    SELECT * FROM employee
       WHERE ((emp_no = 25348 AND emp_lname = 'Smith') OR emp_fname ='Matthew') AND dept_no = 'd1'

    The result is

    emp_noemp_fnameemp_lnamedept_no
    25348MatthewSmithd3
        
    emp_noemp_fnameemp_lnamedept_no

    As the results of Example 5.9 show, the two SELECT statements display two different results. In the first SELECT statement, the system evaluates both AND operators first (from the left to the right), and then the OR operator is evaluated. In the second SELECT statement, the use of parentheses changes the operation execution, with all expressions within parentheses being executed first, in sequence from left to right. As you can see, the first statement returned one row, while the second one returned zero rows.

    The existence of several Boolean operations in a WHERE clause complicates the corresponding SELECT statement and makes it error prone. In such cases, the use of parentheses is highly recommended, even if they are not necessary. The readability of such SELECT statements will be greatly improved, and possible errors can be avoided.

    Here is the first SELECT statement from Example 5.9 (modified using the recom­mended form):

    USE sample
    SELECT *
       FROM employee
       WHERE (emp_no = 25348 AND emp_lname = 'Smith')
       OR (emp_fname = 'Matthew' AND dept_no = 'd1')

    The third Boolean operator, NOT, changes the logical value of the corresponding condition. The truth table for NOT in Chapter 3 shows that the negation of the true value is false and vice versa; the negation of the NULL value is also NULL.

    EXAMPLE 5.10
    Get employee numbers and first names of all employees who do not belong to the department d2.

    USE sample
    SELECT emp_no, emp_lname
       FROM employee
       WHERE NOT dept_no = 'd2'

    The result is

    emp_noemp_lname
    25348Smith
    10102Jones
    18316Barrimore
    28559Moser

    In this case, the NOT operator can be replaced by the comparison operator <> (not equal). Example 5.11 is thus equivalent to Example 5.10.

    Note
    In this book we use the operator < > (instead of !=)to remain consistent with the SQL standard.

    EXAMPLE5.11

    USE sample
    SELECT emp_no, emp_lname
       FROM employee
       WHERE dept_no <> 'd2'

    IN and BETWEEN Operators
    An IN operator allows the specification of two or more expressions to be used for a query search. The result of the condition returns true if the value of the corresponding column equals one of the expressions specified by the IN predicate.

    EXAMPLE 5.12
    Get all the columns for employees whose employee numbers equal either 29346 or 28559 or 25348.

    USE sample
    SELECT *
       FROM employee
       WHERE emp_no IN (29346, 28559, 25348)

    The result is

    emp_noemp_fnameemp_lnamedept_no
    25348MatthewSmithd3
    29346JamesJamesd2
    28559SybillMoserd1

    An IN operator is equivalent to a series of conditions, connected with one or more OR operators. (The number of OR operators is equal to the number of expressions following the IN operator minus one.) Example 5.13 is equivalent to Example 5.12.

    EXAMPLE 5.13

    USE sample
    SELECT *
       FROM employee
       WHERE emp_no = 29346
       OR emp_no = 28559
       OR emp_no = 25348

    The IN operator can be used together with the Boolean operator NOT (see Example 5.14). In this case, the query retrieves rows that do not include any of the listed values in the corresponding columns.

    EXAMPLE 5.14
    Get all columns for employees whose employee numbers are neither 10102 nor 9031.

    USE sample
    SELECT *
       FROM employee
       WHERE emp_no NOT IN (10102, 9031)

    The result is

    emp_noemp_fnameemp_lnamedept_no
    25348MatthewSmithd3
    18316JohnBarrimored1
    29346JamesJamesd2
    2581ElkeHanseld2
    28559SybillMoserd1

    In contrast to the IN operator, which specifies each individual value, the BETWEEN operator specifies a range, which determines the lower and upper bounds of qualifying values.

    EXAMPLE 5.15
    Get the names and budgets for all projects whose budgets are between $95,000 and $120,000 inclusive.

    USE sample
    SELECT project_name, budget
       FROM project
       WHERE budget BETWEEN 95000 AND 120000

    The BETWEEN operator searches for all values in the range inclusively; that is, qualifying values can be between or equal to the lower and upper boundary values.

    The BETWEEN operator is logically equal to two individual comparisons, which are connected with the Boolean operator AND. Example 5.16 is equivalent to Example 5.15.

    EXAMPLE 5.16

    USE sample
    SELECT project_name, budget
       FROM project
       WHERE budget >= 95000 AND budget <= 120000

    Like the BETWEEN operator, the NOT BETWEEN operator can be used to search for column values that do not fall within the specified range. The BETWEEN operator can also be applied to columns with character and date values (see Example 5.17).

    EXAMPLE 5.17
    Get employee numbers of all analysts who did not enter their project in 1998.

    USE sample
    SELECT emp_no
       FROM works_on
       WHERE job ='Analyst'
       AND enter_date NOT BETWEEN '01.01.1998' AND '12.31.1998'

    The result is

    emp_no
    10102
    2581

    Examples 5.18 and 5.19 show another query that can be written using two different but equivalent ways.

    EXAMPLE 5.18
    Get the names of all projects with budgets that are less than $100,000 and greater than $150,000.

    USE sample
    SELECT project_name
       FROM project
       WHERE budget NOT BETWEEN 100000 AND 150000

    The result is

    project_name
    Gemini
    Mercury

    Using comparison operators, the solution will look different than in Example 5.18.

    EXAMPLE 5.19
    Get the names of all projects with budgets that are less than $100,000 and greater than $150,000.

    USE sample
    SELECT project_name
       FROM project
       WHERE budget < 100000 OR budget > 150000

    Note
    Although the English phrasing suggests the use of the AND operator (see the text in Example 5.19), the logical meaning of the query demands the use of the OR operator, because if you use AND instead of OR, you will get no results at all. (The reason is that there cannot be a budget that is at the same time less than $100,000 and greater than $150,000.) Therefore, Example 5.19 shows a possible problem that can appear between English phrasing of an exercise and its logical meaning.

    Queries Involving Null Values
    A NULL in the CREATE TABLE or ALTER TABLE statement specifies that a special value called NULL (which usually represents unknown values) is allowed in the column. These values differ from all other values in a database. In the SELECT statement, the WHERE clause generally returns rows for which the comparison evaluates to true. Our concern regarding queries is, how will comparisons involving NULL values be evaluated in the WHERE clause?

    All comparisons with NULL values will return false (even when preceded by NOT). To retrieve the rows with NULL values in the column, Transact-SQL includes the operator feature IS [NOT] NULL. This specification in a WHERE clause of a SELECT statement has the following general form:

    column IS [NOT] NULL

    Example 5.20 shows the use of the operator IS NULL.

    EXAMPLE 5.20
    Get employee numbers and corresponding project numbers for employees with unknown jobs who work on project p2.

    USE sample
    SELECT emp_no, project_no
       FROM works_on
       WHERE project_no = 'p2'
       AND job IS NULL

    The result is

    emp_noproject_no
    18316p2
    29346p2

    Example 5.21 shows syntactically correct, but logically incorrect, usage of NULL.

    EXAMPLE 5.21

    USE sample
    SELECT project_no, job
       FROM works_on
       WHERE job <> NULL

    The result is

    emp_noproject_no

    The condition

    column IS NOT NULL

    is equivalent to the condition

    NOT (column IS NULL)

    The system function ISNULL allows a display of the specified value as substitu­tion for NULL (see Example 5.22).

    EXAMPLE 5.22

    USE sample
    SELECT emp_no, ISNULL(job, 'Job unknown') task
       FROM works_on
       WHERE project_no = 'p1'

    The result is

    emp_notask
    10102Analyst
    9031Manager
    28559Job unknown
    29346Clerk

    In Example 5.22, we use a column heading task for the job column.

    LIKE Operator
    LIKE is an operator that compares column values with a specified pattern. The data type of the column can be any character or the DATETIME data type. The general form of the LIKE operator is

    column [NOT] LIKE 'pattern'

    pattern may be a string or date constant or expression (including columns of tables) and must be compatible with the data type of the corresponding column. For the speci­fied column, the comparison between the value in a row and the pattern evaluates to true if the column value matches the pattern expression.

    Certain characters within the pattern¡ªcalled wildcard characters¡ªhave a specific interpretation. Two of them are

    % (percent sign)
    _ (underscore)

    The percent sign specifies any sequence of zero or more characters. The underscore specifies any single character. Example 5.23 shows the use of the percent sign.

    EXAMPLE 5.23
    Get the names and numbers of all employees whose last names begin with the letter J.

    USE sample
    SELECT emp_fname, emp_lname, emp_no
       FROM employee
       WHERE emp_lname LIKE 'J%'

    The result is

    emp_fnameemp_lnameemp_no
    AnnJones10102
    JamesJames29346

    Example 5.24 shows the use of the wildcard characters % and _.

    EXAMPLE 5.24
    Get the names and numbers of all employees whose first names contain the letter a as the second character.

    USE sample
    SELECT emp_fname, emp_lname, emp_no
       FROM employee
       WHERE emp_fname LIKE '_a%'

    The result is

    emp_fnameemp_lnameemp_no
    MatthewSmith25348
    JamesJames29346

    In addition to the percent sign and the underscore, Transact-SQL supports other characters that have a special meaning when used with the LIKE operator. These characters ([, ], and A) are best explained in the next two examples.

    EXAMPLE 5.25
    Get full details of all departments whose locations begin with a character in the range C through F.

    USE sample
    SELECT *
       FROM department
       WHERE location LIKE '[C-F]%'

    The result is

    dept_nodept_namelocation
    d1ResearchDallas
    d3MarketingDallas

    As shown in Example 5.25, the square brackets, [ and ], delimit a range or list of characters. The order in which characters appear in a range is defined by the collating sequence, which is determined when SQL Server is installed or later.

    The character ^ specifies the negation of a range or a list of characters. This character has this meaning only within a pair of square brackets (see Example 5.26).

    EXAMPLE 5.26
    Get the numbers and names of all employees whose last names do not begin with the letters J, K, L, M, N, or O and whose first names do not begin with the letters E or Z.

    USE sample
    SELECT emp_no, emp_fname, emp_lname
       FROM employee
       WHERE emp_lname LIKE '[AJ-O]%'
       AND emp_fname LIKE '[AEZ]%'

    The result is

    emp_noemp_fnameemp_lname
    25348MatthewSmith
    18316JohnBarrimore

    The condition

    column NOT LIKE 'pattern'

    is equivalent to the condition

    NOT (column LIKE 'pattern')

    Example 5.27 shows the use of the LIKE operator (together with NOT).

    EXAMPLE 5.27
    Get full details of all employees whose first names do not end with the character n.

    USE sample
    SELECT *
       FROM employee
       WHERE emp_fname NOT LIKE '%n'

    The result is:

    emp_noemp_fnameemp_lnamedept_no
    25348MatthewSmithd3
    29346JamesJamesd2
    2581ElkeHanseld2
    9031ElsaBertonid2
    28559SybillMoserd1

    Any of the wildcard characters (%, _, [, ], or ^) enclosed in square brackets stand for themselves. An equivalent feature is available through the ESCAPE option. Therefore, both SELECT statements in Example 5.28 have the same meaning.

    EXAMPLE 5.28

    USE sample
    SELECT project_no, project_name
       FROM project
       WHERE project_name LIKE ¡®%[_]%¡¯

    SELECT project_no, project_name
       FROM project
       WHERE project_name LIKE ¡®%!_%¡¯ ESCAPE ¡®!¡¯

    The result is

    emp_noproject_name

    emp_noproject_name

    Both SELECT statements search for the underscore as an actual character in the column project_name. In the first SELECT statement, this search is established by enclosing the sign _ in square brackets. The second SELECT statement uses a character (in the example, it is the character !) as an escape character. The escape character overrides the meaning of the underscore as the wildcard character and leaves it to be interpreted as an ordinary character. (Our result contains no rows because there are no project names including the underscore character.)

    Note
    The SQL standard only supports the use of %, _, and the ESCAPE operator. For this reason, we recommend using the ESCAPE operator instead of a pair of square brackets if any wildcard character must stand for itself.

     

    Simple Subqueries
    All previous examples in this chapter contain comparisons of column values with an expression or constant. Additionally, the Transact-SQL language offers the ability to compare column values with the result of another SELECT statement. Such SELECT statements, which are nested in the WHERE clause of another SELECT statement, are called subqueries. The first SELECT statement in a subquery is often called the outer query¡ªin contrast to the inner query, which denotes the second SELECT statement. The inner query will always be evaluated first, and the outer query receives the values of the inner query.

    Note
    A subquery can also be nested in an INSERT, UPDATE, or a DELETE statement, which will be discussed in Chapter 7.

    There are two types of subqueries:

    • Simple
    • Correlated

    In a simple subquery, the inner query is evaluated exactly once. A correlated subquery differs from a simple one in that its value depends upon a variable from the outer query. Therefore, the inner query of a correlated subquery is evaluated each time the system retrieves a new row from the outer query. The correlated subquery will be discussed in Chapter 6.

    A simple subquery can be used with the following operators:

    • Comparison operators
    • IN operator
    • ANY or ALL operator
    • EXISTS function

    Subqueries and Comparison Operators
    Example 5.29 shows the simple subquery that is used with the operator =.

    EXAMPLE 5.29
    Get the first and last names of employees who work in the research department.

    USE sample
    SELECT emp_fname, emp_lname
       FROM employee
       WHERE dept_no =
       (SELECT dept_no
          FROM department
          WHERE dept_name = ¡®Research¡¯)

    The result is

    emp_fnameemp_lname
    JohnBarrimore
    SybillMoser

    SQL Server first evaluates the inner query. That query returns the number of the research department (d1). Thus, after the evaluation of the inner query, the subquery in Example 5.29 can be represented with the following equivalent query:

    USE sample
    SELECT emp_fname, emp_lname
       FROM employee
       WHERE dept_no = ¡®d1¡¯

    A subquery can be used with other comparison operators, too. Example 5.30 shows the use of the operator <.

    EXAMPLE 5.30
    Get all project numbers of employees whose employee numbers are smaller than the number of the employee named Moser.

    USE sample
    SELECT DISTINCT project_no
       FROM works_on
       WHERE emp_no <
       (SELECT emp_no
          FROM employee
          WHERE emp_lname = ¡®Moser¡¯)

    The result is

    project_no
    p1
    p2
    p3

    Any comparison operator can be used, provided the inner query returns exactly one row. This is obvious, because the comparison between particular column values of the outer query and a set of values (as a result of the inner query) is not possible.

    Subqueries and IN Operator
    The IN operator allows the specification of a set of expressions (or constants) that are subsequently used for the query search. This operator can be applied to a subquery for the same reason¡ªthat is, when the result of an inner query contains a set of values.

    Example 5.31 shows the use of the IN operator.

    EXAMPLE 5.31
    Get full details of all employees whose departments are located in Dallas.

    USE sample
    SELECT *
       FROM employee
       WHERE dept_no IN
       (SELECT dept_no
          FROM department
          WHERE location = ¡®Dallas¡¯)

    The result is

    emp_noemp_fnameemp_lnamedept_no
    25348MatthewSmithd3
    10102AnnJonesd3
    18316JohnBarrimored1
    28559SybillMoserd1

    Each inner query may contain further queries. This type of subquery is called a subquery with multiple levels of nesting. The maximum number of inner queries in a subquery depends on the amount of memory SQL Server has for each SELECT statement. In the case of subqueries with multiple levels of nesting, the system first evaluates the innermost query and returns the result to the query on the next nesting level, and so on. Finally, the outermost query evaluates the final outcome.

    EXAMPLE 5.32
    Get the last names of all employees who work on the project Apollo.

    USE sample
    SELECT emp_lname
       FROM employee
       WHERE emp_no IN
       (SELECT emp_no
          FROM works_on
          WHERE project_no IN
          (SELECT project_no
             FROM project
             WHERE project_name = ¡®Apollo¡¯))

    The result is

    emp_lname
    Jones
    James
    Bertoni
    Moser

    The innermost query in Example 5.32 evaluates to the project_no value p1. The middle inner query compares this value with all values of the project_no column in the works_on table. The result of this intermediate query is the set of employee numbers: (10102, 29346, 9031, 28559). Finally, the outermost query displays the corresponding last names for the selected employee numbers.

    ANY and ALL Operators
    The operators ANY and ALL are always used in combination with one of the comparison operators. The general syntax of both operators is

    column operator [ANY | ALL] query

    where operator stands for a comparison operator.

    Note
    Do not use ANY and ALL operators! Every query using ANY or ALL can be better formulated with the EXISTS function (see Chapter 6). Additionally, the semantic meaning of the ANY operator can be easily confused with the semantic meaning of the ALL operator and vice versa.

    The ANY operator evaluates to true if the result of an inner query contains at least one row that satisfies the comparison. Example 5.33 shows the use of the ANY operator.

    EXAMPLE 5.33
    Get the employee numbers, project numbers, and job names for employees who have not spent the most time on one of the projects.

    USE sample
    SELECT DISTINCT emp_no, project_no, job
       FROM works_on
       WHERE enter_date > ANY
       (SELECT enter_date
          FROM works_on)

    The result is

    emp_noproject_nojob
    2581p3Analyst
    9031p1Manager
    9031p3Clerk
    10102p3Manager
    18316p2NULL
    25348P2Clerk
    28559p1NULL
    28559p2Clerk
    29346p1Clerk
    29346p2NULL

    Each value of the enter_date column in Example 5.33 is compared with all values of this column. For all dates of this column, except the oldest one, the comparison is evaluated to true at least once. The row with the oldest date does not belong to the result because the comparison does not evaluate to true in any case. In other words, the expression ¡°enter_date > ANY (SELECT enter_date FROM works_on)¡± is true if there are any (one or more) rows in the works_on table with a value of the enter_date column less than the value of enter_date for the current row. This will be true for all but the minimum (or earliest) value of enter_date in the table.

    The keyword SOME is the synonym for ANY.

    Example 5.34 shows the use of the ANY operator.

    EXAMPLE 5.34
    Get the first and last names for all employees who work on project p1.

    USE sample
    SELECT emp_fname, emp_lname
       FROM employee
       WHERE emp_no = ANY
          (SELECT emp_no
             FROM works_on
             WHERE project_no = ¡®p1¡¯)

    The result is

    emp_fnameemp_lname
    AnnJones
    JamesJames
    ElsaBertoni
    SybillMoser

    The ALL operator evaluates to true if the evaluation of the table column in the inner query returns all values of that column. Example 5.35 shows the use of the ALL operator.

    EXAMPLE 5.35
    Get jobs of the employee with the smallest employee number.

    USE sample
    SELECT job
       FROM works_on
       WHERE emp_no <= ALL
       (SELECT emp_no
          FROM employee)

    The result is

    Job
    Analyst

    EXISTS Function
    The EXISTS function checks the inner query of a subquery and evaluates to
    its result contains at least one row. The syntax of the EXISTS function is

    [NOT] EXISTS (query)

    Chapter 6 contains all examples regarding the EXISTS function.

    Queries in the FROM Clause
    The previous versions of SQL Server only allowed you to place a query in the WHERE clause of the SELECT statement, as shown in earlier examples. Generally, it should be possible to write a query any place in a SELECT statement where a table can appear. (The result of a query is always a table or, in a special case, an expression.) SQL Server now allows you to write a query as part of the FROM clause. Example 5.36 shows the use of a query inside the FROM clause.

    EXAMPLE 5.36
    Get the names of all employees with employee numbers greater than or equal to 10000.

    USE sample
    SELECT emp_fname, emp_lname
       FROM (SELECT *
          FROM employee
          WHERE emp_no >= 10000 ) AS empno_10000

    The result is

    emp_fnameemp_lname
    MatthewSmith
    AnnJones
    JohnBarrimore
    JamesJames
    SybillMoser

    The name empno_10000 is an alias table name for the result of the SELECT statement in the FROM clause. (The alias for a table must be specified if a query is placed in the FROM clause of a SELECT statement.)

    GROUP BY Clause
    The GROUP BY clause defines one or more columns as a group such that all rows within any group have the same values for those columns. Example 5.37 shows the use of the GROUP BY clause.

    EXAMPLE 5.37
    Get all jobs of employees.

    USE sample
    SELECT job
       FROM works_on
       GROUP BY job

    The result is

    Job
    NULL
    Analyst
    Clerk
    Manager

    In Example 5.37, the GROUP BY clause builds different groups for all possible values (NULL, too!) appearing in the job column.

    Note
    There is a restriction regarding the use of columns in the GROUP BY clause. Each column appearing in the SELECT list of the query must also appear in the GROUP BY clause. This does not hold for constants and for columns that are part of an aggregate function (see Example 5.45). This makes sense, because only columns in the GROUP BY clause are guaranteed to have a single value (for each group).

    A table can be grouped by any combination of its columns. Example 5.38 shows the grouping of rows of the works_on table using two columns.

    EXAMPLE 5.38
    Group all employees using their project numbers and jobs.Group all employees using their project numbers and jobs.

    USE sample
    SELECT project_no, job
       FROM works_on
       GROUP BY project_no, job

    The result is

    project_nojob
    p1Analyst
    p1Clerk
    p1Manager
    p1NULL
    p1NULL
    p2Clerk
    p3Analyst
    p3Clerk
    p3Manager

    The result of Example 5.38 shows that there are nine groups with different combinations of project numbers and jobs. The only two groups that contain more than one row are

    p2Clerk25348, 28559
    p2NULL18316, 29346

    The sequence of the column names in the GROUP BY clause need not correspond to the sequence of the names in the SELECT list.

    Note
    Columns of data types TEXT and IMAGE cannot be used in the GROUP BY clause.

    Aggregate Functions
    Aggregates are functions that are used to get summary values. All aggregate functions can be divided into three groups:

    • Convenient aggregate functions
    • Statistical aggregate functions
    • Superaggregates

    The following sections describe these groups.

    Convenient Aggregates
    The Transact-SQL language supports six aggregate functions (an aggregate function is one that acts upon a set of values rather than a single one):

    • MIN
    • MAX
    • SUM
    • AVG
    • COUNT
    • COUNT_BIG
    Note
    There are several other aggregate functions, such as GROUPING, which will be described in detail in Chapter 27.

    All aggregate functions operate on a single argument that can be a column or an expression. (The only exception is the second form of the COUNT and COUNT_BIG function: COUNT(*) and COUNT_BIG(*).) The result of each aggregate function is a constant value, which is displayed in a separate column of the result.

    The aggregate functions appear in the SELECT list, which can include a GROUP BY clause. If there is no GROUP BY clause in the SELECT statement, and the SELECT list includes at least one aggregate function, then no simple columns can be included in the SELECT list (other than as arguments of an aggregate function). Therefore, Example 5.39 is wrong.

    EXAMPLE 5.39 (EXAMPLE OF AN ILLEGAL STATEMENT)

    USE sample
    SELECT emp_lname, MIN (emp_no)
       FROM employee

    The emp_lname column of the employee table must not appear in the SELECT list of Example 5.39, because it is not the argument of an aggregate function. On the other hand, all column names that are not arguments of an aggregate function may appear in the SELECT list if they are used for grouping.

    The argument of an aggregate function can be preceded by one of two keywords:

    • ALL
    • DISTINCT

    ALL indicates that all values of a column are to be considered. DISTINCT eliminates duplicate values of a column before the aggregate function is applied. (ALL is the default value.)

    Note
    Aggregate functions cannot be used in the WHERE clause of the SELECT statement.

    Aggregate Functions MIN and MAX
    The aggregate functions MIN and MAX compute the smallest and the largest values in the column, respectively. If there is a WHERE clause, the MIN and MAX functions return the smallest or largest of values from selected rows. Example 5.40 shows the use of the aggregate function MIN.

    EXAMPLE 5.40
    Get the smallest employee number.

    USE sample
    SELECT MIN(emp_no) min_employee_number
       FROM employee

    The result is

    min_employee_number
    2581

    Example 5.40 shows that column headings can also be applied to aggregate functions to enhance the readability of the result.

    The result of Example 5.40 is not user friendly. For instance, the name of the employee with the smallest number is not known. As already shown in Example 5.39, the explicit specification of the emp_name column in the SELECT list is not allowed. To retrieve the name of the employee with the smallest employee number, we use a subquery in
    Example 5.41, where the inner query contains the SELECT statement of Example 5.40.

    EXAMPLE 5.41
    Get the number and the last name of the employee with the smallest employee number.

    USE sample
    SELECT emp_no, emp_lname
       FROM employee
       WHERE emp_no =
       (SELECT MIN(emp_no)
          FROM employee)

    The result is

    emp_noemp_lname
    2581Hansel

    Example 5.42 shows the use of the aggregate function MAX.

    EXAMPLE 5.42
    Get the employee number of the manager who was entered last in the works_on table.

    USE sample
    SELECT emp_no
       FROM works_on
       WHERE enter_date =
          (SELECT MAX(enter_date)
             FROM works_on
             WHERE job = ¡®Manager¡¯)

    The result is

    emp_no
    10102

    The argument of the functions MIN and MAX can also be a string value or a date. If the argument has a string value, the comparison between all values will be provided using the actual collating sequence. For all arguments of data type DATETIME, the earliest date specifies the smallest and the latest date the largest value in the column.

    The DISTINCT option cannot be used with the aggregate functions MIN and MAX. All null values in the column that are the argument of the aggregate function MIN or MAX are always eliminated before MIN or MAX is applied.

    Aggregate Function SUM
    The aggregate function SUM calculates the sum of the values in the column. The argument of the function SUM must be numeric. Example 5.43 shows the use of the aggregate function SUM.

    EXAMPLE 5.43
    Calculate the sum of all budgets of projects.

    USE sample
    SELECT SUM (budget) sum_of_budgets
       FROM project

    The result is

    sum_of_budgets
    401500.0

    The use of the DISTINCT option eliminates all duplicate values in the column before the function SUM is applied. Similarly, all null values are always eliminated before SUM is applied.

    Aggregate Function AVG
    The aggregate function AVG calculates the average of the values in the column. The argument of the function AVG must be numeric (see Example 5.44).

    EXAMPLE 5.44
    Calculate the average of all budgets with a money amount greater than $100,000.

    USE sample
    SELECT AVG(budget) avg_budget
       FROM project
       WHERE budget > 100000

    The result is

    avg_budget
    153250.0

    All null values are eliminated before the function AVG is applied.

    Aggregate Functions COUNT and COUNT_BIG
    The aggregate function COUNT has two different forms. The syntax of the first form is

    COUNT ([DISTINCT] col_name)

    This form of the function COUNT calculates the number of values in the col_name column. When the DISTINCT keyword is used, all duplicate values are eliminated before the function COUNT is applied. This form of COUNT does not count rows with null values for the column.

    Example 5.45 shows the use of the first form of the aggregate function COUNT.

    EXAMPLE 5.45
    Count all different jobs in each project.

    USE sample
    SELECT project_no, COUNT(DISTINCT job) job_count
       FROM works_on
       GROUP BY project_no

    The result is

    project_nojob_count
    p14
    p21
    p33

    As can be seen from the result of Example 5.45, all null values are eliminated before the function COUNT(DISTINCT col_name) or COUNT(col_name) is
    applied. (The sum of all values in the job_count column is 8 instead of 11.)

    The second form of the function COUNT has the form COUNT(*). This aggregate function counts the number of rows in the table. Or if there is a WHERE clause in the SELECT statement, it returns the number of rows for which the WHERE condition is true. Example 5.46 shows the use of the second form of the aggregate function COUNT.

    EXAMPLE 5.46
    How many employees work on each project?

    USE sample
    SELECT project_no, COUNT(*) emp_count
       FROM works_on
       GROUP BY project_no

    The result is

    project_noemp_count
    p14
    p24
    p33

    In contrast to the first form of the function COUNT, the second form does not eliminate null values (see Example 5.47).

    EXAMPLE 5.47
    Get the number of each job in all projects.

    USE sample
    SELECT job, COUNT(*) job_count
       FROM works_on
       GROUP BY job

    The result is

    Jobjob_count
    NULL3
    Analyst2
    Clerk4
    Manager2

    The COUNT_BIG function is analogous to the COUNT function. The only difference between them is their return values: COUNT_BIG always returns
    a value of the BIGINT data type, while the COUNT function always returns a value of the INTEGER data type.

    Statistical Aggre