How many PC models does a particular supplier produce? How the average price is defined
for computers with the same specifications? The answers to these and other questions
associated with some statistic information may be obtained by means of summarizing (aggregate) functions.
The following aggregate functions are assumed as standard:
Function | Description |
COUNT(*) | Returns the number of rows of the record source. |
COUNT(<column name>) | Returns the number of values in the specified column. |
SUM(<column name>) | Returns the sum of values in the specified column. |
AVG(<column name>) | Returns the average value in the specified column. |
MIN(<column name>) | Returns the minimal value in the specified column. |
MAX(<column name>) | Returns the maximum value in the specified column. |
All these functions return a single value. In so doing, the functions COUNT, MIN, and MAX are applicable to any data types, while the functions SUM and AVG are only used with numeric fields. The difference between the functions COUNT(*) and COUNT(<column name>) is that the second does not calculate NULL-values.
Example. Find the minimal and maximal prices for PCs:
SELECT MIN(price) AS Min_price, MAX(price) AS Max_price FROM PC; |
Min_price | Max_price |
350.0 | 980.0 |
Example. Find the number of available computers produced by the maker :
SELECT COUNT(*) AS Qty FROM PC WHERE model IN (SELECT model FROM Product WHERE maker = 'A'); |
Qty |
7 |
Example. If the number of different models produced by the maker A is needed, the query may be written as follows (taking into account the fact that each model in the Product table is shown once):
SELECT COUNT(model) AS Qty_model FROM Product WHERE maker = 'A'; |
Qty_model |
7 |
Example. Find the number of available different models produced by maker A. This query is similar to the preceding one for the total number of models produced by maker A. Now we need to find the number of different models in the PC table (available for sale).
To use only unique values in calculating the statistic, the parameter DISTINCT with an aggregate function argument may be used. ALL is another (default) parameter and assumes that all the column values returned are calculated. The statement
SELECT COUNT(DISTINCT model) AS Qty FROM PC WHERE model IN (SELECT model FROM Product WHERE maker = 'A'); |
Qty |
2 |
If we need the number of PC models produced by each maker, we will need to use the GROUP BY clause, placed immediately after the WHERE clause, if any.
SELECT model, COUNT(model) AS Qty_model, AVG(price) AS Avg_price FROM PC GROUP BY model; |
model | Qty_model | Avg_price |
1121 | 3 | 850.0 |
1232 | 4 | 425.0 |
1233 | 3 | 843.33333333333337 |
1260 | 1 | 350.0 |
Should the SELECT clause include date column these characteristics may be calculated for each date specified. For that, the date should be added as grouping column with the aggregate functions be calculated for each combination of (model−date).
There are some particular rules for executing aggregate functions:
In so doing, if the query does not include GROUP BY clause, the aggregate functions in the SELECT clause process all the result rows of this query. If the query includes the GROUP BY clause, each row set with the same value in the column or the same combination of values in several columns given in the GROUP BY clause forms a group with the aggregate functions being calculated for each group separately.
While WHERE clause gives predicate for filtering rows, the HAVING clause is applied after grouping that gives a similar predicate but filtering groups by the values of aggregate functions. This clause is nessesary for checking the values that are obtained by means of an aggregate function not from separate rows of record source in the FROM clause but from the groups of these rows. Therefore, this checking is not applicable to the WHERE clause.
Example. Get the count of PC and the average price for each model providing average price is less than $800:
SELECT model, COUNT(model) AS Qty_model, AVG(price) AS Avg_price FROM PC GROUP BY model HAVING AVG(price) < 800; |
model | Qty_model | Avg_price |
1232 | 4 | 425.0 |
1260 | 1 | 350.0 |
Note that the alias (Avg_price) for naming values of the aggregate function in the SELECT clause may not be used in the HAVING clause. This is because the SELECT clause forming the query result set is executed last but before the ORDER BY clause.
Below is the execution order of clauses in the SELECT operator:
This order does not correspond to the syntax order of SELECT operator generally formed as follows:
SELECT [DISTINCT | ALL]{*
| [<column expression> [[AS] <alias>]] [,]}
FROM <table name> [[AS] <alias>] [,]
[WHERE <predicate>]
[[GROUP BY <colunm list>]
[HAVING <condition on aggregate values>] ]
[ORDER BY <column list>]
Suggested exercises: 10, 11, 12, 14, 15, 20, 22, 33, 43, 51, 52.
Home | SELECT exercises (rating stages) | DML exercises | Developers |