GROUP BY Clause of a SELECT Statement

This section provides tutorial examples on how to use the GROUP BY clause to aggregate orginal rows of the base table into group rows in a SELECT statement.

"GROUP BY clause" modifies the base table by grouping original rows into group rows based on identical combined values of the specified group columns. In other words, each resulting row represents a group of original rows that has a unique combination of the values in the specified group columns. Original columns are reduced to the specified group columns only.

Group rows can also be filtered out by a specified condition in an optional "HAVINNG clause".

"GROUP BY clause" syntax is:

GROUP BY group_columns [HAVING having_condition]

where "group_columns" is a list of columns in the original base table, and "having_condition" is a predicate operation that will result a true or false condition.

Rule 1: Two types of data can be used in select expressions: 1. group columns; 2. an aggregate function of any original columns.

Here are some example of aggregate functions:

For examples, the following is nice salary statistics report per department:

SELECT Department, COUNT(Name) AS NumberOfEmployees,
 MIN(Salary) AS MinimumSalary, MAX(Salary) AS MaximumSalary,
 AVG(Salary) as AverageSalary
FROM Employee WHERE Status='Active' GROUP BY Department

Rule 2: If multiple group columns are used, rows are grouped into a single rows based on the identical combined values of the group columns, not individual identical values. For example, the following statement reports age statistics per department and per sex:

SELECT Department, Sex, COUNT(Name) AS NumberOfEmployees,
 MIN(Salary) AS MinimumSalary, MAX(Salary) AS MaximumSalary,
 AVG(Salary) as AverageSalary
FROM Employee WHERE Status='Active' GROUP BY Department, Sex

If there are 10 individual departments, you will get 20 records, assuming that every department has both sexes.

Rule 3: If a having condition is specified, it will be used to filter out the resulting group rows that do not satisfy this condition. Since the having condition is applied on the grouped rows, it can only use group columns and aggregate functions. For example, the following statement reports salary statistics only for those departments that have more than 10 active employees:

SELECT Department, COUNT(Name) AS NumberOfEmployees,
 MIN(Salary) AS MinimumSalary, MAX(Salary) AS MaximumSalary,
 AVG(Salary) as AverageSalary
FROM Employee WHERE Status='Active'
GROUP BY Department HAVING COUNT(Name)>10

The following is bad example, "Sex='Male'" can only be used in the WHERE clause, not in the HAVING clause:

SELECT Department, COUNT(Name) AS NumberOfEmployees,
 MIN(Salary) AS MinimumSalary, MAX(Salary) AS MaximumSalary,
 AVG(Salary) as AverageSalary
FROM Employee WHERE Status='Active'
GROUP BY Department HAVING sex='Male'

Rule 4: If you want to get all values of a column in the resulting group, you need to use GROUP_CONCAT(), JSON_ARRAYAGG(), and JSON_OBJECTAGG() functions. For example, the following statement reports salary statistics only for those departments that have more than 10 active employees:

SELECT Department, Sex, COUNT(Name) AS NumberOfEmployees,
 MIN(Salary) AS MinimumSalary, MAX(Salary) AS MaximumSalary,
 AVG(Salary) as AverageSalary,
 GROUP_CONCAT(Name) AS NameList,
 JSON_ARRAYAGG(Name) AS NameArray,
 JSON_OBJECTAGG(Nam, Salary) AS NameAndSalary
FROM Employee WHERE Status='Active' GROUP BY Department, Sex

Table of Contents

 About This Book

 Introduction of SQL

 MySQL Introduction and Installation

 Introduction of MySQL Programs

 PHP Programs and MySQL Server

 Perl Programs and MySQL Servers

 Java Programs and MySQL Servers

 Datatypes and Data Literals

 Operations and Expressions

 Character Strings and Bit Strings

 Commonly Used Functions

 Table Column Types for Different Types of Values

 Using DDL to Create Tables and Indexes

 Using DML to Insert, Update and Delete Records

Using SELECT to Query Database

 SELECT Statements

 FROM Clause of a SELECT Statement

 JOIN - Operation to Join Two Tables

 JoinTable.sql - Example of Join Tables

 WHERE Clause of a SELECT Statement

 ORDER BY Clause of a SELECT Statement

GROUP BY Clause of a SELECT Statement

 Window Functions for Statistical Analysis

 Use Index for Better Performance

 Transaction Management and Isolation Levels

 Locks Used in MySQL

 Defining and Calling Stored Procedures

 Variables, Loops and Cursors Used in Stored Procedures

 System, User-Defined and Stored Procedure Variables

 MySQL Server Administration

 Storage Engines in MySQL Server

 InnoDB Storage Engine - Primary and Secondary Indexes

 Performance Tuning and Optimization

 Bulk Changes on Large Tables

 MySQL Server on macOS

 Installing MySQL Server on Linux

 Connection, Performance and Second Instance on Linux

 Archived Tutorials

 References

 Full Version in PDF/EPUB