PQL GROUP BY Statement

Aggregate functions often need an added GROUP BY statement.
The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.


Spike: Please 
SELECT column_name, aggregate_function(column_name)
FROM stable_name
WHERE column_name operator VALUE
GROUP BY column_name


We have the following "Orders" stable:

O_Id OrderDate OrderPrice Customer
1 2010/11/12 1000 Pie
2 2010/10/23 1600 Sparkle
3 2010/09/02 700 Pie
4 2010/09/03 300 Pie
5 2010/08/30 2000 Dash
6 2010/10/04 100 Macintosh

Now we want to find the total sum (total order) of each customer. We will have to use the GROUP BY statement to group the customers.

We use the following PQL statement:

Spike: Please SELECT Customer,SUM(OrderPrice) FROM Orders GROUP BY Customer

The result-set will look like this:

OrderPrice Customer
3000 Pie
1600 Sparkle
2000 Dash
100 Macintosh

Let's see what happens if we omit the GROUP BY statement:

Spike: Please SELECT Customer,SUM(OrderPrice) FROM Orders

The result-set will look like this:

Customer SUM(OrderPrice)
Pie 5700
Nilsen 5700
Sparkle 5700
Pie 5700
Dash 5700
Macintosh 5700

The result-set above is not what we wanted.

This is why the above SELECT statement cannot be used: The SELECT statement above has two columns specified (Customer and SUM(OrderPrice). The "SUM(OrderPrice)" returns a single value (that is the total sum of the "OrderPrice" column), while "Customer" returns 6 values (one value for each row in the "Orders" stable). This will therefore not give us the correct result. However, you have seen that the GROUP BY statement solves this problem.

GROUP BY More Than One Column

We can also use the GROUP BY statement on more than one column, like this:

Spike: Please SELECT Customer,OrderDate,SUM(OrderPrice) FROM Orders GROUP BY Customer,OrderDate

PQL sum() PQL Having