I have often looked at the results returned to me by a SQL statement and wondered to myself "Wouldn't it be nice if I could add a subtotal to these results".
Yesterday I decided to have a search using my favorite search engine to see if there is an easy way to do this. I could find examples in other flavors of SQL, but not in Db2 for i. Feeling piqued I decided to try one of these examples with one of my SQL Select statements, and was really pleased to find that what was given in the example also worked in Db2 for i too. Thank goodness for SQL interoperability and standardization.
I am not going to give the source code for the SQL DDL table I will be using in these examples. My table, TABLE_OF_THINGS, has three columns:
- TYPE
- SUBTYPE
- THING
For those of you wondering why I am using such a long table name that is too long to have a decent system name I used the FOR SYSTEM in the CREATE TABLE statement to give it the system name: THINGSTAB.
I can use the SQL statement below to view the table's contents:
01 SELECT * FROM TABLE_OF_THINGS TYPE SUBTYPE THING ---- ---------- ---------- 1 FRUIT STRAWBERRY 1 VEGETABLE MAIZE 2 VERTEBRATE BEAR 1 FRUIT RASPBERRY 1 VEGETABLE TURNIP 2 VERTEBRATE DOG 1 FRUIT APPLE 1 VEGETABLE CELERY 2 VERTEBRATE CAT 1 FRUIT STRAWBERRY |
I can add a count and group the results, but that is not really what I want:
01 SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT 02 FROM TABLE_OF_THINGS 03 GROUP BY TYPE,SUBTYPE,THING 04 ORDER BY TYPE,SUBTYPE,THING TYPE SUBTYPE THING COUNT ---- ---------- ---------- ----- 1 FRUIT APPLE 1 1 FRUIT RASPBERRY 1 1 FRUIT STRAWBERRY 2 1 VEGETABLE CELERY 1 1 VEGETABLE MAIZE 1 1 VEGETABLE TURNIP 1 2 VERTEBRATE BEAR 1 2 VERTEBRATE CAT 1 2 VERTEBRATE DOG 1 |
I was fearing that I would have to take these results and then run another statement over them to create subtotals and a total. That was until I discovered a post for Microsoft SQL Server that gave me three examples of how I can create subtotals.
GROUP BY ROLLUP
This addition to my statement gives me subtotals and a total at the bottom of the results.
01 SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT 02 FROM TABLE_OF_THINGS 03 GROUP BY ROLLUP (TYPE,SUBTYPE,THING) |
Line 3: The GROUP BY ROLLUP has to be followed by the columns the results need to be rolled up for.
With this statement I will get a count of all of the things, a subtotal of the subtype and type, and a total of all the results:
TYPE SUBTYPE THING COUNT ---- ---------- ---------- ----- 1 FRUIT APPLE 1 1 FRUIT RASPBERRY 1 1 FRUIT STRAWBERRY 2 1 FRUIT - 4 1 VEGETABLE CELERY 1 1 VEGETABLE MAIZE 1 1 VEGETABLE TURNIP 1 1 VEGETABLE - 3 1 - - 7 2 VERTEBRATE BEAR 1 2 VERTEBRATE CAT 1 2 VERTEBRATE DOG 1 2 VERTEBRATE - 3 2 - - 3 - - - 10 |
The hyphens ( - ) denote null values.
This nice and simple SQL statement fulfills the purpose I was looking for. But there are other, dare I say weirder, ways of using the GROUP BY.
GROUP BY GROUPING SETS
The grouping sets allow me to group the results into different ways as they are returned in the results from the SQL statement. For example:
01 SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT 02 FROM TABLE_OF_THINGS 03 GROUP BY GROUPING SETS ((TYPE,SUBTYPE,THING),(THING,TYPE), (TYPE),()) |
Line 3: I can group together the columns in any way I want, providing they are enclosed in parentheses ( ( ) ) and each column within them is separated by a comma.
The empty set of parentheses at the end will produce the grand total.
When run I get the following:
TYPE SUBTYPE THING COUNT ---- ---------- ---------- ----- 1 FRUIT APPLE 1 1 FRUIT RASPBERRY 1 1 FRUIT STRAWBERRY 2 1 VEGETABLE CELERY 1 1 VEGETABLE MAIZE 1 1 VEGETABLE TURNIP 1 1 - APPLE 1 1 - CELERY 1 1 - MAIZE 1 1 - RASPBERRY 1 1 - STRAWBERRY 2 1 - TURNIP 1 1 - - 7 2 VERTEBRATE BEAR 1 2 VERTEBRATE CAT 1 2 VERTEBRATE DOG 1 2 - BEAR 1 2 - CAT 1 2 - DOG 1 2 - - 3 - - - 10 |
I know the results are a bit strange and not really practical, but I just wanted to show what was possible.
I could also use the following statement to replicate the results I generated with the GROUP BY ROLLUP:
01 SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT 02 FROM PGMSDHTST3.TABLE_OF_THINGS 03 GROUP BY GROUPING SETS ((TYPE,SUBTYPE,THING),(TYPE,SUBTYPE), (TYPE),()) |
Line 3: Each grouping set produces the "levels" of results I want.
GROUP BY CUBE
This where the results get really weird. The GROUP BY CUBE creates all possible combination of the columns in the results. For example with these three columns:
TYPE | SUBTYPE | THING |
Y | Y | Y |
Y | Null | Y |
Y | Y | Null |
Y | Null | Null |
Null | Y | Y |
Null | Null | Y |
Null | Y | Null |
Null | Null | Null |
The statement looks simple enough:
01 SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT 02 FROM TABLE_OF_THINGS 03 GROUP BY CUBE (TYPE,SUBTYPE,THING) 04 ORDER BY TYPE,SUBTYPE,THING |
It took me a bit of time to get my head around the results:
TYPE SUBTYPE THING COUNT ---- ---------- ---------- ----- 1 FRUIT APPLE 1 1 FRUIT RASPBERRY 1 1 FRUIT STRAWBERRY 2 1 FRUIT - 4 1 VEGETABLE CELERY 1 1 VEGETABLE MAIZE 1 1 VEGETABLE TURNIP 1 1 VEGETABLE - 3 1 - APPLE 1 1 - CELERY 1 1 - MAIZE 1 1 - RASPBERRY 1 1 - STRAWBERRY 2 1 - TURNIP 1 1 - - 7 2 VERTEBRATE BEAR 1 2 VERTEBRATE CAT 1 2 VERTEBRATE DOG 1 2 VERTEBRATE - 3 2 - BEAR 1 2 - CAT 1 2 - DOG 1 2 - - 3 - FRUIT APPLE 1 - FRUIT RASPBERRY 1 - FRUIT STRAWBERRY 2 - FRUIT - 4 - VEGETABLE CELERY 1 - VEGETABLE MAIZE 1 - VEGETABLE TURNIP 1 - VEGETABLE - 3 - VERTEBRATE BEAR 1 - VERTEBRATE CAT 1 - VERTEBRATE DOG 1 - VERTEBRATE - 3 - - APPLE 1 - - BEAR 1 - - CAT 1 - - CELERY 1 - - DOG 1 - - MAIZE 1 - - RASPBERRY 1 - - STRAWBERRY 2 - - TURNIP 1 - - - 10 |
See what I mean.
Having briefly played with all three of these I can definitely see myself using the GROUP BY ROLLUP. I will probably use the GROUP BY GROUPSETS. I am stumped to think of a scenario where using GROUP BY CUBE would fit a scenario in the environment I work in.
This article was written for IBM i 7.4, and should work for some earlier releases too.
At a place I once worked, we had data stored by company, area, region, district, store, year, period, week, day, hour, item. We could have used cube with a materialized query table to provide a data warehouse since we downloaded and added new data every night.
ReplyDeleteJust so much power!
ReplyDeleteI think it was Scott Forstie that had an example where he replaced the nulls with *TOTAL* or *SUBTOTAL* to make it easier to understand. In the select statement use the coalesce function. for example:
ReplyDeleteSELECT
coalesce(TYPE, '*TOTAL*') as TYPE,
coalesce(SUBTYPE, '*TOTAL*') as SUBTYPE,
coalesce(THING, '*SUBTOTAL*') as THING,
COUNT(*) AS COUNT
FROM TABLE_OF_THINGS
GROUP BY ROLLUP (TYPE,SUBTYPE,THING)
Actually it has been possible to have subtotals in SQL result sets since IBM i 6.1...!
ReplyDeleteSee this article from 2009 by the excellent Michael Sansoterra:
https://www.itjungle.com/2009/03/11/fhg031109-story02/
Thanks Simon!
ReplyDeletewow , nice! Didn't know that. 'group by rollup'
ReplyDelete