It was during a COMMON presentation discussing the new features of IBM i 7.3 that Steve Will, chief architect of this operating system, mentioned the reason for this release was to accommodate the new Temporal tables. These new tables promise ability to see data within SQL DDL tables as it was in the past. In fact multiple users could use the same table at the same time and see the data as it was at different times, all while the table still be updated with new data.
Have been involved in projects to recreate a "snap shot" of data for a file as it was at a specific time, I can say that it is a lengthy process to roll back changes using data from journals or trigger output files. After those projects I welcome the usefulness of Temporal tables.
Let me start by giving a very high level description of how Temporal tables work. Any changes made to the "master" table are written to a "history" table. When I perform a select and ask for a specific time the operating system is smart enough to work out which rows were active at the time, and return that data set to me. How it does it will become more apparent as I describe how the tables are coded.
What I have found in experimenting with Temporal tables is that both the "master" and the "history" files needs to be journaled. The documentation from IBM fails to mention this. On consulting with someone far wiser than I in the ways of SQL he suggested that I place my Temporal tables in their own Schema. When I create a schema a journal and journal receiver are created within. When I go and create my Temporal table they will be automatically added to journal and receiver.
For those not familiar with SQL-speak a schema is basically the same as a library. When you look for a schema on the IBM i it will be shown as a library. In my experience every command you can use with a library, you can use with a schema too.
So let me create my schema and look what I find in it:
CREATE SCHEMA MYSCHEMA WRKOBJPDM MYSCHEMA |
What I see in my new schema is:
Work with Objects Using PDM Library . . . . . MYSCHEMA Position to . . . . . . . . . Position to type . . . . . . Opt Object Type Attribute Text QSQJRN0001 *JRNRCV COLLECTION - created by QSQJRN *JRN COLLECTION - created by SYSCHKCST *FILE LF SQL catalog view SYSCOLUMNS *FILE LF SQL catalog view SYSCST *FILE LF SQL catalog view SYSCSTCOL *FILE LF SQL catalog view SYSCSTDEP *FILE LF SQL catalog view |
The first two objects are the journal and journal receiver that was created for me. The rest of the objects are a collection of the same system views that are available in the library QSYS2. Personally whenever I find them in my schema I delete them.
DLTF MYSCHEMA/SYS* |
Now I have my schema I can create my Temporal table. In this example I am going to create table called PERSON, which will contains people's names. The definition of this table and its matching "history" table are defined in one source member. To make it easier for me to explain what is going on I am going to break the code up into parts. I am going to start with the basic definition of the "master" table.
01 CREATE TABLE MYSCHEMA.PERSON ( 02 PERSON_ID FOR UNIQUE NUMERIC (10,0) 03 GENERATED ALWAYS AS IDENTITY 04 (START WITH 1, INCREMENT BY 1, NOCYCLE), 05 FIRST_NAME VARCHAR(25), 06 MID_INITIAL CHAR(1), 07 LAST_NAME VARCHAR(30), 08 BEGIN_TS TIMESTAMP(12) NOT NULL 09 GENERATED ALWAYS AS ROW BEGIN, 10 END_TS TIMESTAMP(12) NOT NULL 11 GENERATED ALWAYS AS ROW END, 12 TS_ID TIMESTAMP(12) NOT NULL 13 GENERATED ALWAYS AS TRANSACTION START ID, 14 PERIOD SYSTEM_TIME (BEGIN_TS,END_TS) 15 ) ; |
Line 1: Every table is created using the CREATE TABLE statement. I am using SQL syntax, therefore, the schema (library) name is separated from the table (file) name by a dot.
Lines 2 – 4: The first column (field) in the table is PERSON_ID. I am making this an identity column. The value of the identity is automatically generated as a unique number, thereby, guaranteeing that it can be used as a unique key for the table.
Line 5: This column is for the first name. I decided to make this VARCHAR rather than CHAR as not everyone's first name needs 25 characters.
Line 6: Middle initial can be CHAR as it can only be one character maximum.
Line 7: Last name I have made a VARCHAR of a maximum of 30 characters.
The following columns have to be present for the Temporal table to work. The names I have used for these columns are not important, it is the keywords that follow that are.
Line 8 and 9: This field is used to indicate when the record became "active". It must be code as a TIMESTAMP(12). The GENERATED ALWAYS AS ROW BEGIN must be present and explains exactly what this column is used for.
Line 10 and 11: If you have a column to say when the record became "active" you need another to say when it became "inactive". As with the previous field it must be TIMESTAMP(12) and must have GENERATED ALWAYS AS ROW END to indicate its purpose.
Line 12 and 13: If several rows were inserted, changed, or deleted by the same statement all of rows that were will have the same value in this column. This allows me to identify all the records that were changed by that one statement. If a row is the only one to be inserted, changed, or deleted then its value will be the same as the row begin column.
Line 14: Shows the time period the row was "active".
The next part of my code is me giving my columns decent column headings. For more information about this subject see the post Changing column headings in output file.
16 LABEL ON COLUMN MYSCHEMA.PERSON ( 17 PERSON_ID IS 'Person id', 18 FIRST_NAME IS 'First name', 19 MID_INITIAL IS 'Middle initial', 20 LAST_NAME IS 'Last name', 21 BEGIN_TS IS 'Row begin timestamp', 22 END_TS IS 'Row end timestamp', 23 TS_ID IS 'Timestamp id' 24 ) ; |
In the next part I create the "history" table and establish the relationship between the "master" and the "history".
25 CREATE TABLE MYSCHEMA.PERSON_H LIKE MYSCHEMA.PERSON ; 26 ALTER TABLE MYSCHEMA.PERSON ADD VERSIONING USE 27 HISTORY TABLE MYSCHEMA.PERSON_H ; |
Line 25: This is a nice thing you can do in SQL to create a copy of a table. In this case the table PERSON is copied to create the table PERSON_H.
Line 26 and 27: This defines the "versioning" relationship between the two tables. Now whenever a row is updated or deleted the original will be versioned into the "history" table.
The last two lines just give the two tables system names that will be displayed when I look at them on the IBM i.
28 LABEL ON TABLE MYSCHEMA.PERSON IS 'PERSON temporal table' ; 29 LABEL ON TABLE MYSCHEMA.PERSON_H IS 'PERSON_H temporal table' ; |
So now I need to add, change, and delete data in the "master" file. I am going to use a SQL RPG program to do this, I could use a CL program and the RUNSQL command, or just plain STRSQL.
01 ctl-opt dftactgrp(*no) ; 02 dcl-pr sleep extproc('sleep') ; 03 *n uns(10) value ; 04 end-pr ; 05 exec sql SET OPTION COMMIT = *NONE ; 06 exec sql INSERT INTO PERSON (FIRST_NAME,MID_INITIAL,LAST_NAME) VALUES('ANNA','','CRUZ') ; 07 sleep(5) ; 08 exec sql INSERT INTO PERSON VALUES(DEFAULT,'JHON','D','SMITH', DEFAULT,DEFAULT,DEFAULT) ; 09 sleep(5) ; 10 exec sql INSERT INTO PERSON (FIRST_NAME,MID_INITIAL,LAST_NAME) VALUES('DARNESHA','A','GRAY') ; 11 sleep(5) ; 12 exec sql INSERT INTO PERSON (FIRST_NAME,MID_INITIAL,LAST_NAME) VALUES('RANJIT','','GUPTA') ; 13 sleep(5) ; 14 exec sql UPDATE PERSON SET LAST_NAME = 'HERNANDEZ' WHERE PERSON_ID = 1 ; 15 sleep(5) ; 16 exec sql UPDATE PERSON SET FIRST_NAME = 'JOHN' WHERE PERSON_ID = 2 ; 17 sleep(5) ; 18 exec sql DELETE FROM PERSON WHERE PERSON_ID = 4 ; |
Line 1: I need the DFTACTGRP(*NO) control option as I am going to use a external procedure.
Lines 2 – 4: And this is the external procedure. sleep() does the same as the DLYJOB command, pauses the program for the number of seconds it is passed. Most examples of sleep() include a returned parameter, frankly I cannot be bothered with it as I don't need it. There is only one parameter: the number of seconds, which needs to be passed as an unsigned integer.
Line 5: As I do not want to use commitment control with my SQL statements I am turning it "off" within my program. For more information about what other options you can use this statement for see Putting the SQL options into the source.
Line 6: First insert statement. In this statement I am only using the columns I need to update. The PERSON_ID identity, and temporal columns, BEGIN_TS, END_TS, and SYSTEM_TIME are generated for me.
Line 7: I am going to sleep for 5 seconds.
Line 8: This another way I could code the insert statement. In this scenario I do not give a list of columns I want to update, it is assumed I want to update all of them. The values I wanting to insert into the identity and temporal columns are to be the default values for these columns, therefore, I use the value DEFAULT. More on using DEFAULT can be found in Easy way to initialize columns when using SQL insert. Notice how I have misspelled this person's first name.
Line 9: Again I sleep.
Lines 10 – 13: The above steps are repeated for two more individuals.
Line 14: Anna Cruz has just informed me that she wants to use her married last name of Hernandez. Therefore, I need to update the last name of the person with the PERSON_ID of 1.
Line 16: Oops I noticed how I misspelled John's first name so I need to correct that.
Line 18: Ranjit has left, therefore, I need to delete his details from the table.
When this program is finished the data in PERSON looks like:
Person First Middle Last id name initial name 1 1 ANNA HERNANDEZ 2 2 JOHN D SMITH 3 3 DARNESHA A GRAY Row begin Row end timestamp timestamp 1 2016-09-30-03.37.21.964027000244 9999-12-30-00.00.00.000000000000 2 2016-09-30-03.37.27.016426000244 9999-12-30-00.00.00.000000000000 3 2016-09-30-03.37.11.888739000244 9999-12-30-00.00.00.000000000000 Timestamp id 1 2016-09-30-03.37.21.964027000244 2 2016-09-30-03.37.27.016426000244 3 2016-09-30-03.37.11.888739000244 |
I find it interesting that the end timestamp is only December 30, 9999, and not December 31, 9999.
The "history" file, PERSON_H, contains:
Person First Middle Last id name initial name 1 1 ANNA CRUZ 2 2 JHON D SMITH 3 4 RANJIT GUPTA Row begin Row end timestamp timestamp 1 2016-09-30-03.37.01.815270000244 2016-09-30-03.37.21.964027000244 2 2016-09-30-03.37.06.852713000244 2016-09-30-03.37.27.016426000244 3 2016-09-30-03.37.16.924815000244 2016-09-30-03.37.32.056313000244 Timestamp id 1 2016-09-30-03.37.01.815270000244 2 2016-09-30-03.37.06.852713000244 3 2016-09-30-03.37.16.924815000244 |
Here I can see the time the rows were changed or deleted by the values in the row end column.
Now let me start my time travelling. I want to see what was "active" at 3:37:10 AM:
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME AS OF '2016-09-30-03.37.10.000000000000' Last First name name CRUZ ANNA SMITH JHON ******** End of data ******** |
At that time only Anna's and John's rows were in the table.
Seven seconds later:
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME AS OF '2016-09-30-03.37.17.000000000000' Last First name name GRAY DARNESHA CRUZ ANNA SMITH JHON GUPTA RANJIT ******** End of data ******** |
And as I slowly move through time...
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME AS OF '2016-09-30-03.37.22.000000000000' Last First name name HERNANDEZ ANNA GRAY DARNESHA SMITH JHON GUPTA RANJIT ******** End of data ******** |
I can watch the data change...
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME AS OF '2016-09-30-03.37.28.000000000000' Last First name name HERNANDEZ ANNA SMITH JOHN GRAY DARNESHA GUPTA RANJIT ******** End of data ******** |
Until it catches up to the present time:
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME AS OF '2016-09-30-03.37.33.000000000000' Last First name name HERNANDEZ ANNA SMITH JOHN GRAY DARNESHA ******** End of data ******** |
Another thing I can do is to give a range of dates and want to see what data changes happened during that time.
SELECT LAST_NAME,FIRST_NAME FROM PERSON FOR SYSTEM_TIME BETWEEN '2016-09-30-03.37.20.000000000000' AND'2016-09-30-03.37.27.00000000000' ORDER BY FIRST_NAME Last First name name HERNANDEZ ANNA CRUZ ANNA GRAY DARNESHA SMITH JHON GUPTA RANJIT ******** End of data ******** |
If I use the between times be careful how I use the data. Anna's row was inserted and then changed within the time range, therefore, her details appear twice.
I can see many uses for Temporal tables. Now after the daily, weekly, monthly or yearly updates have happened the users can return to using the applications, and by using Temporal tables I can run my reports without worrying that the users are changing the data I need for my reports. Or I can now run a backlog report for last week, without having to restore a copy of the backlog file. More uptime for the systems equals happier management.
You can learn more about this from the IBM website:
This article was written for IBM i 7.3.
as far as i can remember the recommendation from Scott Forstie is to put the tables & history tables into the same schema. its easier to handle this way (save, restore, authority, ...)
ReplyDeleteHave to agree with tables & history in same schema, makes life a lot easier
Deleteand another important thing: you cant use foreign keys with set default/cascade/set null and temporal tables.
ReplyDeleteonly "restrict" works...
Thanks for the article, Simon.
ReplyDeleteThis is one of my arguments for convincing managers to 7.3 migration
ReplyDeleteThe views created by system i in the collection's library are't created for fun but for improved the performence of SQL engine. Deleting these files are negative impact on system performance
ReplyDeleteI don't think the views in the schema library provide much in the way of performance. Their purpose is to provide the same views as in QSYS2, but with WHERE clauses that limit the results to only objects in that Schema. Database engines like Oracle and DB2 for Z have schemas, but each is isolated from others with their own catalogs. On IBM i, you can treat the entire system as one big schema (with catalogs in QSYS2) or you can have the limited schema view by using catalogs in the Schema library. So, perhaps if you query for columns named MyColumn, you'll only get those in the schema versus those across the entire system, so to that extent it could affect performance. But not for performance of normal insert/update/delete activity on the tables.
ReplyDelete