Most of us who have been programming in RPG for many years are familiar with doing things at a record level. We read, write, update, and delete one record at a time. Moving to SQL allows us to do things a set of data at a time, more than one row/record. We can fetch multiple rows from a table/file, delete more than one row with a single statement, and we can add multiple rows/records to a table/file at once. You can still do row/record level access using SQL, but until you start thinking in sets you do not get to experience the speed of file I/O that you can with using multiple row/record statements.
In this post I am going to give two examples of performing multiple row inserts.
- Inserting more than one row/record at a time with just one statement with hard coded values
- Inserting just a few rows/records from one table to another, as you would do to create testing data
1. Inserting multiple hard coded rows
For the posts I wrote about regular expressions I used a table PERSONS.
CREATE TABLE MYLIB.PERSON ( PERSON_ID FOR "PID" INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1, NOCYCLE), FIRST_NAME FOR "FNAME" VARCHAR(20), LAST_NAME FOR "LNAME" VARCHAR(30) ) |
The table contains three columns, with the first being an identity column, Db2 for i will create the value when a row is Inserted into the table. Therefore, I must not give a value for that column, I use DEFAULT as the value. If I was just inserting one row I would use:
INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN') |
I had 25 rows to add. I could have used 25 Insert statements to add the rows. Why would I do that when I can insert all 25 at once in one Insert statement.
INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN'), (DEFAULT,'John','Austin'), (DEFAULT,'don','bennett'), (DEFAULT,'DIETMAR','BRUCK'), (DEFAULT,'Brian','Caine'), (DEFAULT,'george','curtis'), (DEFAULT,'ALAN','DALEY'), (DEFAULT,'Ron','Framer'), (DEFAULT,'lol','harvey'), (DEFAULT,'RON','HEWITT'), (DEFAULT,'Brian','Hill'), (DEFAULT,'peter','hill'), (DEFAULT,'COLIN','HOLDER'), (DEFAULT,'Stuart','Imlach'), (DEFAULT,'eric','jones'), (DEFAULT,'MICK','KEARNS'), (DEFAULT,'Frank','Kletzenbauer'), (DEFAULT,'arthur','lightening'), (DEFAULT,'BILLY','MYERSCOUGH'), (DEFAULT,'Brian','Nicholas'), (DEFAULT,'reg','ryan'), (DEFAULT,'KEN','SATCHWELL'), (DEFAULT,'Nelson','Stiffle'), (DEFAULT,'ray','straw'), (DEFAULT,'BOB','WESSON') |
The column values for each row are enclosed with parentheses ( ( ) ), and each row is separated by a comma ( , ). I formatted the above statement in a way I thought would clearly show each row, and make it easier for another programmer to understand what this statement does. I could have easily had each rows columns follow the previous row.
INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN'),(DEFAULT,'John', 'Austin'),(DEFAULT,'don','bennett'),(DEFAULT,'DIETMAR','BRUCK'), (DEFAULT,'Brian','Caine'),(DEFAULT,'george','curtis'),(DEFAULT, 'ALAN','DALEY'),(DEFAULT,'Ron','Framer'),(DEFAULT,'lol','harvey'), (DEFAULT,'RON','HEWITT'),(DEFAULT,'Brian','Hill'),(DEFAULT,'peter', 'hill'),(DEFAULT,'COLIN','HOLDER'),(DEFAULT,'Stuart','Imlach'), (DEFAULT,'eric','jones'),(DEFAULT,'MICK','KEARNS'),(DEFAULT,'Frank', 'Kletzenbauer'),(DEFAULT,'arthur','lightening'),(DEFAULT,'BILLY', 'MYERSCOUGH'),(DEFAULT,'Brian','Nicholas'),(DEFAULT,'reg','ryan'), (DEFAULT,'KEN','SATCHWELL'),(DEFAULT,'Nelson','Stiffle'),(DEFAULT, 'ray','straw'),(DEFAULT,'BOB','WESSON') |
Personally I think the first example is better as it is easier to see what the statement does as the rows are all on their own line.
As this is one Insert statement all of the rows are added to the table in one file output, which is a lot faster than 25 individual file outputs.
2. Inserting just a few records from one table to another
We all create test data we use to test our programs. Most of the time I do not want to use all the records/rows from a file/table as there is just too much data. The program can take a long time to run, and then I feel intimidated by the amount of data I feel I need to check. Most of the time I just copy a few records/rows from the production file into a test file, and use the test file.
How to extract the data for my test file?
I could use the Copy file command, CPYF, to copy a number of records from my production file to my test file. I only want records where the field Column 1 is equal to A, B, or C, and only five of each.
CPYF FROMFILE(PRODLIB/PRODFILE) TOFILE(TESTFILE/TESTFILE) MBROPT(*ADD) FROMRCD(1) TORCD(15) INCREL((*IF COLUMN1 *EQ 'A') (*OR COLUMN1 *EQ 'B') (*OR COLUMN1 *EQ 'C')) |
Using FROMRCD(1) makes my copy faster, than if I had used FROMRCD(*START).
The problem with this statement is that it will only look in the first 15 records for any records that meet my selection criteria. If there are none in the first 15 records I get no records copied, or perhaps I get 10 As and 5 Bs.
I could use SQL Insert statements to first select the eligible rows, and then insert them into my test file. This code snippet is from a RPG, the same could be done in a CL program using the RUNSQL command, or even have this in a source member and execute it using the Run SQL Statement command, RUNSQLSTM.
exec sql DELETE FROM TESTLIB.TESTFILE ; exec sql INSERT INTO TESTLIB.TESTFILE SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'A' FETCH FIRST 2 ROWS ONLY ; exec sql INSERT INTO TESTLIB.TESTFILE SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'B' FETCH FIRST 2 ROWS ONLY ; exec sql INSERT INTO TESTLIB.TESTFILE SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'C' FETCH FIRST 2 ROWS ONLY ; |
The interesting part of these Insert statements is the FETCH FIRST 2 ROWS ONLY. Once the second eligible row is found the statement ends.
Col 1 A A B B C C |
The problem with this approach is that I need to execute three statements, three times PRODFILE is searched for the eligible records. It would be more efficient, and faster, if I could do it all in one statement.
DELETE FROM TESTLIB.TESTFILE ; INSERT INTO TESTLIB.TESTFILE (SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'A' FETCH FIRST 2 ROWS ONLY) UNION (SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'B' FETCH FIRST 2 ROWS ONLY) UNION (SELECT * FROM PRODLIB.PRODFILE WHERE COLUMN_1 = 'C' FETCH FIRST 2 ROWS ONLY) ; |
This code is from a member I created, I then executed the statements within using the RUNSQLSTM command.
The UNION combines the three statements into one. Therefore, when this Insert statement is run it only needs to search PRODFILE once. I can see that when I look in TESTFILE.
Col 1 A A C B C B |
The data inserted clearly shows that the Insert statement ran as one statement, rather than three separate ones, clearly more efficient than using three Insert statements.
I hope that the above examples help you to see the benefits to start looking at data in sets, rather than at a record level. If you can change your mindset you will quickly find how much easier it is to use these multiple row/record functionality, and how much faster your programs execute.
You can learn more about SQL Insert statement from the IBM website here.
This article was written for IBM i 7.3, and should work for earlier releases too.
I think it's better to include the column names so pre-existing inserts won't fail in the future when a new column is added. Column names not referenced get assigned their default value.
ReplyDeleteINSERT INTO PERSON
(PERSON_ID, FIRST_NAME, LAST_NAME)
VALUES(DEFAULT,'ROBERT','ALLEN')
Ringer
Coincidentally, we were just using this technique to load data into a new database, from an older one. The older database stored date values as the number of days since 01-01-1991, which was a bit odd. Our SELECT statement performed the conversion as follows:
ReplyDeletedate('1901-01-01') + rntjdt days as todt, ...
The data-conversion operations were actually quite entailed, but we managed that by building a series of cascading SQL views (views that are based on preceding ones), until we had a view that resembled the tables that we were loading data into.
I have the same as you, built Views using other Views to get the data in the manner I desire. It is so cool that Db2 for i is smart enough to combine all the logic of the Views and Views of Views into one "statement" when those Views of Views are used.
DeleteSee Build Views and Views of Views