I was asked how to update the current row if I am "reading" a file/table using a SQL cursor. At first this question took me aback, it was not that I did not know how to do it, but more a question of why?
What I consider to be one of the biggest benefits of using SQL to process (fetch, delete, and update) data from your database in sets, "chunks" of data. If I need to process a hundred rows/records of data why get each one individually? as that would be a hundred I/O operations. I can get all hundred rows in one I/O operation, thereby, reducing the amount of time and system resources needed to get the data.
I spent some time thinking of a scenario where I might want to fetch rows/records one at a time. I came up with the scenario of an interface. One program would add data to a file, and another would retrieve the data from the file and process the transaction. My example will be that second program.
The program will read the interface, see below, and process only those records that have not been previously processed, the field PROCESSED is blank. When the row/record has been read the PROCESSED field will be changed and the current timestamp will be moved to the PSTAMP field, so that we all know when this record was processed.
01 A UNIQUE 02 A R INTERFACER 03 A KEYFLD 10A 04 A SOMEDATA 200A 05 A PROCESSED 1A 06 A PSTAMP Z 07 A K KEYFLD 08 A K PSTAMP |
As this file will contain a mix of processed and unprocessed records if I was to design a purely RPG program to perform this scenario I would build a logical file to only select the unprocessed records, see line 3 below.
01 A R INTERFACER PFILE(INTERFACE) 02 A K KEYFLD 03 A S PROCESSED COMP(EQ ' ') |
My purely RPG program looks like:
01 **free 02 ctl-opt option(*nodebugio:*srcstmt:*nounref) ; 03 dcl-pr QCMDEXC extpgm ; 04 *n char(20) options(*varsize) const ; 05 *n packed(15:5) const ; 06 end-pr ; 07 dcl-f INTERFACE1 usage(*update) usropn ; 08 dcl-s StopPgm char(1) dtaara('STOP') ; 09 dow (1 = 1) ; 10 open INTERFACE1 ; 11 dow (2 = 2) ; 12 read INTERFACER ; 13 if (%eof) ; 14 leave ; 15 endif ; 16 PROCESSED = 'Y' ; 17 PSTAMP = %timestamp() ; 18 update INTERFACER %fields(PROCESSED:PSTAMP) ; 19 RecordProcessing() ; 20 enddo ; 21 close INTERFACE1 ; 22 QCMDEXC('DLYJOB DLY(60)':14) ; 23 in StopPgm ; 24 if (StopPgm = '1') ; 25 leave ; 26 endif ; 27 enddo ; 28 *inlr = *on ; |
Line 2: I always have these control options in my program as they make it easier for me to debug the program, and in the case of the *NOUNREF make the program smaller and more efficient.
Lines 3 – 6: I will be calling the QCMDEXC API, therefore, I need to define a procedure interface that will be used when calling the program.
Line 7: This is the file definition of the logical file. The file is defined as update, USAGE(*UPDATE), and input is inferred.
Line 8: This is the definition for a data area. I always like to have a way to end a "never ending program" in a controlled manner. If the data area contains the character 1 the program will end.
Line 9: This is the start of the first Do loop that will continuously loop unless the data area causes the logic to exit.
Line 10: The file is opened. I only added this to show its equivalent in SQL.
Line 11: The second, or inner, Do loop. This one is responsible for reading the unprocessed records from the file.
Line 12 – 15: I am not sure if I really need to describe this. The file is read and if there are no more records to process we exit this Do loop.
Line 16 – 18: I update the fields and then the file. By using the %FIELDS built in function only the fields listed are updated. The others in the file remain unchanged.
Line 19: This procedure would be where the data in the read record would be processed.
Line 21: When the second Do loop is left the file is closed.
Line 22: I pause the program for 60 seconds, by using the Delay Job command, DLYJOB, called by QCMDEXC.
Lines 23 – 26: After the pause I retrieve the value in the data area. If it is 1 then I exit the first Do loop, causing the program to end. If not then this Do loop returns to its top, line 9.
So what does my equivalent SQL RPG program look like:
01 **free 02 ctl-opt option(*nodebugio:*srcstmt:*nounref) ; 03 dcl-ds Record extname('INTERFACE') ; 04 end-ds ; 05 dcl-s StopPgm char(1) dtaara('STOP') ; 06 dow (1 = 1) ; 07 exec sql DECLARE C0 CURSOR FOR 08 SELECT * FROM INTERFACE 09 WHERE PROCESSED = ' ' 10 FOR UPDATE OF PROCESSED,PSTAMP 11 WITH NC ; 12 exec sql OPEN C0 ; 13 dow (2 = 2) ; 14 exec sql FETCH NEXT FROM C0 INTO :Record ; 15 if (SQLCOD <> 0) ; 16 leave ; 17 endif ; 18 exec sql UPDATE INTERFACE 19 SET PROCESSED = 'Y', 20 PSTAMP = CURRENT_TIMESTAMP 21 WHERE CURRENT OF C0 ; 22 RecordProcessing() ; 23 enddo ; 24 exec sql CLOSE C0 ; 25 exec sql CALL QSYS2.QCMDEXC('DLYJOB DLY(60)') ; 26 in StopPgm ; 27 if (StopPgm = '1') ; 28 leave ; 29 endif ; 30 enddo ; 31 *inlr = *on ; |
A lot of it is the same as the purely RPG program the differences are:
Lines 3 and 4: When I fetch the data from the file I will be fetching the entire row/record into this data structure. The easiest way to defined this externally defined data structure is by using the file in the EXTNAME keyword.
Lines 7 – 11: I don't need a logical file to select only the unprocessed records from the file. The declaration of the cursor does the same with the WHERE clause. On line 10 I could have just used FOR UPDATE, which allows for any of the fields in the file to be updated. By using FOR UPDATE OF only the field listed can be updated, a bit like the functionality of the %FIELDS built in function. Line 11 means that this cursor will not use commitment control. I could have used the SET OPTION COMMIT = *NONE SQL statement, but that turns off commitment control for all the files in this program. By using the WITH NC it is turned off only for this file.
Line 12: The cursor is opened.
Lines 14 – 17: The next row is fetched from the file into the data structure Record, as :Record is a variable within the program it has to start with the colon ( : ). If the SQL code is not zero then an error was encountered and we leave this Do loop. Usually this means end of file was encountered.
Lines 18 – 21: This SQL Update statement looks the same as many others I have mentioned in this web site, apart from line 21. WHERE CURRENT OF followed by the cursor's name means that the last record fetched by this cursor is the one that is updated.
Line 24: The cursor is closed.
Line 25: As this is a RPG program will SQL within it, I am using Db2 for i's version of QCMDEXC. Notice that there is no procedure definition for it, and I don't have to pass the length of the command string to it.
If you are going to update the current cursor you need to aware that your cursor:
- Cannot that contain joins (more than one file)
- Cannot use a View
- Definition cannot contain:
- GROUP BY
- HAVING
- DISTINCT
How do the two compare when it comes to speed performance? The SQL RPG program could be slower than the purely RPG one the first time the cursor is used, but eventually the Db2 for i engine will retain the access path to the file after a couple of times and the speed of processing of the two will be comparable. The more times the cursor is used you might see that eventually the SQL version will be ever so slightly faster.
I have noticed working on multiple partitions running IBM i 7.1 – 7.3 in my work in my opinion I have it that appears that the programs using SQL for database I/O do run slightly faster than those using the RPG file I/O.
In this scenario using a single row fetch might be best. In other situations you will want to use the multiple row fetches to speed up your programs.
This article was written for IBM i 7.3, and should work for earlier releases too.
Thanks mr simon, very interesting your post
ReplyDeleteI will practice this
The interesting thing about updating a cursor row using WHERE CURRENT OF is that after the UPDATE, the row is *still* locked. This is a different behavior than using native RPG I/O where an UPDATE releases the lock.
ReplyDeleteThis distinction may be important if program exits without explicitly closing the cursor (and compiled as CLOSQLCSR *ENDACTGRP) and the orphaned row lock causes another job to abend, waiting on the lock to release.
Ringer
This comment has been removed by the author.
DeleteAs an aside, there is no need to retrieve ALL columns of a record/row, it is not even necessary to retrieve the column to be updated.
ReplyDeleteExample:
Exec SQL Declare Cursor CsrC01
for Select Quantity from MyTable
Where ...
For Update Of Quantity, LastChange;
...
Exec SQL Update MyTable
Set Quantity = :NewQuantity,
LastChange = Current_Timestamp
Where Current Of Cursor CsrC01;
Birgitta
Birgitta, after seeing your example "Where Current Of Cursor CsrC01" I tried inserting "cursor" before cursor name, just out of curiosity. It gave me a syntax error. Since the IBM example did not have this.
DeleteJust curious :)
Was a typo! This happens if you write something by heard and do not check it.
DeleteThis comment has been removed by the author.
ReplyDeleteInteresting read! It's very helpful that you also mentioned that the update for will not work from join conditions. Do u have any suggestions - if we have a cursor with join defintion.
ReplyDeleteUpdate the columns in the tables using separate statements, one for each table that you joined.
Delete