I am using the word "program" in a generic way here. In reality what I am writing in this post can also be used for SQL packages (object type *SQLPKG) and service programs too.
What I wanted to know was which SQL statements executed within a program, and the tables and indexes these statements used. Fortunately, there is a command for that: PRTSQLINF, Print SQL Information. Before I show what this command can do, I need to have a table and program I can use to create the information that the command will return.
First I need a table. I know I will not win awards for originality of this table's name; I have called it TESTTABLE.
01 CREATE TABLE MYLIB.TESTTABLE 02 (FIRST_COLUMN FOR COLUMN "COL1" CHAR(10), 03 SECOND_COLUMN FOR COLUMN "COL2" INT) ; |
In my opinion it is a very simple with just two columns. Now I need to fill it. I am going to write 1 million rows to the table. FIRST_COLUMN will contain the row count, and SECOND_COLUMN will contain a random number. Why will this column contain random numbers? I am going to sort the table by SECOND_COLUMN and I want the result to not be close to the order it would be if I sorted by FIRST_COLUMN.
The program I created to do this uses the SQL RAND function to generate the random numbers.
01 **free 02 ctl-opt option(*nodebugio:*srcstmt) ; 03 dcl-s Counter int(10) ; 04 for Counter = 1 to 1000000 ; 05 exec sql INSERT INTO MYLIB.TESTTABLE VALUES(TO_CHAR(:Counter,'9G999G999'), ROUND(RAND()*100,0)) ; 06 endfor ; 07 *inlr = *on ; |
Line 1: Yes, my code is all free format RPG.
Line 2: My favorite control options.
Lines 3: This is the only variable I will be using in this program.
Line 4: Start of a For group that will be performed 1 million times.
Line 5: It occured to me that an SQL insert statement can contain any SQL function for the values of the columns inserted into the table.
For the first column I am using the TO_CHAR function to convert the integer value into a variable length character string, and adding thousand seperators to the string.
The second column needs to be a random number. I am going to use the RAND function to generate a random number. I want the random number to be an integer, therefore, I take the returned value from the RAND function, multiplying it by 10, and round it to zero decimal places.
When the program has finished I can check the contents of the table with the following SQL statement:
SELECT * FROM TESTTABLE LIMIT 10 |
I am limiting the results to the first ten columns which looks like:
FIRST_COLUMN SECOND_COLUMN ------------ ------------- 1 52 2 50 3 9 4 72 5 4 6 36 7 68 8 21 9 53 10 85 |
The values in SECOND_COLUMN look pretty "scattered" as opposed to being close together.
My thoughts on the best way to randomly access rows in TESTTABLE is not to use a cursor to retrieve all of the rows that fit a certain criteria. I decided to use the SQL equivalent of a RPG chain operation. I decided to put the SQL statement in a subprocedure that I would call multiple times. The main body of the program looks like:
01 **free 02 ctl-opt option(*nodebugio:*srcstmt) dftactgrp(*no) ; 03 dcl-pr GetData ; 04 *n packed(3:0) const ; 05 end-pr ; 06 GetData(26) ; 07 GetData(32) ; 08 GetData(101) ; 09 GetData(0) ; 10 *inlr = *on ; |
Lines 3 – 5: The procedure definition for the subprocedure, GetData, that will pass one parameter, the number to select from TESTTABLE. As I am going to be passing constant, rather than variables, I need to define the parameter with the CONST keyword.
Lines 6 – 9: The subprocedure is called multiple time. Three of the rows will be found, the other will not.
The procedure looks like:
11 dcl-proc GetData ; 12 dcl-pi *n ; 13 KeyField packed(3:0) const ; 14 end-pi ; 15 dcl-s Work packed(3:0) ; 16 exec sql SELECT SECOND_COLUMN INTO :Work 17 FROM MYLIB.TESTTABLE 18 WHERE SECOND_COLUMN = :KeyField 19 LIMIT 1 ; 20 dsply ('Key field : ' + %char(KeyField) + ' SQL code : ' + %char(SQLCOD)) ; 21 end-proc ; |
Lines 12 – 14: Procedure interface must match the procedure definition. I never bother to name my procedure interfaces as it is obvious which procedure this belongs to. Here I have named the parameter as I want to use it later.
Line 15: The value I find in the table will be returned into this variable. In reality I don't care about the returned value.
Lines 16 – 19: The SQL statement to return one row from a table.
Line 16: I have defined the column I want to return the result from, SECOND_COLUMN, and the variable into which I want it placed, :Work. The name of the variable starts with a colon ( : ) to denote that this is a RPG variable in the program.
Line 18: Here is where I say which row I want.
Line 19: I have to use the LIMIT 1 as I only want to return one result, as I do when I use a RPG chain operation, even though there maybe multiple rows with this value.
Line 20: I add this line so I can see which rows are found, and which are not.
After compiling the program I then run it, and the following is displayed:
DSPLY Key field : 26 SQL code : 0 DSPLY Key field : 32 SQL code : 0 DSPLY Key field : 101 SQL code : 100 DSPLY Key field : 0 SQL code : 0 |
All the rows with a SQL code of zero were found. The one row with the SQL code of 100 was not.
I have to had run the program at least once for PRTSQLINF to have information to retrieve.
PRTSQLINF OBJ(<library>/<object>) OBJTYPE(<object_type>) |
The allowed object types are:
- *PGM: Program
- *SQLPKG: SQL package
- *SRVPGM: Service program
The object type is optional and will default to *PGM if one is not given.
In this example I use the following:
PRTSQLINF OBJ(MYLIB/TESTPGM) |
If I look in my default output queue I find the following spool file:
Work with All Spooled Files Device or Total Opt File User Queue User Data Sts Pages _ TESTPGM RPGPGM MYOUTQ PRTSQLINF RDY 1 |
When I display the spool file it looks like:
5770SS1 V7R4M0 190621 Print SQL information Program MYLIB/TESTPGM Object name...............MYLIB/TESTPGM Object type...............*PGM CRTSQLRPGI OBJ(QTEMP/TESTPGM) SRCFILE(QTEMP/QSQLPRE) SRCMBR(TESTPGM) COMMIT(*NONE) OPTION(*SYS *NOEXTIND *SYSTIME *PERIOD) TGTRLS(V7R4M0) ALWCPYDTA(*OPTIMIZE) CLOSQLCSR(*ENDMOD) RDB(*LOCAL) DATFMT(*ISO) DATSEP('-') TIMFMT(*HMS) TIMSEP(':') DFTRDBCOL(*NONE) DYNDFTCOL(*NO) SQLPKG(MYLIB/TESTPGM) MONITOR(*USER) SQLCURRULE(*DB2) ALWBLK(*ALLREAD) DLYPRP(*NO) DYNUSRPRF(*USER) USRPRF(*NAMING) SRTSEQ(*HEX) LANGID(ENU) RDBCNNMTH(*DUW) TEXT('Test RPGLE program ') SQLPATH(*LIBL) DECRESULT(31 31 0) DECFLTRND(*HALFEVEN) CONACC(*DFT) STATEMENT TEXT CCSID(37) SELECT SECOND_COLUMN INTO : H FROM MYLIB . TESTTABLE WHERE SECOND_COLUMN = : H LIMIT 1 SQL4021 Access plan last saved on DD/DD/DD at TT:TT:TT. SQL4020 Estimated query run time is 0 seconds. SQL4017 Host variables implemented as reusable ODP. SQL4010 Table scan access for table 1. * * * * * E N D O F L I S T I N G * * * * |
The first part of the spool file contains all of the compile definitions that were used when to create this program.
I am interested in the following part:
SELECT SECOND_COLUMN INTO : H FROM MYLIB . TESTTABLE WHERE SECOND_COLUMN = : H LIMIT 1 SQL4021 Access plan last saved on DD/DD/DD at TT:TT:TT. SQL4020 Estimated query run time is 0 seconds. SQL4017 Host variables implemented as reusable ODP. SQL4010 Table scan access for table 1. * * * * * E N D O F L I S T I N G * * * * |
The SQL statements executed by this program are listed immediately followed by a series of SQL messages related to each statement. There is a link at the bottom of this post to the page in the IBM documentation website that lists all of these possible messages and what they mean.
The interesting message here is the SQL4010. This shows that a table scan was used to locate the result for this statement. If this had been a select statement selecting the majority of the rows from the table then this could be efficient. In this case I only wanted one row, and the table scan is inefficient.
How to overcome this inefficiency? Build an index over TESTABLE:
01 CREATE INDEX MYLIB.TESTTABLEX 02 ON MYLIB.TESTTABLE (SECOND_COLUMN) ; |
My new index, TESTTABLEX, is keyed by SECOND_COLUMN.
I call the program again to refresh the data now with the index present.
SQL4021 Access plan last saved on DD/DD/DD at TT:TT:TT. SQL4020 Estimated query run time is 0 seconds. SQL4017 Host variables implemented as reusable ODP. SQL4008 Index TESTTABLEX used for table 1. SQL4026 Index only access used on table number 1. SQL4011 Index scan-key row positioning used on table 1. |
This time there is no table scan. The message SQL4008 shows that the new index was used, and the following messages show how the index was used.
Rather than just getting the information for one very simple program what would I do if I wanted the results for all the SQL RPG programs, service programs, and packages in a library. Looking in various SQL views and table functions I found the table function OBJECT_STATISTICS gives me the infomration I desire.
In this example CL program I am getting the SQL information for objects in the library PRODLIB. I am showing the program in two parts to make it easier to follow:
01 DCL VAR(&LOOP) TYPE(*LGL) VALUE('1') 02 DCLF FILE(QTEMP/OUTFILE) 03 DLTF FILE(QTEMP/OUT*) 04 MONMSG MSGID(CPF0000) 05 RUNSQL SQL('CREATE TABLE QTEMP.OUTFILE + 06 (LIB,OBJ,TYPE,ATTRIB) + 07 AS(+ 08 SELECT CAST(OBJLIB AS CHAR(10)),+ 09 CAST(OBJNAME AS CHAR(10)),+ 10 CAST(OBJTYPE AS CHAR(10)),+ 11 USER_DEFINED_ATTRIBUTE + 12 FROM TABLE(QSYS2.OBJECT_STATISTICS(''PRODLIB'',+ 13 ''SRVPGM PGM SQLPKG'')) + 14 WHERE (OBJTYPE = ''*PGM'' + 15 AND USER_DEFINED_ATTRIBUTE = ''SQLRPI'') + 16 OR OBJTYPE IN (''*SRVPGM'',''*SQLPKG'')) + 17 WITH DATA') + 18 COMMIT(*NC) |
Line 2: The file OUTFILE will contain the list of all the objects that contain SQL statements.
Lines 3 and 4: My work files start with the characters "OUT", therefore, the delete statement deletes both of them from QTEMP.
Lines 5 – 18: I am using one of my favorite CL commands RUNSQL to execute a SQL statement that will build the table OUTFILE in QTEMP based upon the fields and selection criteria of the Select statement.
Lines 8 – 11: I have just selected the columns for library, object name, object type, and attribute for the new table. I have cast certain columns to convert them from variable to fixed length.
Lines 12 and 13: I am using the OBJECT_STATISTICS table function, and I am passing two parameters to it:
- Name of the library : PRODLIB
- List of object types I want in my results : SRVPGM, PGM, and SQLPKG
Lines 14 – 16: This is the selection criteria to only return the results I desire. Lines 14 and 15 only selects programs that have the attribute SQLRPGI, all other program objects will not be selected. Line 16 ensure that all service programs and SQL packages are included.
The next part of the program reads the data from the output table and generates the data I want with the PRTSQLINF command.
19 CRTPF FILE(QTEMP/OUTSPLF) RCDLEN(133) 20 DOWHILE COND(&LOOP) 21 RCVF 22 MONMSG MSGID(CPF0000) EXEC(LEAVE) 23 SELECT 24 WHEN COND(&TYPE = '*PGM') + 25 THEN(PRTSQLINF OBJ(&LIB/&OBJ) OBJTYPE(*PGM)) 26 WHEN COND(&TYPE = '*SRVPGM') + 27 THEN(PRTSQLINF OBJ(&LIB/&OBJ) OBJTYPE(*SRVPGM)) 28 OTHERWISE CMD(PRTSQLINF OBJ(&LIB/&OBJ) OBJTYPE(*SQLPKG)) 29 ENDSELECT 30 CPYSPLF FILE(&OBJ) TOFILE(QTEMP/OUTSPLF) + SPLNBR(*LAST) MBROPT(*ADD) 31 DLTSPLF FILE(&OBJ) SPLNBR(*LAST) 32 ENDDO |
Line 19: I am creating a file that will contain the data from the spool files generated by PRTSQLINF.
Line 20: I have this Do loop that I am using to read all of the OUTFILE, which contains the list of SQLRPGLE source members.
Line 21: The receive the next row from the table.
Line 22: If the receive encounters an error, most commonly end of file, the Do loop is exited.
Line 23: I am using the Select command so I do not have nested ifs.
Lines 24 and 25: Is only performed if the object type from OUTFILE is a program.
Lines 26 and 27: If the object is a service program.
Line 28: If it is neither of the above is true, the object has to be a SQL package.
Line 30: I copy the generated spool file into the file OUTSPLF using the Copy Spool File command, CPYSPLF.
Line 31: I delete the spool file. I do this so I do not end up with an output queue containing many spool files generated by the PRTSQLINF command.
The end result is that the file OUTSPLF will contain the SQL information for all the programs, SQL packages, and service programs, rather than just for one. I can now search the file for any information I want, a much nicer situation rather than having a collection of spool files to search.
I will use PRTSQLINF to see what a SQLRPGLE program, service program or SQL package does, or has done. I would not use it to make the "hard" decision as to what indexes need to be built. But with the information it gives me and some investigation I can certainly make a good assumption about which indexes could be built. I would also use the Index Advisor in ACS or the Index Advice view, SYSIDXADV, and use the information returned by the job log too.
You can learn more about this from the IBM website:
This article was written for IBM i 7.4, and should work for some earlier releases too.
No comments:
Post a Comment
To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.
Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.