Wednesday, April 10, 2019

Validating SQL statements

qsqschks api used to validate sql statements

This post is about an API I stumbled across when looking for another. The Syntax Check SQL Statement API, QSQCHKS, will validate any SQL statement passed to it. I can see this being useful when I build a SQL statement in a variable before executing it. If I was to validate the statement it would allow me identify any errors I may have made.

When I show my code, below, you might think it looks complicated, with several strangely names data structures, which is why I have placed it in its own procedure with just one parameter passed to it and one returned. K.I.S.S. .

I have created two procedures to achieve my goal: to return a character variable containing the SQL error message's text. To this end I have created two procedures, the first in RPG, the second in CL. I could have combined them into a single service program, but as this is an example I left them as modules, RPGMOD001 and CLMOD001, to be bound to the calling program at program creation (compile) time.

Let me start with the beginning of the RPG procedure, including the globally defined variables.

01  **free
02  ctl-opt nomain ;

03  /define GetMsgText
04  /define ValidateSql
05  /include mylib/devsrc,prototypes
06  /undefine ValidateSql
07  /undefine GetMsgText

08  /include qsysinc/qrpglesrc,qusec

Line 1: You know that I only create RPG using completely free RPG.

Line 2: As this is an external procedure that will be compiled to create a module it does not have a main procedure.

Lines 3 – 7: By defining certain compile options I copy in the procedure prototypes from the source member mentioned on line 5. I like to undefine after the code has been copied, it is not necessary to do so, it is just one of my foibles.

Line 8: This member copied from the QSYSINC library contains the standard API error data structure. I have put this in the golad definition section so that it can be used by all of the procedures in this source member.

The next section is the start of the procedure, and some of the definitions.

09  dcl-proc ValidateSql export ;
10    dcl-pi *n char(132) ;
11      Statement char(16322) value ;
12    end-pi ;

13    dcl-pr QSQCHKS extpgm ;
14      *n char(16322) ;  //SQL statement
15      *n int(10) ;      //Statement length
16      *n int(10) ;      //Number of records
17      *n char(10) ;     //Statement language
18      *n likeds(Qsq_Options) ;
19      *n likeds(Qsq_Statement_Info) ;
20      *n int(10) ;      //Size of DS
21      *n int(10) ;      //Number of records
22      *n likeds(QUSEC) ;
23    end-pr ;

24    dcl-s StmtLength int(10) ;
25    dcl-s NbrofRecords int(10) inz(1) ;
26    dcl-s Language char(10) inz('*NONE') ;
27    dcl-s DsSize int(10) inz(%size(Qsq_Statement_Info)) ;
28    dcl-s Filler int(10) ;
29    dcl-s FormattedMsgText char(132) ;

Line 9: This is the start of the procedure. The EXPORT is required as this procedure "exports" its results to a procedure (program) in a different module.

Lines 10 – 12: I never give my procedure interfaces names, therefore, I have to use *N in place of the name. This procedure will return a 132 character value, which is defined on the DCL-PI line. There is just one incoming parameter, which is the SQL statement from the calling programs/procedures.

Lines 13 – 23: This is the procedure prototype (parameters) for the QSQCHKS API, I have to use the EXTPGM keyword as this is a API program, as I am using the program's name as the prototype's name I do not need to give the name in the EXTPGM keyword. I never name the parameters in the prototype, and I have to use *N in place of the parameter names.

Lines 24 – 29: Definition of various variables. Line 27 is interesting as the variable is initialized with the value of the sixe of the Qsq_Statement_Info data structure.

These three data structures are used by the QSQCHKS API.

30    dcl-ds Qsq_Options qualified ;
31      NbrOfKeys int(10) inz(1) ;      //Number of keys
32      KeyNbr int(10) inz(1) ;         //Key number
33      KeyLength int(10) inz(10) ;     //Key length
34      KeyData char(10) inz('*SYS') ;  //Key data
35    end-ds;

36    dcl-ds Qsq_Statement_Info qualified ;
37      MsgfName char(10) ;   //Message file name
38      MsgfLib char(10) ;    //Message file library
39      Statements int(10) ;  //No. of statements processed
40      StatementsInfo likeds(Qsq_Statement_I) ;
41    end-ds ;

42    dcl-ds Qsq_Statement_I qualified ;
43      StmtLength int(10) ;     //Statement length
44      RcdStmtBegin int(10) ;   //Record statement begin
45      ColStmtBegin int(10) ;   //Column statement begin
46      RcdStmtEnd int(10) ;     //Record statement end
47      ColStmtEnd int(10) ;     //Column statement end
48      RcdStmtErr int(10) ;     //Record statement error
49      ColStmtErr int(10) ;     //Column statement error
50      MsgId char(7) ;          //Message id
51      SqlState char(5) ;       //SQL state
52      MsgLength int(10) ;      //Message length
53      MsgData char(100) ;      //Message data
54    end-ds ;

I started wanting to use the definitions for these data structures from the source member QSQCHKS in the source file QRPGLESRC in the library QSYSINC. But each of these data structures have a variable error at their ends, which is not defined in that source member. Therefore, I took the fixed format definitions of the data structures from that member, converted them to free format, included the variable parts of the definitions I needed, and gave the subfields more meaningful names. I named the data structures the same as the originals' long names from the source member.

Lines 30 – 35: In the documentation, there will be a link to it at the bottom of this post, these are the description as the option values. Fortunately for what I am doing here they can all be initialized with fixed values.

Lines 36 – 41: This is the statement information header data structure, the detailed is included at the end, line 40, as the next data structure. Basically this is where the message file, and library, information is returned from QSQCHKS.

Lines 42 – 54: The statement information detail is included in the header using a LIKEDS statement on line 40. The only two subfields I care about here are:

  1. MsgId (line 50)
  2. MsgData (line53)

Using these two subfields I can display the SQL error returned from the API.

After all the definition now we come to the part that does something.

55    StmtLength = %len(%trimr(Statement)) ;

56    if (StmtLength = 0) ;
57      return 'The local statement is empty';
58    endif;

59    clear QUSEI ;

60    QSQCHKS(Statement:
61            StmtLength:
62            NbrOfRecords:
63            Language:
64            Qsq_Options:
65            Qsq_Statement_Info:
66            DsSize:
67            Filler:
68            QUSEC) ;

69    if (QUSEI <> ' ') ;
70      return QUSEI;

       //SQL Error
71    elseif (Qsq_Statement_Info.StatementsInfo.MsgId <> ' ') ;
72      if (Qsq_Statement_Info.StatementsInfo.ColStmtErr > 0) ;
73        GetMsgText(Qsq_Statement_Info.StatementsInfo.MsgId:
74                   Qsq_Statement_Info.MsgfName:
75                   Qsq_Statement_Info.MsgfLib:
76                   Qsq_Statement_Info.StatementsInfo.MsgData:
77                   FormattedMsgText) ;

78        return FormattedMsgText ;
79      endif;
80    endif;

81    return 'OK' ;
82  end-proc ;

Line 55: I need to have the length of the SQL statement that will be validated.

Lines 56 – 58: If the incoming parameter is blank then a message is returned to the calling program.

Line 59: Initialize the part of the QUSEC data structure that will contain any system error messages returned from QSQCHKS.

Lines 60 – 68: Calling the API. I separated this statement so that each parameter was on its own line, to make it easier to understand which ones are used.

Lines 69 – 70: If a system error message is returned from QSQCHKS I return it to the calling program.

Line 71: If there is a message id in the statement information detail information then the SQL statement passed to the API is not valid. As all the data strucutures need to be qualified this makes for a long name for the subfield.

Line 72: If there is an error returned in the subfield in the statement information detail...

Lines 73 – 77: In the prototype, which I will show below I gave the CL module/procedure, CLMOD001, a more meaningful name, GetMsgText. The parameters used are:

  1. SQL message id
  2. Message file name
  3. Library the message file is in
  4. Message data returned from QSQCHKS
  5. The formatted message string returned from this procedure, which is returned from the procedure.

Line 78: I returned the formatted message string to the calling program.

Line 81: If there are no errors I return "OK" to the calling program.

The procedure prototypes included in RPGMOD001 by the code from lines 3 – 7 is:

01  /if defined(ValidateSql)
02  dcl-pr ValidateSql char(132) ;
03    *n char(16322) value ;
04  end-pr ;
05  /endif

06  /if defined(GetMsgText)
07  dcl-pr GetMsgText extproc('CLMOD001') ;
08    *n char(7) ;
09    *n char(10) ;
10    *n char(10) ;
11    *n char(80) ;
12    *n char(132) ;
13  end-pr ;
14  /endif

Lines 1: This defined matches to the defined statement in RPGMOD001, therefore, only the code between this /IF and the /ENDIF will be copied into RPGMOD001.

Lines 7 – 13: As I mentioned above the name I gave the procedure prototype for CLMOD001 is a more meaningful name. Therefore, I need to have CLMOD001 in a EXTPROC keyword, as it is a CL procedure in another module.

The CL procedure only has two executable commands.

01  PGM PARM(&MSGID &MSGF &MSGFLIB &MSGDATA &MSG)

02  DCL VAR(&MSGID) TYPE(*CHAR) LEN(7)
03  DCL VAR(&MSGF) TYPE(*CHAR) LEN(10)
04  DCL VAR(&MSGFLIB) TYPE(*CHAR) LEN(10)
05  DCL VAR(&MSGDATA) TYPE(*CHAR) LEN(80)
06  DCL VAR(&MSG) TYPE(*CHAR) LEN(132)

07  SNDPGMMSG MSGID(&MSGID) MSGF(&MSGFLIB/&MSGF) +
                MSGDTA(&MSGDATA)

08  RCVMSG PGMQ(*PRV) MSGQ(*PGMQ) RMV(*NO) MSG(&MSG)

09  ENDPGM

The purpose of this procedure is to convert the returned message data into a meaningful text.

Line 7: I take the message id and message data returned from QSQCHKS, and "write" the message to the program's message queue.

Line 8: I receive the message from the program message queue into the &MSG variable, and I now have a meaningful message I can return to the calling procedure, and in turn to the program that called that procedure.

I added both of these modules to a binding directory:

WRKBNDDIRE MYLIB/MYBNDDIR


Object       Type      Library
----------   --------  ----------
RPGMOD001    *MODULE   *LIBL
CLMOD001     *MODULE   *LIBL

The program to submit SQL statements looks like:

01  **free
02  ctl-opt main(Main) option(*srcstmt) dftactgrp(*no)
03            bnddir('MYLIB/MYBNDDIR') ;

04  /define ValidateSql
05  /include mylib/devsrc,prototypes
06  /undefine ValidateSql

07  dcl-s Statement char(16322) ;
08  dcl-s ReturnedText char(132) ;

09  dcl-proc Main ;
10    Statement = 'SELECT * FRMO PRODLIB.FILE1' ;
11    ReturnedText = ValidateSql(Statement) ;
12    dsply %subst(ReturnedText:1:50) ;

13    Statement = 'SELECT * FROM PRODLIB.FILE1' ;
14    ReturnedText = ValidateSql(Statement) ;
15    dsply %subst(ReturnedText:1:50) ;
16  end-proc ;

Lines 2 and 3: The control options tell me that:

  • The program has a Main procedure.
  • When the program is compiled it will use the source sequence numbers in the program.
  • The program cannot be in the default activation group as it calls procedures.
  • The given binding directory will be used to bind the modules to this program.

Lines 4 – 6: Here I only need to define the prototype for the RPG procedure.

Lines 7 and 8: These are the variables that will be used to be passed and returned from the RPG procedure.

Line 9: Start of the Main procedure. No procedure interface is needed as no parameters are passed to this program.

Line 10: I am moving an invalid SQL statement to the variable.

Line 11: I call the procedure passing to it the SQL statement I placed in the variable on line 10, and I am getting the returned value from the procedure.

Line 12: I display what has been returned from the procedure. The Display operation will not display all the 132 characters, so I just want the first 50.

Lines 13 – 15: The same as above, but with a valid SQL statement.

If I place a debug break in the RPG procedure I can see the values returned from QSQCHKS, and then what is displayed by the Display operation code.

For the first SQL statement I see:

QSQ_STATEMENT_INFO.STATEMENTSINFO.MSGID = 'SQL0104' 
QSQ_STATEMENT_INFO.MSGFNAME = 'QSQLMSG   '
QSQ_STATEMENT_INFO.MSGFLIB = 'QSYS      '
QSQ_STATEMENT_INFO.STATEMENTSINFO.MSGDATA = ' ?FRMO ?FROM INTO

DSPLY  Token FRMO was not valid. Valid tokens: FROM INTO.

As you can see the data returned in the Message Data subfield does not clearly explain the error. By using the CL procedure I get an understandable message returned.

With the second statement I see:

QSQ_STATEMENT_INFO.STATEMENTSINFO.MSGID = '       ' 
QSQ_STATEMENT_INFO.MSGFNAME = '          ' 
QSQ_STATEMENT_INFO.MSGFLIB = '          '  
QSQ_STATEMENT_INFO.STATEMENTSINFO.MSGDATA = '

DSPLY  OK

As the statement is valid, QSQCHKS returns blanks, which results in the procedure returning "OK".

 

You can learn more about the QSQCHKS API from the IBM website here.

 

This article was written for IBM i 7.3, and should work for earlier releases too.

6 comments:

  1. Good to know how the APIs work and how they can be called

    ... but why not simply running an SQL PREPARE Statement, checking the SQLCODE and receiving the error message information with GET DIAGNOSTICS.

    ReplyDelete
  2. STRSQL, paste SQL, F13, Statement processing *VLD , run

    Ringer

    ReplyDelete
    Replies
    1. Have you ever built a SQL statement in a variable and then used EXECUTE IMMEDIATE to execute it?
      That is when I think this API would be useful.

      Delete
  3. Stumbling upon stuff is a great way to learn, thanks for sharing, Simon.
    I've used STRSQL, changing the session attributes to *VLD from *RUN to check statements. Also, have used iNav RunSql, Run, Syntax Checker to verify my statements.
    I could see one use for this if you were creating a utility where you'd be dynamically creating the SQL and want to check it before getting an ugly error message.
    Thanks again, Simon, you keep us on our toes.

    ReplyDelete
  4. Occasionally. I prefer to use prepared SQL statements with parameter markers. Building a string from variables and running an EXECUTE IMMEDIATE opens the possibility to SQL injection.

    select * from employee_master
    where ID = '123' and city = 'Chicago'

    /* ID here is literally: 123' or 1=1 --
    select * from employee_master
    where ID = '123' or 1=1 -- and city = 'Chicago'

    Ringer

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.