Pages

Tuesday, January 9, 2024

Creating Excel spreadsheets with SQL

One of the more eye-catching additions to Db2 for i in the latest round of Technology Refreshes, IBM i 7.5 TR3 and 7.4 TR9, was a SQL scalar function that would create a Microsoft Excel compatible spreadsheet in the IFS.

For years we have been able to create Common Separated Variable length file, CSV, by copying data to the IFS. This can be opened using Microsoft Excel and then saved as an Excel compatible spreadsheet. This is the first time an easy-to-use method has been provided, as part of native IBM i, to allow us to create an Excel in the IFS from data in the Db2 database.

The new scalar function is called GENERATE_SPREADSHEET and is found in the library SYSTOOLS. It has a number of parameters that need to be passed to it:

  • PATH_NAME:  The location and file name, without extension, of where the file will be created
  • SPREADSHEET_QUERY:  The SQL statement that selects the data to be sent to the IFS file. Cannot be used if the following two parameters are given
  • LIBRARY_NAME and FILE_NAME:  If you want to dump the entire contents of a file to the spreadsheet you can use these two parameters in place of using a SQL statement in the above parameter
  • SPREADSHEET_TYPE:  What format do you want your output file to be in:
    • xlsx:  Microsoft Excel format
    • ods:  OpenDocument spreadsheet format
    • csv:  CSV format
    • If not given CSV is the default
  • COLUMN_HEADINGS:  What headings do you want the columns in the spreadsheet to have:
    • COLUMN:  Column names are used
    • LABEL:  Column labels are used
    • NONE:  No headings
    • If not given no column labels are added

The SQL statement looks like:

01  VALUES SYSTOOLS.GENERATE_SPREADSHEET(PATH_NAME => '/home/MyFolder/MyFile',
02                                       SPREADSHEET_QUERY => 'select * from mylib.file',
03                                       SPREADSHEET_TYPE => 'xlsx',
04                                       COLUMN_HEADINGS => 'COLUMN')

Line 1: This is the path where the file will be created, or it will overlay an existing file with the same name. Notice that no extension is given for the file.

Line 2: This is the SQL statement that generates the data that will be copied into the spreadsheet. I have no idea why I chose to show this example in lower case. Upper case works equally well.

Line 3: I want an Excel compatible file. This is appended to the path name. In other words, my file will be called: MyFile.xlsx

Line 4: I want the column headings to be the column names from the file.

When executed GENERATE_SPREADSHEET returns a return code. "1" means successful, anything else means it was not.

As I am copying all the records from the file to the spreadsheet I could use this statement instead:

01  VALUES SYSTOOLS.GENERATE_SPREADSHEET(PATH_NAME => '/home/MyFolder/MyFile',
02                                       LIBRARY_NAME => 'MYLIB',
03                                       FILE_NAME => 'FILE',
04                                       SPREADSHEET_TYPE => 'xlsx',
05                                       COLUMN_HEADINGS => 'LABEL')

Lines 2 and 3: These two lines replace the Spreadsheet query parameter in the previous statement.

Line 5: I have decided I want the columns' labels as their headings, rather than the columns' names.

To create an ODS format file I would just need to change the Spreadsheet type, on line 3:

01  VALUES SYSTOOLS.GENERATE_SPREADSHEET(PATH_NAME => '/home/MyFolder/MyFile',
02                             SPREADSHEET_QUERY => 'select * from mylib.file',
03                             SPREADSHEET_TYPE => 'ods',
04                             COLUMN_HEADINGS => 'COLUMN')

For a CSV I just need to remove the Spreadsheet type line:

01  VALUES SYSTOOLS.GENERATE_SPREADSHEET(PATH_NAME => '/home/MyFolder/MyFile',
02                             SPREADSHEET_QUERY => 'select * from mylib.file',
03                             COLUMN_HEADINGS => 'COLUMN')

If you run these you will notice that it takes some time for the statement to finish. The spreadsheets are created running a Java program, I will give the statement at the end of this post, and this takes time to complete.

If you execute any of the above statements using the STRSQL command you will be presented with the Java UI. You will have to wait until the message "Press ENTER to end terminal session." is displayed. You can then press Enter to return to the STRSQL interface. This is another reason why ACS's Run SQL Scripts is the better tool.

I can check to see if the files have been created either by using ACS's "Integrated File System" tool, or I can use the IFS Object Statistics table function:

01  SELECT PATH_NAME,OBJECT_TYPE,DATA_SIZE
02    FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS('/home/MyFolder/','NO','*STMF'))

Which returns to me the three spreadsheet files I created:

                            OBJECT   DATA
PATH_NAME                   _TYPE    _SIZE
--------------------------  ------   -----
/home/MyFolder/MyFile.csv   *STMF       96
/home/MyFolder/MyFile.ods   *STMF     1570 
/home/MyFolder/MyFile.xlsx  *STMF     3578

While showing examples in Run SQL Scripts demonstrates how to use this, the useful example is to include it in a program. I decided to use a RPG program:

01  **free
02  dcl-s ReturnCode int(3) ;

03  exec sql SET :ReturnCode =
04               SYSTOOLS.GENERATE_SPREADSHEET(
05                  PATH_NAME => '/home/MyFolder/data',
06                  SPREADSHEET_QUERY =>
07                        'SELECT A.NAME,B.DATA +
08                           FROM MYLIB.PERSON A +
09                           LEFT JOIN MYLIB.DETAILS B +
10                             ON A.NAME = B.KEY +
11                          WHERE NAME = ''SIMON''',
12                  SPREADSHEET_TYPE => 'xlsx',
13                  COLUMN_HEADINGS => 'LABEL') ;

14  dsply %char(ReturnCode) ;

Line 1: I only write modern RPG.

Line 2: Definition for an integer variable to contain the return code value.

Line 3: I am using a SQL Set statement to capture the return code from GENERATE_SPREADSHEET into the RPG variable ReturnCode.

Lines 4 13: This is pretty much the same as I showed above. The only difference is the SQL statement, lines 6 - 11. Here I am joining the PERSON, which contains one record per person, to the DETAILS file, which could contain zero or many records for the person. Using a LEFT OUTER JOIN, line 9, I will return one result for each matching record in DETAILS where the person code is 'SIMON'. As SIMON is a string it needs to be surrounded by double apostrophes ( ' ), and an additional apostrophe is needed after SIMON as it is the end of the SQL statement string parameter.

Line 14: I use the Display Operation Code, DSPLY, to display the return code returned from GENERATE_SPREADSHEET.

After compiling the program when I call it the Java UI is displayed while the Java program called by the SQL scalar function is executed creating the spreadsheet. Like I did before I have to wait until I see the "Press ENTER to end terminal session." message to press Enter. The Java environment is exited, and control is returned to the program.

IMHO that is a nasty user experience. When I have encountered this before I found that if I redirect the STDOUT to a file the Java UI is not displayed, and I don't have to press Enter to exit it. I need to make some additions to my program:

01  **free
02  dcl-s ReturnCode int(3) ;

03  exec sql DROP TABLE IF EXISTS QTEMP.STDOUT ;

04  exec sql CALL QSYS2.QCMDEXC('OVRDBF FILE(STDOUT) TOFILE(QTEMP/STDOUT) +
                                          OVRSCOPE(*JOB)') ;

05  exec sql SET :ReturnCode =
                 SYSTOOLS.GENERATE_SPREADSHEET(
                    PATH_NAME => '/home/MyFolder/data',
                    SPREADSHEET_QUERY =>
                          'SELECT A.NAME,B.DATA +
                             FROM MYLIB.PERSON A +
                             LEFT JOIN MYLIB.DETAILS B +
                               ON A.NAME = B.KEY +
                            WHERE NAME = ''SIMON''',
                    SPREADSHEET_TYPE => 'xlsx',
                    COLUMN_HEADINGS => 'LABEL') ;

06  exec sql CALL QSYS2.QCMDEXC('DLTOVR FILE(STDOUT) LVL(*JOB)') ;

07  dsply %char(ReturnCode) ;

The additions to the program are:

Line 3: The Standard Output, STDOUT will be directed to a disk file. If one already exists I use the DROP SQL statement to delete it. By using the IF EXISTS if the file does not exist the statement will not error.

Line 4: I use the Override Database File command, OVRDBF, to redirect the STDOUT to a file. I also need to give this an override scope of *JOB.

Line 6: After the spreadsheet has been generated I want to delete the override for STDOUT.

After compiling this when I run it the Java UI is not displayed.

The file created by overriding STDOUT is a source file. The data listing what happened during the running of the Java program is listed within the SRCDTA field. I can display using the following:

01  SELECT SRCDTA FROM QTEMP.STDOUT

Which returns the following:

SRCDTA
-------------------------------
The java.version property wa...
Transfer request is complete.
Transfer statistics: 00:00:06
Rows transferred: 3

Rather than use a string for the SQL statement I can put my statement in a variable and use that in GENERATE_SPREADSHEET. Below is an example of doing that:

01  **free
02  dcl-s ReturnCode int(3) ;
03  dcl-s SQL varchar(250) ;

04  SQL = 'SELECT A.NAME,B.DATA +
             FROM MYLIB.PERSON A +
               LEFT JOIN MYLIB.DETAILS B +
               ON A.NAME = B.KEY +
            WHERE NAME = ''PATTY''' ;

05  exec sql DROP TABLE IF EXISTS QTEMP.STDOUT ;

06  exec sql CALL QSYS2.QCMDEXC('OVRDBF FILE(STDOUT) TOFILE(QTEMP/STDOUT) +
                                          OVRSCOPE(*JOB)') ;

07  exec sql SET :ReturnCode =
08           SYSTOOLS.GENERATE_SPREADSHEET(
09                PATH_NAME => '/home/MYLIB/data',
10                SPREADSHEET_QUERY => :SQL,
11                SPREADSHEET_TYPE => 'xlsx',
12                COLUMN_HEADINGS => 'LABEL') ;

13  exec sql CALL QSYS2.QCMDEXC('DLTOVR FILE(STDOUT) LVL(*JOB)') ;

14  dsply %char(ReturnCode) ;

Below is what is different from the previous program.

Line 3: I have defined a variable to contain the SQL statement I will be creating.

Line 4: This SQL is similar to the last one, except this is for Patty.

After compiling I run the program and I look in the STDOUT file in QTEMP, and its contents are:

SRCDTA
-------------------------------
The java.version property wa...
Transfer request is complete.
Transfer statistics: 00:00:07
Rows transferred: 1

It is as easy as that.

I did manage to find the Java program that is called behind the scenes:

01  JAVA CLASS('/QIBM/ProdData/Access/ACS/Base/acsbundle.jar')
02       PARM('/plugin=cldownload'
03            '/system=localhost'
04            '/sql=select * from mylib/person'
05            '/clientfile=java_out.xlsx'
06            '/colheadings=1'
07            '/usecollabels')

Line 3: localhist is my default IFS folder, /home/MyFolder.

Line 4: The SQL statement to execute.

Line 5: Name of the file to be created.

Line 6: I want column headings in my spreadsheet.

Line 7: use the columns' labels for the headings.

The GENERATE_SPREADSHEET is a lot easier to use than the Java program's parameters.

 

You can learn more about the GENERATE_SPREADSHEET SQL Scalar function from the IBM website here.

 

This article was written for IBM i 7.5 TR3 and 7.4 TR9.


Addendum

If you want to run the above in a batch job read this.

30 comments:

  1. Can you create combined columns longer than 32767 using a Join? Is there a limit to individual cell width?

    ReplyDelete
  2. Thanks for sharing, it's great to know, and I am sure many will find this technique valuable. I've been using a functionality to run DB2 for i SQL statements directly from Excel spreadsheet. In order to do so, one must add an IBM i Access ODBC Driver to the list of User Data Sources in their PC. Then, in Excel, go to Data/Get Data/From Other Sources/From ODBC. From the drop-down menu that would pop up, select the data source (you could have many targeting different DB2 for i databases, naming conventions, default schemas, etc.) and paste your SQL statement under the Advanced Options text box and click OK - as simple as that. I can't post the screenshots due to limited space here, but please let me know of interest, and we'll find a way to share that.

    ReplyDelete
    Replies
    1. I know of the method you describe.
      All of the files I generate for Excel, etc., are created by batch jobs. And the generated spreadsheet can be placed in an IFS folder, a Windows share folder, or emailed to the person who requested the report.

      Delete
    2. BTW, you can specify a SQL also with the stock client ACS data transfer. This method doesn't require additional ODBCs on the driver, apart of ACS itself of course.

      Delete
  3. Amazing.. This is very useful, thank you.

    ReplyDelete
  4. Glad to know that they implemented a SQL wrapper over the ACS, more convenient to use.
    Some I noticed don't realize that the ACS client can be used in windows but in IBMi as well, being java.
    Previously I've called /plugin=cldownload directly from a CL wrapper to generate in batch mass xlsx distributions. Nothing fancy, but good enough without external dependencies.

    ReplyDelete
  5. Hi Simon, as always thank you for your endless support and contributions to RPG programmers around the world.
    With the above new functionality that "creates an Excel in the IFS from data in the Db2 database", is there an equivalent functionality from excel to database file, the other way around?
    Thank you very much!

    ReplyDelete
    Replies
    1. Alas there is not. Convert XLS to CSV and then CPYTOIMPF to a file.

      Delete
    2. There is no sql wrapper. But you can just use the opposite /PLUGIN=upload of ACS called from CL ...don't know about error handling...

      Delete
  6. In your examples you use
    exec sql CALL QSYS2.QCMDEXC('DROP TABLE IF EXISTS QTEMP.STDOUT') ;

    Shouldn't it be
    exec sql DROP TABLE IF EXISTS QTEMP.STDOUT;

    ReplyDelete
    Replies
    1. But the qsys2.qcmdexc will fail, there are no command called drop, isnt that right ? it fails on my system.

      Delete
    2. You are correct. That is a mistake on my part.
      If I wanted it as a DROP I could just have a SQL statement to do it in the RPG code.
      Thanks for reporting this.

      Delete
  7. Great share Simon! Thanks! It would be great if they were release a read xls(x) function too, to avoid java stuff ... all HSSF/XSSF in one SQL function.

    ReplyDelete
  8. Hi Simon, wondering does this function could create a multi tab excel?

    ReplyDelete
  9. Hi Simon,
    Thank you for the examples. I am getting an error in my RPG - Character conversion between CCSID 65535 and CCSID 1200 not valid. I don't see any way to specify a CCSID conversion. Any ideas on how to make it to convert or set CCSID to 1208 prior to the SQL execution?
    Thank you again!

    ReplyDelete
    Replies
    1. Not sure what to try that will definitely fix it.
      I would try what is described here.

      Delete
    2. Hi, it seems that I have the same problem with conversion between CCSID 65535 and CCSID 1200. I dont know how it can fix, if I run the same SQL statement from SQL script on ACS it works fine.

      Delete
    3. You'll have to use the CAST statement to change the CCSID of the output of the column. ACS is forgiving and does those translations for you.

      Delete
    4. Thank you Simon. I can't see how to use cast in my select though. I have SPREADSHEET_QUERY => 'Select * From TABLE_NAME, so does it mean that to use cast I need to list all the columns and cast it or is there a better way? Thank you again!

      Delete
    5. CCSID 65535 is the worse CCSID to have as it means there is no CCSID defined.

      As you point out you cannot use a CAST with 'SELECT *'.

      When I have encountered a 'SELECT *' with CCSID 65535 like this I build a View over the Table or File. Within I list all the fields/columns and CAST, to CCSID 37, each one.

      Delete
  10. Hello Simon, would it be possible to format individual cells. For example, colour or font?

    ReplyDelete
  11. Hello is there something simular to add data to a new sheet of an existing excel?

    ReplyDelete
    Replies
    1. My understanding is there are no plans to allow the creation of multiple sheets in a spreadsheet.

      Delete
  12. Hello Simon, this is great and thank you for sharing. I am trying to use a sql statement that has CTE's. It seems to execute but does not produce the spreadsheet. Are the CTE's causing a problem? There are no errors in the joblog. I am running this in ACS.

    ReplyDelete
  13. Values into clause can be used in SQLRPGLE also, exec sql values SYSTOOLS.GENERATE_SPREADSHEET(
    PATH_NAME => '/home/MYLIB/data',
    SPREADSHEET_QUERY => :SQL,
    SPREADSHEET_TYPE => 'xlsx',
    COLUMN_HEADINGS => 'LABEL')
    INTO :returnCode
    where returnCode is a zoned (because returnCode can be negative) variable, I use zoned(5) for this.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.