The question was an interesting one: How is it possible, in SQL, to list all of the files in the IFS that contain a certain string?
There are a number of SQL table functions that allow me to get to information about the files in IFS folders, and others to "read" their contents. This is a good excuse to combine the two.
I am going to place three files in my personal IFS folder, MyFolder, then search those for the desired string. I created three DDL tables in my library, MYLIB. OK, I created one table and then cloned it two times to make the three tables. I populated them with just a few rows of data. As the tables are identical, I can easily combine the results from the three in one SQL statement using the UNION clause:
SELECT 'TABLE1',A.* FROM TABLE1 A UNION SELECT 'TABLE2',B.* FROM TABLE2 B UNION SELECT 'TABLE3',C.* FROM TABLE3 C ORDER BY 1,2 |
I have added an empty row after each table's results to make it easier to understand the results:
00001 FIRST SECOND THIRD FOURTH ------- ----- --------- -------------- ------------------ TABLE1 1 TEST SOMETHING SOMETHING MORE TABLE1 2 SOMETHING TEST SOMETHING MORE TABLE1 3 SOMETHING SOMETHING MORE TEST TABLE1 4 SOMETHING SOMETHING MORE YET MORE SOMETHING TABLE2 1 NOT HERE SOMETHING SOMETHING MORE TABLE2 2 SOMETHING NOT HERE SOMETHING MORE TABLE2 3 SOMETHING SOMETHING MORE NOT HERE TABLE3 1 SOMETHING SOMETHING MORE YET MORE SOMETHING TABLE3 2 SOMETHING SOMETHING MORE TEST TABLE3 3 SOMETHING TEST SOMETHING MORE TABLE3 4 TEST SOMETHING SOMETHING MORE |
I am going to be searching for the string "TEST" in these three tables. It is present in various columns of TABLE1 and TABLE3, but not in TABLE2.
I copied the content of these three tables using the Copy To Import File command, CPYTOIMPF:
CPYTOIMPF FROMFILE(MYLIB/TABLE1) + TOSTMF('/home/MyFolder/TABLE1.csv') + FROMCCSID(37) + STMFCCSID(*PCASCII) + RCDDLM(*CRLF) CPYTOIMPF FROMFILE(MYLIB/TABLE2) + TOSTMF('/home/MyFolder/TABLE2.csv') + FROMCCSID(37) + STMFCCSID(*PCASCII) + RCDDLM(*CRLF) CPYTOIMPF FROMFILE(MYLIB/TABLE3) + TOSTMF('/home/MyFolder/TABLE3.csv') + FROMCCSID(37) + STMFCCSID(*PCASCII) + RCDDLM(*CRLF) |
Depending upon the configuration of your IBM i and the CCSID it uses you might need to change these commands to fit your needs.
I can check that the files were copied to my folder using the IFS_OBJECT_STATISTICS table function:
SELECT PATH_NAME,OBJECT_TYPE FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS('/home/MyFolder','NO','*STMF')) |
I have passed the table function:
- The path to folder I want to list
- Do not list any subfolders
- Only return steam files. The CSV files I generated are considered stream files in the IFS
The results show me that my three tables have been successfully copied to the IFS:
PATH_NAME OBJECT_TYPE -------------------------- ----------- /home/MyFolder/TABLE1.csv *STMF /home/MyFolder/TABLE2.csv *STMF /home/MyFolder/TABLE3.csv *STMF |
Now I can generate a list of files in my IFS folder, I now need to be able to "read" their contents. Fortunately there is. There are multiple similar SQL table functions to do this. The one I normally use is IFS_READ.
SELECT * FROM TABLE(QSYS2.IFS_READ('/home/MyFolder/TABLE1.csv')) |
The only parameter I needed to give this table function is the path name of the file I want to read. The results look like:
LINE_NUMBER LINE ----------- --------------------------------------------------------- 1 1 ,"TEST","SOMETHING","SOMETHING MORE" 2 2 ,"SOMETHING","TEST","SOMETHING MORE" 3 3 ,"SOMETHING","SOMETHING MORE","TEST" 4 4 ,"SOMETHING","SOMETHING MORE","YET MORE SOMETHING" |
The line number is generated by the table function.
I am only interested in the rows that contain the string "TEST", therefore, I can refine my search by changing the Select statement:
SELECT * FROM TABLE(QSYS2.IFS_READ('/home/MyFolder/TABLE1.csv')) WHERE LINE LIKE '%TEST%' |
By using the where clause with the wildcard "%TEST%" I am returned only those rows that contain that string:
LINE_NUMBER LINE ----------- ------------------------------------------- 1 1 ,"TEST","SOMETHING","SOMETHING MORE" 2 2 ,"SOMETHING","TEST","SOMETHING MORE" 3 3 ,"SOMETHING","SOMETHING MORE","TEST" |
In the interest of keeping things K.I.S.S. I decided to create the final statement as a Common Table Expression, CTE:
-- Get list of files from IFS folder 01 WITH T1 (PATH_NAME) 02 AS 03 (SELECT PATH_NAME FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS('/home/MyFolder', 04 'NO','*STMF'))), -- See if any of those files contain string "TEST" 05 T2 (PATH_NAME,LINE_NUMBER) 06 AS 07 (SELECT T1.PATH_NAME,B.LINE_NUMBER FROM T1, 08 LATERAL 09 (SELECT LINE_NUMBER FROM TABLE(QSYS2.IFS_READ(T1.PATH_NAME)) 10 WHERE LINE LIKE '%TEST%') B) 11 SELECT * FROM T2 ORDER BY 1,2; |
Lines 1 – 4: In the first part of the CTE I am creating a virtual table that contains a list of all the files ("*STMF") in my folder, and not in any subfolders it may contain. As there is a second part of the CTE the first part must end with a comma ( , ).
Lines 5 – 10: In this second part I am taking the list of files and then checking to see if any contain the string "TEST". This I do by joining the view, T1, to the table function, IFS_READ, with a LATERAL. The results of the lateral are place in the virtual table T2. As only the SELECT line follows this part of the CTE does not end with a comma.
Line 11: Last, but not least, is the Select statement that is used to show the results from, in this case, the virtual table T2.
My results show the path name of the file that has rows that contain "TEST" and the line number of the row that does.
PATH_NAME LINE_NUMBER -------------------------- ----------- /home/MyFolder/TABLE1.csv 1 /home/MyFolder/TABLE1.csv 2 /home/MyFolder/TABLE1.csv 3 /home/MyFolder/TABLE3.csv 2 /home/MyFolder/TABLE3.csv 3 /home/MyFolder/TABLE3.csv 4 |
This article was written for IBM i 7.4, and should work for some earlier releases too.
Simon, great read and thanks for sharing. ‘grep’ in the shell tool can also give you some of the same results. It’s great to have so many tools to get the job completed. Thanks for sharing.
ReplyDeleteAnd i think it would be a lot faster.
DeleteBut it's great that and how Simon shows that more and more is possible. (Although the shown technique is not always the way one should really go.)
Thank you forward
ReplyDeletehello simon, is there an easy way to delete Files in IFS-Folder with SQL? for example alle files in IFS-Folder older than two month.
ReplyDeleteThere is not a command that allows you to delete with a date range in it.
DeleteI would try using the QCMDEXC scalar function. If you add it to your SQL Select statement with the DEL or RMVLNK command it would delete that row in the results. Many rows = many deletes.
Very useful. I have users ask for this all the time.
ReplyDelete