With the latest IBM i Technical Refreshes to 7.2, TR6, and 7.3, TR2, a number regular expressions functions were added to Db2 for i (SQL). In this post I am going to describe the REGEXP_COUNT function.
Regular readers of this blog will know that I have already written about the REGEXP_LIKE. Rather than repeat a lot of the things I wrote about in the post I am just going to refer you to it at various time in this one.
The format for this function is as follows:
REGEXP_COUNT(source-string, search-pattern, start-position, regexp-flags) |
Only the first two parameters: source-string and search-pattern are mandatory, the other two are optional.
Let me start with some simple examples. In the example below I want to know how many times the search pattern 'rpg' appears in the search string.
SELECT REGEXP_COUNT('RPGPGM.COM is the best website for rpg examples', 'rpg') FROM SYSIBM.SYSDUMMY1 |
I have to use the dummy file SYSIBM.SYSDUMMY1 as this is a Select statement where the source string is not a row/field in a table/file.
My result is:
REGEXP_COUNT 1 |
It is only 1 as I searched for the lower case pattern 'rpg' that only occurs once in the source string. If I wanted my result to ignore case I need the 'i' in the regexp flags. For more example of ignoring case see the post about the REGEXP_LIKE.
SELECT REGEXP_COUNT('RPGPGM.COM is the best website for rpg examples', 'rpg','i') FROM SYSIBM.SYSDUMMY1 |
Now I get the count of both occurrences of 'rpg'.
REGEXP_COUNT 2 |
Notice that I used only the first, second, and fourth parameters, I did not have to give a value for the starting position parameter. Db2 is smart enough to know I had omitted it. If I was to include it the statement would look like.
SELECT REGEXP_COUNT('RPGPGM.COM is the best website for rpg examples', 'rpg',1,'i') FROM SYSIBM.SYSDUMMY1 |
I can use the start position to start my search for the search pattern in the second position:
SELECT REGEXP_COUNT('RPGPGM.COM is the best website for rpg examples', 'rpg',2,'i') FROM SYSIBM.SYSDUMMY1 |
I only return a count of 1 for my result, as the first "RPG" is ignored as it starts in the first position.
REGEXP_COUNT 1 |
I can even take the table I used in the previous post and count the number of vowels in the players' names who have a "t" in their last names:
SELECT PID,FNAME || ' ' || LNAME AS NAME, REGEXP_COUNT(FNAME || LNAME),'[aeiou]','i') AS VOWELS FROM PERSON WHERE REGEXP_LIKE(LNAME,'t','i') |
Which gives me:
Person NAME VOWELS id 2 John Austin 4 3 don bennett 3 6 george curtis 5 10 RON HEWITT 3 17 Frank Kletzenbauer 6 18 arthur lightening 5 22 KEN SATCHWELL 3 23 Nelson Stiffle 4 24 ray straw 2 |
These examples are good to show you what the returned results are, but how would I code them in a program. Let me take the search string from the first examples and count the number of vowels in it.
01 **free 02 dcl-s SourceString char(50) inz('RPGPGM.COM is the best website for rpg examples') ; 03 dcl-s Vowels int(5) ; 04 exec sql SET :Vowels = REGEXP_COUNT(:SourceString, '[aeiou]','i') ; 05 dsply ('Number of vowels = ' + %char(Vowels)) ; 06 *inlr = *on ; |
Line 1: This program is written in totally free RPG.
Line 2: I have defined the search string as a variable, Searchstring.
Line 3: The variable Vowels will be used for the returned value of the number of vowels in the search string.
Line 4: Rather than use a Select statement, I decided to use a Set statement instead.
Line 5: I am using the Display operation code, DSPLY, to show the number in the variable Vowels.
When I run this program I get the following value displayed:
DSPLY Number of vowels = 11 |
In the next example I only want the first row/record from the table/file that fits the selection criteria I used in the above examples with the table PERSON.
01 **free 02 dcl-s PlayerId packed(9) ; 03 dcl-s PlayerName varchar(30) ; 04 dcl-s Vowels int(5) ; 05 exec sql SELECT PID, 06 FNAME || ' ' || LNAME, 07 REGEXP_COUNT(FNAME || LNAME,'[aeiou]','i') 08 INTO :PlayerId,:PlayerName,:Vowels 09 FROM PERSON 10 WHERE REGEXP_LIKE(LNAME,'t','i') 11 FETCH FIRST ROW ONLY ; 12 *inlr = *on ; |
Line 1: I am using totally free RPG.
Lines 2 – 4: I am declaring the variables I will be getting the results from the Select statement. I hope the names are descriptive enough for you to understand how the variables will be used.
Lines 5 – 11: My Select statement.
Lines 5 – 7: I broke the columns/fields I want returned onto three lines to make it easier to understand my result set.
Line 8: The results are moved into these variables.
Line 9: Name of the table.
Line 10: Is where I define that I only want player's whose last name contains the letter "t".
Line 11: I only want to return the first row of the results.
If I debug the program and break at line 12 I can see that the variables I defined contain the values I expected.
PLAYERID = 000000002. PLAYERNAME = 'John Austin ' VOWELS = 4 |
Other Db2 for i regular expressions:
You can learn more about the REGEXP_COUNT Db2 for i function from the IBM website here.
This article was written for IBM i 7.3 TR2 and 7.2 TR6.
the statement
ReplyDeleteREGEXP(FNAME || LNAME),'[aeiou]','i') AS VOWELS
generates REGEXP in *LIBL type *N was not found
was REGEXP supposed to be an different expression?
No that was a typo on my part [sad face]
DeleteThe correction has been made.
It should be REGEXP_COUNT, not just REGEXP.
awesome....totally awesome!!!
ReplyDeleteThis is another great example of coverage of Modern pattern matching using regular expressions. To keep up you need to read up on the new powerful tools you can use in modern RPG on the IBM i...it is a very POWERFUL system.
ReplyDeleteAnother great article, please keep up the good work!
ReplyDelete