Wednesday, July 29, 2020

Removing apostrophe in a field using SQL

replace 1 character with another using rpg and sql

There are occasions when interfacing data to non-IBM i database where an apostrophe ( ' ) in a field can cause a conversion error. In this post I am going to give some simple examples of how apostrophes can be converted to something else using SQL and RPG.

Let me start with the DDS file I will be using in these examples. The file is called TESTFILE and contains one field:

A          R TESTFILER
A            COMPNAME      35A

The field contains the following three records:

COMPNAME
------------------
SMITH BUTCHERS
LLOYD'S BAKERY
JOHNS GREENGROCERS

I am sure you have noticed that the apostrophe is in the second record.

How would I remove this if I was to write a RPG program:

01  **free
02  dcl-f TESTFILE usage(*update) ;

03  dow (1 = 1) ;
04    read TESTFILER ;
05    if (%eof) ;
06      leave ;
07    endif ;

08    COMPNAME = %scanrpl('''':'':COMPNAME) ;
09    update TESTFILER %fields(COMPNAME) ;
10  enddo ;

11  *inlr = *on ;

Line 1: Everything I write these days is in totally free RPG.

Line 2: Here I have defined the file for update.

Line 8: Here is where I replace the apostrophe. I use the Scan and replace built in function, %SCANRPL. The built in function's parameters are:

  1. Scan string, what I want to replace
  2. What I want to replace it with
  3. Variable or field that the string is contain within

It took some trial and error to determine the correct number of apostrophes in the first parameter to denote a single apostrophe in the string (that is four in the code).

The second parameter contains two apostrophes next to each other, this denotes a value of null. If I replace any part of the string with null it removes that part of the string from the string, as you will see below.

Line 9: My update operation is followed by the %FIELDS built in function. That is a moot point with this file as it only contains one field, but in a multi-field file I would list the fields I want to update, and the other would remain unchanged.

After compiling this program, I run it and the apostrophe is removed from the second record:

COMPNAME
------------------
SMITH BUTCHERS
LLOYDS BAKERY
JOHNS GREENGROCERS

Before I make any change to a file using SQL I always like to test my statement to make sure my results are what I desire. In this example I am using the Replace function to replace the apostrophe with null:

01  SELECT COMPNAME,
02         REPLACE(COMPNAME,'''','')
03    FROM TESTFILE

The replace function has remove the apostrophe from the results, not from the file's field.

COMPNAME             REPLACE
------------------   ------------------------
SMITH BUTCHERS       SMITH BUTCHERS    
LLOYD'S BAKERY       LLOYDS BAKERY     
JOHNS GREENGROCERS   JOHNS GREENGROCERS

I could also use the replace regular expression to do the same.

01  SELECT COMPNAME,
02    REGEXP_REPLACE(COMPNAME,'''','',1)
03    FROM TESTFILE

The results are the same as above.

Rather than display the data I need to change the record in the file to remove the apostrophe.

01  UPDATE TESTFILE
02     SET COMPNAME = REPLACE(COMPNAME,'''','')

When I query the file I see that the apostrophe was replaced by the null, therefore, the "S" is right next to the "D":

COMPNAME
------------------
SMITH BUTCHERS
LLOYDS BAKERY
JOHNS GREENGROCERS

Let me try the same with the regular expression replace:

01  UPDATE TESTFILE
02     SET COMPNAME = REGEXP_REPLACE(COMPNAME,'''','',1)

Alas, here I receive the following message:

Argument 03 of function REGEXP_REPLACE not valid.

When I look in the job log I see a message before the one above:

Parameter 3 of function REGEXP_REPLACE not valid.
Argument 03 of function REGEXP_REPLACE not valid.

When I display the first message I see:

Message ID . . . . . . :   CPD439A
Severity . . . . . . . :   40
Message type . . . . . :   Diagnostic

Message . . . . :   Parameter 3 of function REGEXP_REPLACE not
valid.
Cause . . . . . :   Parameter 3 specified in function
REGEXP_REPLACE is not valid for use for reason code 6. The 
reason codes and their meanings follow:

    6 -- Parameter must be a valid CCSID.

The update using the regular expression will not accept null. I can replace the apostrophe with any character, for example with a hyphen ( - ), but not with null:

01  UPDATE TESTFILE
02     SET COMPNAME = REGEXP_REPLACE(COMPNAME,'''','-',1)

The result shows that the apostrophe was replaced with the hyphen:

COMPNAME
------------------
SMITH BUTCHERS
LLOYD-S BAKERY
JOHNS GREENGROCERS

As you can see it easy to remove an apostrophe or any other character using SQL's replace function, or even the replace regular expression if you are replacing one character with another.

 

This article was written for IBM i 7.4, and should work for some earlier releases too.

3 comments:

  1. I don't know Simon, you must have some other issue going on, something with your CCSID it seems. That rexex_replace works just fine for me to strip the ' out:

    LNAME FNAME
    Barkley 'Charles'
    Claus Santa
    C'laus Misses'
    Curry Steph
    Jackson Michael''
    Jordan Michael
    K'e'm'p'' Shawn
    Miller Reggie
    'Payton Gary
    Schrempf Detlef

    UPDATE people
    SET LNAME = REGEXP_REPLACE(LNAME,'''','',1),
    FNAME = REGEXP_REPLACE(FNAME,'''','',1)

    LNAME FNAME
    Barkley Charles
    Claus Santa
    Claus Misses
    Curry Steph
    Jackson Michael
    Jordan Michael
    Kemp Shawn
    Miller Reggie
    Payton Gary
    Schrempf Detlef

    Display System Value

    System value . . . . . : QCCSID
    Description . . . . . : Coded character set identifier

    DSPSYSVAL QCCSID:

    Coded character set
    identifier . . . . . : 37 1-65535


    DSPJOB, 2, PGDN, PGDN, PGDN :

    Coded character set identifier . . . . . . . . . : 37
    Default coded character set identifier . . . . . : 37

    V7R4 TR2

    Thanks for everything you do for the IBMi community!!


    Dan D

    ReplyDelete
  2. Relative to regex_replace, you can achieve the removal (or replace with nothing) by not passing the replacement string.

    -- match any non-numeric character - remove it
    regexp_replace(field, '\D' )
    -- match any numeric character - remove it
    regexp_replace(field, '\d' )

    if field contains 2011TX, the result would be 2011
    if field contains 2011TX, the result would be TX

    ReplyDelete
  3. 01 UPDATE TESTFILE
    02 SET COMPNAME = Translate(COMPNAME,'','''')

    Ringer

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.