As I have mentioned in previous posts there are several processes I have written and responsible for that takes data from a CSV (Comma Separated Values) files and updates files on the IBM i (AS400). One vendor has trouble sticking to the agreed standard of not having double quotes ( " ) in the description fields to denote inches.
The double quotes are interpreted as the start of an alphanumeric field which does not end until another double quote in encountered. This results in the extra fields being created at the end of the record or a loss of several records which are combined into one big field. I prevented this from happening by using the %XLATE built in function in RPGLE/RPG IV to replace all occurrences of the double quote with a space.
But this week they sent a right double quotation mark ( ” ), which were translated to some strange hexadecimal characters by the ASCII to EBCDIC translation table.
How can I replace these hexadecimal characters?
I tried cut-n-pasting the displayed characters into a %XLATE statement, but that did not work.
Then it struck me… if I could find what those hexadecimal values are I could try to use them in a %XLATE. There are many third part tools, for example DBU, that allow you to view the hexadecimal values, but I endeavor in this blog to use just the IBM commands and features so everyone can do it. Therefore, I used the DSPF command, see below:
DSPF FILE(MYLIB/TESTPF) |
The first screen displays the file in character mode. I pressed F10 to display hex.
The interesting part of the screen is displayed below:
- - - - + - - - - * - - - - + - - - - * ----+----*----+----* 40404040 40404040 40404040 405C5C5C 5C5C5C5C ******* F3BA2014 40E740F3 BA201440 E740F1F2 BA201440 3[ X 3[ X 12[ 40404040 40404040 40404040 405C5C5C 5C5C5C5C ******* |
Comparing the character representation, on the right, I am able to find the hexadecimal values on the left, highlighted in red.
Now I know that the hexadecimal values I need to replace are:
- BA
- 20
- 14
Now I could write the lines I needed to replace these in my RPGLE program:
01 FLD01 = %xlate(x'BA':' ':FLD01) ; 02 FLD01 = %xlate(x'20':' ':FLD01) ; 03 FLD01 = %xlate(x'14':' ':FLD01) ; |
The x before the hexadecimal value denotes that it is a hex value.
The result is shown below:
- - - - + - - - - * - - - - + - - - - * ----+----*----+----* 40404040 40404040 40404040 405C5C5C 5C5C5C5C ******* F3404040 40E740F3 40404040 E740F1F2 40404040 3 X 3 X 12 40404040 40404040 40404040 405C5C5C 5C5C5C5C ******* |
Those hexadecimal codes have been translated to blank, 40.
You can learn more about these on the IBM website:
This article was written for IBM i 7.1, and it should work with earlier releases too.
You can also do it in one line of code:
ReplyDeleteFLD01 = %xlate(x'BA2014':' ':FLD01) ;
My thoughts exactly although the "to" constant should have three blanks since there is a one-to-one relationship with the characters in the "from" and "to". I assume doing it in one line should be more efficient too.
DeleteIt does have three blanks, although it's hard to see that, but it's good that it was mentioned for the sake of clarity.
DeleteThank you for your time in posting. I am trying to understand the example you provided here. I do not see where you had a double quote on your DSPF result. Are you saying a double quote shows as an open parenthesis ?
ReplyDeleteThe problem was that a curly double quote, “ which is the equivalent of HTML character code “, not a standard double quote, HTML character code "
DeleteThe curly double quote character code was translated to three hexadecimal values, which I give above. One of which was the equivalent of [.
I did try & %XLATE the [ to blank, alas, this [ was stubborn & did not want to %XLATE. When I tested with a [ character I entered into the program the %XLATE would work.
That is why I decided to try and %XLATE the characters as hex.
This is the technique we use to clean up all unprintable characters (< X'40').
ReplyDeleteD from C X'000102030405060708090A0B0C0D0E0F-
D 101112131415161718191A1B1C1D1E1F-
D 202122232425262728292A2B2C2D2E2F-
D 303132333435363738393A3B3C3D3E3F'
D to C ' -
D '
/FREE
FLD01 = %xlate(from:to:FLD01);
/END-FREE
HI All,
ReplyDeleteI like the technique of the %XLATE biff and it shows it versatility. There are other techniques as well, for instance I do quite a bit of Unix socket connections to ASCII devices like RF reader where we measure theater time for billing processes. Here I use the XLATE api and here is the coding :
* Convert to ASCII for send and to EBCIDIC for receive
DTranslate pr ExtPgm('QDCXLATE')
D len 5p 0 const
D string 2000a options(*varsize)
D xlatetype 10a const
D Data_Length s 10i 0
D Data_String s 2000a options(*varsize)
/Free
// Convert to ASCII
Data_Length = %len( %trim( Data_String ) );
Callp Translate( Data_Length
: Send_Data_String
: 'QTCPASC'
);
// Translate received data to EBCDIC
Callp Translate( Data_Length
: Send_Data_String
: 'QTCPEBC'
);
/end-free
In the early 90's, before EDI translators, I used to XLATE to parse raw data from trading partners. Imagine my delight (and frustrating) after accepting a job for a larger company that had invested in GENTRAN - when discovering someone else had already done the grunt work.
ReplyDeleteI like the XLATE command as it is quite useful. In this particular case is there a reason why you did not used a translation table ?
ReplyDeleteTables can be used in DDS and some of the copy commands such as CPYTOPCD. One great trick when you need to convert a fixed length OS/400 db file to a variable record length PC text file is to use the X'41' non-blank blank character ... just translate it to X'40' in the CPYTOPCD referenced table. You then put a Hex 41 in the file at the point where you want the CR/LF and then you get variable length text records. The translation table puts the X41's back to X40 in the PC final file .
I use %XLATE to remove specail Hex characters the old Stone & Chisel (a.k.a. SEU) developers use. The old Windows would show special characters as a "BLOCK" inside of LPEX (Live Parasing Extensible Editor) in RDi 9.0.1. But Microsoft changed the fonts, so that these special characters no longer show up. So, I do a remove. Since RDi 9.0.1 already has colors in it's editors....(You can use the font "Terminal" to see the special Hex characters below x'40')
ReplyDeleteSo, if the Developer put in x'22' (Color White) for comment lines. I do an override the source member and run an %XLATE like this:
SRCDTA = %Xlate(x'2021222324252627282930':x'4040404040404040404040':SRCDTA).
Often I get why does RDi 9.0.1. Indent "funny". It is because of those special characters. So I remove them when editing my code.