I saw this mentioned in a Facebook discussion where someone said he was using what I will describe below to determine the difference between the records in two files. I had not heard of this method so I wanted to try it out for myself and determine whether it is something I could use in the future. Two SQL Select statements joined with an EXCEPT clause. But before I show examples of that I need data.
I am going to:
- Create a file
- Add data to it
- "Clone" the file to make an exact copying, including the data
- Change the data in the "clone"
- Compare the two files
In these examples I am just going to use a file with one field. The same principles work with one field as it would with a hundred fields. I decided to K.I.S.S. and just have one field in my file. The layout for the file is:
A R FILE0R A FIRST_NAME 20A |
I created the file:
CRTPF FILE(MYLIB/FILE0) SRCFILE(*LIBL/DEVSRC) SIZE(*NOMAX) |
Inserted data into the file using an SQL Insert statement:
INSERT INTO FILE0 VALUES('ANDREA'),('BETTY'), ('CAROLE'),('DOROTHY'), ('EVELYN'),('FRANCESCA'), ('GEORGINA') |
I can then view the contents of my file, including the file's internal RRN, Relative Record Number, so I can tell which records I am doing things to.
01 SELECT RRN(A) AS RRN, 02 A.FIRST_NAME 03 FROM FILE0 A RRN FIRST_NAME --- ---------- 1 ANDREA 2 BETTY 3 CAROLE 4 DOROTHY 5 EVELYN 6 FRANCESCA 7 GEORGINA |
Line 1: I have to qualify the column names from the file so that I can use the qualifying character, or name, with the RRN function as shown.
I am going to "clone" the file using the Create Duplicate Object command, CRTDUPOBJ:
CRTDUPOBJ OBJ(FILE0) FROMLIB(*LIBL) OBJTYPE(*FILE) NEWOBJ(FILE1) DATA(*YES) CST(*NO) TRG(*NO) |
I am not going to show the contents of the new file, FILE1, as it contains the same data the original file, FILE0, does.
Now to make some changes to the data in the new file:
01 DELETE FROM FILE1 WHERE FIRST_NAME = 'DOROTHY' 02 INSERT INTO FILE1 VALUES('HARRIET') 03 UPDATE FILE1 SET FIRST_NAME = 'BARBARA' WHERE FIRST_NAME = 'BETTY' |
Now I am going to show you the data in the new file as now it is different from the original file:
01 SELECT RRN(B) AS RRN, 02 B.FIRST_NAME 03 FROM FILE1 B RRN FIRST_NAME --- ---------- 1 ANDREA 2 BARBARA 3 CAROLE 5 EVELYN 6 FRANCESCA 7 GEORGINA 8 HARRIET |
As I now have two files I can compare them.
As I said before all I need to do is to combine two SQL Select statements with the EXCEPT clause. In this example I want to know what is in FILE0 and not in FILE1. To do this the first SQL Select statement has to be for FILE0 and the second for FILE1:
01 SELECT RRN(A) AS RRN,A.* 02 FROM FILE0 A 03 EXCEPT 04 SELECT RRN(B),B.* 05 FROM FILE1 B |
The results show that there are two differences:
RRN FIRST_NAME --- ---------- 4 DOROTHY 2 BETTY |
Dorothy's record is in the results as I deleted it from FILE1.
Betty's record is shown as I changed the name to "Barbara" in FILE1.
Now let me do the comparison the other way around:
01 SELECT RRN(B) AS RRN,B.* 02 FROM FILE1 B 03 EXCEPT 04 SELECT RRN(A),A.* 05 FROM FILE0 A |
These are the results:
RRN FIRST_NAME --- ---------- 8 HARRIET 2 BARBARA |
Harriet's record is shown as I added it to FILE1, therefore, it is not present in FILE0.
Barbara's record is displayed as I changed Betty's name "Barbara", therefore, it is displayed.
Having deleted Dorothy's record above, what would happen if I added a record for Dorothy into FILE1 again?
INSERT INTO FILE1 VALUES('DOROTHY') |
As this recorded was inserted it will have a new relative record number, not the same one that the original Dorothy record had:
RRN FIRST_NAME --- ---------- 1 ANDREA 2 BARBARA 3 CAROLE 5 EVELYN 6 FRANCESCA 7 GEORGINA 8 HARRIET 9 DOROTHY |
Let me run the comparison SQL statement again listing the differences in FILE1 when compared to FILE0.
01 SELECT RRN(B) AS RRN,B.* 02 FROM FILE1 B 03 EXCEPT 04 SELECT RRN(A),A.* 05 FROM FILE0 A RRN FIRST_NAME --- ---------- 8 HARRIET 2 BARBARA 9 DOROTHY |
The following SQL Selects with a UNION clause show that the two Dorothy records have different relative record numbers and this makes them different:
01 SELECT 'FILE0' AS FILE, 02 RRN(A) AS RRN, 03 A.FIRST_NAME 04 FROM FILE0 A 05 WHERE A.FIRST_NAME = 'DOROTHY' 06 UNION 07 SELECT 'FILE1',RRN(B),B.FIRST_NAME 08 FROM FILE1 B 09 WHERE B.FIRST_NAME = 'DOROTHY' FILE RRN FIRST_NAME ----- --- ---------- FILE0 4 DOROTHY FILE1 9 DOROTHY |
This experiment has shown me that I can use this to determine if there are differences in the records in two files. For example: are the records in the file in "production" the same as the one in the "test" environment?
You can learn more about the EXCEPT clause from the IBM website here.
This article was written for IBM i 7.4, and should work for some earlier releases too.
This is on 7.4 only, right?
ReplyDeleteNo. Runs on 7.3 for sure as I tested it at that release.
DeleteI believe it will run on 7.2 too.
Hi Simon,
ReplyDeleteJust to understand. If you have same data with different RRN in both files, then data will be identified as different. but actually it is same. Is that so.
You are correct.
DeleteIf you use 7.4 and have the latest TRs this would be a better approach.
Nice example.
ReplyDeleteAs you said, very useful to compare Test and Production tables.
Other Suggestion:
I use very often EXCEPTION JOIN to find integrity errors. F.e. missing records in a Mandatory - Optional relation where records exist in the child table but are missing in the parent while they should be there.
Motivation:
Very often databases don't implement Referential integrity. Which is the case in many (old) IBM i environments.