Pages

Wednesday, June 28, 2017

Using relative record number in SQL

sql rrn function to retrieve relative record number from file

I have written about using the Relative Record Number, RRN, in RPG and said that I could not think of a reason why I would ever need to use RRN in one of my programs. Well, I have to take that statement back as I recently found a really good reason for using it. I needed a quick way to retrieve the last record from several different "flat files", and check if it started with the characters "END:". I could do it in RPG, but what is the fun with doing that when I could do the same using SQL?

What is my definition of what a "flat file" is? See here.

I am not describe these "flat files" as it is not really relevant to what I am going to show. I will say that I had a variety of these files from several sources, the layout of the contents is different, and the maximum record lengths are different to. They all have the following in common:

  • The columns within are all CSV, Comma Separated Variable length.
  • They all contain three different record types:
    1. One header record, HDR:, as the first record.
    2. One or more detail records.
    3. One end record, END:, as the last record.

I want to create one program that would be able to check if the last record started with "END:", if it does not I do not have a complete file. I want to do is pass the name of the file to the program, and return if an end record was found. As the program I am going to show is just an example, rather than return a found/not found value, I am going to display the number of records in the file and the first four characters of the record instead.

SQL has a RRN function that will return the RRN of the row retrieved, and I can use this to retrieve the last record in the file.

01  dcl-s FileName char(10) ;
02  dcl-s RRN int(10) ;
03  dcl-s Row char(1000) ;
04  dcl-s String char(100) ;

05  String = 'CREATE ALIAS QTEMP.SOME_ALIAS FOR ' + FileName ;

06  exec sql EXECUTE IMMEDIATE :String ;

07  exec sql SELECT RRN(A),A.*
08             INTO :RRN,:Row
09             FROM QTEMP.SOME_ALIAS A
10            ORDER BY RRN(A) DESC
11            FETCH FIRST ROW ONLY ;

12  exec sql DROP ALIAS QTEMP.SOME_ALIAS ;

13  dsply (%triml(FileName) + ': +
           Value=<' + %subst(Row:1:4) + '> +
           RRN=<' + %trim(%editc(RRN:'J')) + '>') ;

Lines 1 – 4: These are the definitions of the variables I will be using in this program.

Lines 5 and 6: I thought about how I could handle multiple files, it dawned on me that I could use the CREATE ALIAS statement, which I have used for multiple member files, and if I do not give a member name the first member in the file is used for the Alias. I make my CREATE ALIAS statement in a variable, line 5, and then execute the statement in that variable using the EXECUTE IMMEDIATE statement to create the Alias. Notice that I created the Alias in QTEMP, because we should all use QTEMP for our work files, Aliases, etc.

Lines 7 – 11: Now I am going to get the last record and its RRN.

Line 7: I am selecting the RRN using the RRN function and all the fields from my file, I say "all fields" in reality it is just one field whose name is differs depending upon the "flat file".

Line 8: I am placing these retrieve values into two variables. As the "flat file" contains only one field I can place it in just one variable.

Line 9: I am retrieving this data from the Alias I previously created.

Line 10: If I sort the contents of the file by the RRN in descending order then the last record will be retrieved first.

Line 11: I only want to retrieve one row/record, which is the last record in the "flat flat".

Line 12: As I am finished using the Alias I delete it.

Line 13: Using the DSPLY operation code I can make a string and display it. All the fields I substring together must be character, therefore, I need to convert the numeric value in the variable RRN to character. If I use the %EDITC built in function it will convert the number to character applying edit code J to it. I need to trim the result to remove the leading blanks and the blank where the negative sign would be from the end.

When I run this over five different "flat files" I get the following results.

DSPLY  TFRFILE001: Value=<END:> RRN=<16>
DSPLY  TFRFILE002: Value=<END:> RRN=<212>
DSPLY  TFRFILE003: Value=<END:> RRN=<42,614>
DSPLY  TFRFILE004: Value=<END:> RRN=<439>
DSPLY  TFRFILE005: Value=<3,"1> RRN=<4>

The first four files are all complete. As the fifth file does not end with an "END:" record it is not complete, and I need to get a new version this file sent from the vendor.

I could also use the RRN to retrieve where in a file a particular record is. For example:

dcl-s RRN int(10) ;

exec sql SELECT RRN(A)
           INTO :RRN
           FROM ORDHDR A
          WHERE ORDNBR = '0425213' ;

dsply ('RRN=<' + %trim(%editc(RRN:'J')) + '>') ;

When this program it called the following is displayed:

DSPLY  RRN=<1,171>

While I will be using the RRN function to determine if the last record, I am not sure if I would use it for anything else. Do you use the RRN? If so to do what and how?

 

You can learn more about SQL's RRN function from the IBM website here.

 

This article was written for IBM i 7.3, and should work for earlier releases too.

 

What is a flat file?

Let me quickly define what I consider a "flat file" to be. It is a file that has no field definitions. Its file name, record format, and field name are all the same. If I was to create one I would use the following command:

CRTPF FILE(SOME_LIB/FLATFILE) RCDLEN(1000)

I normally use them when I am transferring data from another data source in a CSV format, or something similar.

HDR:
1,"G2025","PAPER TOWEL SINGLE ROLL",1.0892
10,"G2059","USA MADE STEEL THREAD 12 MICRON",212.2264
100,"G2061","LOCTITE GLUE 19ML TWIN PACK",5.8564
END:
****** END OF DATA ******

Return

17 comments:

  1. What about deletes and reusing deleted records? Could you delete a record (lets say rrn = 51) in a file with 123 records so max rrn = 123 and then use that deleted record (lets say rrn = 51) as the last record inserted to the file?

    ReplyDelete
  2. You're a powerful and talented man, Simon. That's a great technique. Thanks.

    ReplyDelete
  3. Personally, I prefer to use the row_number() function instead of RRN() since the latter is not always a sequential integer. Try it out!

    SELECT row_number() over () as rowNum,
    personID,
    firstName,
    lastName,
    birthDate
    FROM MySchema.MyTable
    order by rowNum desc
    fetch first row only

    ReplyDelete
    Replies
    1. As an aside, ROW_NUMBER() OVER() does not guarantee to build a running number over the table, the query optimizer might use an index and then the running number is built over the key sequence

      Birgitta

      Delete
    2. @Birgitta; Thank you for the clarification.

      Delete
  4. I've used SELECT * FROM myfile WHERE RRN(myfile) = 12345 when I want to see the specific record that is shown in a job's open files. Have also used it when I wanted to create a sample output file (every 100th record): CREATE TABLE mylib/myfile
    AS
    (SELECT * FROM yourlib/yourfile
    WHERE MOD((RRN(yourfile),100) = 0) with data

    ReplyDelete
    Replies
    1. As an aside before relase 7.1 or when the CQE must be used a tables scan is performed to find the relative record no. This is no problem for a table with 500 rows, but becomes a performance problem for a 500 Billion row table.
      Meanwhile values lists are used to find the relative record no which performs much faster.
      ... and since Release 7.1 TR 5 it is also possible to create an index over the relative record no.

      Delete
    2. Hi, may i know how can i create index on a physical file's rrn?

      Delete
    3. CREATE INDEX QTEMP.INDEX1
      ON file name
      (RRN(file name))

      Delete
    4. Thanks Simon. I have a query that runs on RRN, it's a range query and it refuses to use the RRN index.

      Something like this
      Select a, b, c, from amcmst where rrn(amcmst) >= 1 and rrn(amcmst) <= 13678

      I have index created
      Create index amcmst 01 on amcmst (rrn(amcmst))
      Fetch statement

      Exec SQl fetch relative :wrrn from c1 for :hdnrec into :dswrow

      Hdnrec 999 rows

      The above execution won't trigger index

      Delete
    5. Yes, it is a design "oversight" with Db2 for i and Query that you cannot query an Index.

      Retrieve the SQL from the Query and make a SQLRPGLE to do what you need.

      Getting the SQL statement out of Query/400

      Delete
  5. I have used rrn to help me delete duplicates. I find my duplicates then I delete the ones that have the higher rrn.

    ReplyDelete
  6. Hi Simon, Nice blog, I have a small request, can we select records based on RRN using WRKQRY.

    ReplyDelete
    Replies
    1. Query does not have the ability to use RRN.

      The only suggestion I have is to build a work file using the technique described above and then use Query on the work file.

      Delete
  7. I've used RRN to help quickly generate files for performance testing with a unique key:

    CREATE TABLE T ( A INT)
    INSERT INTO T VALUES(1)
    INSERT INTO T SELECT * FROM T -- Repeat this to double number of records to get as large as you want.
    UPDATE T SET A = RRN(T)
    -- Then either add a unique constraint or copy file to a similar file but with unique key defined.

    ReplyDelete
  8. Simon, thanks for sharing, retrieval of a record by the RRN, is a very good ideal sometimes. Thanks for another learning moment.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.