Using derived columns to sort files @ RPGPGM.COM

Wednesday, August 30, 2017

Using derived columns to sort files

using sql views and derived columns to sort the data in the results

My employer uses "descriptive" part numbers for the parts they manufacturer. Without going into too much information, "descriptive" part numbers contain information about the item within the part number, it could contain characters that describe the material used to make it and/or characters to denote its end use. I am often asked to create a report for certain collections of parts. These "collections" are identified by a range of characters in certain positions of the part number, and the sort order desired may not be in a simple sort order, I will explain what I mean by that later.

The "descriptive" part numbers I will be using in these example consists of three parts:

Positions	Description
1 – 2	Finished product line
3	Dash
4 - 8	Raw material made from
9	Dash
10 – 30	Customer end use

For example: HN-C0434-S15

The Part Master file is called PARTMST, and contains the following fields:

01  A                                      UNIQUE
02  A          R RPARTMST
03  A            PARTNBR       30A
04  A            PARTTYPE       2A
05  A            PARTORIGIN     3A
06  A            PARTCOST       7P 2
07  A          K PARTNBR

I can use this simple SQL Select statement to view its contents:

  SELECT * FROM PARTMST ORDER BY PARTNBR


PARTNBR        PARTTYPE  PARTORIGIN   PARTCOST
HN-C0434-S15      AB        USA            .12
HN-C0677-S22      A2        MEX          14.20
HS-C0211-PX       A1        CAN           3.21
HS-C0424-PP       A         USA          11.50
HS-C0899-PP       A         USA           9.39
SS-C1133-Q01      AB        USA           7.00

I need to sort by the raw material part of the Part Number. I can easily create a Logical file, with a new field, SORT1, created from substring the raw material information from the Part Number field.

01  A          R RPARTMSTL0                PFILE(PARTMST)
02  A            PARTNBR
03  A            PARTTYPE
04  A            PARTORIGIN
05  A            PARTCOST
06  A            SORT1              I      SST(PARTNBR 4 5)
07  A          K SORT1

I can use a SQL Select statement to see and sort the data by in this Logical file:

  SELECT * FROM PARTMSTL0 ORDER BY SORT1


PARTNBR        PARTTYPE  PARTORIGIN   PARTCOST   SORT1
HS-C0211-PX       A1        CAN           3.21   C0211
HS-C0424-PP       A         USA          11.50   C0424
HN-C0434-S15      AB        USA            .12   C0434
HN-C0677-S22      A2        MEX          14.20   C0677
HS-C0899-PP       A         USA           9.39   C0899
SS-C1133-Q01      AB        USA           7.00   C1133

The request is that I must sort the report by the Part Type and then the raw material. The Part Type must be sorted in the order "AB", "A1", "A2", and then everything else. This is where using a Logical file no longer is efficient. I switch to using a SQL View. By using derived columns I can assign any value to the new column(s), and then sort using the new column(s).

01  CREATE OR REPLACE VIEW MYLIB.PARTMSTV0
02  (PARTNBR,PARTTYPE,PARTORIGIN,PARTCOST,
03   SORT1,SORT2)
04  AS SELECT A.*, 
05            SUBSTR(A.PARTNBR,4,5),
06            CASE WHEN A.PARTTYPE = 'AB' THEN 10
07                 WHEN A.PARTTYPE = 'A1' THEN 20
08                 WHEN A.PARTTYPE = 'A2' THEN 30
09                 ELSE 99
10            END
11  FROM PARTMST A ;

Lines 2 and 3: This is the definition of the columns that will be in the View.

Line 4: By using A.* I am including all the fields from the Physical file in the Select statement.

Line 5: The Substring parses out the raw material section of the Part Number.

Lines 6 – 10: As I mentioned above the sort order of the Part Type is not a simple sort, not just an ascending or descending sort. This is where I am using the CASE to give a value to the SORT2 column depending upon the value in the PARTTYPE field. Notice that I have an ELSE, line 9, this is what I call a "catch all" in other words if the value is not any of the above the value of SORT2 is 99.

Now the SQL Select’s ORDER BY is very simple that everyone can understand it.

  SELECT * FROM PARTMSTV0 ORDER BY SORT2,SORT1


PARTNBR        PARTTYPE  PARTORIGIN   PARTCOST   SORT1   SORT2
HN-C0434-S15      AB        USA            .12   C0434      10
SS-C1133-Q01      AB        USA           7.00   C1133      10
HS-C0211-PX       A1        CAN           3.21   C0211      20
HN-C0677-S22      A2        MEX          14.20   C0677      30
HS-C0424-PP       A         USA          11.50   C0424      99
HS-C0899-PP       A         USA           9.39   C0899      99

Rather than have multiple sort columns I can use a more complex CASE to create one sort column:

01  CREATE OR REPLACE VIEW MYLIB.PARTMSTV1
02  (PARTNBR,PARTTYPE,PARTORIGIN,PARTCOST,
03   SORT)
04  AS SELECT PARTNBR,PARTTYPE,PARTORIGIN,
05            PARTCOST,
06            CASE WHEN PARTORIGIN = 'USA' AND PARTTYPE = 'AB' THEN 10
07                 WHEN PARTORIGIN = 'USA' AND PARTTYPE = 'A1' THEN 20
08                 WHEN PARTORIGIN = 'USA' THEN 30
09                 WHEN PARTORIGIN = 'MEX' THEN 40
10                 WHEN PARTORIGIN = 'CAN' THEN 50
11                 ELSE 99
12            END
13  FROM PARTMST ;

The SQL Select statement is, again, very simple:

  SELECT * FROM PARTMSTV1 ORDER BY SORT


PARTNBR        PARTTYPE  PARTORIGIN   PARTCOST   SORT
HN-C0434-S15      AB        USA            .12     10
SS-C1133-Q01      AB        USA           7.00     10
HS-C0424-PP       A         USA          11.50     30
HS-C0899-PP       A         USA           9.39     30
HN-C0677-S22      A2        MEX          14.20     40
HS-C0211-PX       A1        CAN           3.21     50

When considering whether to use a Logical file or a SQL View you need to consider that when the Physical file is updated the all associated Logical files’ pointers are also updated. If the Logical file is just used for one or two reports then this, in my opinion, slows down any program that updates the Physical or associated Logical files. The advantage of a SQL View is that its pointers are not updated until the View is used. I can have hundreds of Views over a Physical file and the speed of updating the Physical is not impacted.

When I say "report" I am rarely asked for a paper report these days. Most of my requests are for a CSV file that can be opened in Microsoft Excel. I create a file/table in QTEMP, copy the file to the IFS, and then email it to the requester. The requester can then manipulate the data into any form they want using Excel.

This article was written for IBM i 7.3, and should work for earlier releases too.

13 comments:

chris haydenAugust 30, 2017 at 9:49 AM
excellent
ReplyDelete
Replies
Dan LovellAugust 30, 2017 at 10:08 AM
Simon, as usual, very good write-up. However I would suggest you not use DDS in your examples. Use DDL, so this will encourage your audience to do things right and for the purpose of writing modern code. Your point was not missed...good tip you have there!
ReplyDelete
Replies
JoseenocperezAugust 30, 2017 at 11:12 AM
Hi Simon like always good article, however if this was done just to explain the use of derived columns that's good but wouldn't it be easier to do this in this way.
Select * from PARTMST
order by TRIM(Parttype)||Trim(substr(PARTNBR,4,5))

if the order is by Partype and Partnbr
instead if the order is by PARTORIGIN and PARTTYPE then replace the order by accordingly.
ReplyDelete
Replies
ScottAugust 30, 2017 at 11:04 PM
Good ideas for getting to the all important DATA SET you need for your application. Thanks for pointing out the difference between Logical files and Views and their impact on updating data in a table.
ReplyDelete
Replies
AnonymousAugust 31, 2017 at 5:47 AM
Another consideration for using views is that LFs can take up considerable disk space over large tables (unless they share access paths).
ReplyDelete
Replies
Mike SSeptember 4, 2017 at 9:35 AM
Of course, if E.F. Codd had designed the database table he would have realized that he has dealing with encoded information, and he would have split the partnumber into multiple columns when creating the table and then used a view to bring the information back together at presentation time. No encode data rule.
ReplyDelete
Replies
Mike MoeglingNovember 7, 2021 at 7:04 PM
Simon, of course this is excellent.

I use derived columns quite a bit in SQL to make the data extensible and "come alive" for the task at hand. Using extensible columns during the SQL process is an excellent way to add additional dimensions to the data. In LOB and ERP systems the data is often the servant of the application that it belongs to, but using SQL to read the data and in a temp table or inline SQL you can join and create the perfect scenario for your purpose.

The last couple of years I have had to pull off some miracles with ERP and associated systems data to help get through the adverse business conditions brought on by COVID and shortages. Without SQL I could never have been able to do this.

I encourage all developers to be SQL magicians and have as many SQL techniques in their toolboxes as possible.

On the IBM i systems and the great DB2 database, Simon keep leading the charge!

Cheers, Mike
ReplyDelete
Replies
Mark TaylorDecember 20, 2021 at 9:13 AM
Saving this one.
So many times this would have come in handy!
ReplyDelete
Replies

Add comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.

RPGPGM.COM - From AS400 to IBM i

Pages

Wednesday, August 30, 2017

Using derived columns to sort files

13 comments: