Validation: CHAIN versus SETLL @ RPGPGM.COM

Wednesday, August 28, 2013

Validation: CHAIN versus SETLL

I recently received a message about the post No More Number Indicators. The person asked why I had used a SETLL operation code instead of a CHAIN to check for a record on a file in one of the examples.

12   setll Z1DEPT DPTMAST ;
13   if not(%equal) ;
14     ErrDept = *on ;
15   endif ;

I have to credit to my wife for this. Years ago she attended an IBM conference in San Diego, California, and she went to a presentation by John Sears about improving your code's performance. This was one of his suggestions

If you think about what the two operations code do it does make sense.

  chain Key FILE ;
  if not(%found) ;
     Error = *on ;
  endif ;

  setll Key FILE ;
  if not(%equal) ;
     Error = *on ;
  endif ;

The CHAIN retrieves a record from the file, and if successful, places the record into the input fields.

The SETLL positions the file pointer at the record it finds with a key that is equal or greater than the key. The %EQUAL built in function, BIF, is "on" when the key searched for is matched to one on the file. The input fields are not loaded.

As the SETLL does not not load the input fields, John Sears explained, it makes it faster and better to use.

The AS400 has become the IBM i, the newer servers and versions of the operating systems mean that programs and database functions run a lot faster. But the logic behind his statement is still valid.

After "digging deep" I did find a reference to this on the IBM website:

SETLL does not cause the system to access a data record. If you are only interested in verifying that a key actually exists, SETLL with an equal indicator (positions 58-59) is a better performing solution than the CHAIN operation in most cases. Under special cases of a multiple format logical file with sparse keys, CHAIN can be a faster solution than SETLL.

You can learn more about this from the IBM website:

This article was written for IBM i 7.1, and it should work with earlier releases too.

Sunday September 1: After all the discussion on this subject I decided to go ahead and perform my own test on which is faster, CHAIN or SETLL?

I have created a new post, CHAIN versus SETLL the results, to detail my findings.

59 comments:

SumitAugust 28, 2013 at 12:52 PM
Very useful...thanks for sharing.
ReplyDelete
Replies
ShantanuAugust 29, 2013 at 9:21 AM
Agreed -- if it is only between CHAIN and SETLL.

However -- unless you are "maintaining" legacy code AND you have very strict standards for preserving RPG opcodes (double whammy!) -- I will take the SQL route (select count, if exist, etc.) on this one. It is common belief that IBM stopped optimizing the RPG opcodes long time ago (probably around the same time your wife attended the conference), but they continued working on the SQL engine. So from a performance standpoint, and frankly SQL is more readable with /free code than "setll Z1DEPT DPTMAST" -- I highly recommend ditching SETLL/CHAIN in favor of SQL.

That is my humble opinion, of course.
ReplyDelete
Replies
AnonymousAugust 29, 2013 at 10:46 AM
Performance is usually (not always) less important than other considerations such as clarity. And if (if!) IBM's current wonks are to be trusted and obeyed to the nth degree, this should be accomplished with SQL in any case.
-kh
ReplyDelete
Replies
James DentonAugust 29, 2013 at 2:47 PM
This is an interesting article. We (AS400 people), know that of course "speed", is relative. Performance is always limited by Input/Output, while the best written code always wins. Performance still is limited by I/O. But the SETLL is interesting in that it does not bring data into the buffer, while the traditional CHAIN does.
ReplyDelete
Replies
Paul PutkowskiAugust 29, 2013 at 2:47 PM
If I don't need the data, I always SETLL. In the early days (S/38) the performance difference was noticeable.
ReplyDelete
Replies
Venkatesan SrinivasanAugust 29, 2013 at 2:48 PM
If the file was opened for Update or Add, then CHAIN needs the No-lock option coded. SETLL makes clear the intention of your need to just validate.
ReplyDelete
Replies
Steve MiallAugust 29, 2013 at 2:48 PM
If you don't need any data from the record, then SETLL is the way to go beacuse it has no danger of changing any field values in your program. And it is a bit quicker, since it only involves the database index and doesn't require data transfer.
ReplyDelete
Replies
Tommy HoldenAugust 29, 2013 at 2:49 PM
if you are only checking for existence then SETLL will be faster and will not retrieve the record but you're talking nano-seconds but as a general rule of thumb i use SETLL to check for existence if I'm not actually retrieving the data.
ReplyDelete
Replies
Jeff O'BryanAugust 29, 2013 at 2:49 PM
Use the SETLL, it is worth it based on the speed aspect alone.
ReplyDelete
Replies
Susan GantnerAugust 29, 2013 at 2:50 PM
I had always heard that SETLL was faster than CHAIN as well, but a while back we did a little testing and we didn't find a difference in our tests. Now, our tests were just informal - by no means under controlled conditions - no formal benchmarking procedures were followed and I'm by no means claiming these as actual benchmark results. Even so, our results led us to the belief that if there is a performance difference it must be pretty small - probably not worth losing any sleep over. We probably didn't have the file declared as update-capable in our tests - I'd think that locking might have caused more of a noticeable difference. Of course, we could have always done CHAIN(N) anyway.

Logically, it seems there should be a difference. And I would still do SETLL, %Equal in those cases anyway - if only because it makes more sense to those following the logic afterwards and to avoid populating fields I don't intend to use in the program. It just makes sense to me, regardless of performance differences that may (or may not) exist.

If anyone has seen evidence that SETLL actually performs faster than CHAIN I'd love to hear about it. I was very surprised at our results.
ReplyDelete
Replies
Johnny SkaggsAugust 29, 2013 at 2:50 PM
I agree with Steve and I have used SETLL for validating if a record is in the file for as long as I can remember.
ReplyDelete
Replies
John VorisAugust 29, 2013 at 2:51 PM
Also, the SETLL is important when creating a 3=Copy function.
bring into the program the original record,
load up the key fields changing what you need to change
Before writing, use SETLL to ensure that noone has snuck in the record with these keys
If it is not found ( using not %Equal after the Setll) you can now safely write the record.

If you were to use CHAIN instead of SETLL in the step above, the behvaior of the CHAIN will replace the key-values you just loaded..
ReplyDelete
Replies
Alvaro Roberto Meoño WongAugust 29, 2013 at 2:51 PM
Yo he utilizado mucho el setll porque es mas rapido para una cantidad de registros a realizar la busqueda, ya que se situa en el puntero directo del registro a buscar, mientras con el chain si es pequeño el archivo es poca la diferencia. Lo importante es mejor trabajar con el Setll. Yo lo he trabajado con el RPGII,III y en ambiente nativo
ReplyDelete
Replies
Rocky MarquissAugust 29, 2013 at 2:52 PM
I like SETLL as record locks don't play into the picture and it's easy to start reading from that point if desired. While if reading MASSIVE amounts of data performance can be a consideration but with todays machines the difference between a CHAIN and SETLL is so small anymore I don't know that I would use performance as the full determination which is better.... but even so SETLL should be faster, less issues with record locks (yes I know you can avoid them with CHAIN as well) and is a good launching pad for I/O if deemed necessary.
ReplyDelete
Replies
Christopher LivelyAugust 29, 2013 at 2:53 PM
I have always followed what Tommy suggests. I was told when I first started learning RPG that SETLL was more efficient when just checking for existence of a record.
ReplyDelete
Replies
Paolo BordinAugust 29, 2013 at 2:53 PM
To experience SETLL is slightly faster but must then be followed by a READ CHAIN ??is a unique education logically less then write instructions faster the calculation.
ReplyDelete
Replies
Swati TorsekarAugust 29, 2013 at 2:53 PM
Yes, in the given situation, SETLL is more efficient for 2 reasons - first is the speed with which file pointer can get to the reqd record. SETLL merely sets the indicator on, chain will bring into buffer memory or program variables all the field info of the found record - at times such info may overwrite the values of the variables when the same is not intended. Hence SETLL would save on memory and AVOID undesired overwrite errors.
ReplyDelete
Replies
Steve LandessAugust 29, 2013 at 2:54 PM
SETLL is inherently faster, as noted by others.

Unfortunately, due to design changes sometimes you need to go ahead and retrieve the record to get some field values, and when /that/ change is made, you need to do volume testing in addition to the regular testing on the program to determine how much it will cost you in terms of runtime - been there...
ReplyDelete
Replies
Dan LovellAugust 29, 2013 at 2:54 PM
If the record is only accessed for existence, then a SETLL is less expensive then a CHAIN. Always better to reset the cursor and test an indicator. This may not matter for small amounts of I/O but for a job that needs to do this many times, the time savings can add up fast. Simon, I like using *in42 ;-)
ReplyDelete
Replies
Pascal JacquemainAugust 29, 2013 at 2:55 PM
SETLL is most often far less expensive than CHAIN. I know of one exception. A number of years ago in my previous employment, we found out that doing a SETLL on a logical select/omit file could be very very costly, far slower than CHAIN. If I remember correctly, most of the rows were excluded by the select/omit.
ReplyDelete
Replies
Larry MassieAugust 29, 2013 at 2:56 PM
This is a no brainier, SETLL is always the best method to simply validate that a record exists in a files. Programs should be designed to maximize functionality. They should also be designed to be used by foreign application as well as local ones. I have spend over 25 years designing application in RPG, RPG/400, & RPGLE the SETLL method insures that variable values in your program do not get modified during a simple record validation. Once you have validated the existence of a record, you can then move on to process the request as needed.
ReplyDelete
Replies
Anura AriyaratneAugust 29, 2013 at 8:42 PM
Above is a great explanation covering everything to refresh some memory.
ReplyDelete
Replies
Michael CatalaniAugust 29, 2013 at 8:43 PM
SETLL is even more efficient when you are dealing with logical files and indexes. With these file types, a chain would need to position to the search argument in the logical file, retrieve the location of the record in the physical file, position to the record in the physical, then move the data record into the IO buffer. A SETLL would simply need to position to the search argument in the logical.

Another bonus is that, when dealing with locally scoped files, the SETLL does not require an IO data structure.
ReplyDelete
Replies
Manoj Kumar VermaAugust 30, 2013 at 6:11 AM
Absolutly CHAIN is much better.
ReplyDelete
Replies
JeshuaAugust 30, 2013 at 9:34 AM
I believe both commands have been created for different purposes, perhaps its functions are similar, but SETLL is to approximate a pointer to read records under a criterion, and chain is to locate records randomly.

I can use a knife to screw, but the knife is to cut.

I don't believe that SETLL be most express than setll....

Anyway, everyone does a program according to his style...
ReplyDelete
Replies
Jon ParisAugust 30, 2013 at 6:01 PM

Speed should really not be the gating factor here - clarity should be.

I personally would always use CHAIN because I think it makes the intent of the code more obvious. A SETLL by definition is intended to position the cursor. The fact that it can detect an exact match is incidental to its primary use. When I see a SETLL I assume that the programmer is positioning for a READ sequence - when I see CHAIN I assume that the existence of a record is being checked. Which of the two scenarios best matches the OP's "validate" requirement? For me the choice is obvious.

P.S. While in theory a SETLL should always be faster, I tested it some years ago on a regular database (not multi-format) and found that in several tests CHAIN was faster. Go figure.
ReplyDelete
Replies
Rocky MarquissAugust 30, 2013 at 6:08 PM
This is ultimately one of those questions really that is "it depends" - if there's a strong possibility I will need the data anyway I'll use the CHAIN - if it's strictly for verification of the data (user wants to ADD a record .... verify it doesn't exist first type of thing) I use SETLL - why read in data if you're not going to use it? Why use SETLL if you're going to use the data anyway? In otherwords - the goal dictates which is more logical to use.

Given the stated criteria - record validation... SETLL makes sense - why have the system retrieve data that is not going to be used? That's kind of like driving to the grocery store to see if there's milk, take it home and say "Yup, they have milk" when you can simply call the store and ask...
ReplyDelete
Replies
Mark (Dan) RobertsAugust 30, 2013 at 6:09 PM
I am a self-taught RPG programmer and I have learned more from this short discussion than I can say. I always "knew" the difference, but I never considered the ramifications of the choice... thank you all for enlightening me...
ReplyDelete
Replies
Hugh BradyAugust 31, 2013 at 12:25 AM
For new development I prefer SQL SELECT rather than native IO, the list of advantages for using SQE over CQE is long. But you'll find many angles on the tip of a pin.
ReplyDelete
Replies
Alok KumarAugust 31, 2013 at 1:23 PM
I think SETLL is better, because by using CHAIN other fields of record format get changed.
And in SETLL noting is changed, and one more thing we also can use SETGT in lieu of SETLL.
ReplyDelete
Replies
Vigna LucianoAugust 31, 2013 at 1:24 PM
I use one or more function in a SRVPGM that return if record record exists (or other stuff too) for whatever file and for proper key field, for example:
Input parameters
Library name
File name
Key field (usually UID or GUID)
Output param
-1 not exists / 0 exists

The SRVPGM uses SQL statement.
ReplyDelete
Replies
Guido FaeckeAugust 31, 2013 at 1:28 PM
To add my $.02 to that discussion: at the end it doesn't mater which one is better. What matters is which approach is the best for the application.
SETLL, CHAIN, SELECT COUNT, SeviceProgram, stored procedures... who cares?
Sure, if you work with LAWSON/infor, you are stuck with procedure calls. If you believe SQL is the best thing since sliced bread, go with SELECT COUNT.
As far as I am concerned, there is no "best practice", there is a "best approach" for that project.
ReplyDelete
Replies
Charles SiuAugust 31, 2013 at 7:58 PM
I can provide the source of a multi-format database file on request.
SETLL gives better performance, but if you want the actual information on file, you will use CHAIN. Despite the better performance of SETLL, there CHAIN does have the following advantage. Given that you will use a display file to show records that are not found, you can use the same indicator for the "Not Found" indicator for CHAIN as the RI attribute of the display field in the display file.
ReplyDelete
Replies
Stefan UytterhoevenSeptember 1, 2013 at 1:29 PM
you can see a CHAIN as a combination of SETLL and READE ... I assumme when a record doesn't exist, CHAIN or SETLL would even equally fast. When records are found, CHAIN would be slower, as it retrieves the values of the found record.
I prefer SETLL for test existance, and in my experience it's much much faster...
Just try a million times CHAINING a existing record, versus SETLL an existing record...
ReplyDelete
Replies
John DriscollSeptember 1, 2013 at 5:07 PM
There are very few occasions when I do not want something from the record. I use CHAIN as it is clear and makes the fields available if required. I like to use prefixes on files to prevent making a mess of existing field values. I would only consider using SETLL in cases where it is executed many times and would make a significant difference. I have no doubt that SETLL is faster, and would be interested to learn whether it reads the data at all, or simply determines whether it exists in the index.
ReplyDelete
Replies
J. Michael G.September 1, 2013 at 8:27 PM
SETLL may be also used to initially determine that one record or more of a set is available, and then sometime later do a READE to start retrieving actual data, if that's what is required.

For an interactive program, this technique may be used to get a /feedback panel back to the user ASAP while the program goes on to pull data and compose a substantive response and send a followup panel. This arrangement decreases apparent response time as perceived by users.
ReplyDelete
Replies
John DriscollSeptember 2, 2013 at 12:15 AM
There are very few occasions when I do not want something from the record. I use CHAIN as it is clear and makes the fields available if required. I like to use prefixes on files to prevent making a mess of existing field values. I would only consider using SETLL in cases where it is executed many times and would make a significant difference. I have no doubt that SETLL is faster, and would be interested to learn whether it reads the data at all, or simply determines whether it exists in the index.
ReplyDelete
Replies
Geoff BorelandSeptember 3, 2013 at 7:05 AM
I used to always ask this question when interviewing programmers and always got some interesting answers. To simply check the existence of a record, a SETLL will set the file pointer at the appropriate record and will then allow the programmer to determine if the records exists. If you need to return some data then you will either use the CHAIN (which is effectively a Setll and a Reade combined) or simply use the Reade after the Setll.
ReplyDelete
Replies
John Brandt jrSeptember 3, 2013 at 2:51 PM
I suspect SETLL is likely to perform worse than CHAIN when you have a sparse select/omit file with DYNSLT specified. In that case, SETLL has to read forward in the file to find the next record so it can set %Found() properly, which could involve skipping a lot of omitted records.
ReplyDelete
Replies
MIDNIGHT_RIDER_1961September 4, 2013 at 10:09 AM
I have always used CHAIN. Perhaps not the best of choice. But for smaller files it didn't really matter. Now if this were large files I might re-evaluate the decisions.
ReplyDelete
Replies
Tai WongSeptember 6, 2013 at 2:49 PM
Rocky Marquiss and others that have hopped into this simple discussion on SETLL and CHAIN, to help inform Newbies of the performance trade offs and when to use which based on the merits of the situation, have been answered. In closing, any design where interactive use of data and locking protocol can be handled more elegantly with never-ending programs with exclusive locks on files in batch subsystems working with data queues, et cetera. It is not a matter of one language being better than another but which works best performance, maintenance, support, and growth-wise, ease of use with an ROI for all parties to win.

We all know that ILE on the IBM power systems, or what ever platform it reinvents itself on was object oriented before the industry made that a buzz word. It is our goal to provide service, education and ease for services to be viewed as sound and agile. It is too often we see language wars that go nowhere and discussions on performance that borders on being crazy to our readers, but as Rocky pointed out if you can not maintain or read the documentation well enough to use or reuse or enhance it, it becomes obfuscated.

I still have code working and programmers thanking me for documenting my work. Keep up the great work in your jobs, and make the people you meet think how neat you gave them what they wanted without a whole lot of hassle. We as techies, can admire the machinery and the processes, but out customers want the job done well, quickly, and at a reasonable price. Sadly, we don't do so great a job at showing the total cost of ownership is better than the competition or we lapse into computerese and lose our audience. Take time to sharpen your saw and perhaps join Toastmasters to improve yourself as well.
ReplyDelete
Replies
Bhargav KorrapatiSeptember 21, 2013 at 5:07 AM
Is there any performance difference between 'SETLL followed by READ' and 'CHAIN' ?

BTW, i am beginner in RPG, and i never knew that i can use SETLL followed by %equal to check if a record exists. this really helps in avoiding bugs
ReplyDelete
Replies
AnonymousSeptember 24, 2013 at 8:51 AM
This is an interesting discussion and the use of CHAIN, SETLL or even SQL stmt in an RPG program is largely dependent on the overall logic and process. Interactive, batch, repetitive/sequential logic, single/multiple records, record locks, response time, wait time, single/multiple use and shop programming standards are some of the considerations. Unless you can physically count single CPU cycles, I doubt if anyone sees the difference. I only object to someone dictating to me how to code based on personal preferences instead of a logical explanation. I once received a disciplinary write-up by my manager b/c I wrote code in RPGIV. His reasoning was because it looked 'foreign' to the other two senior programmers who favored RPGIII and resisted the change. Of course, the manager could not write code if his life depended on it.
ReplyDelete
Replies
AnonymousSeptember 24, 2013 at 9:40 AM
John Sears' Design for Performance lectures are timelessly invaluable for every IT professional regardless of your concentration. His background in operating systems, database design, programming languages and computer science history was evident in every lecture, discussion panel and informal conversation. His major emphasis was that software performance and security are most effective when addressed in the initial design instead as an afterthought.
ReplyDelete
Replies
Phillip SlessorOctober 3, 2013 at 4:48 PM
If I am validating an Item and need its description I use Chain. But if all I need to know is if the item exists I always use SETLL and test the %Equal It does not bring in any data to the buffers and is much more efficient than a chain.
ReplyDelete
Replies
J. Michael G. JonesOctober 14, 2013 at 7:12 AM
Limiting any batch program which needs only to do 'record exists' tests to using SETLL really does speed it up for large datasets. For interactive programs or very small datasets, performance change between SETLL and CHAIN may be negligible.

Actually fetching a record into program space when it's unnecessary to do so also enables data cross-contamination. Especially for 1st normal form data coming from an exterior source.
ReplyDelete
Replies
AnonymousJanuary 19, 2021 at 12:40 PM
I'm working on a huge system having 15+ millions transactions per day and have billions of records in the tables. hence count every file I/Os added on the process. with this view,
During a Chain operation, if "Record is not found" then will it still do the actual I/O on the file? does this counts to the file I/O?
ReplyDelete
Replies

Add comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.

RPGPGM.COM - From AS400 to IBM i

Pages

Wednesday, August 28, 2013

Validation: CHAIN versus SETLL

59 comments: