In last week's post, here, I gave an example of how I determine the 250 biggest objects on my IBM i server. I created the information about the objects by using the Display Object Description command, DSPOBJD, and mentioned if anyone knew of a better way to get the same information to contact me.
Bill Gravelle did just that suggesting that I look for a table in the QSYS2 library, see here. After poking around I did find a promising table, SYSTABLESTAT, which contains information for every user table/file. This would not have identified all of the biggest objects, as 28 of them were Journal Receivers, *JRNRCV. But the table does contain information that is useful, I am not going to list all the columns/fields in this table as IBM does such a good job on their Knowledge Center web site here.
As we have been talking about releasing wasted disk space the column that grabbed my attention was NUMBER_DELETED_ROWS. Most of the files on the IBM i I use at work do not reuse delete records, therefore, of them have increasing numbers of deleted records that are just wasting disk space. If I could identify the worst offenders I could purge these deleted records and release that disk space. With this in mind I am only interested in the following columns:
Column name | Description |
DATA_SIZE | Size of table/file in bytes |
SYSTEM_TABLE_NAME | Table/file name |
SYSTEM_TABLE_SCHEMA | Library that contains table/file |
NUMBER_ROWS | Number of valid rows/records |
NUMNER_DELETED_ROWS | Number of deleted rows/records in the file |
I can put that all in a SQL in a CL program like this:
01 PGM 02 RUNSQL SQL('CREATE TABLE MYLIB.OUTFILE AS + (SELECT DATA_SIZE,SYSTEM_TABLE_NAME, + SYSTEM_TABLE_SCHEMA,NUMBER_ROWS, + NUMBER_DELETED_ROWS + FROM QSYS2.SYSTABLESTAT + ORDER BY NUMBER_DELETED_ROWS DESC + FETCH FIRST 250 ROWS ONLY) + WITH DATA') + COMMIT(*NONE) NAMING(*SQL) 03 ENDPGM |
If you are not familiar with CREATE TABLE in SQL you ought to read the post Creating SQL table on the fly.
I have decided to use the SQL naming convention, NAMING(*SQL), just because I can and to show you that it is not much different from the standard IBM i convention. In the SELECT statement I have selected the columns mentioned in the table above, and sorted (order by) by the number of deleted rows in descending order, i.e. largest first. Then the first 250 rows will be written to by output file, MYLIB.OUTFILE. This will include all the tables/files in IBM libraries too, e.g. QSYS.
When run this took about five times as long as program in the previous post that used DSPOBJD.
The output looks like this:
DATA_SIZE SYSTEM_TABLE_NAME SYSTEM_TABLE_SCHEMA NUMBER_ROWS NUMBER_DELETED_ROWS 2116046848 FILE1 LIB1 75 5,351,845 1981816832 FILE2 LIB1 11,781 4,988,364 1361072128 FILE3 LIB1 890 3,504,467 421531648 FILE4 LIB1 19,130 3,438,994 304144384 FILE5 LIB2 855,966 3,423,437 |
Only the names have been changed to protect the innocent files and libraries.
There is a mutlitude of other information in this table that may prove useful in the future, such as:
- LOGICAL_READS - Number of logical read operations since the last IPL.
- PHYSICAL_READS - Number of physical read operations since the last IPL.
- SEQUENTIAL_READS - Number of sequential read operations since the last IPL.
- RANDOM_READS - Number of random read operations since the last IPL.
You can learn more about the SYSCL's SYSTABLESTAT table on the IBM website here.
This article was written for IBM i 7.2, and should work for earlier releases too.
Couple of items to note:
ReplyDeleteWe had to remove the "SQL" stmt from the RUNSQL command.
We had to update the "xyzlib.file" to "xyzlib/file".
We had to remove / comment out the "commit" portion of the RUNSQL command.
We could not locate file QSYS2.SYSTABLESTAT even though once we were ablr to compile the CLLE, the program is running and selecting records. Waiting on results but wanted to post ahead of the results.
We're at V7R1. Wonder if some of this is a PTF issue since we'r ein a 24x7x365 hospital environment and getting any PTF window in a multi-partition setting is difficult at best.
The file is not called SYSTABLESTAT. That is what the view is known as within SQL.
DeleteI am surprised that you were not able to run this in the SQL convention. I tired this on an IBM i running 7.1 TR7 and it worked as I have described.
SYSTABLESTAT is the SQL name for the SQL View QSYS2/SYSTSTAT.
DeleteIf you look in QSYS2 its object attribute is "LF", which is how the IBM i stores Views.
This is a problem that can be controlled systematically. I wrote a CL program that did a DSPFD to an output file then I read the output file and checked the variable of number of deleted records if it came out to be a certain percentage (could be a variable) of the total size of the file the CL would automatically reorganize the file eliminating the deleted records. You may want to do some analysis on this method beforehand. Files with millions of records will take a very long time to reorg especially if there are many logical files attached. You also may want to change the the variable to reuse deleted records.
ReplyDeleteI personally think this is the cleanest, easiest way to do that
DeleteI will be taking this approach.
ReplyDeleteWith files containing millions of deleted records I have found that it is quicker to CPYF the contents of the file to another. Clear the original. Then CPYF the data back into the original.
And I will be changing the file to reuse deleted records too.
Make sure and do the CPYF by RRN(*BEGIN)
DeleteJust make sure you do the CPYF by relative record #
DeleteWe're on 7.1 and this worked like a charm, sort of. I'm surprised that it takes so long to run. It must have taken almost an hour. Why is that?
ReplyDeleteMy guess is...
Delete- There are a heck of alot of files/tables on any IBM i when this includes all the IBM (operating system) libraries.
- As this is a View it is generated when it is used, unlike an Index which is more life a LF.
It was the same to me.
DeleteBut, when I submitted the job, it took just a few seconds.
First time I tried with a tool using JDBC.
By the way, NUMBER_ROWS and NUMBER_DELETED_ROWS is a total for the file. Using this output as a driver for a RGZPFM will not work for a multi-member file if the deleted rows is not in the first member. Since we have a package that uses multi-members, it would be useful to see the above with member-level statistics.
ReplyDeleteAlas, this does not give you member level statistics.
DeleteBut the NUMBER_PARTITIONS column does give the number of partitions (= members) that the file has.
I would need to check and see if NUMBER_PARTITIONS is greater than 1 and do something else to determine the members in the file and the active and deleted records within them.
Perhaps DSPFD FILE(lib/file) TYPE(*MBR) OUTPUT(*OUTFILE)...
Then RTVMBRD using the records from the outfile
Hi Simon, great article, great blog!!
ReplyDeleteDo you know if there is a catalog Table/View where I can browse different data that can be got from DSPFD command?
I'd like to combine this SQL sentence, with another table/view, to check wether big files with lots of deleted records have REUSEDLT set or not.
I'm trying to find something related with that, and I guess it exists!
Thanks a lot!
Have you tried the View SYSTABLES?
DeleteYes, yes, I did it. But there is no info about that.
DeleteI'm trying surfing on the different procedures/Views/tables provided by the system (I'm on 7.2 TR2).
If I find something, I'll post it here.