Wednesday, November 21, 2018

Using command to zip and unzip files

zipping and unzipping data to the ifs

In the past I have written about zipping files in the IFS using Java jar and APIs, this week a colleague showed me a simpler way to do it using a couple of IBM i commands.

He explained that these commands had been introduced in IBM i 7.2, and can zip and unzip files in the IFS as well as files in, what I call, the native IBM i environment. With these commands the zipped/compressed files are called "archive files", therefore, IBM named them:

 

CPYTOARCF copy to archive file

The CPYTOARCHF command has three parameters:

  1. FROMFILE:  Name of the file that will be compressed in the archive file
  2. TOARCF:  Name of archive file
  3. SUBTREE:  Whether all the subfolders should be archived too

The first thought I had when I was shown this command was how many bytes of space could the compressing save?

Created a file of one million records, and then used command:

01  CPYTOARCF FROMFILE('/qsys.lib/qtemp.lib/bigfile.file')
02            TOARCF('/MyFolder/zipped.zip')

Line 1: The path name of my file QTEMP/BIGFILE has to be given in "directory format", therefore, as the IBM i native environment is all in the QSYS.LIB folder, the file is in the QTEMP library which needs to be entered as QTEMP.LIB, and my file BIGFILE has to be entered with the file extension as BIGFILE.FILE. Unlike the IFS the IBM i environment is not case sensitive.

Line 2: The location of the archived file, which needs to be in the IFS, which is case sensitive. Here the archive file is called zipped.zip which is in the IFS folder MyFolder.

I do not need the SUBTREE parameter in this case as I am only saving one file.

The archived file, zipped.zip, is created by the command.

When the command had finished I compared the size of the file BIGFILE in QTEMP to the size of the archive file:

BIGFILE 719,388,672 bytes
zipped.zip 39,075,409 bytes
Difference 680,313,263 bytes

That is an enormous difference with this file. I know that the difference will be different depending on the file I copy, but this does show how much space could be saved.

I can also use this command to compress a file in the IFS.

01  CPYTOIMPF FROMFILE(QTEMP/EXTRACT1) +
02              TOSTMF('MyFolder/extract1.csv') +
03              MBROPT(*REPLACE) +
04              FROMCCSID(37) +
05              STMFCCSID(*PCASCII) +
06              RCDDLM(*CRLF) 

07  CPYTOARCF FROMFILE('MyFolder/extract1.csv') 
08              TOARCF('MyFolder/zipfile.zip')

Lines 1 – 6: First I want to copy my data from the IBM i environment to the IFS. You might need to do something different with the CCSID, line 4, as that depends on your IBM i's default CCSID.

Lines 7 and 8: Then I can compress the file that was copied to the IFS.

I could easily attach this file to an email and send it.

It is not possible to add to an archive file, or give the names of multiple files in the FROMFILE parameter. Therefore, if I want to archive more than one file I would need to place them in the same IFS folder and then use the following:

01  CPYTOARCF FROMFILE('/MyFolder')
02            TOARCF('/MyFolder/zipfile.zip')

By only giving the name of the folder all of the files within it are compressed into the zip file.

But that only does the current folder. If I wanted to compress the contents of all the subfolders too I would need to use the SUBTREE parameter:

01  CPYTOARCF FROMFILE('/MyFolder')
02              TOARCF('/MyFolder/zipfile.zip')
03              SUBTREE(*ALL)

 

CPYFRMARCF copy from archive file

Having archived/zipped/compressed a file or files how do I unarchive them? The CPYFRMARCF command does that.

The command has three parameters:

  1. FROMARCF:  Name of the archive file
  2. TODIR:  The folder where the uncompressed file will be placed.
  3. RPLDTA:  Whether to replace the existing data

The format of the command is pretty much the same whatever you are unarchiving, IFS files and IBM i environment files.

01  CPYFRMARCF FROMARCF('/MyFolder/zipfile.zip')
02               TODIR('.')
03               RPLDTA(*YES)

Line 1: I need to give the archive file's name and folder it is in.

Line 2: I have found that if I use the dot/period/full stop ( . ) in the to directory parameter the archived file(s) are unarchived into the IFS folder or IBM i library it was archive from.

Line 3: If I want to replace the existing files in the folder or library I would use *YES in the replace data parameter.

Experimenting with the putting values in the TODIR parameter I have found if want to restore the IFS file(s) to another IFS folder I must enter the folder's name without a leading slash ( / ). If the folder does not exist it is created by the command.

01  CPYFRMARCF FROMARCF('/MyFolder/zipfile.zip')
02               TODIR('AnotherFolder')

If I give a folder name with a leading slash, then a subfolder is created in the folder that the file was archived from.

01  CPYFRMARCF FROMARCF('/MyFolder/zipfile.zip')
02               TODIR('/SubFolder')

In this example the archived file(s) are restored to the subfolder /MyFolder/SubFolder.

I wish that restoring to the IBM i environments was as easy. I need to have a copy of the physical data file I originally archived in the library I archived it from, with the same name of the original file. I have to put the dot in the TODIR the file is restored to the library and file it was archived from.

01  CRTDUPOBJ  OBJ(BIGFILE) FROMLIB(*LIBL) +
02               OBJTYPE(*FILE) +
03               TOLIB(QTEMP) +
04               CST(*NO) TRG(*NO) ACCCTL(*NONE)

05  CPYFRMARCF FROMARCF('/MyFolder/zipped.zip') +
06               TODIR('.') +
07               RPLDTA(*YES)

If original file is not present in the library then a source physical file is created. Alas, the source data field the source file is wrapped so the file contains less than one record of data, the remainder is wrapped to the next record.

I have tried using the TODIR, even putting the path that the original file was archived from.

01  CPYFRMARCF FROMARCF('/PGMSDH/zipped.zip')
02                TODIR('/qsys.lib/qtemp.lib/bigfile.file')
03                RPLDTA(*YES)

Every time I see the following in the job log:

File BIGFILE created in library QTEMP.
Object QSYS in database file BIGFILE not found.
0 objects were replaced and 0 objects were newly
  created during decompression.
Information passed to this operation was not valid.

When I look at the last message I have to admit I do not find it useful in diagnosing the issue.

Message ID . . . . . . :   CPFA0A2
Message type . . . . . :   Escape

Message . . . . :   Information passed to this operation was 
   not valid.
Cause . . . . . :   Possible causes:
    --The operation could not use the data passed to it.
    --A name may not be correct.
    --Directory was expected, but a file was specified.
    --File was expected, but a directory was specified.
    --The function requested is not supported by the file system.
Recovery  . . . :   Check the input data to determine the cause
  of the problem.  Correct the error, and retry the operation.

If you know how to overcome this problem I would be grateful if you would share the details with me. You can use the Contact Form on the right side of this page, or add a comment to this post.

 

Having spent a couple of days playing with these commands I can see how useful they are for IFS files. In their current form I know I will not be using these commands for files in the IBM i environment.

 

 

This article was written for IBM i 7.3, and should work for 7.2 too.

12 comments:

  1. Very nice. Thanks for share it.

    ReplyDelete
  2. The problem is that you cannot zip more than 4gb file.

    ReplyDelete
  3. Hello,
    I'm using the following syntax command to create a zip file in the same folder as the original CSV file :
    CPYTOARCF FROMFILE('/home/a/b/file1.csv') TOARCF('/home/a/b/file1.zip')
    The archive is created without problem, however the CSV file within is buried inside a series of folders corresponding to the complete IFS path. This could be solved by using a relative path in the TOARCF parameter, but I'm not sure how to set/indicate the current folder with a CL command...
    Thanks in advance.

    ReplyDelete
    Replies
    1. @JJ, I assume you mean Using the CD command before zipping and then specifying the from-file relatively, not from root. That will solve this problem when using the 'Jar' command to zip but apparently not for CPYTOARCF where FROMFILE in the archive will still be buried inside a nest of folders as described above. I too would like to know how to get round this if possible. The problem with 'Jar' is that you have to use the QSH interface which is asynchronous so you don't know if the archive has been created yet before you do something with it.

      Delete
  4. I discovered those two commands thanks to your post, thanks for that.

    ReplyDelete
  5. Never heard about these commands... Thank you

    ReplyDelete
  6. Regarding the decompression of a DB file, you have to add the relevant member.
    Example:
    CPYFRMARCF FROMARCF('/MyFolder/zipfile.zip') TODIR('/qsys.lib/qtemp.lib/Bigfile.file/Bigfile.mbr') RPLDTA(*YES)
    However, then you receive another message stating:
    Parameter passed to decompress a file is not valid, reason code 8..
    Reason code 8 is as follows:
    The path specified to place the decompressed files is not a directory.


    You wrote " Alas, the source data field the source file is wrapped so the file contains less than one record of data, the remainder is wrapped to the next record.".
    If you'd look into the archived file (by Notepad) you'll see the same there - the file archived contains only one record.
    In any case, I tried several scenarios but didn't succeed in decompressing a DB file, whether into the original library or into another one, with the file existing or not. I keep getting a message that "Archive file decompressed." but it doesn't really show into any DB file..

    It seems that the commands work successfully one way or another when you're dealing with IFS documents, but not when you're trying to decompress a DB file.

    ReplyDelete
  7. There's also a 7Zip command available for PASE as well.

    ReplyDelete
  8. I have experimented that in some situations it is necessary to change the code-page (ccsid) of the unzipped document to be able to read it properly. For example, after you run the cmd CPYFRMARCF, run the following: CHGATR OBJ('mydir/mydoc') ATR(*CCSID) VALUE(1208)

    ReplyDelete
  9. The zip file contains the information to TODIR - so just:

    01 CPYFRMARCF FROMARCF('/PGMSDH/zipped.zip')
    02 TODIR('/')
    03 RPLDTA(*YES)

    ReplyDelete
  10. Thank you for posting all these tips.
    After reading your blog post, I used CPYTOARCF yesterday. It worked fine.
    I tried then to use it today on another file but the file was too big and the command failed (The size of the file xxx is greater than 4,294,967,295 bytes). So there is a limitation on the file size.
    I then used the following which worked fine and is also easy to use : STRQSH CMD('Jar cfM MyFolder/extract.zip MyFolder/extract.csv')
    Sources : https://www.semiug.org/2016/docs/05HowtoCompressLARGEFiles.pdf
    https://www.rpgpgm.com/2013/11/zipping-files-in-ifs.html

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.