ZIP Data Compression with JavaBy Chád Darbye-Business Advisor, August 1998 |
Do you need the ability to add data compression to your enterprise application? Well, look no further because Java 1.1 provides a package for zip-compatible data compression. The new package, java.util.zip, allows software developers to read, create, modify and write PKZIP and GZIP compatible files. This article will provide step-by-step examples for reading data from a ZIP file and writing data to a ZIP file. Also, the article will discuss the methods available for accessing property information for the compressed entries in a ZIP file.
By using the java.util.zip package, you will be able to incorporate sophisticated compression technology into your Java applications with minimal work. The package not only provides single file compression but you can also create multi-file archives. Since the ZIP files created by the package adhere to the ZIP standard, you will be able to use these files with the PKZIP utility.
Smart E-mailer You can use ZIP data compression to create a smart e-mail program. The e-mail program can be developed to have the following features:
Normally, users attach very large word-processing or spreadsheet files to e-mail message. The e-mail program assists the user with compressing the file. Before sending out the message the e-mail program checks to see if the attached files are in a compressed format. If not, then the e-mail program automatically compresses the file in ZIP format. The e-mail program also automatically unzips incoming attachments. The user simply sets a program option for unzipping files in a specific directory. Also, in order to conserve disk space, an archive feature is available to compress old e-mail messages. This will allow the user to conserve valuable disk space. The user will benefit because the e-mail program has ZIP data compression built-in. Also, the user no longer has to fumble with an external ZIP utiliy.
Before we get into the nuts-and-bolts of compressing and uncompressing data, we will discuss the components of a ZIP file. A ZIP file is composed of one or more compressed files. Refer to figure 1 for the ZIP file structure. A compressed file is described by a zip entry. The zip entry contains information for the compressed file such as the file name, original size, compressed size and additional file details. In a later section, we will see how to access this information.
Uncompressing a ZIP file is simply a matter of reading data from an input stream. In this example, we will write the code fragments to unzip the contents of a file secrets.zip. The five step process follows: Step 1: Get the ZIP input stream The java.util.zip package provides a class ZipInputStream for sequentially reading ZIP files. The constructor for this class accepts an InputStream object. As you can see, a ZipInputStream is created similar to other input streams. fis = new FileInputStream("secrets.zip"); source = new ZipInputStream(fis); Step 2: Get the zipped entries Once the ZIP input stream is opened, then you can get each zip entry one at time. The getNextEntry() method returns a ZipEntry() object. If the end-of-file is reached then getNextEntry() will return null.
After the getNextEntry() method is called, then you can begin reading the data associated with this ZipEntry. Step 3: Prepare the uncompressed output stream Now is a good time to setup the uncompressed output stream. This code is placed inside of the while-loop from step 2. fos = new FileOutputStream(theEntry.getName()); targetStream = new BufferedOutputStream(fos, DATA_BLOCK_SIZE); A file output stream is created using the entrys name. The method Entry.getName() will return the name of the entry which is the file name of the compressed file. Step 4: Inside of the main while-loop, you can read source zipped data and write it to the uncompressed stream.
The read() method retrieves data from the source zipped input stream. This method will return the number of bytes read in. If end-of-file is reached, then the read() method will return a -1. For each block of data that is read in, it is then written to the uncompressed target stream. Step 5: Close the input/output streams After processing the data for each entry (ie reading and writing), the target stream is flushed and closed. targetStream.flush(); targetStream.close(); Finally, after all entries are processed, the source input stream is also closed. source.close(); Complete code example: JKUNZIP Listing 1 contains a complete code example. The file jkunzip.java incorporates the above steps and works as a command-line decompression utility. You can compile the program by typing: javac jkunzip.java To test the jkunzip utility, you can enter the command below. You must provide a zip file name. java jkunzip zipfile
In order to compress data to a ZIP file, you can use the ZipOutputStream class. This class provides the functionality of writing the data in a compressed format. There is only a small amount of work required by the developer to create a ZIP file. The simple six step process is outlined below. Step 1: Create the zip output stream The java.util.zip package provides the class ZipOutputStream for writing ZIP files. The constructor for this class accepts an OutputStream object. Basically, you can pass the output stream of the file you are writing to. Here is an example of creating a ZIP file titled "modules.zip":
Notice the call to setMethod(). By passing the value of DEFLATED, the ZipOutputStream object will store the files in a compressed manner. By default, the compression method is set to ZipOutputStream.DEFLATED. You also have the option of just storing the files in an uncompressed format. To accomplish this, you pass the value of ZipOutputStream.STORED to the setMethod() routine. However, when storing files in an uncompressed format, you must specify the CRC-32 checksum and the file size. Please reference the JDK 1.1 API for details on the CRC32 class. You can also set the level of compression by calling the setLevel(int aLevel) method. The compression levels range from 1-9 with 1 being the weakest and 9 being the fastest level of compression. Step 2: Open the source data file Now that the target zip output stream is created, you can open the source data file. In this code example, the file "java_intro.ppt" is the source data file:
Step 3: Create the zip entry You will need to create a zip entry for each data file that is read. Recall from an earlier section that a zip entry contains file information such as the file name, original size, compressed size and additional details. Most of the zip entry information will be updated once the data is written to the zip output stream. Here is the code for creating a ZipEntry object: theEntry = new ZipEntry(dataFileName); Step 4: Put the zip entry into the archive Before you can write information to the zip output stream, you must first put the zip entry object that was created in Step 3. targetStream.putNextEntry(theEntry); Step 5: Read source and write the data to the zip output stream Finally, the coast is clear for you to read the source file and write the data. Since you are writing to a zip output stream, the data will be written in a compressed format without any additional work by you. data = new byte[DATA_BLOCK_SIZE]; targetStream.flush(); Step 6: Close the zip entry and other open streams. You can close the zip entry when you are finished writing the data. By closing the zip entry, the ZipEntry object is updated with the compressed file size, uncompressed file size and other file related information. It is also a good idea to close any open streams. targetStream.closeEntry(); Complete code example: JKZIP Listing 2 contains a complete code example. The file jkzip.java incorporates the above steps and works as a command-line compression utility. You can compile the program by typing: javac jkzip.java To test the jkzip utility, you can enter the command below. You must provide a zip file name and an optional list of files to compress. java jkzip zipfile [files ]
At any time, you can find out the properties of a given ZIP file. Recall that the ZIP file is composed of zip entries for each compressed file. The ZipEntry class has a number of useful methods for accessing compressed file details (see figure 2). FIGURE 2: ZIP ENTRY CLASS DIAGRAM Here is a description of the popular methods available in the ZipEntry class:
Here is a code sample that uses some of the above methods to display information for a given ZipEntry System.out.print(theEntry.getSize() +
"\t\t"); Complete code example: JKUNZIP In the jkunzip program, functionality was added to display the contents of a given zip file. Refer to Listing 1 and view the listContents() method. You can use the "-v" option on the jkunzip program to list the zip file contents.
The java.util.zip package is compatible with PKZIP version 2.04G. However, I encountered problems when using other zip utilities. For example, earlier versions of WinZip could not extract ZIP files created with java.util.zip. This problem was solved when I downloaded the latest version of WinZip, version 6.3. If you are going to share your ZIP files with others then be sure to test for compatibility with the various zip utilities.
As you can see, the java.util.zip package is easy to use. By following a simple six-step process, you can create, read and write ZIP files. You can easily tune the compression level of files to favor speed or size. The package allows you to take advantage of the many benefits of reading and writing ZIP files. Currently software companies are selling ZIP controls for Visual Basic and Delphi. I wonder who will ship the first JavaBean with ZIP functionality? Well, armed with the information in this article it could easily be you!
Chád (shod) Darby is a Java consultant for J9 Consulting, www.j-nine.com. He specializes in developing server-side Java applications and database applications. Chád received a B.S. in Computer Science from Carnegie Mellon University. You can reach him at darby@j-nine.com.. |