Page 1 of 1

import performance compress vs uncompressed

PostPosted: Thu Apr 26, 2012 3:32 pm
by ingvar
Hello,

This just an observation that may be useful for others on the relative performance on importing compressed vs uncompressed files from the command line.

On a set of 82 MRC-files, in total about 2.5 GB compressed/5 GB uncompressed, it took 179 min to import the compressed files, but only 17 min to do the steps uncompress/import/re-compress. Obviously only one data point for one file format on one server (Dell T5500 running RHEL6.2WS).

From the Blitz log, it looks like most of the time for the compressed files are spent in ome.api.RawPixelStore.setTile

Cheers,
Ingvar

Re: import performance compress vs uncompressed

PostPosted: Thu Apr 26, 2012 5:54 pm
by mlinkert
Hi Ingvar,

This just an observation that may be useful for others on the relative performance on importing compressed vs uncompressed files from the command line.

On a set of 82 MRC-files, in total about 2.5 GB compressed/5 GB uncompressed, it took 179 min to import the compressed files, but only 17 min to do the steps uncompress/import/re-compress. Obviously only one data point for one file format on one server (Dell T5500 running RHEL6.2WS).


How are these files being compressed? To my knowledge, MRC files don't support internal compression (at least, Bio-Formats/OMERO doesn't support it), so I'd assume that the files were compressed externally with gzip or similar?

If that is indeed the case, then it makes perfect sense for the import to be much slower. Importing the files when they are compressed means random access I/O across data that is being decompressed on the fly; there really isn't much we can do make that more efficient than a single (sequential access) decompression operation.

-Melissa

Re: import performance compress vs uncompressed

PostPosted: Fri Apr 27, 2012 11:43 am
by ingvar
Hi Melissa,

Should have mentioned that the files were gzip-ed.
And as I said, this was just an observation that you may want to uncompress at least some file types before importing them to Omero. While I would expect some difference in performance on importing compressed/uncompressed files, an order of magnitude was beyond my expectations.

Cheers,
Ingvar