Open Microscopy Environment

by **kdean** » Wed Jul 22, 2015 7:52 pm

Hello,

We use LabView to acquire data from multiple custom light-sheet microscopes, and we are having significant issues with both read/write speed and memory. I am not an expert in BioFormats, or computer programming in general, so please excuse my naivety.

Following data acquisition, LabView must reopen each individual image stack, read the temporary metadata, and write the cumulative OME metadata for the entire image sequence. Lately, the number of image planes that must be handled ranges from 50,000-300,000 image planes. This becomes prohibitively long to write the OME metadata, taking hours to days. Often times, we will cancel the OME rewrite process after it is complete with the first image so that we may continue with imaging. However, this is not a good solution.

Importantly, because we also have problems with the command line and matlab-based tools, I do not think that this is purely a LabView problem.

I have uploaded two representative files, 1_CAM01_000000.tif and 1_CAM01_000499.tif. The first file has the OME metadata, whereas the second does not. Each file is ~75 Mb. Because the second file does not have the OME metadata, it opens using the Matlab bfopen (~1.1 seconds). The first file, however, cannot open in Matlab despite increasing the java heap memory to the maximum amount. Below is the error that I receive.

...............................Reading IFDs
Populating metadata
Caught "std::exception" Exception message is:
Message Catalog MATLAB:services was not loaded from the file. Please check file location, format or contents
An error was encountered while saving the command history
java.io.FileNotFoundException: /Users/kdean/.matlab/R2014b/History.xml (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at com.mathworks.mde.cmdhist.AltHistoryCollection

Indeed, while keeping an eye on my memory in the Activity Monitor, bfopen is immensely memory-intensive. I am a bit lost why it takes so long to open the file, and I am beginning to believe that this may be a related issue as to why we cannot save the OME-TIFF information in the first place in a respectable amount of time.

Thank you, and I apologize if this is unclear.

Kevin

by **sbesson** » Wed Jul 22, 2015 9:53 pm

Hi Kevin,

thanks for the post and sharing the data. My feeling is that both issues you are encountering are separate: one is about with OME-TIFF writing performance while the second is about OME-TIFF reading performance.

For the reading problem described in this thread, the issue likely lives within the nature of bfopen since this function will both initialize a reader for the selected file and read all of its pixel data.

The 1_CAM01_000499.tif does not contain any OME-XML metadata and is initialized as a regular
TIFF file with the following dimensions:

Code: Select all: $ showinf -nopix /ome/apache_repo/11318/1_CAM01_000499.tif Checking file format [Tagged Image File Format] Initializing reader TiffDelegateReader initializing /ome/apache_repo/11318/1_CAM01_000499.tif Reading IFDs Populating metadata Checking comment style Populating OME metadata Initialization took 1.302s Reading core metadata filename = /ome/apache_repo/11318/1_CAM01_000499.tif Series count = 1 Series #0 : Image count = 138 RGB = false (1) Interleaved = false Indexed = false (false color) Width = 512 Height = 512 SizeZ = 1 SizeT = 138 SizeC = 1 Thumbnail size = 128 x 128 Endianness = intel (little) Dimension order = XYCZT (uncertain) Pixel type = uint16 Valid bits per pixel = 16 Metadata complete = true Thumbnail series = false

The 1_CAM01_000000.tif is initialized as an OME-TIFF file, including file grouping and has the following dimensions:

Code: Select all: $ showinf -nopix /ome/apache_repo/11318/1_CAM01_000000.tif Checking file format [OME-TIFF] Initializing reader OMETiffReader initializing /ome/apache_repo/11318/1_CAM01_000000.tif Reading IFDs Populating metadata Initialization took 4.67s Reading core metadata filename = /ome/apache_repo/11318/1_CAM01_000000.tif Used files: /ome/apache_repo/11318/1_CAM01_000000.tif /ome/apache_repo/11318/1_CAM01_000499.tif Series count = 1 Series #0 : Image count = 69000 RGB = false (1) Interleaved = false Indexed = false (false color) Width = 512 Height = 512 SizeZ = 138 SizeT = 500 SizeC = 1 Thumbnail size = 128 x 128 Endianness = intel (little) Dimension order = XYZCT (certain) Pixel type = uint16 Valid bits per pixel = 16 Metadata complete = true Thumbnail series = false

While calling bfopen in MATLAB, using 1_CAM01_000499.tif would load 138 planes while using 1_CAM01_000000.tif would load 138x500 planes. This would likely explain the resource and file descriptor exhaustion you reported.

The MATLAB equivalent of the command above, which only initializes the reader without loading its pixels data is bfGetReader. Can you run the following commands:

Code: Select all: bfGetReader('/ome/apache_repo/11318/1_CAM01_000000.tif'); bfGetReader('/ome/apache_repo/11318/1_CAM01_000499.tif');

I would expect the first call to have an overhead but to return in a timely fashion.

Best,
Sebastien

by **kdean** » Wed Jul 22, 2015 10:43 pm

Hey Seb,

You are correct. The reading is significantly faster with bfGetReader.

1_CAM01_000000.tif completed in 5.067s, whereas 1_CAM01_000499.tif completed in 0.179s.

Both provided a strange output:

loci.formats.ChannelSeparator@170836e0

loci.formats.ChannelSeparator@4cb58c7f

by **sbesson** » Thu Jul 23, 2015 1:44 pm

Hi Kevin,

the output of the bfGetReader call should be an initialized reader of type ChannelSeparator. So the MATLAB output is expected and it looks like you have reasonable reading performance.

On the OME-TIFF writing side, while trying to assess the time it would take to embed the OME-XML metadata from the first TIFF file into the second TIFF file using Bio-Formats command line tools, I got the following error:

Code: Select all: $ tiffcomment 1_CAM01_000000.tif > 1_CAM01_000000.xml $ tiffcomment -set 1_CAM01_000000.xml 1_CAM01_000499.tif loci.formats.FormatException: Tag not found (IMAGE_DESCRIPTION) sbesson@necromancer ~ $

So the IMAGE_DESCRIPTION tag which is required for embedding the OME-XML is not present in the original TIFF file and cannot be recreated. Is it possible from the acquisition side to have this TIFF tag created when the original TIFF files are saved?

Sebastien

by **kdean** » Thu Jul 23, 2015 4:26 pm

One quick note in response to "While calling bfopen in MATLAB, using 1_CAM01_000499.tif would load 138 planes while using 1_CAM01_000000.tif would load 138x500 planes. This would likely explain the resource and file descriptor exhaustion you reported."

For testing purposes, I downloaded only the _000000.tif and _000499.tif files from a remote server. Images _000001...488.tif were not on the machine at the time, nor was there an active connection to the remote server. In this case, I don't know where bfOpen was actually 'getting' the data for the intermediate image stacks...

I will have more info regarding the ImageDescription writing soon...

Kevin

by **sbesson** » Fri Jul 24, 2015 3:03 pm

Hi Kevin,

when creating the reader from the file containing the OME-XML metadata, the reader will use this metadata notably to determine the image dimensions in XYZTC and then try to group the files required to construct the fileset.

As you can see in my previous command, if only the first and last TIFF files of the fileset are present, then they files registered in the fileset:

Code: Select all: $ showinf -nopix /ome/apache_repo/11318/1_CAM01_000000.tif ... Used files: /ome/apache_repo/11318/1_CAM01_000000.tif /ome/apache_repo/11318/1_CAM01_000499.tif Series count = 1 ...

Then when reading the pixels data from the image. using bfGetPlane(planeIndex) under MATLAB, if the TIFF file for the requested plane is not present, the function will return an array of size sizeX x sizeY filled with zeros.

Sebastien

by **kdean** » Mon Jul 27, 2015 8:33 pm

That makes perfect sense. Thank you, Seb. I can see why the computer would be uphappy performing the Matlab equivalent of data=zeros(512,512,180,500)...

After some analysis, it appears that the major bottleneck is XML generation, where the data is loaded into memory and the XML string is created. Moving forward, we will try to create the 'stub' XML file at the beginning of the data acquisition, and save the tiff files immediately with the stub XML file embedded within them. We will need to calculate the master XML UUID at the beginning of the data acquisition to do this.

The master XML file will need to be written after data acquisition is complete. This will be built up in parallel during the acquisition, and should be quick.

by **mlinkert** » Tue Jul 28, 2015 10:55 pm

Hi Kevin,

If you haven't already, you might consider using the BinaryOnly/MetadataOnly feature of OME-TIFF in this case; see:

https://www.openmicroscopy.org/site/sup ... ata-blocks
http://www.openmicroscopy.org/Schemas/D ... BinaryOnly

This would require writing the fully-assembled OME-XML once to a text file, with each of the OME-TIFF files having small XML stub that references that file. Depending upon the total number of files and the size of the complete XML, this may be faster than writing the XML to multiple TIFFs after acquisition.

-Melissa

Open Microscopy Environment

Read/Write/Memory Problems

Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Re: Read/Write/Memory Problems

Who is online