Page 1 of 1

Importing a large number of files

PostPosted: Mon Jan 31, 2011 10:46 am
by helmerj
Hi,

for last two weeks I have been struggling with the Omero system to import larger number of files. For a client I am setting up an automated import system which is suppose to handle a vast amount of images every night. We are talking about up to 118000 images in one session.

This our current work-flow:
  • transfer of files to the machine running the Omero server
  • adding of meta data
  • import to Omero

I have attempted to use the drop box system (http://www.openmicroscopy.org/community/viewtopic.php?f=4&t=598) but it does lack essential features to be really useful for what I need to accomplish.
Recently I have attempted to make the command-line importer work. Either way the import fails after a varying number of files (between 400 and 1300).

Error:
Code: Select all
java.io.FileNotFoundException: /home/helmerj/rails/metaxpress/transfer-script/out0500/20100825 rbilly_C13_w376F74D57-F0AE-4C0F-A273-A573DC6A1237.ome.tif (Too many open files)
   at java.io.RandomAccessFile.open(Native Method)
   at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:100)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:111)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:119)
   at loci.common.Location.getHandle(Location.java:196)
   at loci.common.Location.getHandle(Location.java:167)
   at loci.common.RandomAccessInputStream.<init>(RandomAccessInputStream.java:71)
   at loci.formats.in.OMETiffReader.openBytes(OMETiffReader.java:206)
   at loci.formats.FormatReader.openBytes(FormatReader.java:739)
   at loci.formats.ImageReader.openBytes(ImageReader.java:370)
   at loci.formats.ChannelFiller.getLookupTableComponentCount(ChannelFiller.java:262)
   at loci.formats.ChannelFiller.setId(ChannelFiller.java:245)
   at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:480)
   at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:238)
   at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:480)
   at ome.formats.importer.ImportLibrary.open(ImportLibrary.java:245)
   at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:483)
   at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:223)
   at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:128)
   at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:366)
2011-01-31 11:44:20,272 657081     [      main] ERROR     ome.formats.importer.cli.ErrorHandler  - FILE_EXCEPTION: /home/helmerj/rails/metaxpress/transfer-script/out0500/20100825 rbilly_C13_w376F74D57-F0AE-4C0F-A273-A573DC6A1237.ome.tif
java.io.FileNotFoundException: /home/helmerj/rails/metaxpress/transfer-script/out0500/20100825 rbilly_C13_w376F74D57-F0AE-4C0F-A273-A573DC6A1237.ome.tif (Too many open files)
   at java.io.RandomAccessFile.open(Native Method)
   at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:100)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:111)
   at loci.common.NIOFileHandle.<init>(NIOFileHandle.java:119)
   at loci.common.Location.getHandle(Location.java:196)
   at loci.common.Location.getHandle(Location.java:167)
   at loci.common.RandomAccessInputStream.<init>(RandomAccessInputStream.java:71)
   at loci.formats.in.OMETiffReader.openBytes(OMETiffReader.java:206)
   at loci.formats.FormatReader.openBytes(FormatReader.java:739)
   at loci.formats.ImageReader.openBytes(ImageReader.java:370)
   at loci.formats.ChannelFiller.getLookupTableComponentCount(ChannelFiller.java:262)
   at loci.formats.ChannelFiller.setId(ChannelFiller.java:245)
   at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:480)
   at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:238)
   at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:480)
   at ome.formats.importer.ImportLibrary.open(ImportLibrary.java:245)
   at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:483)
   at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:223)
   at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:128)
   at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:366)
2011-01-31 11:44:20,276 657085     [      main] INFO         ome.formats.importer.ImportLibrary  - Exiting on error



Here are my questions:

  • What are possible parameters to improve the import behavior in regard to speed and memory management? I have increased the Max Heap size up to 4GB but he import of more than 342 images still fails.
  • What is the reason for the error I am getting above? I suppose I could import the images in batches of 100 or 200 hundred files but that would be really cumbersome when looking at 118000 files to be imported...
  • Is it possible to speed up the import by pre-processing the image files so no pixel conversion has to be applied to the image data?
  • What is the internal file format in the /OMERO/Pixels/ and /OMERO/Thumbnails directories
  • How can I file bug reports I have search the Omero website and I do know about the trac system but how do I get an account so I can file proper bug reports with full debug information?

Any help on the matter would be greatly appreciated. Is anybody using the Omero system being able to import larger number of files? With work being done towards a HCS module I would had assumed that larger number of files pose no problem...

Cheers Juergen

Re: Importing a large number of files

PostPosted: Mon Jan 31, 2011 2:34 pm
by wmoore
Hi Juergen,

We think this issue might have been fixed already (in the 4.2.2 release). It is a problem with too many file handles (OMERO is not releasing them after importing each file). Are you using an older version?

Files in OMERO are saved as pure pixel data (on disk) and metadata (image dimensions, pixel-type etc) in the database. If you look in the /OMERO/ (or wherever you store the binary data) then you will see Pixels and Thumbnails folders with the files named with image ID.

We are happy to receive bug reports on the forum or e-mail lists. If you submit errors from the clients using the "Error dialogs" they end up here: http://qa.openmicroscopy.org.uk/ but for server stuff, just use forum or e-mail lists with a stack trace etc. You can always e-mail an individual later if you need to attach a load of server logs etc.

Cheers,

Will.

Re: Importing a large number of files

PostPosted: Mon Jan 31, 2011 3:19 pm
by helmerj
HI Will,

I am using Version 4.2.2beta (the latest from the website). Right now I am importing in batches of 100 files and that does work. Would be great though if I could increase the batch size in order to minimize the overhang of stating the JavaVM for each import session.

Cheers Juergen

Re: Importing a large number of files

PostPosted: Tue Feb 01, 2011 11:47 am
by wmoore
Hi Juergen,

We have created a ticket for this bug. We'll see if we can reproduce and fix it!
https://trac.openmicroscopy.org.uk/omero/ticket/4195

Will.

Re: Importing a large number of files

PostPosted: Tue Feb 01, 2011 8:29 pm
by wmoore
Hi Juergen,

I see that Chris closed the ticket for this BUG - so hopefully you'll find it fixed in the next release.

Will.

Re: Importing a large number of files

PostPosted: Thu Feb 03, 2011 12:39 pm
by cxallan
The Bio-Formats fixes (a file handle leak) have been backported to the 4.2 branch. You can get access to a build that has these changes here:

http://hudson.openmicroscopy.org.uk/vie ... eta4.2/71/

If you're using the CLI import via bin/omero import you will need to upgrade your server installation.