We're Hiring!

Problem importing partial datasets

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

Problem importing partial datasets

Postby PhilippeP » Mon May 23, 2016 11:27 am

Hi List,

We get datasets using MicroManager HCS plugin. These datasets consist, for example, of 1200 individual .tif images, corresponding to one 20x20 grid (1z) in 3 wells. Each .tif file comes with its own metadata .txt file. Altogether, that is 2400 files, more than the 2000 limit for Insight import...

- If we "split" the data set in 3 folders (one per well, with 400 .tif and 400 corresponding metadata .txt), OMERO does not see any data to import (I guess because we tinkered the original dataset?).

- If we try to import all the 1200 .tif files without the .txt metadata, it is weird: we do see 1200 files in the Insight importer, but the first "file" being imported is said to be 25GB (which is the size of the whole 1200 images). After the loooong 45 min import of the "first" .tif file, OMERO starts processing it, and fails because of missing metedata... The "second" .tif image, said to be 25GB also, then starts being imported, and same problem repeats. We have to interrupt Insight.

Is there a way to import part of big datasets (or whole datasets) with a GUI?

Thanks for your time.

Insight (Windows) and OMERO (Linux) are both v5.2
Last edited by PhilippeP on Wed May 25, 2016 7:10 am, edited 1 time in total.
PhilippeP
 
Posts: 44
Joined: Tue Oct 22, 2013 1:31 pm

Re: Problem importing partial datasets

Postby sbesson » Tue May 24, 2016 11:50 am

Hi Philippe,

If we "split" the data set in 3 folders (one per well, with 400 .tif and 400 corresponding metadata .txt), OMERO do not see anydata to import (I guess because we tinkered the original dataset?).


Depending on the initial data structure and how it was split, it might be that file links are broken and Bio-Formats cannot recognize the whole fileset.

If we try to import all the 1200 .tif files without the .txt metadata, it is weird: we do see 1200 files in the Insight importer, but the first "file" being imported is said to be 25GB (which is the size of the whole 1200 images). After the loooong 45 min import of the "first" .tif file, OMERO starts processing it, and fails because of missing metedata... The "second" .tif image, said to be 25GB also, then starts being imported, and same problem repeats. We have to interrupt Insight.


Assuming the original data is saved as image file stacks, each TIF will be recognized as an OME-TIFF fileset and all TIFF files will be grouped together at import - see notes at the bottom of http://www.openmicroscopy.org/site/supp ... nager.html. Selecting multiple TIFF files in the Insight importer will indeed trigger multiple imports of the same fileset at the moment. To work around this issue and import only one fileset, you can select either one TIFF file or even the parent folder.

Do you have a more detailed error message of the the server-side processing failures? Or alternatively do you have a smaller Micro-Manager HCS dataset that could be uploaded at https://www.openmicroscopy.org/qa2/qa/upload/ that would help reproducing the failure.

Is there a way to import part of big datasets (or whole datasets) with a GUI?


Sadly, we do not have a solution for partial imports at the moment. As you saw, our graphical clients have limitations when it comes to importing large datasets. For such operations, we usually recommend the usage of the command-line importer either directly or invoked from other tools (dropbox, scripts).

Best,
Sebastien
User avatar
sbesson
Team Member
 
Posts: 421
Joined: Tue Feb 28, 2012 7:20 pm

Re: Problem importing partial datasets

Postby sbesson » Wed Jun 08, 2016 1:19 pm

Hi Philippe,

Following a recent upload of file listing for an HCS fileset generated by Micro-Manager, I have been spending a bit of time investigating this thread. Using both the Micro-Manager HCS plugin and the demo configuration and following the initial description, I generated a fileset made of 2400 files, 1200 .ome.tif and 1200 metadata.txt.

I tried the import of this fileset into an OMERO server using the command-line importer with various advanced options:

Code: Select all
$ bin/omero import --skip all /ome/data_repo/inbox/micromanager/Test_Dataset_MultiPosition_Stack_7/Test
_Dataset_MultiPosition_Stack_7_MMStack_A1-Site_0.ome.tif -- --transfer=ln_s
...
==> Summary
1200 files uploaded, 1 fileset created, 1200 images imported, 0 errors in 0:05:52.400
$ bin/omero import --skip thumbnails /ome/data_repo/inbox/micromanager/Test_Dataset_MultiPosition_Stack_7/Test
_Dataset_MultiPosition_Stack_7_MMStack_A1-Site_0.ome.tif -- --transfer=ln_s
...
1200 files uploaded, 1 fileset created, 1200 images imported, 0 errors in 0:08:02.600
$ bin/omero import /ome/data_repo/inbox/micromanager/Test_Dataset_MultiPosition_Stack_7/Test
_Dataset_MultiPosition_Stack_7_MMStack_A1-Site_0.ome.tif -- --transfer=ln_s
...
==> Summary
1200 files uploaded, 1 fileset created, 1200 images imported, 0 errors in 4:16:55.539


First of all, I was not able reproduce the missing metadata error you reported previously and I cannot spot any obvious issue from the file listing that got uploaded. To be able to investigate further, we would need a complete representative fileset that fails to import as mentioned in my previous answer.

Note that in all cases above, the import completed successfully but the completion time largely depends on the individual steps to be performed. Based on the metrics above, it seems transfer time and thumbnail generation could be the major bottlenecks for these filesets. The basic and advanced documentation of the command line import includes the description of various options that might be worth investigating to tune the performance of large filesets.

Best,
Sebastien
User avatar
sbesson
Team Member
 
Posts: 421
Joined: Tue Feb 28, 2012 7:20 pm


Return to User Discussion

Who is online

Users browsing this forum: No registered users and 1 guest