We're Hiring!

OMERO.insight: import directory skips files

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

OMERO.insight: import directory skips files

Postby evenhuis » Thu Sep 27, 2018 8:46 am

Hi,

Sorry for the limited details, a user has just raised an issue as I was leaving for a long weekend. I’d like to see if this a known and/or resolved issue so I can give users some idea of the scope of the problem (which OS’s, which file types).

The Problem

• When a user imports a directory in insight only about 3/4 of the image files that should have been imported appear on the import list. They are not missing due to failing on import with a red cross.

• If they go into the directory and select all files to import, then all the images are imported (and there’s a bunch of failed imports for random text files).

• The same files are always missed (it’s not random).

Details

• The operating system is centos and the OMERO.sight client is about 6 months old. OMERO.web is on 5.4.1-ice36-b75.

• The files are deltavision R3D.dv, D3D.dv +log files. It seems to be just the R3D files that are being missed.

Once I get back I can get the rest of information, check the logs, and update the installer to see if it still fails.

Cheers ,

Chris
evenhuis
 
Posts: 61
Joined: Tue Jan 30, 2018 4:47 am

Re: OMERO.insight: import directory skips files

Postby Dominik » Thu Sep 27, 2018 9:32 am

Hi Chris,

we just released version 5.4.8 a few days ago: https://www.openmicroscopy.org/omero/downloads/

There have been some changes with respect to the import component. Might be worth a try, maybe it's not a problem any longer with the new version. OMERO.Insight 5.4.8 will be compatible with your server too.

If it still happens with 5.4.8, would it be possible to upload an example file (dv and log) ( https://www.openmicroscopy.org/qa2/qa/upload/ ) ? Would make it much easier to replicate the issue.

Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: OMERO.insight: import directory skips files

Postby evenhuis » Fri Sep 28, 2018 5:25 am

Hi Dominik,

Thanks for the reply. This is looking like a serious data loss for our facility. We have been encouraging users to upload files from acquisition computers to OMERO and then to delete files to free up space.

Test set
I've uploaded a directory with 48 image files and their associated log files to here
https://www.openmicroscopy.org/qa2/qa2/ ... f33d1888d3

Versions tested
I've tested this with the following insight clients.
windows 7 5.4.4-ice36-b82
mac OS 10.13 5.4.5-ice36-b83
CentOS 5.4.3-ice36-b77
CentOS 5.4.8-ice36-b99
They all show this behaviour of dropping files.


Testing
I made a series of directories of subsets of the images to see how the issue behaves. As you can see:
• Version of insight does not effect the problem
• OS does effect affect the problem, CentOS is much worse
• which files are dropped depends on the number of files in the directory.
• only the raw R3D.dv files are dropped.

The results are organised like this:
Number of files : index of files dropped

Files dropped by CentOS (0 based index)
04 newInsight: 1
08 newInsight: 1
08 newInsight: 1
12 newInsight: 1
24 newInsight: 1, 13, 19
24 oldInsight: 1, 13, 19
48 oldInsight: 9, 13, 15, 17, 23, 27, 39, 47

Mac OS files dropped
24 : 15, 23
48 : 15, 23, 39, 41, 45, 47

Windows files dropped:
24 : 15, 23
48 : 15, 23, 39, 41, 45, 47

Thanks,

Chris
evenhuis
 
Posts: 61
Joined: Tue Jan 30, 2018 4:47 am

Re: OMERO.insight: import directory skips files

Postby Dominik » Fri Sep 28, 2018 2:23 pm

Hi Chris,

thanks for providing the files. We were able to replicate the problem.
It's an issue with the Bioformat's DeltavisionReader. The *.R3D.dv and the *.R3D_D3D.dv files are related to eachother. So if for example 'Coupon_dead01_snapshot_R3D_D3D.dv' is imported first, then the DeltavisionReader groups the following files together (output of the bftools show command):
filename = ../21768/Coupon_dead02_snapshot_R3D_D3D.dv
Used files:
../21768/Coupon_dead02_snapshot_R3D_D3D.dv
../21768/Coupon_dead02_snapshot_R3D.dv.log
../21768/Coupon_dead02_snapshot_R3D_D3D_log.txt

Then later when 'Coupon_dead02_snapshot_R3D.dv' is checked for import, the Importer detects that 'Coupon_dead02_snapshot_R3D.dv.log' is already marked for import with 'Coupon_dead02_snapshot_R3D_D3D.dv' and skips 'Coupon_dead02_snapshot_R3D.dv'.

If you select the individual files in the Importer the files are checked in order, and *.R3D.dv is always checked before *.R3D_D3D.dv that's why no files are skipped in that case. But if a whole directory is selected its content is acquired by a filesystem command which returns the files unordered and above cases can happen and files are skipped (that's why the results differ for different operating systems).

We're going to fix that immidiately. Unfortunately this issue must have been around a long time.

Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: OMERO.insight: import directory skips files

Postby evenhuis » Fri Sep 28, 2018 6:49 pm

Hi Dominik,

Thanks for tracking that down. We have advised our users to stop using the import folder method and to open the directory and select all files.

Could you please provide a list of all file types impacted by this bug?

Thanks,

Chris
evenhuis
 
Posts: 61
Joined: Tue Jan 30, 2018 4:47 am

Re: OMERO.insight: import directory skips files

Postby evenhuis » Tue Oct 02, 2018 3:19 am

Hi Dominik,

we have quite a menagerie of .dv files stored in OMERO:
    R3D.dv
    D3D_R3D.dv
    D3D_ALX.dx
    SIR.dv
    SIR_D3D.dv
    SIR_ALX.dv
    SIR_PRJ.dv
    SIR_ALX_PRJ.dv
    SIR_VOL.dv
and I'm sure there's more with other imaging modes.

• Would these file types be similarly impacted by the bug?

• Is the bug restricted to .dv files or are there other microscope formats that have associated log files which are affected too?

Thanks,

Chris
evenhuis
 
Posts: 61
Joined: Tue Jan 30, 2018 4:47 am

Re: OMERO.insight: import directory skips files

Postby sbesson » Tue Oct 02, 2018 2:47 pm

Hi Chris,

Sorry for the delay. We have been quite busy over the last few days trying to investigate in order to be as precise as possible.

Summarizing our findings, the primary issue seems to be related to the detection of filesets in OMERO. In the case of multi-file formats where some files are shared between filesets, it is possible that some files are lost from the import candidates detection. This only happens under some conditions:
  • the detected filesets need to have partial overlap in terms of files
  • the files in the filesets must to be grouped from any file
  • the order in which the files are scanned matters

This is the case of a DeltaVision folder containing both the original data and deconvolved data since the original log file is detected both as part of the original image fileset as well as part of the deconvolved image fileset with the result that only one of those filesets is imported.

To the best of our knowledge, we have had no community report of other file formats with this behavior. We have not been able to perform a thorough audit of all our affected file formats. The conservative but pessimistic assumption is that any format composed of multiple files as documented in the dataset structure table could be at risk. Looking into the details, a much smaller subset of these file formats can effectively lead to this scenario. So far, we have only been able to reproduce the issue with synthetic NRRD filesets.

Given the severity of the problem, we are treating it with the highest priority. We are investigating a fix which should result in a non-ambiguous detection of import candidates independent of the client, operating system or file format. We hope to have this fix reviewed, tested and shipped in a release of OMERO next week.

As a general rule of caution, in this use case like others, OMERO does not hide any images: multiple DeltaVision datasets should be viewable as separate images in any of the OMERO clients. We certainly recommend users to check that all their data has been successfully imported before deleting the original source of the data especially if no backup exists.

Thanks again for bearing with us, we will update this thread as soon as a robust solution is found and released publically.

Best,
Sebastien
User avatar
sbesson
Team Member
 
Posts: 421
Joined: Tue Feb 28, 2012 7:20 pm

Re: OMERO.insight: import directory skips files

Postby Dominik » Wed Oct 17, 2018 9:54 am

Hi Chris,

sorry, I forgot to post an update to this thread. Just in case you haven't seen it already: We released the version 5.4.9 a few days ago, which includes the respective bug fix: https://www.openmicroscopy.org/2018/10/ ... 5-4-9.html

Kind Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: OMERO.insight: import directory skips files

Postby evenhuis » Wed Oct 17, 2018 10:48 am

Thanks for the update Dominik,

I did an audit of the the files stored on OMERO to check which D3D files didn't have the the corresponding R3D. Out of 7078 D3D files 1888 lacked the R3D file, about a 25% file loss. The larger the dataset import, high the rate chance of loss.

Now that bug is understood do you have any idea how ALX.dv, SIR.dv files etc are be affected? We have a lot of these files too and I don't have the time to test this.

One thing that would make life a lot easier is if the importer would skip an upload if the a file with the same name in the same project /dataset already existed (maybe checksum too).

This would allow us to just point the importer and the directories to pickup the missed files rather than painstaking tracking them down.

The other case that this would be handy is if the upload is interrupted for some reason then users could pickup where they left off. Rather than delete the duplicates I get users to delete the whole dataset and upload again.

Thanks,

Chris
evenhuis
 
Posts: 61
Joined: Tue Jan 30, 2018 4:47 am

Re: OMERO.insight: import directory skips files

Postby Dominik » Thu Oct 18, 2018 2:06 pm

Hi Chris,

as far as I know, there is no option in Insight/Importer to skip previously imported files. The command line importer has this option ("exclude: clientpath"), https://docs.openmicroscopy.org/omero/5 ... er-options . I agree, that would be a very useful option for the GUI importer as well. I will bring that up for discussion.

I don't know how your other *.dv files might have been affected. You could check a few of them with the bftools showinf command. If you always have a unique pair of xyz.dv / xyz.dv.log then everything should be fine. But if you have something like this with the D3D.dv example:

$ ./showinf -nopix ../21768/Coupon_dead02_snapshot_R3D_D3D.dv
filename = ../21768/Coupon_dead02_snapshot_R3D_D3D.dv
Used files:
../21768/Coupon_dead02_snapshot_R3D_D3D.dv
../21768/Coupon_dead02_snapshot_R3D.dv.log
../21768/Coupon_dead02_snapshot_R3D_D3D_log.txt

$ ./showinf -nopix ../21768/Coupon_dead02_snapshot_R3D.dv
filename = ../21768/Coupon_dead02_snapshot_R3D.dv
Used files:
../21768/Coupon_dead02_snapshot_R3D.dv
../21768/Coupon_dead02_snapshot_R3D.dv.log

I.e. the R3D.dv.log file is associated with both the R3D_D3D.dv and the R3D.dv file, then some of these files could have been skipped.

Kind Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 0 guests

cron