We're Hiring!

OMERO Integrity check

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

OMERO Integrity check

Postby Lipplab » Fri Jan 13, 2017 12:54 pm

Hi,
we´ve kind of stumbled into a - self-made - possible inconsistency between the Postgres database and the image repository.
Before issuing any repair et al. I would like to check integrity between the db and the repository.
This aims - in particular - on the question whether (i) all image pointers in the db actually point to corresponding data and (ii) (if possible) whether each piece of imaging data in the repository has a "link" to itself in the db.
Any help would be most welcome,
Peter
Lipplab
 
Posts: 54
Joined: Thu Sep 12, 2013 9:56 am

Re: OMERO Integrity check

Postby jmoore » Mon Jan 16, 2017 10:53 am

Hi Peter,

sorry to hear you're having troubles. Before we begin digging, do you have a backup of the current state of your server? (See "Backup and Restore")

Lipplab wrote:Before issuing any repair et al. I would like to check integrity between the db and the repository.
This aims - in particular - on the question whether (i) all image pointers in the db actually point to corresponding data and (ii) (if possible) whether each piece of imaging data in the repository has a "link" to itself in the db.


There's not yet a ready-made solution for this yet, though it's been discussed before: https://www.openmicroscopy.org/community/viewtopic.php?f=4&t=7666

A first thing to check is the "cleanse" tool. See https://www.openmicroscopy.org/site/support/omero5.2/developers/Modules/Delete.html#binary-data with the --dry-run option, e.g. as the operating-system user owning /OMERO and logging in as OMERO's root user:

Code: Select all
bin/omero -s root@localhost admin cleanse --dry-run /OMERO


This will show you which files on disk OMERO does not have in the database. The reverse doesn't exist as a script in the OMERO server. I'm pretty sure versions have been created by the community. Either a response may appear here, or I'll post an example shortly.

Cheers,
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: OMERO Integrity check

Postby carandraug » Tue Jan 17, 2017 12:51 pm

Lipplab wrote:I would like to check integrity between the db and the repository.
This aims - in particular - on the question whether (i) all image pointers in the db actually point to corresponding data and (ii) (if possible) whether each piece of imaging data in the repository has a "link" to itself in the db.


Hi Peter

We had an issue some time ago that required a tool like that. Omero
covers #2 and Josh already replied. For #1 we never got around
creating a script but I have the series of commands and psql queries
used. There is a simple python script used on the commands below named
check-omero-data
which we have online (note that will require have omeropy on
PYTHONPATH).

Code: Select all
## Get list of pixel ids (filenames in Pixels) that are missing.
$ psql -d omerodb -c \
    "COPY (SELECT id
             FROM pixels
            WHERE repo IS NULL OR repo = '')
     TO STDOUT" | ./check-omero-data /srv/OMERO/ Pixels -

## Get list of originalfile ids (filenames in Files) that are missing.
$ psql -d omerodb -c \
     "COPY (SELECT id
              FROM originalfile
             WHERE (repo IS NULL OR repo = '')
                    AND mimetype != 'Repository')
      TO STDOUT" | ./check-omero-data /mnt/OMERO/ Files -

## This will include omero scripts of previous omero installs and you
## may want to ignore those
## http://lists.openmicroscopy.org.uk/pipermail/ome-users/2016-September/006172.html
## You can ignore all with mimetype 'text/x-python' or, if you are
## afraid of having other python files that are not the official omero
## scripts, you can filter them out later.  This will show the python
## files only.
$ psql -d omerodb -c \
    "COPY (SELECT f.id, f.hash, f.name
             FROM originalfile f JOIN checksumalgorithm h on f.hasher = h.id
            WHERE f.mimetype = 'text/x-python' AND f.repo IS null)
     TO STDOUT"


This only looks into the files inside the Files and Pixels
directories. It checks nothing within ManagedRepositories.

I have a bunch of other notes, SQL commands, that I used to create
reports on the missing files. These include getting list of checksums
of the missing files and looking for duplicates on omero (seems common
for different people from the same lab to upload the same image),
dates of the missing images, and number of missing images per group
and person. I can share them with you if you need, I just need to
organize them as they are an absolute mess of snippets with cryptic
comments. Let me know and good luck.
carandraug
 
Posts: 15
Joined: Mon Sep 06, 2010 8:50 pm

Re: OMERO Integrity check

Postby Lipplab » Tue Jan 17, 2017 2:13 pm

Dear Josh and carandraug,
many thanks for all your detailed responses.
In the meanwhile I could - in a 100 hours lasting saving effort - partially manually "recover" a repository version that "should" correlate to your Postgres database version. I´m still inclined to give both of your suggestions a try.
I´ll try both and give a feedback to the forum at the time we have been successful or failed or get stuck ;)
Thanks again
Peter
Lipplab
 
Posts: 54
Joined: Thu Sep 12, 2013 9:56 am


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron