Open Microscopy Environment

by **ClayB** » Fri Jun 26, 2015 9:35 pm

New machines, similar problem.

After copying the ManagedRepository to a new location, setting the new location with the "bin/omero config" command, and restarting the server, when I tried to import an image, I get this message:

Code: Select all: 2015-06-26 12:32:54,384 12185 [ main] ERROR ome.formats.importer.ImportLibrary - Error on import java.lang.RuntimeException: Cannot exclusively use the managed repository.

Looking deeper, I found that the confirmation message of the data directory move was NOT in the Blitz-0.log. I tried the config command again and no message was printed on the screen. However, looking into the Blitz-0.log file I saw:

Code: Select all: 2015-06-26 14:03:37,375 ERROR [ o.s.blitz.repo.AbstractRepositoryI] (2-thread-1) Failed during repository takeover

(Most recent 5000 lines of Blitz-0.log file have been attached.)

There were no image files imported into the ManagedRepository before the move was attempted. (On a sister server that is having the same problem now, a test image was loaded and copied over to the new location.) The directories from / to the OMERO directory I want to hold the ManagedRepository files are owned by (Linux)root, all with drwxr-xr-x permissions. The OMERO directory is owned by the (Linux) omero account:

Code: Select all: drwxr-xr-x 3 omero ccc 4096 Jun 26 11:47 OMERO

Any ideas on how to get this set up as I need or what I did wrong (this time)?

by **jmoore** » Mon Jun 29, 2015 10:41 am

Hi Clay,

A couple of questions before a full response:

From where to where are you moving the data.dir?
What type of file systems are involved?
Can you describe the exact steps you took?

Cheers,
~Josh

by **ClayB** » Mon Jun 29, 2015 3:29 pm

jmoore wrote:A couple of questions before a full response:

From where to where are you moving the data.dir?

The move is from the original location specified during installation (/mnt/app_hdd/omero/omero_server) to /cluster_share/tools/imaging/OMERO.
What type of file systems are involved?

The original site is NFS while the target file system is Lustre.
Can you describe the exact steps you took?
a. Create OMERO directory in /cluster_share

> mkdir /cluster_share/tools/imaging/OMERO

b. Copy current ManagedRepository to shared area

> cp -r omero_server/ManagedRepository /cluster_share/tools/imaging/OMERO

c. Configure OMERO server to point at new MR location

> OMERO.server/bin/omero config set omero.managed.dir /cluster_share/tools/imaging/OMERO/ManagedRepository

d. Restart OMERO server (with new location of MR)

> OMERO.server/bin/omero admin restart

by **ClayB** » Mon Jun 29, 2015 3:51 pm

ClayB wrote:The original site is NFS while the target file system is Lustre.

My bad. Just checked with the sysadmain. The original site file-system is EXT4.

The move is necessary since the compute nodes in the cluster don't have access to /mnt/app_hdd, but do have access to everything in the /cluster_share system.

by **jmoore** » Tue Jun 30, 2015 9:46 am

ClayB wrote:
jmoore wrote:A couple of questions before a full response:

From where to where are you moving the data.dir?

The move is from the original location specified during installation (/mnt/app_hdd/omero/omero_server) to /cluster_share/tools/imaging/OMERO.

Makes sense. And there's to change to ${omero.data.dir} itself, correct?

What type of file systems are involved?

The original site is EXT4 while the target file system is Lustre.

Thanks. I was worried that we were running into NFS issues.

Can you describe the exact steps you took?
a. Create OMERO directory in /cluster_share

> mkdir /cluster_share/tools/imaging/OMERO

b. Copy current ManagedRepository to shared area

> cp -r omero_server/ManagedRepository /cluster_share/tools/imaging/OMERO

c. Configure OMERO server to point at new MR location

> OMERO.server/bin/omero config set omero.managed.dir /cluster_share/tools/imaging/OMERO/ManagedRepository

d. Restart OMERO server (with new location of MR)

> OMERO.server/bin/omero admin restart

Thanks for the detailed steps, Clay! I've tried to reproduce with the following:

Code: Select all: # default.sh NAME=ome9 OMERO=`pwd`/dist/bin/omero rm -rf `pwd`/dist/var $OMERO admin stop set -e set -u $OMERO version dropdb $NAME createdb $NAME $OMERO db script --password ome -f- | psql $NAME rm -rf /tmp/$NAME mkdir /tmp/$NAME cd /tmp/$NAME mkdir data $OMERO config set omero.data.dir `pwd`/data $OMERO admin start $OMERO admin waitup $OMERO -s root@localhost -w ome fs repos

and

Code: Select all: # copied.sh set -e set -u NAME=ome9 OMERO=`pwd`/dist/bin/omero $OMERO admin stop cd /tmp/$NAME COPIED=`pwd`/copied.dir/OMERO mkdir -p $COPIED $OMERO config set omero.managed.dir $COPIED/ManagedRepository cp -r data/ManagedRepository $COPIED $OMERO admin start $OMERO admin waitup $OMERO -s root@localhost -w ome fs repos

But on doing so, I see this in my logs:

Code: Select all: /opt/ome9$ grep "updated to" dist/var/log/Blitz-0.log 2015-06-30 11:30:11,841 WARN [ o.s.blitz.repo.AbstractRepositoryI] (2-thread-3) Data directory moved: /tmp/ome9/data/ManagedRepository updated to /tmp/ome9/copied.dir/OMERO/ManagedRepository

There may be similar issues with Lustre. Could you attach your logs zipped? (I'm wondering if there are any other WARNs or ERRORs)

It might also be useful to have a jstack output from the Blitz process:

Code: Select all: jstack $(bin/omero admin ice server pid Blitz-0)

If this is related to the filesystem & locking, then likely you will need to move /cluster_share/tools/imaging/OMERO/ManagedRepository/.omero onto a non-Lustre file system unless there's someone who can fix locking directly in Lustre itself.

ClayB wrote:The move is necessary since the compute nodes in the cluster don't have access to /mnt/app_hdd, but do have access to everything in the /cluster_share system.

That also makes sense. If this isn't a file locking issue as it is with NFS, then perhaps you could either:

create the ManagedRepository directory yourself and set the property before your first startup?
use a symlink from the old location to the new? (omero_server/ManagedRepository -> /cluster/ ....) and not set the propery?

Thanks for helping us to track this down.
~Josh.

by **ClayB** » Tue Jun 30, 2015 4:20 pm

jmoore wrote:Makes sense. And there's to change to ${omero.data.dir} itself, correct?

The etc/grid/config.xml file shows

Code: Select all: <property name="omero.managed.dir" value="/cluster_share/tools/imaging/OMERO/ManagedRepository" />

Attached are first parts of the Blitz-0.log file. (I had to 'split' the file and then used BZIP2 to compress each piece [having to split the 00 log once more] get in under the 256MB file limit. Remainder of log file and 'jstack' output in the following message.)

We can (probably) reinstall and reset the ManagedRepository before start up. We are trying to set up an automated installation, so we'd need to modify the script(s) for this if that is the solution we need to work. The symlink variation might be a better solution for that.

For the current installation, I was hoping to ultimately store the image files in-place. I guess this means that I don't really need to move the ManagedRepository, but it does mean that ALL files would need to be stored in the shared location and imported in-place. This is likely going to be harder to enforce than setting things up right from the start.

I've tried to do an import in-place, but the error is preventing this since it can't get exclusive access to the ManagedRepository directory. (I'm hoping that solving the MR placement will fix this issue, too.)

by **ClayB** » Tue Jun 30, 2015 4:24 pm

rest of files

by **jmoore** » Wed Jul 01, 2015 9:53 am

Hi Clay,

Ah, finally a smoking gun! Thanks for the logs:

Code: Select all: 2015-06-29 13:58:37,381 INFO [ ome.services.util.ServiceHandler] (2-thread-2) Rslt: java.io.IOException: Function not implemented 2015-06-29 13:58:37,383 ERROR [ o.s.blitz.repo.AbstractRepositoryI] (2-thread-2) Failed during repository takeover java.io.IOException: Function not implemented at sun.nio.ch.FileDispatcherImpl.lock0(Native Method) ~[na:1.7.0_79] at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:91) ~[na:1.7.0_79] at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1022) ~[na:1.7.0_79] at java.nio.channels.FileChannel.lock(FileChannel.java:1052) ~[na:1.7.0_79] at ome.services.blitz.repo.FileMaker.getLine(FileMaker.java:95) ~[blitz.jar:na] at ome.services.blitz.repo.AbstractRepositoryI$GetOrCreateRepo.doWork(AbstractRepositoryI.java:310) ~[blitz.jar:na]

And indeed I find comments along the lines of "The issue is that parallel distributed file systems such as Lustre and NFS do not implement lock0, but ... seems to rely on it."

I'd try one of two things first:

place the ManagedRepository/.omero directory on a non-parallel file system on the server node and symlink things as needed
use the FS template to configuration to leave ManagedRepository in place, but add a top layer directory which can by symlink'ed to the parallel file system. See http://downloads.openmicroscopy.org/presentations/2014/Paris-Workshops/OMERO-FS-Workshop/#/16 for more information.

Cheers,
~Josh.

by **ClayB** » Wed Jul 01, 2015 2:59 pm

jmoore wrote:I'd try one of two things first:
place the ManagedRepository/.omero directory on a non-parallel file system on the server node and symlink things as needed
use the FS template to configuration to leave ManagedRepository in place, but add a top layer directory which can by symlink'ed to the parallel file system. See http://downloads.openmicroscopy.org/presentations/2014/Paris-Workshops/OMERO-FS-Workshop/#/16 for more information.

Not sure I've gotten enough info from the second option URL, so let me try the first option first.

I can leave the ManagedRepository/.omero in it's original directory with a symlink in the parallel file system pointing back to that from the copied location. Once that's done, do I need to try the "config set" command again or just restart the server?

--clay

by **jmoore** » Thu Jul 02, 2015 7:55 am

If your managed.dir configuration points at the /cluster_share and that directory contains a .omero symlink going to a local filesystem, a restart should suffice.

~Josh.

Open Microscopy Environment

Moving ManagedRepository directory II

Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Re: Moving ManagedRepository directory II

Who is online