We're Hiring!

Moving ManagedRepository directory II

Having a problem deploying OMERO? Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

The OMERO.server installation documentation begins here and you can find OMERO.web deployment documentation here.

Moving ManagedRepository directory II

Postby ClayB » Fri Jun 26, 2015 9:35 pm

New machines, similar problem.

After copying the ManagedRepository to a new location, setting the new location with the "bin/omero config" command, and restarting the server, when I tried to import an image, I get this message:

Code: Select all
2015-06-26 12:32:54,384 12185      [      main] ERROR        ome.formats.importer.ImportLibrary - Error on import
java.lang.RuntimeException: Cannot exclusively use the managed repository.


Looking deeper, I found that the confirmation message of the data directory move was NOT in the Blitz-0.log. I tried the config command again and no message was printed on the screen. However, looking into the Blitz-0.log file I saw:

Code: Select all
2015-06-26 14:03:37,375 ERROR [      o.s.blitz.repo.AbstractRepositoryI] (2-thread-1) Failed during repository takeover


(Most recent 5000 lines of Blitz-0.log file have been attached.)

There were no image files imported into the ManagedRepository before the move was attempted. (On a sister server that is having the same problem now, a test image was loaded and copied over to the new location.) The directories from / to the OMERO directory I want to hold the ManagedRepository files are owned by (Linux)root, all with drwxr-xr-x permissions. The OMERO directory is owned by the (Linux) omero account:

Code: Select all
drwxr-xr-x 3 omero ccc 4096 Jun 26 11:47 OMERO


Any ideas on how to get this set up as I need or what I did wrong (this time)?
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby jmoore » Mon Jun 29, 2015 10:41 am

Hi Clay,

A couple of questions before a full response:
  • From where to where are you moving the data.dir?
  • What type of file systems are involved?
  • Can you describe the exact steps you took?

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: Moving ManagedRepository directory II

Postby ClayB » Mon Jun 29, 2015 3:29 pm

jmoore wrote:A couple of questions before a full response:

  • From where to where are you moving the data.dir?

    The move is from the original location specified during installation (/mnt/app_hdd/omero/omero_server) to /cluster_share/tools/imaging/OMERO.

  • What type of file systems are involved?

    The original site is NFS while the target file system is Lustre.

  • Can you describe the exact steps you took?
    a. Create OMERO directory in /cluster_share

    > mkdir /cluster_share/tools/imaging/OMERO

    b. Copy current ManagedRepository to shared area

    > cp -r omero_server/ManagedRepository /cluster_share/tools/imaging/OMERO

    c. Configure OMERO server to point at new MR location

    > OMERO.server/bin/omero config set omero.managed.dir /cluster_share/tools/imaging/OMERO/ManagedRepository

    d. Restart OMERO server (with new location of MR)

    > OMERO.server/bin/omero admin restart
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby ClayB » Mon Jun 29, 2015 3:51 pm

ClayB wrote:The original site is NFS while the target file system is Lustre.


My bad. Just checked with the sysadmain. The original site file-system is EXT4.

The move is necessary since the compute nodes in the cluster don't have access to /mnt/app_hdd, but do have access to everything in the /cluster_share system.
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby jmoore » Tue Jun 30, 2015 9:46 am

ClayB wrote:
jmoore wrote:A couple of questions before a full response:

  • From where to where are you moving the data.dir?

    The move is from the original location specified during installation (/mnt/app_hdd/omero/omero_server) to /cluster_share/tools/imaging/OMERO.


Makes sense. And there's to change to ${omero.data.dir} itself, correct?



  • What type of file systems are involved?

    The original site is EXT4 while the target file system is Lustre.


Thanks. I was worried that we were running into NFS issues.


  • Can you describe the exact steps you took?
    a. Create OMERO directory in /cluster_share

    > mkdir /cluster_share/tools/imaging/OMERO

    b. Copy current ManagedRepository to shared area

    > cp -r omero_server/ManagedRepository /cluster_share/tools/imaging/OMERO

    c. Configure OMERO server to point at new MR location

    > OMERO.server/bin/omero config set omero.managed.dir /cluster_share/tools/imaging/OMERO/ManagedRepository

    d. Restart OMERO server (with new location of MR)

    > OMERO.server/bin/omero admin restart



Thanks for the detailed steps, Clay! I've tried to reproduce with the following:

Code: Select all
# default.sh
NAME=ome9
OMERO=`pwd`/dist/bin/omero
rm -rf `pwd`/dist/var
$OMERO admin stop

set -e
set -u

$OMERO version

dropdb $NAME
createdb $NAME
$OMERO db script --password ome -f- | psql $NAME

rm -rf /tmp/$NAME
mkdir /tmp/$NAME
cd /tmp/$NAME

mkdir data
$OMERO config set omero.data.dir `pwd`/data
$OMERO admin start
$OMERO admin waitup
$OMERO -s root@localhost -w ome fs repos


and

Code: Select all
# copied.sh
set -e
set -u

NAME=ome9
OMERO=`pwd`/dist/bin/omero

$OMERO admin stop

cd /tmp/$NAME
COPIED=`pwd`/copied.dir/OMERO
mkdir -p $COPIED
$OMERO config set omero.managed.dir $COPIED/ManagedRepository
cp -r data/ManagedRepository $COPIED

$OMERO admin start
$OMERO admin waitup
$OMERO -s root@localhost -w ome fs repos


But on doing so, I see this in my logs:
Code: Select all
/opt/ome9$ grep "updated to" dist/var/log/Blitz-0.log
2015-06-30 11:30:11,841 WARN  [      o.s.blitz.repo.AbstractRepositoryI] (2-thread-3) Data directory moved: /tmp/ome9/data/ManagedRepository updated to /tmp/ome9/copied.dir/OMERO/ManagedRepository


There may be similar issues with Lustre. Could you attach your logs zipped? (I'm wondering if there are any other WARNs or ERRORs)

It might also be useful to have a jstack output from the Blitz process:
Code: Select all
jstack $(bin/omero admin ice server pid Blitz-0)


If this is related to the filesystem & locking, then likely you will need to move /cluster_share/tools/imaging/OMERO/ManagedRepository/.omero onto a non-Lustre file system unless there's someone who can fix locking directly in Lustre itself.

ClayB wrote:The move is necessary since the compute nodes in the cluster don't have access to /mnt/app_hdd, but do have access to everything in the /cluster_share system.


That also makes sense. If this isn't a file locking issue as it is with NFS, then perhaps you could either:

  • create the ManagedRepository directory yourself and set the property before your first startup?
  • use a symlink from the old location to the new? (omero_server/ManagedRepository -> /cluster/ ....) and not set the propery?

Thanks for helping us to track this down.
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: Moving ManagedRepository directory II

Postby ClayB » Tue Jun 30, 2015 4:20 pm

jmoore wrote:Makes sense. And there's to change to ${omero.data.dir} itself, correct?


The etc/grid/config.xml file shows

Code: Select all
<property name="omero.managed.dir" value="/cluster_share/tools/imaging/OMERO/ManagedRepository" />


Attached are first parts of the Blitz-0.log file. (I had to 'split' the file and then used BZIP2 to compress each piece [having to split the 00 log once more] get in under the 256MB file limit. Remainder of log file and 'jstack' output in the following message.)

We can (probably) reinstall and reset the ManagedRepository before start up. We are trying to set up an automated installation, so we'd need to modify the script(s) for this if that is the solution we need to work. The symlink variation might be a better solution for that.

For the current installation, I was hoping to ultimately store the image files in-place. I guess this means that I don't really need to move the ManagedRepository, but it does mean that ALL files would need to be stored in the shared location and imported in-place. This is likely going to be harder to enforce than setting things up right from the start.

I've tried to do an import in-place, but the error is preventing this since it can't get exclusive access to the ManagedRepository directory. (I'm hoping that solving the MR placement will fix this issue, too.)
Attachments
Blitz-0.log01.bz2
split log file 01
(155.65 KiB) Downloaded 123 times
Blitz-0.log00b.bz2
split log file 00 part 2
(92.83 KiB) Downloaded 136 times
Blitz-0.log00a.bz2
split log file 00 part 1
(191.71 KiB) Downloaded 141 times
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby ClayB » Tue Jun 30, 2015 4:24 pm

rest of files
Attachments
jstack.zip
(2.32 KiB) Downloaded 130 times
Blitz-0.log03.bz2
(102.92 KiB) Downloaded 141 times
Blitz-0.log02.bz2
(155.75 KiB) Downloaded 154 times
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby jmoore » Wed Jul 01, 2015 9:53 am

Hi Clay,

Ah, finally a smoking gun! Thanks for the logs:

Code: Select all
2015-06-29 13:58:37,381 INFO  [        ome.services.util.ServiceHandler] (2-thread-2)  Rslt:    java.io.IOException: Function not implemented
2015-06-29 13:58:37,383 ERROR [      o.s.blitz.repo.AbstractRepositoryI] (2-thread-2) Failed during repository takeover
java.io.IOException: Function not implemented
        at sun.nio.ch.FileDispatcherImpl.lock0(Native Method) ~[na:1.7.0_79]
        at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:91) ~[na:1.7.0_79]
        at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1022) ~[na:1.7.0_79]
        at java.nio.channels.FileChannel.lock(FileChannel.java:1052) ~[na:1.7.0_79]
        at ome.services.blitz.repo.FileMaker.getLine(FileMaker.java:95) ~[blitz.jar:na]
        at ome.services.blitz.repo.AbstractRepositoryI$GetOrCreateRepo.doWork(AbstractRepositoryI.java:310) ~[blitz.jar:na]


And indeed I find comments along the lines of "The issue is that parallel distributed file systems such as Lustre and NFS do not implement lock0, but ... seems to rely on it."

I'd try one of two things first:

Cheers,
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: Moving ManagedRepository directory II

Postby ClayB » Wed Jul 01, 2015 2:59 pm

jmoore wrote:I'd try one of two things first:


Not sure I've gotten enough info from the second option URL, so let me try the first option first.

I can leave the ManagedRepository/.omero in it's original directory with a symlink in the parallel file system pointing back to that from the copied location. Once that's done, do I need to try the "config set" command again or just restart the server?

--clay
User avatar
ClayB
 
Posts: 12
Joined: Wed Apr 08, 2015 7:16 pm
Location: Hillsboro, OR USA

Re: Moving ManagedRepository directory II

Postby jmoore » Thu Jul 02, 2015 7:55 am

If your managed.dir configuration points at the /cluster_share and that directory contains a .omero symlink going to a local filesystem, a restart should suffice.

~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Next

Return to Installation and Deployment

Who is online

Users browsing this forum: No registered users and 1 guest