We're Hiring!

question/thoughts on inplace import options

Having a problem deploying OMERO? Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

The OMERO.server installation documentation begins here and you can find OMERO.web deployment documentation here.

question/thoughts on inplace import options

Postby dsudar » Tue Aug 22, 2017 8:04 pm

Hi team,

I've successfully used inplace import for a few years, pretty much only using hardlink transfers. On one of the servers I run, it is not possible to keep both the users directories (where people drop their images, SPW's, etc. and where they want to keep access to those data) and OMERO's ManagedRepo on the same file system so I'm looking at symlinks instead. However, managing the permissions etc. so as to avoid data loss looks non-trivial. So I was wondering about another inplace transfer option but before trying to write a new transfer subclass for this, I wanted to get your opinion:
The transfer option I'm considering is: copy the data to the ManagedRepo, delete the file from the original location, then create a symlink in the original location to the new file in the ManagedRepo, and (if needed) do the appropriate chmod to allow the user read-only access. I was thinking that this gives the same safeness as hardlinks but provides more file system flexibility.

Finally, a thought for consideration for a future desktop/web importer: support for optional inplace imports for endusers who have already uploaded the files to a mounted filesystem on the OMERO server.

Cheers,
- Damir
dsudar
 
Posts: 235
Joined: Mon May 14, 2012 8:43 pm
Location: Berkeley, CA, USA

Re: question/thoughts on inplace import options

Postby jmoore » Wed Aug 23, 2017 1:19 pm

dsudar wrote:Hi team,


Hi Damir,

The transfer option I'm considering is: copy the data to the ManagedRepo, delete the file from the original location, then create a symlink in the original location to the new file in the ManagedRepo, and (if needed) do the appropriate chmod to allow the user read-only access. I was thinking that this gives the same safeness as hardlinks but provides more file system flexibility.


Inverse symlinks, interesting. We've avoided providing transfer classes that need to make too many assumptions. In this case, the trickiest assumption would likely be the client process' authorization to chmod, etc the file source file. If you can encode it for your use case, we could certainly look to generalize it for general consumption.


Finally, a thought for consideration for a future desktop/web importer: support for optional inplace imports for endusers who have already uploaded the files to a mounted filesystem on the OMERO server.


Definitely. This is a likely first step of the web importer work, where referenced so far it's been referred to as "remote import". (More client-side imports then likely make use of that mechanism by first uploading to the server.)

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: question/thoughts on inplace import options

Postby dsudar » Tue Aug 29, 2017 12:49 am

Hi Josh,

Mostly as a draft, I implemented the concept of the copy, delete, and reverse symlink concept as a subclass of AbstractExecFileTransfer as:
Code: Select all
/*
* Copyright (C) 2015 University of Dundee & Open Microscopy Environment.
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/

package ome.formats.importer.transfers;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

/**
* Local-only file transfer mechanism which makes use of the platform
* copy and ln commands.
*
* This is only useful where the commands "cp (or ln -s) source target" (Unix) or
* "copy (or mklink) source target" (Windows) will work.
*
* @since 5.3.x
*/
public class CopyReverseSymlinkFileTransfer extends AbstractExecFileTransfer {

    /**
     * Executes "cp file location" (Unix) or "cp file location" (Windows),
     * then creates a temporary symlink in the source location,
     * and fails on non-0 return codes.
     *
     * @param file File to be copied
     * @param location Location to copy to.
     * @throws IOException
     */
    protected ProcessBuilder createProcessBuilder(File file, File location) {
        ProcessBuilder pb = new ProcessBuilder();
        List<String> args = new ArrayList<String>();
        if (isWindows()) {
            // no idea about this, just guessing this might work
            args.add("cmd");
            args.add("/c");
            args.add("copy");
            args.add(file.getAbsolutePath());
            args.add(location.getAbsolutePath());
            args.add("&&");
            args.add("mklink");
            args.add(String.format("%s.tmp", file.getAbsolutePath()));
            args.add(location.getAbsolutePath());
        } else {
            args.add("cp");
            args.add(file.getAbsolutePath());
            args.add(location.getAbsolutePath());
            args.add("&&");
            args.add("ln");
            args.add("-s");
            args.add(location.getAbsolutePath());
            args.add(String.format("%s.tmp", file.getAbsolutePath()));
        }
        pb.command(args);
        return pb;
    }

    /**
     * Delete all copied files if there were no errors and change the temporary symlink name
     */
    @Override
    public void afterTransfer(int errors, List<String> srcFiles) throws CleanupFailure {
        deleteTransferredFiles(errors, srcFiles);
        // rename the symlink
        renameSymlinks(errors, srcFiles);
    }

    /**
     * Method used by this subclass during {@link FileTransfer#afterTransfer(int, List)}
     * if they would like to rename symlinks to the files transferred in the set.
     */
    protected void renameSymlinks(int errors, List<String> srcFiles)
        throws CleanupFailure {

        if (errors > 0) {
            printLine();
            log.error("{} error(s) found.", errors);
            log.error("{} symlink rename not performed!", getClass().getSimpleName());
            log.error("The following <filename>.tmp symlinks will *not* be renamed:");
            for (String srcFile : srcFiles) {
                log.error("\t{}", srcFile);
            }
            printLine();
            return;
        }

        List<File> failedFiles = new ArrayList<File>();
        for (String path : srcFiles) {
            File tmpSymlink = new File(String.format("%s.tmp", path));
            File realSymlink = new File(path);
            try {
                log.info("Removing .tmp from symlink {}...", tmpSymlink);
                if (!tmpSymlink.renameTo(realSymlink)) {
                    throw new RuntimeException("Failed to rename.");
                }
            } catch (Exception e) {
                log.error("Failed to rename temporary symlink {}", tmpSymlink);
                failedFiles.add(tmpSymlink);
            }
        }

        if (!failedFiles.isEmpty()) {
            printLine();
            log.error("Renaming failed!");
            log.error("{} files could not be renamed and will need to " +
                "be handled manually", failedFiles.size());
            for (File failedFile : failedFiles) {
                log.error("\t{}", failedFile.getAbsolutePath());
            }
            printLine();
            throw new CleanupFailure(failedFiles);
        }
    } 
}


In order to mostly use existing functionality, the implementation is effectively doing a copy and creating a reverse symlink with a temporary name, and then relies on the existing afterTransfer delete followed by a rename of the symlinks. Maybe a bit clumsy but seemed most compatible with the existing code.

Agreed that there are a number of assumptions before this can work but they are pretty much the same as are needed for hardlinks with all modern Linuxes.

I haven't yet compiled/run this, mostly because I don't know much about Java and how to build jars. While I figure that out, does this approach look workable?

Cheers,
- Damir
dsudar
 
Posts: 235
Joined: Mon May 14, 2012 8:43 pm
Location: Berkeley, CA, USA

Re: question/thoughts on inplace import options

Postby jmoore » Wed Aug 30, 2017 5:24 pm

dsudar wrote:Hi Josh,


Hi Damir,

Mostly as a draft, I implemented the concept of the copy, delete, and reverse symlink concept as a subclass of AbstractExecFileTransfer as:
...cut...

In order to mostly use existing functionality, the implementation is effectively doing a copy and creating a reverse symlink with a temporary name, and then relies on the existing afterTransfer delete followed by a rename of the symlinks. Maybe a bit clumsy but seemed most compatible with the existing code.


Looks generally as expected, though I noticed that some there may be some refactoring that can be done in this hierarchy to help you out, depending on how far we take this.

I haven't yet compiled/run this, mostly because I don't know much about Java and how to build jars. While I figure that out, does this approach look workable?


I think so. It occurs to me that as soon as you start adding more than a single atomic step the error handling could become an issue. One option I can think of is to replace each of the multi-step commands with a call to a single external script:

Code: Select all
bin/omero import --transfer=/tmp/inverse_link.sh ...


so that you can more quickly workaround any issues that arise. I think this would require adding something like:

Code: Select all
if (arg.contains(File.separator )) {
    return new ExternalFileTransfer(arg);
}


at https://github.com/openmicroscopy/openmicroscopy/blob/develop/components/blitz/src/ome/formats/importer/transfers/AbstractFileTransfer.java#L79. The ExternalFileTransfer class would then need to pass all the arguments to the script on stdout.


Cheers,
- Damir


~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany


Return to Installation and Deployment

Who is online

Users browsing this forum: No registered users and 1 guest