We're Hiring!

Batch download of 'public' data & scripts.

General and open developer discussion about using OMERO APIs from C++, Java, Python, Matlab and more! Please new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

If you are having trouble with custom code, please provide a link to a public repository, ideally GitHub.

Batch download of 'public' data & scripts.

Postby i.munro » Wed Mar 19, 2014 12:37 pm

Dear all .

We are hoping to use the web client & a public group to allow us to share data.
We have just realised that the only way to do a batch download is via a script.
Currently our configuration blocks the public user from scripts & our sysadmin has
expressed the following concern ". exposing that script might mean that anyone could trivially perform an effective denial-of-service attack on the server by launching lots of batch exports"

Does anyone have any suggestions ?

Ian
i.munro
 
Posts: 50
Joined: Thu Apr 25, 2013 1:01 pm

Re: Batch download of 'public' data & scripts.

Postby wmoore » Wed Mar 19, 2014 1:03 pm

If you know ahead of time what you are going to allow 'public' users to download, this could be prepared in advance using a script etc. E.g. prepare a zip of all images in a Dataset and attach it to the Dataset (as Batch_Image_Export does). Then this could be downloaded by public users.

I guess you'd want some way to run this script on any new data once it is ready to go public.
User avatar
wmoore
Team Member
 
Posts: 674
Joined: Mon May 18, 2009 12:46 pm

Re: Batch download of 'public' data & scripts.

Postby i.munro » Wed Mar 19, 2014 9:30 pm

Thanks Will. There may be concerns about storage space though. We're now looking at 3 copies of the data on the server, the original, a copy in the public group & a zipped copy.

Do you think it might be possible to add an anti-robot to the batch download script?

Ian
i.munro
 
Posts: 50
Joined: Thu Apr 25, 2013 1:01 pm

Re: Batch download of 'public' data & scripts.

Postby manics » Thu Mar 20, 2014 9:21 am

At present there's no throttling on the Processor service. If you're feeling adventurous you could try out a multi-node configuration so the Processor is on a different host, and give us any feedback since this is a new addition to the docs:
https://www.openmicroscopy.org/site/sup ... iple-hosts

Simon
User avatar
manics
Team Member
 
Posts: 261
Joined: Mon Oct 08, 2012 11:01 am
Location: Dundee

Re: Batch download of 'public' data & scripts.

Postby i.munro » Thu Mar 20, 2014 1:22 pm

Thanks Simon.
I'll pass that along to our sysadmin in the hope that, unlike me, he understands it.


Ian
i.munro
 
Posts: 50
Joined: Thu Apr 25, 2013 1:01 pm

Re: Batch download of 'public' data & scripts.

Postby mwoodbri » Thu Mar 20, 2014 1:34 pm

Hi Simon,

Would it be feasible to run a second processor but on the same box that handles requests from the "public" user and is throttled to run at most one job simultaneously?

Mark.
mwoodbri
 
Posts: 9
Joined: Mon Jun 22, 2009 11:51 am

Re: Batch download of 'public' data & scripts.

Postby manics » Thu Mar 20, 2014 3:11 pm

Hi Mark

Unfortunately we don't have a way of throttling the number of concurrent script jobs for a single processor, nor is it possible to restrict jobs by omero user. However if it's just a case of preventing the Processor from slowing down the server then in principle you could run the Processor service under a different OS user and limit the resources using ulimit, nice or some other functionality provided by the OS, but this isn't something we've tried.

We've been thinking about how OMERO Processor could be improved, so it's useful to hear how you'd like to use it.

Thanks

Simon
User avatar
manics
Team Member
 
Posts: 261
Joined: Mon Oct 08, 2012 11:01 am
Location: Dundee

Re: Batch download of 'public' data & scripts.

Postby mwoodbri » Fri Mar 21, 2014 2:31 pm

Thanks - that's a good idea. But I think we would at least need to be able to target jobs by user - so that we could run jobs from unauthenticated or external users under tighter resource constraints.
mwoodbri
 
Posts: 9
Joined: Mon Jun 22, 2009 11:51 am

Re: Batch download of 'public' data & scripts.

Postby jmoore » Fri Mar 21, 2014 2:36 pm

The Processor does take some options as to who it will serve. See:
Code: Select all
bin/omero script serve -h
usage: dist/bin/omero script serve [-h] [--verbose] [-b] [-t TIMEOUT] [-C]
                                   [-s SERVER] [-p PORT] [-g GROUP] [-u USER]
                                   [-w PASSWORD] [-k KEY]
                                   [who [who ...]]

Start a usermode processor for scripts

Positional Arguments:
  who                               Who to execute for: user, group, user=1, group=5 (default=official)
...

which you can run locally as an unauthenticated user to have your scripts run. The same could be launched in the backend. (That being said, yes, all of this definitely could use more extensive features!)

All the best,
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: Batch download of 'public' data & scripts.

Postby mwoodbri » Wed Mar 26, 2014 6:38 pm

Thanks guys. That sounds promising. So perhaps we could:

* Prevent the OMERO.web public user from seeing/running 'global' scripts
* Create equivalent user scripts for the features that we wish to provide to the public user
* Configure the server to execute jobs for this user on a separate script processor that is resource constrained (e.g. using ulimit/cgroups/kvm).[/list]

Is this approach possible at the moment?
mwoodbri
 
Posts: 9
Joined: Mon Jun 22, 2009 11:51 am

Next

Return to Developer Discussion

Who is online

Users browsing this forum: Google [Bot] and 1 guest