Page 1 of 1

Download complete dataset from IDR website

PostPosted: Tue Nov 15, 2016 9:38 am
by Julien
Hi all,

I’m trying to download a complete dataset from the website http://idr-demo.openmicroscopy.org/
It’s easy to download one single image but to download a complete dataset of a plate for example requires a specific account (not a public one).
I was wondering if someone tried before to download data from this website using a script for example, any help will be welcome.

Cheers,
Julien

Re: Download complete dataset from IDR website

PostPosted: Wed Nov 16, 2016 5:09 pm
by EleanorWilliams
Hi Julien

Thank you for your enquiring about downloading image data from the Image Data Resource (IDR). At the moment, the IDR is not designed for raw data download because some of the image sets are several terabytes in size. However we have a virtual analysis environment in which your own analysis can be run on any dataset. It includes example tools and scripts to work with the IDR. We can provide further details about this and get you started in this cloud environment if you are interested. The analysis environment is designed to be a secure research resource that can be customised to individual researcher's needs.

Further information about what you are looking to do with the downloads would be useful.

For the particular screen, idr0012, the images are actually available on the CellMorph web site http://www.ebi.ac.uk/huber-srv/cellmorph/.

Best regards
Eleanor

Re: Download complete dataset from IDR website

PostPosted: Fri Nov 25, 2016 1:05 pm
by Julien
Hi Eleanor,

thank you very much for your answer. I actually manage to get the data through the website you indicate, thanks for the link !

I was wondering if there is a way to get the metadata info associated with each images as seen on the IDR website, more specifically the Well Details, Attributes and Tables, do you know if this information is downloadable somewhere for the whole dataset ?

best regards,
Julien

Re: Download complete dataset from IDR website

PostPosted: Fri Nov 25, 2016 7:55 pm
by EleanorWilliams
Hi Julien

That's great you got the images ok.

In the IDR the main metadata in the Attribrutes section, such as the siRNA Pool Identifier, Gene Identifier etc, can be found in the file idr0012-screenA-annotation.csv which can be download from here
https://github.com/IDR/idr-metadata/blo ... tation.csv.

This file is converted to hd5 format in OMERO and can be downloaded http://idr-demo.openmicroscopy.org/webc ... on/3466432 (its an attachment to the screen). We then copy some of the values from this file into the Attributes section so that we can display them better and they are searchable.

As I’m sure you discovered, the CellMorph web site also has the original versions of the annotations e.g. http://www.ebi.ac.uk/huber-srv/cellmorp ... 7+HGNC.tab and http://www.ebi.ac.uk/huber-srv/cellmorp ... prints.tab.

The Well level information is nothing else then image details. You can script access to it first by getting a list of all the plate IDs http://idr-demo.openmicroscopy.org/webc ... =0&group=3 and then for each plate ID you can get the plate grid e.g. http://idr-demo.openmicroscopy.org/webg ... te/4287/0/. Increment the last url parameter to loop through each field. Then for each image ID you can get image and pixel information from urls like http://idr-demo.openmicroscopy.org/webc ... a/1000858/

For each well you can also download bulk_annotation using http://idr-demo.openmicroscopy.org/webg ... ell-554552

If you need further help, or have questions or suggestions let us know.

Best regards
Eleanor

Re: Download complete dataset from IDR website

PostPosted: Wed Dec 07, 2016 1:56 pm
by Julien
Dear Eleanor,

thank you very much for the infos. I indeed manage to get the information on each wells in the tab file. It's useful to know that it's possible to access via a script to the well level information.

best regards,
Julien

Re: Download complete dataset from IDR website

PostPosted: Thu Dec 22, 2016 4:13 pm
by Julien
Dear Eleanor,

I've been in touch with Wolfgang Huber about downloading data of his lab that is hosted on the IDR website. He told me that the purpose of IDR was precisely to make such data available and in particular for download. I was wondering if there is a way to download a particular dataset via a Python script and the omero python library ? I guess one issue with that will be the need for a user account and password, am I right ?

kind regards,
Julien

Re: Download complete dataset from IDR website

PostPosted: Fri Dec 23, 2016 2:35 pm
by jmoore
Hi Julien,

Julien wrote:I've been in touch with Wolfgang Huber about downloading data of his lab that is hosted on the IDR website.


What studies other than CellMorph (idr0012) are you interested in working with?


He told me that the purpose of IDR was precisely to make such data available and in particular for download.


The purpose of the current IDR server (http://idr-demo.openmicroscopy.org) isn’t to provide download of the tens of terabytes of raw image data, but to demonstrate that it's possible & useful to integrate these data through the curation of their metadata and to make that added value available for re-use, including by download.

But, it's clear that with everything on one system, there's going to be value in working with the raw data. That's why once the review process is complete, we'll start by providing logins to a Jupyter environment that enables the reverse -- moving analysis to the data.

Details on these two strategies are in our current preprint, http://biorxiv.org/content/early/2016/11/24/089359


I was wondering if there is a way to download a particular dataset via a Python script and the omero python library ?


Not currently, no.

I guess one issue with that will be the need for a user account and password, am I right ?


As the system currently stands, correct. That would be a prerequisite.

kind regards,
Julien


All the best for the holidays,
~Josh

Re: Download complete dataset from IDR website

PostPosted: Tue Jan 03, 2017 4:41 pm
by Julien
Dear Josh,

I'm interested in the following dataset : idr0017-breinig-drugscreen/screenA. The data should be downloadable via the biostudies website : http://wwwdev.ebi.ac.uk/biostudies/studies/S-BSMS-PGPC1.
I will give it a try.

Just out of curiosity, will image analysis be possible in Jupyter ? If yes with which backend tools (CellProfiler, FiJi/ImageJ) ?

best regards,
Julien

Re: Download complete dataset from IDR website

PostPosted: Wed Jan 04, 2017 4:09 pm
by jmoore
Julien wrote:I'm interested in the following dataset : idr0017-breinig-drugscreen/screenA. The data should be downloadable via the biostudies website : http://wwwdev.ebi.ac.uk/biostudies/studies/S-BSMS-PGPC1.
I will give it a try.


Sounds good. Let us know how things go.

Just out of curiosity, will image analysis be possible in Jupyter ? If yes with which backend tools (CellProfiler, FiJi/ImageJ) ?


Sufficient resources haven't yet been made available for an individual to do large-scale image analysis in Jupyter. Instead, what we're currently doing is running the slow feature calculation on a cluster and making those features available via the API. This strategy will have to evolve over time, so if you have any particular wishes, we're certainly keen to hear them.

All the best,
~Josh

Re: Download complete dataset from IDR website

PostPosted: Wed Oct 24, 2018 2:58 pm
by manics
At least one person has recently arrived here via a search so I thought I'd provide an update.

Our long term goal is to support downloads via the IDR and OMERO. In the meantime bulk download is possible using Aspera. We've got a preconfigured Docker image here along with example instructions: https://github.com/IDR/aspera-client-docker