We're Hiring!

Bulk import of Perkin Elmer Operetta data

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

Bulk import of Perkin Elmer Operetta data

Postby alexr » Mon May 13, 2019 7:54 am

Dear forum readers,
I have a problem importing Perkin Elmer Operetta data using the command line in-place import using bulk import and yaml.
My yaml file looks like this
continue: "true"
transfer: "ln_s"
checksum_algorithm: "File-Size-64"
logprefix: "logs/"
output: "yaml"
path: "/OMERO/ManagedRepository/ipimp-54474.tsv"
columns:
- target
- path

My approach was to generate a filelist from the exported Operetta images folder, that contains the images as well as the metadata in the Index.idx.xml file.
My file list contains 3360 images and the Index.idx.xml file.
My screen consisted of 10 wells, 16 fields each, 7 z-slices, 3 colors. So the 3360 files are the correct number, and the imported files are correctly displayed inside the omero web gui.
When I start the import using:
Code: Select all
/home/omero/OMERO.server/bin/omero import --bulk /OMERO/ManagedRepository/bulki-54474.yml --skip upgrade

Omero starts to import my images correctly. I get dataset that contains the images as well as plate layouts of the imported images / plate.
What is strange is that the importer does not stop once the dataset is imported, but imports the files repeatedly. I manually canceled the import now after 4 days. I got 22080 files imported in my dataset folder as well as the plate 138 times (names run 1 to 138).
Here is an image from my omero web interface:
[img]
Screen Shot 2019-05-13 at 09.46.06.png
Screen Shot 2019-05-13 at 09.46.06.png (62.64 KiB) Viewed 956 times

[/img]
Inside my imported dataset I have entries that are called "Index.idx.xml [Well x, Filed x] that contain then many duplicated images.
Am I doing something wrong and do I have just to point the importer to the Index.idx.xml file instead to both the metadata and the images? Can I check somehow how this import was generated?
Thanks for the help,
Alex
alexr
 
Posts: 46
Joined: Tue Jun 12, 2018 12:20 pm

Re: Bulk import of Perkin Elmer Operetta data

Postby Dominik » Mon May 13, 2019 9:19 am

Hi Alex,

this could be either an import issue or an issue with file format, respectively the reader.
To narrow it down a bit, could you post the content of the ipimp-54474.tsv file (or better upload it on http://qa.openmicroscopy.org.uk/qa/upload/ if it's a larger file)? Also the output of 'import -f' would be interesting. The '-f' option is a kind of dry-run option, it won't kick off the import but it'll list in detail what would happen (which files would be imported, into how many plates, etc.).
If you could capture and send us the output of
Code: Select all
omero import -f --bulk /OMERO/ManagedRepository/bulki-54474.yml
, that'd be great.

Kind Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: Bulk import of Perkin Elmer Operetta data

Postby alexr » Mon May 13, 2019 10:22 am

Hi Dominik,
thanks for the fast response. I uploaded both the ipimp.54474.tsv file as well as the console output for the command
Code: Select all
omero import -f --bulk /OMERO/ManagedRepository/bulki-54474.yml

Please not the console output is not complete, but the last entries are repeated again and again.
Thanks
Alex
alexr
 
Posts: 46
Joined: Tue Jun 12, 2018 12:20 pm

Re: Bulk import of Perkin Elmer Operetta data

Postby alexr » Mon May 13, 2019 4:36 pm

Hi Dominik,
just an additional information. The last command is still not finished and if I check the file sets with
Code: Select all
bin/omero fs sets

I get the following list:
Code: Select all
#  | Id  | Prefix                           | Images | Files | Transfer
----+-----+----------------------------------+--------+-------+----------
0  | 713 | AlexR_2/2019-05/13/09-02-40.066/ | 160    | 3046  | ln_s     
1  | 712 | AlexR_2/2019-05/13/08-24-01.442/ | 160    | 3046  | ln_s     
2  | 711 | AlexR_2/2019-05/13/07-45-20.546/ | 160    | 3046  | ln_s     
3  | 710 | AlexR_2/2019-05/13/07-06-41.235/ | 160    | 3046  | ln_s     
4  | 709 | AlexR_2/2019-05/13/06-28-04.421/ | 160    | 3046  | ln_s     
5  | 708 | AlexR_2/2019-05/13/05-49-26.910/ | 160    | 3046  | ln_s     
6  | 707 | AlexR_2/2019-05/13/05-10-42.893/ | 160    | 3046  | ln_s     
7  | 706 | AlexR_2/2019-05/13/04-32-05.787/ | 160    | 3046  | ln_s     
8  | 705 | AlexR_2/2019-05/13/03-52-49.194/ | 160    | 3046  | ln_s     
9  | 704 | AlexR_2/2019-05/13/03-13-55.541/ | 160    | 3046  | ln_s     
10 | 703 | AlexR_2/2019-05/13/02-35-08.014/ | 160    | 3046  | ln_s     
11 | 702 | AlexR_2/2019-05/13/01-56-28.709/ | 160    | 3046  | ln_s     
12 | 701 | AlexR_2/2019-05/13/01-17-38.558/ | 160    | 3046  | ln_s     
13 | 700 | AlexR_2/2019-05/13/00-38-49.877/ | 160    | 3046  | ln_s     
14 | 699 | AlexR_2/2019-05/12/23-59-58.226/ | 160    | 3046  | ln_s     
15 | 698 | AlexR_2/2019-05/12/23-21-17.378/ | 160    | 3046  | ln_s     
16 | 697 | AlexR_2/2019-05/12/22-42-37.295/ | 160    | 3046  | ln_s     
17 | 696 | AlexR_2/2019-05/12/22-03-46.220/ | 160    | 3046  | ln_s     
18 | 695 | AlexR_2/2019-05/12/21-25-04.230/ | 160    | 3046  | ln_s     
19 | 694 | AlexR_2/2019-05/12/20-46-15.442/ | 160    | 3046  | ln_s     
20 | 693 | AlexR_2/2019-05/12/20-07-28.382/ | 160    | 3046  | ln_s     
21 | 692 | AlexR_2/2019-05/12/19-28-33.157/ | 160    | 3046  | ln_s     
22 | 691 | AlexR_2/2019-05/12/18-49-43.550/ | 160    | 3046  | ln_s     
23 | 690 | AlexR_2/2019-05/12/18-10-58.112/ | 160    | 3046  | ln_s     
24 | 689 | AlexR_2/2019-05/12/17-32-16.646/ | 160    | 3046  | ln_s   

I do not know if this is helpful.
I can also upload the corresponding log file, but it is 32 MB.
Best wishes
Alex
alexr
 
Posts: 46
Joined: Tue Jun 12, 2018 12:20 pm

Re: Bulk import of Perkin Elmer Operetta data

Postby Dominik » Tue May 14, 2019 9:10 am

Hi Alex,

I think the problem is that you list every image file separately in the tsv file. Each line is a import.
Only the first line is needed:
Code: Select all
SAMHD1-SC35   /mnt/CCHL-User/Alex/03-Microscopy/2018/2018-08-09-Operetta_SamHD1_SC35_AGS/plate01_SAMHD1_SC35__2018-08-09T13_50_53-Measurement1/Images/Index.idx.xml

You can point to the 'index.idx.xml' or actually just to the 'Images' directory itself. The importer will figure out automatically which image files to import. And as you're importing plates, you should remove the "Dataset:name:"

You could look at an example from IDR:
This https://github.com/IDR/idr0037-vigilant ... er/screenA (idr0037-screenA-bulk.yml and idr0037-screenA-plates.tsv) is how http://idr.openmicroscopy.org/webclient ... creen-2051 was imported. Also see the option "exclude: "clientpath"" in the bulk.yml. Although it's commented out in this example, you could use it to prevent accidentely importing image files multiple times. With that option you'd get a warning if you try to import a single image file, if it has already been imported previously as part of a plate.

Kind Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: Bulk import of Perkin Elmer Operetta data

Postby alexr » Wed May 15, 2019 8:48 am

Dear Dominik,
thanks for the suggestion. This worked well. What I realised from my previous import is that the import of the full directory resulted in a large number of huge images. If I check with
Code: Select all
bin/omero fs images


# | Image | Name | FS | # Files | Size
----+-------+----------------------------------+-----+---------+--------
0 | 22259 | Index.idx.xml [Well 8, Field 16] | 713 | 3046 | 5.4 GB
1 | 22258 | Index.idx.xml [Well 8, Field 15] | 713 | 3046 | 5.4 GB
2 | 22257 | Index.idx.xml [Well 8, Field 14] | 713 | 3046 | 5.4 GB
3 | 22256 | Index.idx.xml [Well 8, Field 13] | 713 | 3046 | 5.4 GB
4 | 22255 | Index.idx.xml [Well 8, Field 12] | 713 | 3046 | 5.4 GB
5 | 22254 | Index.idx.xml [Well 8, Field 11] | 713 | 3046 | 5.4 GB
6 | 22253 | Index.idx.xml [Well 8, Field 10] | 713 | 3046 | 5.4 GB
7 | 22252 | Index.idx.xml [Well 8, Field 9] | 713 | 3046 | 5.4 GB
8 | 22251 | Index.idx.xml [Well 8, Field 8] | 713 | 3046 | 5.4 GB
9 | 22250 | Index.idx.xml [Well 8, Field 7] | 713 | 3046 | 5.4 GB
10 | 22249 | Index.idx.xml [Well 8, Field 6] | 713 | 3046 | 5.4 GB

I see the imported images from the index.idx.xml as 5.4 GB size. Although I expect the images (7 z-planes, 3 colors per Field with a single frame having 1.5 MB) to be only approx. 32 MB.
Any idea how this was generated?
By the way how can I delete the files from my omero database?
Thanks
Alex
alexr
 
Posts: 46
Joined: Tue Jun 12, 2018 12:20 pm

Re: Bulk import of Perkin Elmer Operetta data

Postby Dominik » Wed May 15, 2019 9:56 am

Easiest way would be to use the 'delete' command. For the various options see:
Code: Select all
./omero delete --help


With respect to the image sizes: Have to check myself what the 'fs' commands actually reports there, might not be what you'd expect (or might as well be a bug).

Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am

Re: Bulk import of Perkin Elmer Operetta data

Postby Dominik » Wed May 15, 2019 11:23 am

I've been playing around a bit with the 'fs' command and I'm not sure if the 'fs image' command is the right tool to use if you want to check the import. What might be useful to check the size of an imported plate, if you have the ID (copied from the web client for example):
Code: Select all
./omero fs usage --report --human-readable Plate:2052


To check which files have been imported for a particular plate:
Code: Select all
./omero fs ls 8751


You need the Fileset ID for the previous command, which you can get by running
Code: Select all
./omero obj get Image:47739

on of the images in the plate.

Kind Regards,
Dominik
User avatar
Dominik
Team Member
 
Posts: 149
Joined: Mon Feb 10, 2014 11:26 am


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 1 guest