We're Hiring!

OMERO table reading time

General and open developer discussion about using OMERO APIs from C++, Java, Python, Matlab and more! Please new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

If you are having trouble with custom code, please provide a link to a public repository, ideally GitHub.

OMERO table reading time

Postby bhcho » Mon Dec 19, 2011 7:37 pm

Hi all,

I have an OMERO.table saved on the OMERO DB and the table has around 4000 rows with 170 columns.
In this case, when I tried to read the data into a python variable using the following code, it took 10~11 seconds.

Code: Select all
values = table.read(range(len(table.getHeaders())),0,table.getNumberOfRows())


is this normal? I'm asking this because I think it takes toooooo much time (we'd like to expand the table to almost hundreds of thousands rows). If this is normal, we better look for other repository type for image content (other than OMERO.tables). It will be greatly appreciated if you have any suggestion on this.

Best,
BK
bhcho
 
Posts: 236
Joined: Mon Apr 05, 2010 2:15 pm

Re: OMERO table reading time

Postby jmoore » Mon Dec 19, 2011 7:55 pm

Can you perhaps post the Tables-0.log file which should contain some timing information?

Thanks,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: OMERO table reading time

Postby bhcho » Tue Dec 20, 2011 3:28 am

attached.

BK
Attachments
Table.tgz
(167.76 KiB) Downloaded 198 times
bhcho
 
Posts: 236
Joined: Mon Apr 05, 2010 2:15 pm

Re: OMERO table reading time

Postby jmoore » Tue Dec 20, 2011 8:45 am

Hi BK,

I couldn't find any calls that were taking 10+ seconds, but definitely found some that were taking 2+ seconds:
Code: Select all
/tmp/bk $ python lib.py Tables-0.log 
2.27400016785   Method:Meth: TableI.read   2011-12-19 04:35:45,373
2.15200018883   Method:Meth: TableI.read   2011-12-19 04:32:38,100
2.14900016785   Method:Meth: TableI.read   2011-12-19 10:37:58,768
2.13100004196   Method:Meth: TableI.read   2011-12-19 04:22:56,177
2.13100004196   Method:Meth: TableI.read   2011-12-19 04:23:20,656
2.12999987602   Method:Meth: TableI.read   2011-12-19 09:42:17,279
2.11800003052   Method:Meth: TableI.read   2011-12-19 04:38:50,549
2.10800004005   Method:Meth: TableI.read   2011-12-19 09:43:23,274
2.10700011253   Method:Meth: TableI.read   2011-12-19 10:38:46,944
2.09600019455   Method:Meth: TableI.read   2011-12-19 09:47:26,472

which I would still consider took long for reading 4000 rows. Could you also upload sample code and a sample file that I could look at?

Also, it looks like you are frequently opening, reading, and closing the file. Is this due to the multi-process nature, or would it be possible to keep the file open longer, so that PyTables internal caching could play a role?

Code: Select all
654 close
745 getHeaders
745 getNumberOfRows
744 getTable
654 getWhereList
745 read


Cheers,
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: OMERO table reading time

Postby bhcho » Tue Dec 20, 2011 2:51 pm

The following is the part of the table reading from Tables log
2011-12-20 09:41:50,740 INFO [ omero.remote] (Dummy-3 ) Meth: TablesI.getTable
2011-12-20 09:41:50,741 INFO [ omero.tables.TablesI] (Dummy-3 ) getTable: 11400 {}
2011-12-20 09:41:50,749 INFO [ omero.tables.HdfStorage] (Dummy-3 ) Size: 1790 - Attaching Table-2E938ABE-C7DD-477E-A536-8CBAE8DA56F2 to /usr1/OMERO432/Files/Dir-011/11400
2011-12-20 09:41:50,750 INFO [ omero.remote] (Dummy-3 ) Rslt: 2E938ABE-C7DD-477E-A536-8CBAE8DA56F2 -t @ Tables-0.TablesAdapter
2011-12-20 09:41:50,761 INFO [ omero.remote] (Dummy-10 ) Meth: TableI.getHeaders
2011-12-20 09:41:50,774 INFO [ omero.tables.TableI] (Dummy-10 ) Table-2E938ABE-C7DD-477E-A536-8CBAE8DA56F2.getHeaders() => size=169
2011-12-20 09:41:50,775 INFO [ omero.remote] (Dummy-10 ) Rslt: [object #0 (::omero::grid::LongColumn)
{
name = INDEX
description =
values = {}
}, object #0 (::omero::grid
2011-12-20 09:41:50,791 INFO [ omero.remote] (Dummy-9 ) Meth: TableI.getNumberOfRows
2011-12-20 09:41:50,791 INFO [ omero.tables.TableI] (Dummy-9 ) Table-2E938ABE-C7DD-477E-A536-8CBAE8DA56F2.getNumberOfRows() => 3998
2011-12-20 09:41:50,792 INFO [ omero.remote] (Dummy-9 ) Rslt: 3998
2011-12-20 09:41:50,793 INFO [ omero.remote] (Dummy-7 ) Meth: TableI.read
2011-12-20 09:41:50,793 INFO [ omero.tables.TableI] (Dummy-7 ) Table-2E938ABE-C7DD-477E-A536-8CBAE8DA56F2.read([0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, $
2011-12-20 09:41:52,736 INFO [ omero.remote] (Dummy-7 ) Rslt: object #0 (::omero::grid::Data)
{
lastModification = 1324253065942
rowNumbers =
{
[0] = 0
[


Could you also upload sample code and a sample file that I could look at?

Could you tell me how to export the OMERO.table on DB (not attached to anything) as a file?

Also, it looks like you are frequently opening, reading, and closing the file. Is this due to the multi-process nature, or would it be possible to keep the file open longer, so that PyTables internal caching could play a role?


it might be because tables were created for 1,000 images at one process (without using queue manager).
and we are open,read, and close at least two tables for one search query.

BK
bhcho
 
Posts: 236
Joined: Mon Apr 05, 2010 2:15 pm

Re: OMERO table reading time

Postby jmoore » Wed Dec 21, 2011 9:40 am

bhcho wrote:The following is the part of the table reading from Tables log


Right, which shows one of your call sequences to take about 2 seconds, right?

Could you tell me how to export the OMERO.table on DB (not attached to anything) as a file?


If you have the table id: table.getOriginalFile().id.val then you can just return the file with that name from under /OMERO/Files.

it might be because tables were created for 1,000 images at one process (without using queue manager).and we are open,read, and close at least two tables for one search query.


I checked the logs and couldn't find multiple concurrent threads. It may be that some of your threads are hanging waiting on the file opening. It would help to see the code (both that which is launching the 1000 processes and that which is accessing the tables).

Cheers,
~Josh.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: OMERO table reading time

Postby bhcho » Thu Dec 22, 2011 2:57 pm

hi Josh,

the file size is too big, so I sent it by email.

BK
bhcho
 
Posts: 236
Joined: Mon Apr 05, 2010 2:15 pm

Re: OMERO table reading time

Postby jmoore » Mon Jan 02, 2012 2:38 pm

Hi BK,

thanks for the file. I've created a ticket: https://trac.openmicroscopy.org.uk/ome/ticket/7711, and am looking at possible reasons for this. As it says in the the ticket, the times on the server-side look quite ok (as I saw looking at your logs), but once the objects are transferred, things look much worse.

Cheers,
~Josh.

P.S. Though it's not implemented yet, for your use, it will eventually be much more efficient to have a float-matrix rather than many float-arrays.
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: OMERO table reading time

Postby bhcho » Tue Jan 03, 2012 4:48 pm

Thanks Josh,

I have another question about the table. Today, I could not retrieve a OMERO.table suddenly.
I tried to get the values of the table using the following code
Code: Select all
In [20]: table = conn.getSharedResources().openTable( omero.model.OriginalFileI( fid, False ) )
In [22]: pixels=0
In [23]: channel=0
In [24]: zslice=0
In [25]: timepoint=0
In [26]: query = "(pixels ==" + str(pixels) + ") & (channel ==" + str(channel) + ") & (zslice ==" + str(zslice) + ") & (timepoint =="  + str(timepoint) + ")"
In [27]: ids = table.getWhereList( query, None, 0, table.getNumberOfRows(), 1 )
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)

.../omeroweb/searcher/<ipython console> in <module>()

AttributeError: 'NoneType' object has no attribute 'getWhereList'


when I looked at the Tables log, it says something like...

2012-01-03 11:26:42,657 INFO [ omero.tables.TablesI] (Dummy-3 ) getTable: 2393 {}
2012-01-03 11:26:42,661 INFO [ omero.remote] (Dummy-3 ) Excp: local variable 'lock' referenced before assignment
2012-01-03 11:26:42,661 WARNI [ omero.remote] (Dummy-3 ) <function getTable at 0x3853a28> raised a non-ServerError (<type 'exceptions.UnboundLocalError'>): local$
2012-01-03 11:37:53,337 INFO [ omero.remote] (Dummy-4 ) Meth: TablesI.getTable
2012-01-03 11:37:53,337 INFO [ omero.tables.TablesI] (Dummy-4 ) getTable: 2740 {}
2012-01-03 11:37:53,340 INFO [ omero.remote] (Dummy-4 ) Excp: local variable 'lock' referenced before assignment
2012-01-03 11:37:53,341 WARNI [ omero.remote] (Dummy-4 ) <function getTable at 0x3853a28> raised a non-ServerError (<type 'exceptions.UnboundLocalError'>): local$
2012-01-03 11:39:12,293 INFO [ omero.remote] (Dummy-8 ) Meth: TablesI.getTable
2012-01-03 11:39:12,293 INFO [ omero.tables.TablesI] (Dummy-8 ) getTable: 2740 {}
2012-01-03 11:39:12,297 INFO [ omero.remote] (Dummy-8 ) Excp: local variable 'lock' referenced before assignment
2012-01-03 11:39:12,297 WARNI [ omero.remote] (Dummy-8 ) <function getTable at 0x3853a28> raised a non-ServerError (<type 'exceptions.UnboundLocalError'>): local$


do you have any idea?

After I restarted the server, it works okay again. Is this because of another memory issue?

BK
bhcho
 
Posts: 236
Joined: Mon Apr 05, 2010 2:15 pm

Re: OMERO table reading time

Postby jmoore » Tue Jan 03, 2012 6:06 pm

The real exception is being masked by a small bug (local variable 'lock' referenced before assignment), but it looks like there is concurrent access on the HDF file. Any idea how that could be happening?
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Next

Return to Developer Discussion

Who is online

Users browsing this forum: No registered users and 0 guests