Page 2 of 2

Re: OMERO table reading time

PostPosted: Tue Jan 03, 2012 6:49 pm
by bhcho
if you think there was concurrent access on the HDF file by the following two lines of the log

2012-01-03 11:37:53,337 INFO [ omero.tables.TablesI] (Dummy-4 ) getTable: 2740 {}
...
2012-01-03 11:39:12,293 INFO [ omero.tables.TablesI] (Dummy-8 ) getTable: 2740 {}

these are not concurrent access. you can see there is time difference of 2 min., meaning I tried to open the Table again after 2 min.
Otherwise, I don't have any clue on this.

Re: OMERO table reading time

PostPosted: Tue Jan 10, 2012 2:29 pm
by jmoore
Hi Bk,

I don't know what might have caused the hanging of your tables service. If it happens again, please let me know, along with any info about what else was happening at the time.

As for the read times, my only suggestion for the moment is to read in fewer columns and rows at a time. In testing on your file, I got the best performance by reading half the columns for the first 2000 rows, then the second half of the columns for the same 2000 rows. Then going on to the next 2000 rows, etc.

I'll keep looking into how to increase the speeds as we work on 4.4.

Cheers,
~Josh

Re: OMERO table reading time

PostPosted: Tue Jan 10, 2012 3:11 pm
by bhcho
Thanks Josh,

There was a DDoS attack at that period of time. I don't think the DDoS attack directly caused the symptom, but I guess it could be an indirect source of the problem. I'll let you know if it happens again.

And thanks for your suggestion of splitting the reading process. How much did you cut the reading time by doing that?

Best,
BK

Re: OMERO table reading time

PostPosted: Tue Jan 10, 2012 4:16 pm
by jmoore
I got down as low as 2 seconds, which is still admittedly too long.

~J.

Re: OMERO table reading time

PostPosted: Tue Jan 10, 2012 8:09 pm
by bhcho
for me, it took almost the same time as before (10~12 sec.). here's my code.

Code: Select all
def chunks(l, n):
    return [l[i:i+n] for i in range(0, len(l), n)]



table = conn.getSharedResources().openTable( omero.model.OriginalFileI( fid, False ) )

num_col = len(table.getHeaders())
num_row = table.getNumberOfRows()

chunk_col = chunks(range(num_col), 100) # read every 100 columns
chunk_row = chunks(range(num_row), 1000) # read every 1000 rows

data = []
for row in range(len(chunk_row)):
    temp_col = []
    for col in range(len(chunk_col)):
        values = table.read(chunk_col[col], chunk_row[row][0], chunk_row[row][-1]+1)
        values = values.columns
        for cols in values:
            column_value_list = cols.values
            temp_col.append(list(column_value_list))
    temp_col = zip(*temp_col)
    for r in temp_col:
        data.append(r)


FYI, when I used the 'table.read' only from the upper code (without the lines below the function), it took almost the same time. This means (I guess) there's something with the 'table.read' (or our server)

BK