Page 1 of 1

Server waits for file access??

PostPosted: Fri Jun 10, 2011 4:37 pm
by graeme_b
We have installed an OMERO server for use in our imaging facility (Oxford biochemistry / Micron advanced microscopy unit). The server is deployed on the department's hierarchical storage system, which means that data not accessed for >60 days remain only on tape. Obviously this means that access times for such archived data are slower than normal hard disk access - which is fine - the problem is, that the OMERO server seems to wait! Not only is the offending user's session frozen while OMERO waits for access to the file, but so are all other users' sessions, and nobody else can log in to the server! :o Can't some sort of multithreading solve this problem? (have we failed to set some flag properly during the install?) These sort of freezes become a very serious usability issue when the server is relied on by more than a handful of users - isn't there a way to ensure that no one user can freeze up the whole server?

Cheers,

Graeme

Re: Server waits for file access??

PostPosted: Fri Jun 10, 2011 5:51 pm
by jmoore
Hi Graeme,

graeme_b wrote:...the problem is, that the OMERO server seems to wait! Not only is the offending user's session frozen while OMERO waits for access to the file, but so are all other users' sessions, and nobody else can log in to the server! :o


That one request hangs is expected, since there is no built in knowledge of the FS. But that the whole session for a user and much less for the whole server hang is definitely not expected. Would you be willing to provide some information to help us figure out just what is going on?

The Blitz-0.log, master.err, and master.out files from var/log would be a great start. However, the output of
Code: Select all
jstack `bin/omero admin ice server pid Blitz-0`

during such a hang would also be very useful. And if possible, such a stack trace from the client while it is hung. Assuming this is one of the Java clients, then you can also use jstack and the pid, which you can find with "jps -m". Or depending on platform, you may be able to attach to the process via "jconsole".

Thanks for bringing it to our attention!

Cheers,
~Josh

Re: Server waits for file access??

PostPosted: Fri Jun 10, 2011 6:06 pm
by graeme_b
Dear Josh,

Thanks for your reply - I will try on Monday to put together the information you need.

Best Regards,

Graeme

Re: Server waits for file access??

PostPosted: Mon Jun 13, 2011 4:16 pm
by graeme_b
Dear Josh,

I have only had time to obtain the log files so far (attached) - I think the server hang happened some time after 10:55 looking through the Blitz.log output from that day (i.e. 2011-06-09) - the other two files don't appear to be much use. I can try to arrange a hang-up and get more diagnostic information next week when I'm back from the OME user's meeting if you can't see anything obvious in the material I've given you today.

Cheers,

Graeme

Re: Server waits for file access??

PostPosted: Tue Jun 14, 2011 7:39 am
by jmoore
Hi Grame,

the only two slow downs I see in the log are:

Code: Select all
~/Downloads/info $ grep ERROR Blitz-0.log.2011-06-09
2011-06-09 11:00:54,086 ERROR [        ome.services.util.ServiceHandler] (l.Server-5) Method interface omeis.providers.re.RenderingEngine.renderCompressed invocation took 246491
2011-06-09 11:05:02,431 ERROR [        ome.services.util.ServiceHandler] (l.Server-6) Method interface omeis.providers.re.RenderingEngine.renderCompressed invocation took 166195


After both of them, there are several other entries, e.g. here from the one at 11:00:54:
Code: Select all

2011-06-09 11:00:54,089 INFO  [       ome.security.basic.CurrentDetails] (l.Server-0) Adding log:INSERT,class ome.model.meta.Session,1281
2011-06-09 11:00:54,102 INFO  [                 org.perf4j.TimingLogger] (l.Server-0) start[1307613653979] time[123] tag[omero.call.success.ome.services.sessions.SessionManagerImpl$2.doWork]
2011-06-09 11:00:54,102 INFO  [        ome.services.util.ServiceHandler] (l.Server-0)  Rslt:   (ome.model.meta.Experimenter:Id_102, ome.model.meta.ExperimenterGroup:Id_6, [1, 4, 6, 0], ... 4 more)
2011-06-09 11:00:54,108 INFO  [ ome.services.blitz.fire.SessionManagerI] (l.Server-0) Created session ServiceFactoryI(session-6fde25af-71f1-4dd0-ac15-e92e01a628c9/727ae449-7783-4f07-9b07-b699053ae706) for user graemeb (agent=OMERO.java)
2011-06-09 11:00:54,294 INFO  [ ome.services.blitz.impl.ServiceFactoryI] (l.Server-3) Added servant to adapter: 727ae449-7783-4f07-9b07-b699053ae706/6fde25af-71f1-4dd0-ac15-e92e01a628c9omero.api.IConfig(omero.api._IConfigTie@77efc077)
2011-06-09 11:00:54,317 INFO  [        ome.services.util.ServiceHandler] (l.Server-2)  Meth:   interface ome.api.IConfig.getConfigValue
2011-06-09 11:00:54,318 INFO  [        ome.services.util.ServiceHandler] (l.Server-2)  Args:   [omero.router.insecure]
2011-06-09 11:00:54,324 INFO  [        ome.services.util.ServiceHandler] (l.Server-4)  Meth:   interface ome.api.IMetadata.loadAnnotations
2011-06-09 11:00:54,325 INFO  [        ome.services.util.ServiceHandler] (l.Server-4)  Args:   [class ome.model.core.Image, (40), (), (), ome.parameters.Parameters@7d40f6ec]
2011-06-09 11:00:54,336 INFO  [         ome.security.basic.EventHandler] (l.Server-4)  Auth:   user=5,group=4,event=null(User),sess=9110eb91-bbda-4197-a8f8-f30a7073fec1
2011-06-09 11:00:54,339 INFO  [         ome.security.basic.EventHandler] (l.Server-2)  Auth:   user=102,group=6,event=11214(User),sess=727ae449-7783-4f07-9b07-b699053ae706
2011-06-09 11:00:54,341 INFO  [                 org.perf4j.TimingLogger] (l.Server-2) start[1307613654318] time[23] tag[omero.call.success.ome.logic.ConfigImpl.getConfigValue]
2011-06-09 11:00:54,341 INFO  [        ome.services.util.ServiceHandler] (l.Server-2)  Rslt:   OMERO.Glacier2/router:tcp -p 4063 -h @omero.host@
2011-06-09 11:00:54,353 INFO  [                 org.perf4j.TimingLogger] (l.Server-4) start[1307613654325] time[28] tag[omero.call.success.ome.logic.MetadataImpl.loadAnnotations]
2011-06-09 11:00:54,353 INFO  [        ome.services.util.ServiceHandler] (l.Server-4)  Rslt:   {40=(ome.model.annotations.FileAnnotation:Id_47)}
2011-06-09 11:00:54,361 INFO  [        ome.services.util.ServiceHandler] (l.Server-8)  Meth:   interface ome.api.IQuery.findAllByQuery
2011-06-09 11:00:54,362 INFO  [        ome.services.util.ServiceHandler] (l.Server-8)  Args:   [select link from ImageAnnotationLink as link left outer join fetch link.child child left outer join fetch child.details.owner left outer join fetch link.details.owner where link.child.id in (:childIDs) and link.parent.id = :parentID, ome.parameters.Parameters@2e2a4369]
2011-06-09 11:00:54,363 INFO  [         ome.security.basic.EventHandler] (l.Server-8)  Auth:   user=5,group=4,event=null(User),sess=9110eb91-bbda-4197-a8f8-f30a7073fec1
2011-06-09 11:00:54,367 INFO  [                 org.perf4j.TimingLogger] (l.Server-8) start[1307613654362] time[5] tag[omero.call.success.ome.logic.QueryImpl.findAllByQuery]
2011-06-09 11:00:54,367 INFO  [        ome.services.util.ServiceHandler] (l.Server-8)  Rslt:   (ome.model.annotations.ImageAnnotationLink:Id_47)
2011-06-09 11:00:54,374 INFO  [ ome.services.blitz.fire.SessionManagerI] (l.Server-9) Found session locally: 727ae449-7783-4f07-9b07-b699053ae706
2011-06-09 11:00:54,378 INFO  [ ome.services.blitz.fire.SessionManagerI] (l.Server-9) Rejoining session ServiceFactoryI(session-9fa52046-ec83-4d24-a1e1-acf0c1a85e3f/727ae449-7783-4f07-9b07-b699053ae706) (agent=OMERO.insight;secure=false)
2011-06-09 11:00:54,396 INFO  [ ome.services.blitz.impl.ServiceFactoryI] (l.Server-5) Added servant to adapter: 727ae449-7783-4f07-9b07-b699053ae706/6fde25af-71f1-4dd0-ac15-e92e01a628c9omero.api.IAdmin(omero.api._IAdminTie@395baa8c)
2011-06-09 11:00:54,397 INFO  [        ome.services.util.ServiceHandler] (l.Server-0)  Meth:   interface ome.api.IMetadata.loadAnnotations
2011-06-09 11:00:54,398 INFO  [        ome.services.util.ServiceHandler] (l.Server-0)  Args:   [class ome.model.core.Image, (40), (), (), ome.parameters.Parameters@1c533552]


so at least not everything is hung. So yes, please, do try to arrange for a hang, as any extra information would be greatly appreciated.

Cheers & see you in Paris!
~Josh