We're Hiring!

MessageSize and server OOM (possibly) related problems

Having a problem deploying OMERO? Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

The OMERO.server installation documentation begins here and you can find OMERO.web deployment documentation here.

MessageSize and server OOM (possibly) related problems

Postby dpwrussell » Tue Nov 05, 2013 2:06 pm

Public Continuation of discussions and OMERO QAs 7651 and 7659

Basically, there is a client side Ice::MemoryLimitException (See 7651) After this client error, the server continue to operate.

There is also a server side java.lang.OutOfMemoryError: Java heap space, which is likely related. I've uploaded 2 server OOM logs in a tar.bzip2 file here: https://www.openmicroscopy.org/qa2/qa/feedback/7693/

Existing relevant settings:

<property name="Ice.MessageSizeMax" value="131072"/>
<option>-Xmx2048M</option>
<option>-XX:MaxPermSize=1024M</option>
User avatar
dpwrussell
 
Posts: 18
Joined: Tue May 15, 2012 1:26 pm

Re: MessageSize and server OOM (possibly) related problems

Postby jmoore » Tue Nov 05, 2013 3:42 pm

Here are the OOMs in the provided log files:
Code: Select all
OOM.log:2013-10-08 07:39:34,525
OOM2.log:2013-10-14 17:34:59,807
OOM2.log:2013-10-14 17:35:08,463
OOM2.log:2013-10-14 17:42:03,802
OOM2.log:2013-10-14 17:45:36,377


Ignoring the one from the 8th, I noticed that this starts during a delete operation (perhaps by another user). Helio, can you try to describe specifically the steps that led to the crash? Also, when were the times of the other failures?

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: MessageSize and server OOM (possibly) related problems

Postby hroque » Tue Nov 05, 2013 4:25 pm

The deletion operation was made by me at the same time. The deletion operation concluded successfully while the import failed.
I've tried these imports at different times during the day, but mostly in the evening, and leaving it running during night (have also done it during the day and it also fail). I've tried this with Importer and Insight-importer.
This import stays quite a lot of hours analyzing the data before trying to import something.
Sorry not being more precise but hope this helps.
Helio
hroque
 
Posts: 36
Joined: Thu Nov 04, 2010 6:20 pm

Re: MessageSize and server OOM (possibly) related problems

Postby jmoore » Wed Nov 06, 2013 11:49 am

Looking through the related QA feedback items, I can only assume that for now, it actually is going to require more memory to get this dataset in. If you can provide a heap dump of the OOM, then we can investigate how to prevent this in the future. You can activate heap dumps via:
Code: Select all
bin/omero admin deploy heap-dump

or
Code: Select all
bin/omero admin deploy heap-dump-tmp


See https://github.com/openmicroscopy/openmicroscopy/blob/v.4.4.9/etc/grid/templates.xml#L184

This will restart the server. On the next restart or the next call to:
Code: Select all
bin/omero admin deploy

(with no options), then heap dumps will be disabled to save disk space.

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: MessageSize and server OOM (possibly) related problems

Postby dpwrussell » Wed Nov 06, 2013 4:09 pm

Ok, I've changed tthe settings to:
Code: Select all
<option>-Xmx16384M</option>
<option>-XX:MaxPermSize=1024M</option>
<property name="Ice.MessageSizeMax" value="524288"/>


I've left XX:MaxPermSize alone as obviously it's not the Classes themselves that's overloading memory.

To recap:

This should deal with the Ice::MemoryLimitException because the metadata is being sent as a single message that exceeds Ice's current limit? I guess something to look at would be breaking up large metadata into multiple messages or whatever Ice magic can give you.

The OOM still remains somewhat of a mystery, I've activated heap-dump so next time it happens we'll hopefully have more data. Although I guess now it may never happen if it's a concurrent problem and not a leak. At least not until more users start doing more stuff like this at the same time.
User avatar
dpwrussell
 
Posts: 18
Joined: Tue May 15, 2012 1:26 pm

Re: MessageSize and server OOM (possibly) related problems

Postby jmoore » Thu Nov 07, 2013 7:48 am

Thanks, Douglas. Sounds like a plan. FYI: in OMERO5, there will be a server-side import queue, so there should not be an uncontrolled number of "saveToDB" actions at any one time.

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: MessageSize and server OOM (possibly) related problems

Postby hroque » Thu Nov 07, 2013 9:42 am

Hi all,

Redid the import and it failed again. Starting importing last evening around ~19h and it failed somewhere during the night. Server wasn't down from what I can tell.
Here is the error:

Code: Select all
Ice.ConnectionLostException
    error = 0
   at IceInternal.Outgoing.invoke(Outgoing.java:147)
   at omero.api._ServiceFactoryDelM.getAdminService(_ServiceFactoryDelM.java:627)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:705)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:677)
   at ome.formats.OMEROMetadataStoreClient.initializeServices(OMEROMetadataStoreClient.java:413)
   at ome.formats.OMEROMetadataStoreClient.createRoot(OMEROMetadataStoreClient.java:1049)
   at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:769)
   at org.openmicroscopy.shoola.env.data.OMEROGateway.importImage(OMEROGateway.java:6736)
   at org.openmicroscopy.shoola.env.data.OmeroImageServiceImpl.importCandidates(OmeroImageServiceImpl.java:230)
   at org.openmicroscopy.shoola.env.data.OmeroImageServiceImpl.importFile(OmeroImageServiceImpl.java:1475)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter.importFile(ImagesImporter.java:77)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter.access$000(ImagesImporter.java:53)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter$1.doCall(ImagesImporter.java:102)
   at org.openmicroscopy.shoola.env.data.views.BatchCall.doStep(BatchCall.java:144)
   at org.openmicroscopy.shoola.util.concur.tasks.CompositeTask.doStep(CompositeTask.java:226)
   at org.openmicroscopy.shoola.env.data.views.CompositeBatchCall.doStep(CompositeBatchCall.java:126)
   at org.openmicroscopy.shoola.util.concur.tasks.ExecCommand.exec(ExecCommand.java:165)
   at org.openmicroscopy.shoola.util.concur.tasks.ExecCommand.run(ExecCommand.java:276)
   at org.openmicroscopy.shoola.util.concur.tasks.AsyncProcessor$Runner.run(AsyncProcessor.java:91)
   at java.lang.Thread.run(Thread.java:695)

   at org.openmicroscopy.shoola.env.data.OMEROGateway.importImage(OMEROGateway.java:6790)
   at org.openmicroscopy.shoola.env.data.OmeroImageServiceImpl.importCandidates(OmeroImageServiceImpl.java:230)
   at org.openmicroscopy.shoola.env.data.OmeroImageServiceImpl.importFile(OmeroImageServiceImpl.java:1475)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter.importFile(ImagesImporter.java:77)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter.access$000(ImagesImporter.java:53)
   at org.openmicroscopy.shoola.env.data.views.calls.ImagesImporter$1.doCall(ImagesImporter.java:102)
   at org.openmicroscopy.shoola.env.data.views.BatchCall.doStep(BatchCall.java:144)
   at org.openmicroscopy.shoola.util.concur.tasks.CompositeTask.doStep(CompositeTask.java:226)
   at org.openmicroscopy.shoola.env.data.views.CompositeBatchCall.doStep(CompositeBatchCall.java:126)
   at org.openmicroscopy.shoola.util.concur.tasks.ExecCommand.exec(ExecCommand.java:165)
   at org.openmicroscopy.shoola.util.concur.tasks.ExecCommand.run(ExecCommand.java:276)
   at org.openmicroscopy.shoola.util.concur.tasks.AsyncProcessor$Runner.run(AsyncProcessor.java:91)
   at java.lang.Thread.run(Thread.java:695)
Caused by: Ice.ConnectionLostException
    error = 0
   at IceInternal.Outgoing.invoke(Outgoing.java:147)
   at omero.api._ServiceFactoryDelM.getAdminService(_ServiceFactoryDelM.java:627)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:705)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:677)
   at ome.formats.OMEROMetadataStoreClient.initializeServices(OMEROMetadataStoreClient.java:413)
   at ome.formats.OMEROMetadataStoreClient.createRoot(OMEROMetadataStoreClient.java:1049)
   at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:769)
   at org.openmicroscopy.shoola.env.data.OMEROGateway.importImage(OMEROGateway.java:6736)
   ... 12 more
hroque
 
Posts: 36
Joined: Thu Nov 04, 2010 6:20 pm

Re: MessageSize and server OOM (possibly) related problems

Postby jmoore » Fri Nov 08, 2013 7:27 am

Could I get your ~/omero/log/omeroinsight.log file as well as the server logs and the heap dump if available? Thanks, ~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: MessageSize and server OOM (possibly) related problems

Postby hroque » Fri Nov 08, 2013 10:11 am

Not sure how to upload the files here!
It does not seem to work. Is there a size limit?
anyway here is a link to the log.

https://www.dropbox.com/s/mrd347eyjylyy ... nsight.log
hroque
 
Posts: 36
Joined: Thu Nov 04, 2010 6:20 pm

Re: MessageSize and server OOM (possibly) related problems

Postby jmoore » Fri Nov 08, 2013 4:11 pm

Thanks for the log, Helio.
Code: Select all
...SNIP...
2013-11-06 18:58:01,300 INFO  [   ome.formats.importer.ImportCandidates] ( Thread-33) 74589 file(s) parsed into 1 group(s) with 2 call(s) to setId in 183097ms. (304743ms total) [1 unknowns]
2013-11-06 18:58:01,590 INFO  [       ome.formats.importer.ImportConfig] ( Thread-33) OMERO Version: 4.4.8-ice33-b256
2013-11-06 18:58:01,590 INFO  [       ome.formats.importer.ImportConfig] ( Thread-33) Bioformats version: 4.4.8 revision: 660f607 date: 1 May 2013
...SNIP...
Caused by: Ice.ConnectionLostException
    error = 0
   at IceInternal.Outgoing.invoke(Outgoing.java:147)
   at omero.api._ServiceFactoryDelM.getAdminService(_ServiceFactoryDelM.java:627)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:705)
   at omero.api.ServiceFactoryPrxHelper.getAdminService(ServiceFactoryPrxHelper.java:677)
   at ome.formats.OMEROMetadataStoreClient.initializeServices(OMEROMetadataStoreClient.java:413)
   at ome.formats.OMEROMetadataStoreClient.createRoot(OMEROMetadataStoreClient.java:1049)
   at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:769)
   at org.openmicroscopy.shoola.env.data.OMEROGateway.importImage(OMEROGateway.java:6736)
   ... 12 more
Exception in thread "Thread-33"

The above looks like an error in the long-running connection code which we fixed in 4.4.9. Could you try upgrading to the latest release (4.4.9) and see if that gets you past this issue?

Thanks for your patience!
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Next

Return to Installation and Deployment

Who is online

Users browsing this forum: No registered users and 1 guest