We're Hiring!

DatabaseBusyException: how to analyze and overcome?

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

DatabaseBusyException: how to analyze and overcome?

Postby rpoehlmann » Mon Apr 23, 2018 1:12 pm

Dear all,

seems like we are running in database connection pool issues:
recently, we started to receive several errors from user activities with omero::DatabaseBusyException:
Code: Select all
Internal Server Error: /webclient/
Traceback (most recent call last):
  File "/home/omeronas/omeroweb/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omeroweb/decorators.py", line 474, in wrapped
    retval = f(request, *args, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omeroweb/decorators.py", line 524, in wrapper
    context = f(request, *args, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omeroweb/webclient/views.py", line 463, in load_template
    url=url, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omeroweb/webclient/views.py", line 388, in _load_template
    conn.getEventContext().groupId)
  File "/home/omeronas/OMERO.server/lib/python/omero/gateway/__init__.py", line 2208, in getEventContext
    self._ctx = self._proxies['admin'].getEventContext()
  File "/home/omeronas/OMERO.server/lib/python/omero/gateway/__init__.py", line 4528, in __call__
    return self.handle_exception(e, *args, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omeroweb/webclient/webclient_gateway.py", line 2025, in handle_exception
    e, *args, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omero/gateway/__init__.py", line 4525, in __call__
    return self.f(*args, **kwargs)
  File "/home/omeronas/OMERO.server/lib/python/omero_api_IAdmin_ice.py", line 2354, in getEventContext
    return _M_omero.api.IAdmin._op_getEventContext.invoke(self, ((), _ctx))
DatabaseBusyException: exception ::omero::DatabaseBusyException
{
    serverStackTrace = ome.conditions.DatabaseBusyException: Cannot acquire connection
   at ome.services.db.SelfCorrectingDataSource.callWithRetries(SelfCorrectingDataSource.java:113)
   at ome.services.db.SelfCorrectingDataSource.getConnection(SelfCorrectingDataSource.java:60)
   at org.springframework.jdbc.datasource.DataSourceTransactionManager.doBegin(DataSourceTransactionManager.java:203)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:371)
   at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:354)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:103)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at ome.tools.hibernate.ProxyCleanupFilter$Interceptor.invoke(ProxyCleanupFilter.java:249)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at ome.services.util.ServiceHandler.invoke(ServiceHandler.java:121)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at com.sun.proxy.$Proxy81.getEventContext(Unknown Source)
   at sun.reflect.GeneratedMethodAccessor452.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
   at ome.security.basic.BasicSecurityWiring.invoke(BasicSecurityWiring.java:93)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at ome.services.blitz.fire.AopContextInitializer.invoke(AopContextInitializer.java:43)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at com.sun.proxy.$Proxy81.getEventContext(Unknown Source)
   at sun.reflect.GeneratedMethodAccessor715.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at ome.services.blitz.util.IceMethodInvoker.invoke(IceMethodInvoker.java:172)
   at ome.services.throttling.Callback.run(Callback.java:56)
   at ome.services.throttling.InThreadThrottlingStrategy.callInvokerOnRawArgs(InThreadThrottlingStrategy.java:56)
   at ome.services.blitz.impl.AbstractAmdServant.callInvokerOnRawArgs(AbstractAmdServant.java:140)
   at ome.services.blitz.impl.AdminI.getEventContext_async(AdminI.java:217)
   at sun.reflect.GeneratedMethodAccessor714.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
   at omero.cmd.CallContext.invoke(CallContext.java:78)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at com.sun.proxy.$Proxy82.getEventContext_async(Unknown Source)
   at omero.api._IAdminTie.getEventContext_async(_IAdminTie.java:192)
   at omero.api._IAdminDisp.___getEventContext(_IAdminDisp.java:1881)
   at omero.api._IAdminDisp.__dispatch(_IAdminDisp.java:2041)
   at IceInternal.Incoming.invoke(Incoming.java:221)
   at Ice.ConnectionI.invokeAll(ConnectionI.java:2536)
   at Ice.ConnectionI.dispatch(ConnectionI.java:1145)
   at Ice.ConnectionI.message(ConnectionI.java:1056)
   at IceInternal.ThreadPool.run(ThreadPool.java:395)
   at IceInternal.ThreadPool.access$300(ThreadPool.java:12)
   at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:832)
   at java.lang.Thread.run(Thread.java:748)

    serverExceptionClass = ome.conditions.DatabaseBusyException
    message = Cannot acquire connection
    backOff = 0


The OMERO config has a "omero.db.poolsize=80" defined. Postgres is still on the default value of 100 connections.

Any ideas how I could further analyze/debug this issue including hints how to overcome?
Any additional info/output needed?

Thanks a lot in advance, best,
-Rainer

P.S.: by the way, talking about OMERO.server-5.3.4-ice36-b69 and psql (PostgreSQL) 9.4.17 on CentOS 7.4.1708
User avatar
rpoehlmann
 
Posts: 42
Joined: Thu Feb 09, 2012 2:04 pm

Re: DatabaseBusyException: how to analyze and overcome?

Postby mtbc » Tue Apr 24, 2018 9:40 am

Dear Rainer,

Thank you, good to know that you are on CentOS 7. I think that you would normally expect rather high limits in places like /etc/security/limits.conf and /proc/sys/fs/file-max but maybe it's worth quickly checking while we are trying to figure this problem out. I am afraid that we are a bit short-handed this week.

The Monitoring and Metrics parts of https://docs.openmicroscopy.org/omero/5 ... mance.html may be helpful for you in keeping an eye on how many files are being held open. After an OMERO.server restart or an OMERO.web restart with luck you will see them start off low and perhaps there is some clue in the pattern of their growth and then you can also get warning that it might be time to restart the server. There's also the question of which server: Are both OMERO.server and OMERO.web running on the same system, as the same user? Could OMERO.web sessions (Django or OMERO) being held open exhaust file handles for OMERO.server's connections to PostgreSQL?

Over recent months we have worked on reducing leakage of file handles. Some of the issues were exacerbated by configuring a public user for OMERO.web: do you have one enabled? Other issues were NumPy-related, I think: which version of that do you have installed? OMERO 5.4.6 should include another round of such leakage fixes but there were earlier ones too so you may already find 5.4.5 an improvement and when we release 5.4.6 that may certainly be worth your consideration.

Sorry to have more questions than answers at this point! I appreciate that restarts can be very disruptive for your user community.

Cheers,
Mark
User avatar
mtbc
Team Member
 
Posts: 282
Joined: Tue Oct 23, 2012 10:59 am
Location: Dundee, Scotland

Re: DatabaseBusyException: how to analyze and overcome?

Postby mtbc » Tue Apr 24, 2018 9:50 am

(A small housekeeping note: QA 20629 thru 20632 are client-side views of this trouble.)
User avatar
mtbc
Team Member
 
Posts: 282
Joined: Tue Oct 23, 2012 10:59 am
Location: Dundee, Scotland

Re: DatabaseBusyException: how to analyze and overcome?

Postby rpoehlmann » Tue Apr 24, 2018 1:04 pm

Dear Mark,

thanks a lot for your reply.

regarding open file limits, there seems to be one: /etc/security/limits.conf
Code: Select all
omeronas  soft  nofile  10000
omeronas  hard  nofile  12000

and
Code: Select all
[omeronas@omero02 ~]$ cat  /proc/sys/fs/file-max
26212486


Mmmh, maybe I need to increase?

I installed VisualVM, and I also could start JConsole. In terms of making sense out of those info I feel slightly overwhelmed since I am not really deeply familiar with JVM monitoring details.
jconsole_threading_attributes.png
jconsole_threading_attributes.png (58.15 KiB) Viewed 1936 times


E.g. is this ThreadCount the actual no. of threads issued by Blitz? So far, I could observe numbers between 50-<80.

VisualVM_Blitz_Screen.png
VisualVM_Blitz_Screen.png (32.76 KiB) Viewed 1936 times


The upgrade to OMERO 5.4.5 is planned, I will asap rehearse it on our TEST system before we will also upgrade the PROD system afterwards. Therefore it is good to hear that this might improve the situation as well.

Please do not hesitate to request additional info if needed.

Thanks a lot, best,
-Rainer
User avatar
rpoehlmann
 
Posts: 42
Joined: Thu Feb 09, 2012 2:04 pm

Re: DatabaseBusyException: how to analyze and overcome?

Postby rpoehlmann » Wed Apr 25, 2018 6:05 am

Addendum:
Interestingly, the no. of open files is reported as defined in "/etc/security/limits.conf"

Code: Select all
[omeronas@omero02 systemd]$ prlimit -n
RESOURCE DESCRIPTION               SOFT  HARD UNITS
NOFILE   max number of open files 10000 12000


However, when I take a closer look at the main Blitz process, this one seems to have different restrictions:

Code: Select all
[omeronas@omero02 systemd]$ prlimit -p 9699 -n
RESOURCE DESCRIPTION              SOFT HARD UNITS
NOFILE   max number of open files 4096 4096

Code: Select all
[omeronas@omero02 systemd]$ ps -ewf | grep 9699
omeronas  9699  2775  5 Apr23 ?        02:15:52 java -Xmx96000m -XX:MaxPermSize=1g -XX:+IgnoreUnrecognizedVMOptions -Djava.awt.headless=true -Dlogback.configurationFile=etc/logback.xml -Domero.logfile=var/log/${omero.name}.log -Domero.logbase=/OMERO/ManagedRepository/ -Domero.name=Blitz-0 ome.services.blitz.Entry --Ice.Config=/home/omeronas/OMERO.server-5.3.4-ice36-b69/var/master/servers/Blitz-0/config/config
omeronas 24595 12805  0 08:02 pts/0    00:00:00 grep --color=auto 9699


Puzzled ...
User avatar
rpoehlmann
 
Posts: 42
Joined: Thu Feb 09, 2012 2:04 pm

Re: DatabaseBusyException: how to analyze and overcome?

Postby jmoore » Thu Apr 26, 2018 2:20 am

Hi Rainer,

a quick suggestion being one of the those who have left the others short-handed...

when & if your server gets into this state, can you check both what the server is doing as well as what PostgreSQL is doing via:

Code: Select all
jstack $(bin/omero admin ice server pid Blitz-0)


and

Code: Select all
psql omero -c "select * from pg_stat_activity"


Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany


Return to User Discussion

Who is online

Users browsing this forum: No registered users and 1 guest