-
Notifications
You must be signed in to change notification settings - Fork 0
DHIS2 server unresponsive after heavy concurrent requests
Arnau Sanchez edited this page Aug 11, 2017
·
2 revisions
The problem appears when a lot of concurrent requests hit an endpoint which stresses the database. For example, this ab command leaves the server unresponsive (50 concurrent requests):
$ ab -q -s 9999 -A admin:district -n 50 -c 50 -m GET \
https://play.dhis2.org/android-previous1/api/reportTables/xIWpSo5jjT1/data.html
The problem needs further investigation, just as a starting point, some notes:
- There are no logs in tomcat/DHIS2 with errors that could give hints of the problem.
- When the server is unresponsive, there are as many processes "postgres: dhis previous1 127.0.0.1(33822) idle in transaction", as simultaneous requests. This may be a signal of a deadlock in transactions from dhis2/hibernate code (caveat: or the consequence of an error somewhere else!)
Testing snippet:
watch '
echo -n "pg_locks: "
echo "select * from pg_locks;" | sudo -u postgres psql -t previous1 | grep "." | wc -l
echo -n "idle in transaction: "
ps awx | grep "[i]dle in transaction" | wc -l
'
When idle:
pg_locks: 2
idle in transaction: 0
With 40 requests from ab, it freezes the server and you get this data:
pg_locks: 4102
idle in transaction: 40
Maybe relevant:
https://stackoverflow.com/questions/32255557/postgresql-hang-forever-on-serializable-transaction
https://dba.stackexchange.com/questions/118922/select-1-idle-in-transaction