#241 closed defect (fixed)
too many open files
Reported by: | dmn | Owned by: | kjdon |
---|---|---|---|
Priority: | very high | Milestone: | 3.04 Release |
Component: | Greenstone3 Runtime | Severity: | major |
Keywords: | Cc: |
Description
Tomcat falls over with too many open files
these are typically open .ldb files, in normal operation GS3 has several open ldb files per collection (3,4,5,6...)
Linux systems usually have a 1024 limit, over time the no. of open files grows and if it peaks over the limit Tomcat and GS3 crash
Here is sample output from lsof showing several (6) ldb files open at once for the collection ikrgrxv:
java 23255 daven 191rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven) java 23255 daven 192rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven) java 23255 daven 193rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven) java 23255 daven 443rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven) java 23255 daven 444rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven) java 23255 daven 445rR REG 0,18 143581 19743405 /home/daven/research/greenstone3/web/sites/localsite/collect/ikrgrxv/index/text/ikrgrxv.ldb (goblin:/export/home/staff/daven)
This is mainly derived from just re-configuring the collection repeatedly: e.g. with a URL such as
http://kiwi.cs.waikato.ac.nz:8090/greenstone3/classic?a=s&sa=c
Guess: somewhere in the collection reconfigure (or caching code?) we are holding onto references that keep file handles open and if you have enough collections this will eventually kill Tomcat. Because most of the open file handles (approx 75%) are ldb files then the problem is with the GDBM code, either the Java wrapper or the native code.
You will see errors in the Tomcat logs like this:
SEVERE: Error reading tld listeners java.io.FileNotFoundException: /home/daven/research/greenstone3/packages/tomcat/work/Catalina/localhost/greenstone3/tldCache.ser (Too many open files) java.io.FileNotFoundException: /home/daven/research/greenstone3/packages/tomcat/work/Catalina/localhost/greenstone3/tldCache.ser (Too many open files) java.io.FileNotFoundException: /home/daven/research/greenstone3/packages/tomcat/conf/web.xml (Too many open files)
in
greenstone3/packages/tomcat/logs/catalina<date>.log
The actual files involved vary as not finding files produces unpredictable errors.
this Bash script gives you an idea of how to check this:
TOMCAT_ID=ps ux | grep tomcat | grep java | grep -v grep | awk -F" " '{ print $2 }'
echo "Tomcat_ID: $TOMCAT_ID"
NUM_OPEN_FILES=/usr/sbin/lsof -p $TOMCAT_ID | wc -l
echo "open files: $NUM_OPEN_FILES"
Change History (6)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Owner: | changed from | to
---|
comment:3 by , 16 years ago
Status: | new → assigned |
---|
comment:4 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I have fixed the ever increasing number of file handles. On a reconfigure, the old collection object wasn't being cleaned up properly. Now it is (hopefully). Multiple reconfigures shouldn't increase the number of file handles now.
However, there is still a problem that if the number of collections is too large, then there couldbe too many file handles. See new ticket #250.
comment:5 by , 9 years ago
Severity: | → major |
---|
It seems that the open ldb files are connected to the number of open services, so in here:
http://trac.greenstone.org/browser/greenstone3/trunk/src/java/org/greenstone/gsdl3/service/GS2Browse.java?rev=14185
this code:
keeps an open file handle via
protected GDBMWrapper gdbm_src
If you force a cleanUp() at the end of the configure() method then the open files are reduced but the web UI (and maybe the SOAP UI?) falls over.
If the architecture we have is an open file for each classifier (or 1 per browse and 1 per search or similar) then I think it will not scale and needs to be fundamentally rewritten.