How to analyze leaky webapps
Web applications have a lifecycle – they can be deployed, started, stopped, restarted and undeployed. Administrators expect that this happens in a clean way and all resources allocated by a webapp are freed when it’s undeployed. But with a growing complexity of applications and a growing number of third-party libraries, it becomes harder for apps to cleanly shut down due to leaky components. This leads to OutOfMemory errors and other weird behaviors of web applications. Find out how to analyze leaks in Java web applications using a memory analyzer and heap dumps.
Two related types of leaks are ClassLoader leaks and Thread leaks. When a Web Application starts a new thread, the WebappClassLoader becomes the Thread‘s ContextClassLoader. If the Thread never dies (e.g. TimerTask Threads, Schedulers, Cache Cleanup-Threads), the WebappClassLoader cannot be garbage collected and hence will suck up all the precious memory. To find such leaks, you can use the Eclipse Memory Analyzer (MAT), which I introduced in my last post. Start your application server using the -XX:+HeapDumpOnOutOfMemoryError command line option (Sun/Oracle VM only). Start your app server, deploy your application a couple of times (multiple rounds of deploy/undeploy or start/stop). When an OutOfMemory error occurs, the VM will create an .hprof heap dump file. Load it into MAT.
The heap dump file now gets analyzed. MAT is creating lots of highly optimized indices to allow extremely quick browsing, even for very large dumps. It will do this only once for each dump. The next time you load the same dump, it will be loaded immediately. If MAT is ready, it will show a wizard. You can safely omit the dialog and press Cancel. We’re not interested in the standard reports right now, as they’re not very suitable for our cases. (We already know what we’re looking for, but if you analyze your own application using MAT, the Leak Suspects report may be very informative). In the toolbar, select the Histogram icon and group the data by ClassLoader. This will give use a list of all ClassLoaders in the system at the time of the OutOfMemoryError.
You will then receive a list of ClassLoaders, and we’re interested in the specific ones for Web Applications. For an application running in Tomcat, we’re looking for the org.apache.catalina.loader.WebappClassLoader entries.
If you select one of the entries, the Inspector view will show details about the ClassLoader object. For a Tomcat ClassLoader, we can use the started attribute to distinguish web applications which were still running and web applications which should have been shut down. Look for instances where started is false and some of the attributes have been set to null like in the following picture:
Now, if you have found instances of WebappClassLoader, right-click on it and select Merge Shortest Paths to GC Roots > Exclude all … references. A new tab will open showing you the Objects, which prevents the ClassLoader from being garbage collected. If you following the path, you may find the culprit and fix it. For leaked Threads, try to avoid starting the Thread in the first place or find a way to shut down the Thread when the web application receives the STOP event from the application server. Spring-based applications usually do that for example by shutting down the Spring application context and calling the destroy-method on beans. Of course there are plenty of ways to notify components to tell them they should shut down now and clean up their resources.
In the example, we found the culprit: EhCache’s automatic check for Updates. Btw, Terracotta (the vendor of EhCache and Quartz) recommends to disable this feature in production environments anyway.
Happy Leak Hunting!