An OOM or OOME (OutOfMemoryError) simply means that the JVM ran out of memory. When this occurs, you basically have 2 choices:
- Allow the JVM to use more memory using the -Xmx VM argument. For instance, to allow the JVM to use 1 GB (1024 MB) of memory:
java -Xmx1024m ...
- Improve/Fix the application so that it uses less memory
In many cases, like in the case of a memory leak, the second option is the only sound choice. A memory leak happens when the application keeps more and more references to objects and never releases them. The garbage collector will therefore never collect those objects and less and less free memory will be available until we reach the point where not enough free memory is available for the application to function normally. At this point, the JVM will throw an OOM.
A memory leak can be very latent. For instance, the application might behave flawlessly during development and QA. However, it suddenly throws a OOM after several days in production at customer site. To solve that issue, you first need to find the root cause of it. The root cause can be very hard to find in development if it cannot be reproduced in-house. Here are the steps to follow in order to find the root cause and fix that issue:
- Start the application with the VM argument -XX:+HeapDumpOnOutOfMemoryError. This will tell the VM to produce a heap dump when a OOM occurs:
java -XX:+HeapDumpOnOutOfMemoryError ...
- Reproduce the problem. Well, if you cannot reproduce in dev, you will have to use the production environment.
- Use VisualVM to read the heap dump file and diagnose the issue.
First of all, a heap dump is the dump of the heap (duh!). It will allow you to navigate the heap and see what objects use all the heap memory and which are the ones that still keep a reference on them, and so on and so forth. This will give you very strong hints and you will (hopefully) be able to find the root cause of the problem. The problem could be a cache that grows indefinitely, a list that keeps collecting business-specific data in memory, a huge request that tries to load almost all data from database in memory, etc.
Once you know the root cause of the problem, you can elaborate solutions to fix it. In case of a cache that grows indefinitely, a good solution could be to set a reasonable limit to that cache. In case of a query that tries to load almost all data from database in memory, you may have to change the way you manipulate data; you could even have to change the behavior of some functionalities of the application.
If you do not want to wait for a OOM or if you just want to see what is in memory, you can still generate heap dump. To manually trigger a heap dump, you have 2 choices:
- Use VisualVM, right-click on the process on the left pane and select Heap Dump
- If you do not have a graphical environment and can't use vnc (VisualVM needs a graphical environment), use jps and jmap to generate the heap dump file. Then copy the file to your workstation and use VisualVM to read the heap dump (File -> Load...):
Here is what VisualVM looks with a heap dump:user@host:~$ jps
user@host:~$ jmap -dump:live,format=b,file=heap.bin 21734
Dumping heap to /home/user/heap.bin ...
Heap dump file created
Alternatively, you can also use jhat to read heap dumps.