How to investigate deadlock issues
For a typical hung- possible caused by a deadlock- the collection should take at least 3 dumps (javacore) with 5 minutes interval between one and the next one. The best way to get mentioned data is follow the instructions of WAS MustGather tech document. For example, WAS hung issue on AIX:
Open the javacores with any text file word processor. Look for the string "Deadlock detected".
Open the javacores with the IBM tool "Thread and Monitor Dump Analyzer". Find the Thread Status Analysis in the report generated for the 1st javacore:
keep particular attention on the number of "Waiting on condition" and "Blocked" threads. At this point it will be useful review the details of the waiting/blocked threads looking their stack trace in order to understand what and why their are waiting. Point of investigation are:
The lock information is the key in the quest of finding a deadlock. The lock is a resource that can only be owned by one thread at a time. Other threads waiting for that lock are blocked until the thread that owns it releases it.
Thread Monitor and dump analyzer shows the threads are keeping the lock on resources. They are identified with special icons:
As you can see the icon is the Monitor image just beside the name of the Thread. As soon as you select the thread you can see what it's locking.
So, for example this Thread owns Monitor Lock on com/ibm/ws/util/BoundedBuffer$GetQueueLock@0x0...
The investigation should continue looking for threads are locking Monitor for long time and if there is a relevant number of threads waiting/or blocked by these threads.
A possible further step in the investigation is compare the threads and the monitors during different javacores. Thread Monitor and dump analyzer could help to accomplish this task without any effort but just selecting the option "Analysis > Compare threads" and "Analysis > Compare monitors" (after the selection of the javacores collected).
This feature offers a view of the threads status during the time reported by the javacore collected (that's why it's important collect more javacores!!!). The best way to describe this feature is show some example:
Above image shows 4 running threads and their status during 4 different javacores (the columns). The border indicate the status (green is Running) and the background color (in this case red) indicate a suspect thread. Different conclusions can be done looking- for example- above image: in fact, the WebContainer is blocked in the first two javacore but than just waiting on condition in the next two javacore for different tasks (you can understand the tasks were different looking at the stack trace of the thread reported by Thread Monitor and dump analyzer on the right side of thewindow); since the task requested by the thread is different (in different javacore) this thread (or better pool) should not be a suspected thread. So the analysis should be on looking for threads (in particular WebContainer) in wait status or blocked from the first javacore till the last one requesting the same operation.
At this point the investigation could continue or have confirmation on what we have found previously using the comparison monitors feature:
The above image shows as the tool detects a suspect Thread- DRSTHreadPool: DMN0- locks specific Monitor for the time of the last three javacores. In the Waiting Threads tab there is the list of threads in waiting for current thread to unblock the lock. So, this Thread could be a good candidate to investigate on looking at the stack trace and try to understand what it's doing and why it does not release the lock for the time of the last 3 javacores.
The detection of an hung or deadlock issue is unfortunately a task could requires time to investigate on. The most important things are: