Bert's Blog

Posts tagged unix

Jul 15

Profiling Production

I work a lot in Java and one thing it does well is profiling. IDEs like Netbeans have profiling options, that are also available as plugins to VisualVM which comes with the JDK. This is great for development, but I think its main use is to settle technical arguments.

Things are a little different in production where hundreds of processes are running with all manner of dependencies. By far the most common problem that occurs is that something has “slowed down”. So without a profiler how would you find out what’s happening? 

First you need to find out what’s going on in the JVM and unless you have an amazing logging framework, it will require some digging. If you’ve ever done this sort of thing before then you’ll know all about thread dumps. Taking a thread dump is as easy as:

jstack pid > threaddump.txt

or on Solaris you can get a JVM to output a thread dump to its stdout via

kill -3 pid

This will output the stack of every thread running on a particular JVM. If you have an application server that’s running 1000+ threads, then it’ll output a lot of information to wade through. However this can be to your advantage, simply because slow code “hangs around”, and if it’s being noticed then it’s probably holding up multiple threads. It’s then time to try one of my favourite commands: 

threaddump.txt sort | uniq -c | sort -n

This simply says:

  1. sort all lines in the file in alphabetical order. 
  2. count the occurrences of each line.
  3. sort by the number of occurrences.

This will give you a hit count for every method call on your system. Slow calls will tend to stack up, and if you have a sensible naming convention then multiple calls to methods named “send”, “read” or ”write” will jump out at you to suggest an external dependency is to blame. Another thing that sticks out will be recursive calls that might be on their way to a Stack Overflow.

If this doesn’t give you something to go on, and things are still slow, then you might have a deadlock. To verify this you’ll need to take a second thread dump and compare it to the first. Every thread in java is given a unique name, so you can see which threads are common as follows 

cat threaddump.txt threaddump2.txt | grep prefix | sort | uniq -d

Hopefully this’ll return only a handful of matches giving you a good chance at finding out what’s going wrong.

Happy Hunting!