Thursday, January 14, 2010

Can you believe Java API's can fool you sometimes?

Recently I run into an issue with streaming large files (specifically PDF files) from WebSphere application server (version 5.x and 6.x). I have a servlet that reads a large file (300 MB) from Application server and stream the bytes to the browser. naturally i have used a buffered output stream and used a buffer size as 1024 bytes. and every time 1024 bytes is written to the servlet output stream i used the "flush" API to clear the buffer and thinking it may send the bytes immediately to the browser. But unfortunately I have observed a strange error! can you guess what could be that?
Are you thinking of Out of Memory error? OR Hung Threads? OR ArrayIndexOutOfBounds Exception? - In my case it is the server restarting automatically without any errors!. It is a perfect shutdown and instant re-start and have no clue from log files what caused the restart. I had a tough time to debug the issue, I tried changing the buffer size like a lunatic to all possible values but had no luck. The restart happened in the middle of streaming after crossing almost about 150 MB (JVM max memory was 512 MB and minimum 256 MB). My doubt was Out of Memory error and I started monitoring the JVM Heap memory statistics every 10 seconds. I have observed that the memory is shooting up every time and it is irrespective of the size of the buffer. I was wondering why memory should shoot up even though I am using a 1024 bytes buffer? then lately I found an article in IBM that, to some large extend, clarified the issue. You can take a look at the explanation here.

The article is describing about the trade-off between asynchronous and synchronous methods depending on the available memory but there is no concrete answer about fixing the issue. Now since I got some clue from this and commented out the "flush" API from the servlet output stream. To my surprise the whole issue was come to a resolution after this. I was surprised why this API cause a server shutdown? strange behavior from an API isn't it?

Summary: Even though the "flush" API is commented out, the memory still shoots up in the server even if I am using a buffer size 1024 or even no buffer and I am sure the above article can explain that.


No comments: