I've added dynamic thread pools to the JWS, using the
java.util.concurrentthread pool APIs. The way these APIs works is not necessarily intuitive, so let me explain a bit. Basically, there's a distinction made between threads (which actually execute tasks), and
Runnableobjects (which are the tasks to be executed). In the JWS, the main thread creates a
Runnableobject for each incoming connection. That object goes into a thread-safe queue (the task queue) and awaits a thread that will run it.
And this is where the thread pool comes in. The thread pool (
ThreadPoolExecutor) has N threads in it. Its job is to find an idle thread, pull a
Runnableobject from the task queue, and use the thread to execute the object's
run()method. The thread continues to execute
run()until it completes and exits. That object is then presumed to be completely processed, so it is thrown away. Also, the thread is idle again. So now the thread pool can recycle that thread to process the next object from the task queue. And so it goes.
Now, I chose to use a dynamic thread pool, which can also do things like noticing if there aren't enough threads to keep up with the objects coming in, and start up more. Or notice that there are way too many threads, and kill some of them. These are all variables you can control when you construct the
(Incidentally, I also used an
ArrayBlockingQueuefor the JWS's thread-safe task queue, because I wanted a hard upper bound on the number of incoming connections that could be queued before the JWS decided it was too busy and began rejecting new connections.)
When I set out to add dynamic thread pools to the JWS, I didn't think it would be that hard. I was familiar with the
Runnableinterface from previous programs, and was pretty sure thread pools wouldn't be too much more difficult. As it turns out thread pools aren't hard, but you'd never know that because the docs on Java thread pools really aren't very good! Putting "java thread tutorial" into Google will give you Sun's concurrency tutorial page. This is a great tutorial as far as concepts go, but it's rather light on actual code examples. It also glosses over a lot of practical issues that you run into when you actually start writing thread pools in Java. Particularly wronghead IMO is its recommendation to use the
ExecutorService.newCachedThreadPool()to create your dynamic thread pool. While
newCachedThreadPool()is a great convenience method, but I think it does a couple of things very wrong...
First and foremost, when you create a dynamic thread pool with
newCachedThreadPool(), you don't get to specify either the minimum or maximum number of threads in the pool. The lack of a minimum is not so annoying - it might mean a few hundred ms delay to kick off new threads when the load suddenly spikes. But that lack of an upper bound on the number of threads that can be dynamically created? Absolutely unacceptable in my mind. If someone DoS's your web server and it's written using
newCachedThreadPool(), then it could theoretically spawn off an unbounded number of threads! And that's a recipe for thrash 'n crash disaster.
Secondly, there's a related resource starvation issue.
newCachedThreadPool()must create a
ThreadPoolExecutorinternally, but the docs don't say what kind of task queue it gives to that
newCachedThreadPool()doesn't limit the maximum number of threads, I think it probably doesn't limit the task queue size either. In other words, I suspect it's using a
SynchronousQueuewhich is a dynamic data structure that can grow without bound. So now you're looking at potential unbounded thread growth AND unbounded memory growth!
Now, all this wouldn't be so bad if the docs would just give an example of how to create your own
ThreadPoolExecutorobjects, instead of relying on
newCachedThreadPool(). But no such example is in Sun's tutorial! And such examples are also pretty rare elsewhere. It's not hard to construct your own
ThreadPoolExecutorif you understand the concepts. But with no real documentation on what those concepts are... that learning curve gets steep fast. Fortunately, there are a few sources that give examples of how to use
ThreadPoolExecutordirectly. Including (now) the JWS.
I have one last objection about the thread pool APIs. It's one that will get me called grey haired and crotchety, but that's alright. I enjoy yelling at those darn kids to get off my lawn.
As a resource-conscious embedded systems programmer, I don't like the idea of allocating a new
Runnablefor each incoming connection. Now, I realize that the way the
SocketAPI works, you have to make a new object for each incoming connection anyway. So complaining that we make not just one but (gasp!) two objects per connection is a little silly. Like complaining that some welds in the hull of the Titanic were weak, after the iceberg had already hit. Still, I feel the need to argue the point. And I'll tell you why: I believe I have a more efficient architecture for doing multi-threaded computing in Java.
Basically, what I'd advocate is that you make one
run()method is an infinite loop. This
run()method grabs objects to process from a
BlockingQueueof some sort - easy with the
java.util.concurrentclasses. It will keep running and processing whatever it's supposed to process, until it receives a
ThreadTerminationExceptionor similiar. Then, the thread pool starts up N threads, and M instances of your
Runnableclass. Finally, the thread pool shares the threads among the runnables. And that's all there is to it. This is, in fact, the exact architecture that I used when I wrote a version of the JWS between .2 and .3, which used a static array of threads and a
Sockets to do its work.
I like this approach better for two reasons. One is that you're not constantly
new'ing and then throwing away any more objects than you absolutely have to. If you new two objects for every connection and the connection rate gets high, garbage collection churn will start to take away cycles from servicing incoming connections. The other thing I like about having a bunch of infinite loops all pulling off the same queue is that it improves locality of cache reference. And that's often a big key to execution speed.
Yeah, yeah, I know - only an old C programmer would be so distrustful of TEH OBJEKT ORIENTATED PARADIGMZ OMG!!!!1!!! to suggest that, hey, maybe we don't have to create a new
Runnableobject for every single connection just because we can. Just because Java can be used inefficiently doesn't mean we have to use it that way. (Yeah, I know - "That's crazy talk!")
So just be quiet and bring me my walker, sonny! When I wuz your age, we didn't have none of these fancy garbage collectors! We hand-tweaked the microcrode in our floating point co-processors! With a paper clip! And we liked it!!! I remember back in the summer of '87... we had just gotten in a shiny new 386-DX2 40 MHz...
Edit 2007/06/05: Yay, LJ finally beat the DDoSers, so this post now contains the full text as I originally intended!