Bob Balaban's Blog

     
    alt

    Bob Balaban

     

    Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Bob Balaban  May 17 2009 12:00:00 PM
    Greetings, Geeks!

    If you're not interested in the swirling, sexy synergy at the intersection of Java memory management and Notes back-end classes and multi-threaded programs, you might want to go catch up on events elsewhere. Don't worry, I don't take this stuff personally.

    Ok! For the 2 of you still here, have you already scanned my previous post on the topic of garbage collection (gc) in Java? The point of this gloss on the topic is to try to explain what happens to gc, recycle() and so on when you also mix in multi-threading. As Jeff Foxworthy says, "And then it got WEIRD..."

    You probably already know that the Notes back-end classes, regardless of what programming language you use to access them, are implemented internally as a bunch of C++ code, which in turn manipulates the product (Notes/Domino) via a rather large C API. Thus, the needs and requirements of the C API drive some of the fundamental behavior of the C++ code that implements the back-end classes, and, to one extent or another, also impose constraints on the various language-specific implementations of your agents, standalone programs, etc.

    One of these "requirements" (or constraints, if you prefer) has to do with a nasty (IMHO) little programming technique called "thread-local storage", or TLS. TLS is one of those ideas that sounds cool, if you're a codegeek, but which can actually really tie you in knots in some situations. Basically, TLS means that any given "thread of execution" within a process can allocate some memory that only it can "see". In other words, your code has to be running on the thread that allocated the TLS in order for that code to be able to access that memory.

    Why would you do that? Well, it's a convenient way to "remember" context in certain situations. If you're a server, for example, and you bind a thread to each client session, so that all client requests are serviced by a single thread, you can use TLS to easily remember stuff like, who is this client anyway? Is she authenticated? What databases is she accessing, and so on.

    Where it gets nasty is that the use of TLS in Notes imposes two non-breakable requirements on multi-threaded programs which use the CAPI to get things done: 1) Any TLS allocated on a given thread must also be de-allocated on that same thread, and b) All threads accessing the Notes C API must explicitly initialize (and terminate) themselves before using any CAPI services. Why? Because Notes uses TLS. Since only the thread that creates TLS can "see" it, it's logical that no other thread can de-allocate it. And, like most storage/memory systems in any complicated program (like Notes or Domino), TLS requires per-thread initialization, and therefore, termination.

    Now, when you're using (or writing) a single-threaded program using the CAPI (or back-end classes), there's no problem, because everything happens on the same thread. Examples of single-threaded programs using the Notes C API include: all LotusScript programs; a Notes CAPI program where you don't create extra threads; Java agents or standalone programs where you don't create extra ("child") threads; most of the Notes Client.

    It gets weird fast in Java, though, partly because it's so easy to spin off child threads in that language. So what happens if you create an instance of the java.lang.Thread class and run some code that accesses the back-end classes on that thread? The answer is: it won't work, unless you do the right thread initialization. There are 2 ways to do that:

       A) Make your code use NotesThread instead of Thread (your class can extend it, or you can launch it in any of the ways you can launch Thread). NotesThread extends Thread, and adds just a little bit of logic: it calls the correct Notes CAPI entry points to do the required initialization when it starts running, and when it terminates, it does the required call for termination.

       2) If you don't want to extend NotesThread (or can't, for some reason), any instance of java.lang.Thread can be explicitly initialized (and terminated) with a pair of "static" methods on the NotesThread class. "Static" means that you can call the methods without having an actual instance of the class around. You can call NotesThread.sinitThread() and NotesThread.stermThread() from any thread instance, and then go use back-end classes as you wish.

    Ok. Still with me? Next question: What the heck does this have to do with recycle()? The bottom line: remember from my previous post how I attributed the need to explicity recycle() to the fact that Java gc doesn't have a way to invoke Notes API to free up resources? Well, it's actually worse than that. Even if there WERE such a way (e.g., if the finalize() call in Java were actually reliable), it would still be NO GOOD!

    Why? Because of TLS. And, because in the Java virtual machine, garbage collection takes place on its own thread. So even if there were a way to reliably invoke the Notes API when an object (say an instance of lotus.domino.Database) was being gc'ed, it would violate the rule that TLS MUST be de-allocated on the thread that created it!

    So, what happens, then when you create a Notes object (such as a Database) on one thread, and then use it on another? It works! Why? because the back-end classes code internally figures out that you're accessing the internal data structures of that object (a CAPI thing called a DBHANDLE, in the case of a Database) on another thread, and it compensates by creating new TLS for that object on your thread. When your thread terminates, the back-end classes logic has to go find EVERY object that has TLS allocated on that thread and reclaim it. This happens automatically if you're using NotesThread, or explicitly when you use the static stermThread() call.

    And now, to get FULLY weird (and, I hope, your head will not explode): what happens if you create (say) a Database object on ThreadA, then access it on ThreadB and ThreadC, and then you decide you're ready to recycle it? Think about it (but not TOO hard)! Which thread should you use to call Database.recycle()? Whichever one it is, doesn't that leave 2 threads' worth of TLS still allocated?

    So, when is a "recycle" not a "recycle"? Answer: when you have an object "open" on multiple threads! The only way to avoid memory leaks in the above situation is to not fully destroy the object until you're done with it on ALL threads that have accessed it.

    How much of this does the everyday Java developer using Notes back-end classes have to worry about? Some, but not a whole lot (again, IMHO and YMMV). Here are a few "best practices" I have developed over the years to keep my own code relatively (if not squeaky) clean:

       1) Don't write multi-threaded programs. They're harder to code, harder to debug and harder to maintain. And you can get unintended memory leaks...
       2) If you MUST write multi-threaded programs, make sure you really have to. If you don't REALLY have to, see point #1. Yes, you can derive big performance gains with multi-threading. But ask yourself this: is it worth the extra pain?
       3) Don't share Notes objects across threads.
       4) If you MUST share Notes objects across threads, make sure you really have to. If you don't REALLY have to, see point #3.
       5) Adopt this convention: If a thread created the object, that thread "owns" the object and ONLY that thread may recycle it. Of course there are a couple of corollaries to this:
            a) ALL objects should be recycled when you're done with them
            b) When you go to recycle an object on the owning thread, make sure no other threads expect that object to still be there. Ideally, you'd have the non-owning threads terminate before the owning thread, then that last-thread-standing can recycle the object(s) safely, and all TLS gets cleaned up.
       6) Try to avoid writing multi-threaded programs, unless you really know what you're doing.

    Isn't this fun?

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)
    Follow me on Twitter @LooseleafLLC
    This article ┬ęCopyright 2009 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.