Bob Balaban's Blog


    Bob Balaban


    Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Bob Balaban  May 17 2009 12:00:00 PM
    Greetings, Geeks!

    If you're not interested in the swirling, sexy synergy at the intersection of Java memory management and Notes back-end classes and multi-threaded programs, you might want to go catch up on events elsewhere. Don't worry, I don't take this stuff personally.

    Ok! For the 2 of you still here, have you already scanned my previous post on the topic of garbage collection (gc) in Java? The point of this gloss on the topic is to try to explain what happens to gc, recycle() and so on when you also mix in multi-threading. As Jeff Foxworthy says, "And then it got WEIRD..."

    You probably already know that the Notes back-end classes, regardless of what programming language you use to access them, are implemented internally as a bunch of C++ code, which in turn manipulates the product (Notes/Domino) via a rather large C API. Thus, the needs and requirements of the C API drive some of the fundamental behavior of the C++ code that implements the back-end classes, and, to one extent or another, also impose constraints on the various language-specific implementations of your agents, standalone programs, etc.

    One of these "requirements" (or constraints, if you prefer) has to do with a nasty (IMHO) little programming technique called "thread-local storage", or TLS. TLS is one of those ideas that sounds cool, if you're a codegeek, but which can actually really tie you in knots in some situations. Basically, TLS means that any given "thread of execution" within a process can allocate some memory that only it can "see". In other words, your code has to be running on the thread that allocated the TLS in order for that code to be able to access that memory.

    Why would you do that? Well, it's a convenient way to "remember" context in certain situations. If you're a server, for example, and you bind a thread to each client session, so that all client requests are serviced by a single thread, you can use TLS to easily remember stuff like, who is this client anyway? Is she authenticated? What databases is she accessing, and so on.

    Where it gets nasty is that the use of TLS in Notes imposes two non-breakable requirements on multi-threaded programs which use the CAPI to get things done: 1) Any TLS allocated on a given thread must also be de-allocated on that same thread, and b) All threads accessing the Notes C API must explicitly initialize (and terminate) themselves before using any CAPI services. Why? Because Notes uses TLS. Since only the thread that creates TLS can "see" it, it's logical that no other thread can de-allocate it. And, like most storage/memory systems in any complicated program (like Notes or Domino), TLS requires per-thread initialization, and therefore, termination.

    Now, when you're using (or writing) a single-threaded program using the CAPI (or back-end classes), there's no problem, because everything happens on the same thread. Examples of single-threaded programs using the Notes C API include: all LotusScript programs; a Notes CAPI program where you don't create extra threads; Java agents or standalone programs where you don't create extra ("child") threads; most of the Notes Client.

    It gets weird fast in Java, though, partly because it's so easy to spin off child threads in that language. So what happens if you create an instance of the java.lang.Thread class and run some code that accesses the back-end classes on that thread? The answer is: it won't work, unless you do the right thread initialization. There are 2 ways to do that:

       A) Make your code use NotesThread instead of Thread (your class can extend it, or you can launch it in any of the ways you can launch Thread). NotesThread extends Thread, and adds just a little bit of logic: it calls the correct Notes CAPI entry points to do the required initialization when it starts running, and when it terminates, it does the required call for termination.

       2) If you don't want to extend NotesThread (or can't, for some reason), any instance of java.lang.Thread can be explicitly initialized (and terminated) with a pair of "static" methods on the NotesThread class. "Static" means that you can call the methods without having an actual instance of the class around. You can call NotesThread.sinitThread() and NotesThread.stermThread() from any thread instance, and then go use back-end classes as you wish.

    Ok. Still with me? Next question: What the heck does this have to do with recycle()? The bottom line: remember from my previous post how I attributed the need to explicity recycle() to the fact that Java gc doesn't have a way to invoke Notes API to free up resources? Well, it's actually worse than that. Even if there WERE such a way (e.g., if the finalize() call in Java were actually reliable), it would still be NO GOOD!

    Why? Because of TLS. And, because in the Java virtual machine, garbage collection takes place on its own thread. So even if there were a way to reliably invoke the Notes API when an object (say an instance of lotus.domino.Database) was being gc'ed, it would violate the rule that TLS MUST be de-allocated on the thread that created it!

    So, what happens, then when you create a Notes object (such as a Database) on one thread, and then use it on another? It works! Why? because the back-end classes code internally figures out that you're accessing the internal data structures of that object (a CAPI thing called a DBHANDLE, in the case of a Database) on another thread, and it compensates by creating new TLS for that object on your thread. When your thread terminates, the back-end classes logic has to go find EVERY object that has TLS allocated on that thread and reclaim it. This happens automatically if you're using NotesThread, or explicitly when you use the static stermThread() call.

    And now, to get FULLY weird (and, I hope, your head will not explode): what happens if you create (say) a Database object on ThreadA, then access it on ThreadB and ThreadC, and then you decide you're ready to recycle it? Think about it (but not TOO hard)! Which thread should you use to call Database.recycle()? Whichever one it is, doesn't that leave 2 threads' worth of TLS still allocated?

    So, when is a "recycle" not a "recycle"? Answer: when you have an object "open" on multiple threads! The only way to avoid memory leaks in the above situation is to not fully destroy the object until you're done with it on ALL threads that have accessed it.

    How much of this does the everyday Java developer using Notes back-end classes have to worry about? Some, but not a whole lot (again, IMHO and YMMV). Here are a few "best practices" I have developed over the years to keep my own code relatively (if not squeaky) clean:

       1) Don't write multi-threaded programs. They're harder to code, harder to debug and harder to maintain. And you can get unintended memory leaks...
       2) If you MUST write multi-threaded programs, make sure you really have to. If you don't REALLY have to, see point #1. Yes, you can derive big performance gains with multi-threading. But ask yourself this: is it worth the extra pain?
       3) Don't share Notes objects across threads.
       4) If you MUST share Notes objects across threads, make sure you really have to. If you don't REALLY have to, see point #3.
       5) Adopt this convention: If a thread created the object, that thread "owns" the object and ONLY that thread may recycle it. Of course there are a couple of corollaries to this:
            a) ALL objects should be recycled when you're done with them
            b) When you go to recycle an object on the owning thread, make sure no other threads expect that object to still be there. Ideally, you'd have the non-owning threads terminate before the owning thread, then that last-thread-standing can recycle the object(s) safely, and all TLS gets cleaned up.
       6) Try to avoid writing multi-threaded programs, unless you really know what you're doing.

    Isn't this fun?

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban,
    Follow me on Twitter @LooseleafLLC
    This article ┬ęCopyright 2009 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.

    1Daniele Vistalli  05/17/2009 4:44:18 PM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Thanks Bob, this is great stuff.

    And now ? How does it works when the java API is invoked trough DIIOP ? Is the DIIOP server taking care of thread initialization ? (I've never needed to initialize TLS on DIIOP programs).

    Also ... Are you going for a geek-post about DIIOP ?

    2Simon  05/17/2009 10:06:15 PM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Very interesting stuff, this whole series is good. There is so little consolidated info on this kind of stuff that I reckon these posts will become 'Notes Lore' over time.

    I do a fair bit of multi threading stuff, I stick to the idea of never sharing across threads, every thread = 1 session (when accessing locally natch). I agree with the previous poster, the recycling situation is so different with DIIOP that, if you felt so inclined, a bit of light shed there would be awesome.

    Thanks very much for these


    3Andrew Pollack  05/18/2009 1:21:18 AM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Bob - we've talked about this, I think.

    I solved this problem for a many-threaded, long-running java agent that handled lots of documents in many database.

    I created a static class used for check-in/check-out of all Notes objects. The rule was, any time I used any method to get a new notes object handle (set doc=, etc..) the very next call I made was to check "out" that handle with something like myclass.checkOut(doc); When I was finished with the object, instead of calling doc.recycle(), I called mycalls.checkIn(doc);.

    Each time I checked out an object, a counter was incremented for a hash value of that objects identifiers (e.g. server:database:unid for documents). If more than one thread had the document "checked out" the counter would be higher. When the last thread checked in the handle, the counter would drop to zero, and the static class would then call .recycle() on the object.

    Worked like a charm across tens - even hundreds - of thousands of documents in hundreds of database and over several hours. So far as I know, this is still running in production in some very very high volume settings.

    4Brent Henry  05/18/2009 9:41:37 PM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads


    I'm curious to know what happens if a VB.Net application were to spawn two threads, each of which created its own Domino.NotesSession object.

    Do you think there would be memory or object contention issues or would each session object be independant and isolated?

    I always enjoy reading your blog. The technical content is much appreciated.



    5John Smart  05/18/2009 10:38:57 PM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    This is a great series, Bob. Thanks much.

    6Bob Balaban  05/20/2009 9:31:17 AM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    @1 - Good question! I'll make CORBA the Part 3 of garbage collection. Luckily, it won't be as long or complicated a post... Thanks for the suggestion.

    @2 - One thread per session is ok, but realize that within a single Agent (or standalone process), the Session is likely to be a singleton instance anyway. Another way to approach it is to bind one Database instance per thread.

    @3 - Thanks for that description Andrew, sounds perfectly valid to me, though I imagine it was a bit tedious to get it right at implementation time. Having said that, I think it's also true that other possible solutions exist.

    @4 - Interesting question. Here's how I approach that: From VB you're using the COM version of the back-end interfaces. I dont' know if the COM classes enforce the same singleton Session object that the LotusScript version does. If so, then you aren't really "creating" 2 Session instances, there's only 1, referenced by 2 variables.

    In that case, you can freely mix and match "lower level" objects across "Sessions". If, however, you are really getting 2 individual Session instances, you have to worry about maintaining a strict hierarchy: you can't pass an object created under 1 Session as a parameter to an object created by the other. Really bad things will happen if you do. Maybe.

    @5 - Thanks for the feedback John!

    7Charles Robinson  05/22/2009 9:26:48 AM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    @4 - Bob would know better than I do, but I think once Notes loads into memory it creates one session object, period. The COM interface is a wrapper for the C API, so on the Notes side of things it all boils down to however the C API behaves.

    If you create two COM references to NotesSession in two separate threads they will act independently. I don't know what happens when the Notes C API portion gets invoked, but I do know that destroying one has no effect on the other. If you pass a handle to a NotesSession object from one thread to another, then Dispose of it in the second thread, the handle is invalid in all threads that reference it. In this respect it's the same in Java.

    One key difference is that .Net only passes a handle to COM objects across threads. It does not create a local copy of it. You would have to do that manually if you truly wanted a second copy. Incidentally ByRef only passes a reference to the handle, so it's just a pointer to a pointer. You might as well use ByVal (which is the default).

    For the purposes of garbage collection in VB.Net, you don't have the issue of thread local storage causing memory leaks with COM objects, but you should still Dispose of the COM objects or set them to Nothing to flag them for collection.

    8Daniel Lehtihet  06/16/2009 7:56:10 AM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads


    do you know if version 8.5 supports java virtual machine profiler interface (JVMPI) and if there exists any jvm profiling tools that can handle the Domino JVM.

    One thing i am missing is a possibility to actually "see" what is going on inside the Domino JVM (i.e browsing threads, objects, heap, etc). Right now, it is rather like a black hole where stuff happends behind the scenes.

    I think a jvmpi tool would help tracking down problems.

    kind regards


    9Bob Balaban  06/18/2009 8:13:35 PM  Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    @Daniel - Not sure if it does or not. But have you tried the profiler that's built into Designer?