Bob Balaban's Blog

     
    alt

    Bob Balaban

     

    Geek-o-Terica 5: Taking out the Garbage (Java)

    Bob Balaban  May 4 2009 02:00:00 PM
    Greetings, Geeks!

    I posted a short article last week on garbage collection in LotusScript. This post is about a whole different set of issues you need to be aware of if you're writring Java code for Notes or Domino.

    Like LotusScript, the Java language creates objects with a built-in "new" operator, and, like LotusScript sweeping up the no-longer-used memory ("garbage collection",or gc) is supposed to be automatic. But. The gc mechanism in Java is nothing like the one in LotusScript -- you have to do more work to avoid memory leaks, and you have to know more about how it all works. The basic saying people like to quote about gc in Java is: "You never have to worry about memory." I generally add to that: "Until you run out."

    There are two major differences in the way LotusScript and Java each handle allocated memory: the first has to do with how many kinds of memory there are, and the second has to do with when each language does gc.

    How many kinds of memory are involved with a Notes/Domino Java program? At least 2. First there's all the memory your java program allocates directly in the Java Virtual Machine (JVM): space for your code, space for your objects, space for the JVM itself to use. This all comes from the Java "heap": a big pool of memory that the JVM gets from the operating system, and parcels out to your program, and any other Java programs that might be running at the time.

    The second set of memory that your Java program is going to consume comes from a completely different place: It comes from the "Notes runtime heap" -- a different big pool of memory that the Notes/Domino core (which, after all, has nothing to do with any JVMs) gets from the operating system. This pool is what the Notes back-end classes use, as well as the Notes core itself. So, for example, when you instantiate a lotus.domino.Session object (or Notes does it for you, if you're running an Agent), a number of things happen:

      - If the JVM isn't running yet, it starts up. A bunch of .jar files are pre-loaded, using up some JVM heap.
      - Notes (or Domino) creates an instance of the Session class for the agent to use. This uses a bit more memory from the JVM heap. The new Java Session object calls into the back-end classes DLL ("lsxbe") to initialize the corresponding C++ Session object. Every Notes Java object is linked to a corresponding lsxbe back-end C++ object.
      - The C++ object that gets created to go with the new Java object uses up some memory of its own. This memory, however, comes from the "other" heap -- what I called the "Notes runtime heap" above. ALL of the lsxbe objects that are linked to the Java objects you create and use in your Agent come from this other heap. Some of them (Docuemnts, Databases, and others) consume Notes CAPI resources, such as NOTEHANDLEs and DBHANDLEs, which might represent lots and lots of other allocated memory in the Notes core (just as one example, an "open" Notes document that consumes 10mb of disk space might also consume 10mb of memory).

    So here's where it gets interesting: The JVM has a background thread that runs all the time (at lower priority than "normal" application threads), looking for objects which have been allocated out of the JVM heap and which are no longer used anywhere. When it finds such, it frees the memory used by those objects, and that memory in the JVM heap is then available for re-use. HOWEVER, there is no automatic mechanism by which the C++ objects associated with those Java objects can be notified to free up the memory THEY are consuming (which is often far larger). If nothing is done, all of that memory taken from the Notes runtime heap is "leaked": it never gets released.

    Thus, because there's no way for Notes to know for sure when a given Java object is being garbage-collected, we need another way to tell the back-end classes to clean up and take out the garbage. Unfortunately, the only way is to enlist the aid of the developer: you have to TELL lsxbe to clean up with the notorious recycle() call. What makes this unfortunate, from a product point of view, is that developers have to know to do this, and have to know how to do it correctly, so that they don't mess theselves up accidentally.

    So what, exactly, does recycle() do? First, it finds the link stored in the Java object to the corresponding C++ object. Then it invokes some code in lsxbe to destroy that C++ object. The "destructor" code in the back-end classes does a couple of things: it first finds and destroys any "owned" objects that it knows about. When you invoke recycle() on a lotus.domino.Database object, for example, any Document objects that were instantiated out of that database are also destroyed. The object's destructor also knows how to lilnk back to the JVM and tell it to invalidate (and garbage-collect) the corresponding Java object. It also tells the Notes CAPI to release any temporary memory it has consumed, and then the memory taken up by the C++ object itself is reclaimed.

    Try this experiment: FInd (or create) a Notes database that contains a view with 50,000 or so documents in it (the number of fields in the document is relatively unimportant for this purpose). Then write a Java agent that walks the view, and accesses each document (just use View.getFirstDocument/getNextDocument). Don't use recycle(). If your view is big enough, you might actually crash Notes this way, because every time you assign a new Document object to your local variable inside the iteration loop, the previous object referred to by that variable will be gc'ed, but the corresponding C++ document object will not.

    Of course, even when you leak megabytes of memory in this way, Notes (and the operating system) get it all back when the Agent terminates. Why? Because the CurrentDatabase and the Session objects that Notes created for your Agent to use get recycled automatically when the Agent is done, and therefore all other objects "owned" by that Session and Database (i.e., all of them) get recycled automatically. Of course, if you run out of memory before that point, you're screwed.

    So: recycle early, recycle often!

    Believe it or not, there's actually a bunch more to say on this topic. Look for my next blog post in a couple of days: "So Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads".

    Geek ya later.

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)
    Follow me on Twitter @LooseleafLLC
    This article ©Copyright 2009 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.
    Comments

    1Dan Schwarz  5/5/2009 9:05:22 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Bob,

    Have you seen <a href="{ Link } A valiant attempt to deal with the recycle() issue once and for all, and add a higher level of abstraction to the Lotus Notes Java API. I am a sometime contributor to the project.

    2Bob Balaban  5/5/2009 5:56:26 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    @1 - Thanks Dan, looks like your posted link got messed up, but I found it here:

    { Link }

    Looks interesting, if I ever get some time, I'd like to see how the code works.

    3Scott Leis  5/8/2009 2:44:16 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Great post, Bob.

    How about a mention of LS2J and its extra complications with memory use?

    E.g. see this thread on the ND6/7 forum:

    { Link }

    4Kerr  5/14/2009 4:54:02 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    A bit late to the party on this one.

    I had a quick look at the source for domingo and it looks like it handles the recycle issue by overriding finalize in a base class. This sets off alarm bells straight away. Finalize is notorious for causing more problems than it solves. See Josh Bloch's Effective Java { Link } Item 7: Avoid Finalizers.

    This is also a great article from GC expert Tony Printezis { Link } which gives a good explanation of avoiding problems with finalizers and how to use Weak Refs instead.

    Now the domingo guys might have done a great job and it might work really well. But I'd want to poke around a lot more before accepting that thid was the best way to handle recycle().

    5Bob Balaban  5/14/2009 7:15:52 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    @4 - Thanks for the post, Kerr! I was going to investigate Domingo myself, but haven't got around to it yet.

    I echo your concern about Java finalize(). I investigated using it as one would use a C++ "destructor" way back in 1996 when I first wrote the Java version of the back-end classes for Notes 4.6. I read the Java doc, and saw that it more or less said something like, "Finalize() MIGHT be called before the object is garbage-collected". Which, IMHO, makes it frackin' useless.

    A couple of years ago, while I was still at IBM, several of us on the Domino server team had a conversation with the team in the UK who are responsible for IBM's Java VM development and support, and they confirmed that we should not rely on finalize().

    There are additional issues with gc and finalize(), on which I'm preparing a new post. Watch for that in the next few days.

    6Kerr  5/14/2009 12:51:18 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Well, a compliant JVM will always call an overridden finalize() before garbage collecting an object. The problem is that there is no guarantee the it will ever be GC'd. It might just push it back and back and them dump everything on exit.

    7Simon O’Doherty  5/24/2009 6:11:38 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Finalize will only run when the gc sees no more references to the object in question.

    However, if an uncaught exception is thrown within the finalize() then the exception is ignored and the finalize() method terminates.

    To add to that finalize() only ever runs once for the life cycle of the object.

    ...

    Good write up Bob!

    The majority of notes/domino crashes in relation to Java are because of the misuse of the recycle() method. So it clears it up quite nice.

    The one other thing to note in relation to memory being taken up is that the Domino objects are JNI calls. As such they also take up space on the JVM Heap. More details here.

    { Link }

    8Dan  6/6/2009 8:18:49 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Bob-

    Thanks or taking a look at Domingo. My work has been mainly in the Groupware layer, adding a high level interface to access Notes address books, etc. Point taken about the use of finalize()! I'll forward your comments to the project leads and see what they have to say.

    Best,

    Dan

    9S. Macgowan  1/18/2011 9:01:26 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Quick question, hope it's not too late...

    From Technote #1097861 ​:

    "When using objects in an agent, all objects (both Java and C++) are destroyed when the agent ends. When using servlets, .jsp's, or standalone applications, recycle must be used since Domino will never clean up these backend objects."

    When using Java Web Service Provider, do they behave exactly like agents? (all Java and C++ objects are destroyed when the Web Service ends so no need to recycle if it's a simple Web Service with no loop).

    Or do they behave more like Servlets where I should recycle everything manually?

    Thanks!

    10Bob Balaban  1/18/2011 12:03:12 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Web Service provider and consumer design elements in Domino are basically forms of Agents. The root Session instance is provided by Notes/Domino, and is therefore destroyed/recycled automatically when your program exits.

    Having said that, however, you are still responsible for ensuring that the program doesn't run out of memory before it gets to the end. If your web service is iterating over thousands of documents in a large view, you should really recycle the document objects as you go.

    As with all forms of Java garbage collection: You never have to worry about memory, until, that is, you run out.

    11Matthew  4/8/2011 12:07:12 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    I learnt about recycle on looseleaf.net (which sadly just seems to serve up a blank page now).

    I have often wondered about transient Notes objects and if we have to worry about garbage collection. e.g. Consider a loop that iterates over thousands of documents and includes code something like:

    if (lookupView.getDocumentByKey(key, true) != null) {

    // do whatever

    }

    Is this a problem because lookupView.getDocumentByKey(key,true) returns a NotesDocument but we never have a handle on it and so don't explicitly recycle it when we are done.

    Would it be better to do something like:

    doc = lookupView.getDocumentByKey(key, true);

    if (doc != null) {

    // do whatever

    // we are done with doc, recycle it

    doc.recycle();

    }

    Maybe I should loop up the two different ways and see if I get memory problems with the first way.

    12Bob Balaban  4/8/2011 3:53:02 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Matthew - there are no blank pages that I know of on this site. What URL is failing for you?

    To answer your recycle() question, yes, you must recycle that document inside the loop, or you will have a memory leak

    13Matthew  4/11/2011 6:59:12 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Thanks Bob for the confirmation on the need to recycle.

    No blanks on this site - I was referring to your other site { Link } The home page just brings up a blank page for me. But I am seeing a lot of your great content here at bobzblog so maybe you decommissioned looseleaf.net. Still occasionally go back to your book "Programming Domino 4.6 With Java" after all these years.

    14Mike Woolsey  3/28/2013 12:46:10 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    So ... do I get this right that the agent (NotesMain, single-threaded) will clean up on agent closure, if I don't recycle? But that if I'm doing something hefty with even a few hundred documents, I'm spiking the Domino memory pool usage until the agent's done.

    I've also noticed the Java heap follows the same path of climbing, climbing, climbing memory ... which leads to my next question

    About Runtime.gc(). Is it accurate that the Java garbage collector doesn't recycle(), ie there are no destructor actions in the Java objects to destroy Domino memory use?

    Again, just trying to get clear exactly what happens with some agents I have known.

    15Bob Balaban  4/3/2013 5:44:09 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    @Mike - Yes, Notes/Domino will clean up all the back-end objects when the agent ends. The problem is that you might run out of memory before the agent is done, so "best practice" is to recycle as you go. The jvm gc() call will immediately collect all unused Java objects -- but Notes back-end (c++) memory will NOT be freed, that's the whole problem. Also, recycle() does not gc the java memory, it only makes it "available" to be garbage collected. HTH

    16Sam Elperin  9/20/2013 4:29:21 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Bob!

    Thanks for the great post.

    I am struggling though to find anywhere the explanation whether or not profile documents need to be recycled.

    The big profile usage causes server crashes with "Out of Memory" errors, in 8.5 it's better, but still the case.

    To patch the issue IBM suggest using NSF_DOCCACHE_THREAD=1 ini variable which helps somewhat but doesn't resolve the issue in it's core.

    I remember having the code with forcefully recycling profiles at the and of the Java agent, but it was causing event driven crashes if some other agent or form references same profile while my agent is recycling it(this is per IBM support team).

    So I took the code for recycling profiles out to resolve the event driven crashes.

    Can you please shed any light into this?

    Thanks, Sam

    17Bob Balaban  9/23/2013 8:31:58 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Hey Sam,

    My knowledge on this may be a bit out of date, but here's what i (used to) know about this: In client agents you can (probably should) freely recycle profile documents just like any other. The issue is a little more complicated with server agents, because profile documents are cached in memory on servers (and not on clients, at least so far as i know). Recycle is fairly brutal: it forces the document "HANDLE" (the c-api level object that represents an "open" document, whether it be a regular document or a profile document in memory) to close. That's where you get the memory reclaimed.

    However, if the profile document is cached (as it is on servers), the same in-memory copy of the document is available to multiple threads, whether those threads represent client sessions, or server agents running side by side. If one thread forces the in-memory copy of the document to shut down, other threads/agents which have accessed it may be left holding an invalid pointer, which will almost certainly cause a crash, somewhere, somewhen.

    My recommendation would be to leave server-side profile document instances un-recycled. If this causes you out-of-memory issues, first look to other sources of memory leak, and only start recycling profile docs as a last resort.

    hth

    18Art  10/18/2014 3:42:07 PM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Hello Bob,

    I need some more clarification about the scope of java recycle(). Does it affect only one user or all users. What I mean is , if one user is editing document in xpage and another user call recycle() on this document, would it destroy all handles or just the one called upon?

    Thanks in advance,

    Art

    19Bob Balaban  10/28/2014 3:52:39 AM  Geek-o-Terica 5: Taking out the Garbage (Java)

    Recycle only operates within the scope of a Session's object hierarchy, and only within an individual JVM process (memory) space. Calling recycle() only affects objects in memory, not in a database. If you delete (remove()) a Document from a database, it's gone for everyone, but if you recycle a Document object, it only kills the memory in your local process.

    HTH