Bob Balaban's Blog

     
    alt

    Bob Balaban

     

    Yes, I really did need NotesDocumentCollection.GetNth that one time....

    Bob Balaban  August 18 2008 02:47:42 AM
    Greetings, Geeks!

    I promised you a follow up to my "Spawn of the Devil" post of a couple of days ago, and this is it. I sincerely hope this will be my last-ever discourse on the GetNth topic.

    I wanted to be fair, and mention that there was one time, recently, in fact, when GetNth was the right solution to a problem I was having, and I really needed it.

    But, to tell you that story, I have to tell you this: Back when the NotesDocumentCollection class was first created (for Notes 4.0, mid-1990's), it was essentially a LotusScript wrapper for a CAPI data structure called an IDTable. The IDTable "object" is a compressed list of NOTEIDs, which is optimized for two things: efficient storage of a potentially large number of IDs, very fast lookup to determine if a given ID is in the list or not, and ver fast traversal (if you have an ID and want to know the next one, NOT if you have to count N entries from the beginning every time....). Ok, that was three things. IDTable allowed such operations as getting a list of all the documents in a database to be very, very fast (expressed in LotusScript as NotesDatabase.AllDocuments, for example).

    The thing about db.AllDocuments (and the CAPI that underlies that function) is that normally it returns an IDTable that contains both actual document IDs, AND the NOTEIDs of any deletion stubs which are also in the database. Originally, the code underlying .AllDocuments stripped out the deletion stubs, leaving only actual data documents IDs, so you could be pretty sure that when iterating over a DocumentCollection, that you would always get a valid NotesDocument (assuming someone else didn't go and delete something on a shared copy of the database between the time when you executed AllDocuments and when you accessed everything in the resulting DocumentCollection).

    That all changed sometime in between Notes 5 and Notes 7 (I don't know exactly when, I was off doing other things...), and deletion stubs are no longer stripped out. Now when you want to iterate over all the documents in a database using NotesDatabase.AllDocuments, you have to check each NotesDocument instance with either (both is probably a better choice) the NotesDocument.IsDeleted or .IsValid properties before you try to access the document contents. Do I know the difference between the two? No, I do not.

    Ok, back to the story. So, there I was, needing to check all documents in a database that happened to have a lot of deleted documents in it. As always, I got a DocumentCollection using Database.AllDocuments, and wrote my NotesDocumentCollection.GetFirst/GetNext loop (I was actually using the COM classes from C#, but they're the same as the LotusScript and Java classes anyway, so that difference is meaningless for the purpsoes of this sad story). Inside the loop I coded a test so that if .IsDeleted was true or .IsValid was false, I could skip the current document and go on to the next one.

    HOWEVER (this is the sad part), it turned out that DocumentCollection.GetNextDocument() would fail if the current document was not "valid". (Gasp!) What to do? I thought maybe someone was messing with the database on the server while I was iterating, and that the document had been deleted out from under me. But no! The same thing happened when I ran a test on a local copy of the database. I was stuck, couldn't get the next document, ever.

    AHA! But then GetNth came to my rescue. Why? Because, GetNth doesn't NEED the previous document to get the next one, it just counts IDs in the IDTable. Problem solved, worked like a charm. Sure, slower. But how do you measure speed when the "correct" way of doing something runs you into a brick wall? I'll take correct behavior that's a bit slower over a "faster" approach that doesn't work, any day.

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)
    Follow me on Twitter @LooseleafLLC
    This article ┬ęCopyright 2009 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.
    Comments

    1Bart  8/18/2008 4:50:56 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    Maybe you could have created your collection with db.Search({Form!=""}). I suppose this leaves out the deletion stubs since they don't have any items (afaik).

    Of course it would take some time to build the collection but you don't need GetNthDocument :-)

    2David Leedy  8/18/2008 7:26:47 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    Bob,

    I thought the fact that the documentcollection returned deletion stubs was a "bug" that was fairly quickly corrected in a point release.

    I guess from reading your post that the bug was they let the deletion stubs through, but I think in a point release or two Lotus put it back the way it was.

    I haven't checked yet but I'd be surprised if stubs are still being included in the documentcollection.

    3Mike Miller  8/18/2008 8:17:43 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    Bob,

    I ran into a similar situation. My db had a DECS connection and for a while it was throwing errors when retrieving certain records (I think it was some kind of regression bug as its not doing it anymore). In a loop, the script would error out and wouldn't complete. In order to log the error and then continue on, I had to use get Nth. Not ideal, but it was a scheduled nightly agent, so I didn't really care about performance.

    4Jan Van Puyvelde  8/18/2008 8:31:39 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    I also had some problems with deletion stubs being in the collection. My workaround was to check (doc.NoteID = "") as nothing else worked.

    GetNextDocument always worked for me though.

    I too have not seen that behaviour for some time. Guess it was fixed.

    5Karl-Henry Martinsson  8/18/2008 10:02:24 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    @Bob: I had exactly the same issue recently, probably 5-6 months ago. I ended up using GetNthDocument as well. One thing I will try next time is to use a counter to keep track of the current document. Then I use GetNextDocument, but trap any error, redirect it to where I ude GetNthDocument(counter+1) and then continue:

    on error got errHandler

    set col = db.AllDocuments()

    cnt = 0

    set doc = col.GetFirstDocument()

    do while not doc is Nothing

    tryagain:

    cnt = cnt + 1

    ' Process doc here

    set doc = col.GetNextDocument

    Loop

    exit sub

    errHandler:

    set doc = col.GetNthDocument(cnt)

    resume tryagain

    @1: Does not db.Search (which is slow) or db.FTSearch (faster for indexed databases) have a 5000 document limit on what they return, unless you modify the server's notes.ini?

    6John Kingsley  8/18/2008 10:42:01 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    It is always something. Which is why I love and hate Notes - while investigating this whole thing, I used a categorized view at one point. And much to my surprise, getting Nth document in a categorized view returned the Nth CATEGORY, not document. My example ran out of documents well before the view did!

    7Bob Balaban  8/18/2008 4:31:09 PM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    Greetings, Geeks!

    Below some replies/comments to your postings about GetNth. I suppose I should just reiterate that GetNth is really not a problem for small collections. When does a collection become big enough to be problematic? I don't know! How about some of you doing some testing and letting me know what you find?

    @1- Interesting idea! You're right, though, that db.search() would be much slower.

    @2 - I'm using 7.02, I should check the Knowledge Base and try a later build. Probably will, when I get some time....

    @4- NotesDocument.IsDeleted should work, and should be quicker than testing the NoteID string. My problem seemed more related to documents where doc.Isvalid was FALSE. What's the difference between "deleted" and "invalid"? I don't know! I wish someone would splain it to me.

    @5 - Nice workaround!

    @6 - Another case where better documentation would be very helpful.

    8David Leedy  8/19/2008 8:13:04 AM  Yes, I really did need NotesDocumentCollection.GetNth that one time....

    You're getting deletion stubs on 7.0.2? Really? Wow! I could have sworn that they "fixed" it a while ago....

    Hmmm... Maybe there's a differnece in using the alldocuments property vs generating the collection in another manner.

    As I sit here I think I saw the problem pop up using the responses property. At the time I had to put in some isvalid checking and all that to get around it, but I don't think I've used isvalid or isdeleted in a long time... (Though that probably should be a best practice)...

    Interesting...