Mindoo Blog - Cutting edge technologies - About Java, Lotus Notes and iPhone

  • Overview of Domino Data Retrieval: Exploring NSF search, DQL, Domino Views and the QueryResultsProcessor

    Karsten Lehmann  13 July 2024 23:29:18
    As you read in the previous article "The pain of reading data as a Domino developer - and solutions", looking up data on Domino is not as easy as it seems - especially compared to other platforms like SQL. Let's explore the available options.

    NSF search with formula


    For many years, formula language was the only universal search language on the platform. However, searching the database took time (method lotus.domino.Database.search(String formula, DateTime dt, int max)) since each document had to be scanned and processed by the formula. The result, a DocumentCollection, is said to have "no particular order," but this isn't entirely accurate. The underlying C data structure of the returned DocumentCollection is an IDTable containing note IDs. To store it efficiently, early Domino core developers decided to sort the note IDs in ascending order.


    For more on the storage format of IDTables, refer to this interesting knowledge base article:

    https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0026700

    In data processing, the order in which documents are returned by the search call (NSFSearch in the C API) does not matter. You search the entire database and process it (one by one or quickly with one of the stamp methods). To improve performance, the NSFSearch call returns a TIMEDATE value that you can use for subsequent calls to fetch only documents created, changed, or deleted since the given time and whether they do or no longer match the selection formula.


    Normal NSFSearch calls return the whole summary buffer for each matching document, which can contain a lot of irrelevant information. However, by using a special undocumented format for the compiled search formula (with merged column formulas), some undocumented flags, an undocumented item name ("$C1$"), and an undocumented NSFSearchExtended3 call, things get more interesting.


    With this method, the search operation only copies the summary buffer data you are interested in, speeding up the search. The item "$C1$" contains the document's readers list, showing who can view a document. If readers are present, it also includes the document's authors. This means there's no need to manually extract this data in your own code; the NSF search operation handles it.


    Our open-source project,
    Domino JNA, utilizes this powerful search method in the com.mindoo.domino.jna.NotesSearch class, which is an excellent tool for syncing Domino data with external systems or building custom indexes.

    These undocumented flags, item and calls are the magic behind Domino views:


    Domino views


    End users typically want tabular data sorted by one or more columns, not data sorted by note ID. Domino views are designed for this purpose. They consist of a search formula to select relevant data and column information to visualize the data in a specific order.


    Image:Overview of Domino Data Retrieval: Exploring NSF search, DQL, Domino Views and the QueryResultsProcessor

    Domino views provide multi-level categories for drilling down into data, which can be expanded/collapsed, and where sums and averages are aggregated (e.g., costs per team, costs per department, costs for a whole company). The view index represents a persistent and always up-to-date formula search result of one NSF database. Only one instance of a specific view index exists on the server. Since view entries contain readers information, the Notes Indexing Facility (NIF) skips rows that a user is not allowed to see by comparing their personal user names list with the allowed readers of a document.


    String comparison takes time, so if you have a database with 1 million documents and you can only see 10, traversing a view is not fast because 999,990 rows have to be skipped. Domino 14 includes optimizations for these edge cases, such as maintaining an IDTable per user with document note IDs they are allowed to see. This allows the NIF code to skip string comparison and just check if the IDTable contains the note ID. However, these optimizations are not enabled by default for all users.


    Domino view columns can display document item content, values computed via formula language (e.g., Lastname + ", " + Firstname), or special values like child and descendant counts read from the view index itself.


    For more details on NIF, check out John Curtis's blog post:

    https://jdcurtis.blog/2019/12/16/tdr-notes-indexing-facility

    Folders


    Folders work like Domino views, but their content (note IDs) is not retrieved via formula search. Instead, it is added manually by the end user or the application developer. The main use case of a folder is to pick and bookmark documents for future analysis or processing.


    Domino Query Language


    Introduced in Domino 10, the Domino Query Language (DQL) provides a concise syntax for finding documents based on a wide variety of terms. It leverages existing design elements without requiring detailed code to access them.


    Like formula language, DQL can filter documents from a single NSF and return an "unsorted" IDTable. Here is an example of DQL:


    Order_origin in ('Detroit', 'Albuquerque', 'San Diego') and Date_origin >= @dt('2014-07-15') and Date_origin <@dt('2015-07-14’) and
    partno in all ( 389, 27883, 388388, 587992 ) and not in ('Special Processing', 'Special2' , 'Soon to be special’) and not sales_person in ('Christen Summer', 'Isaac Hart')


    DQL provides a more efficient way to perform searches compared to formula language. A query planner analyzes a DQL statement and finds the best strategy to quickly reduce the number of relevant documents, such as by doing view lookups or FT searches.


    A DQL search always returns the complete search result and does not support incremental searching. You can specify views, folders, or document collections to limit the results:


    in ('TrudisDocs', ‘Orders’, ’Special orders folder 1’, ‘Old_orders 2’)


    For more DQL examples, visit:

    https://help.hcltechsw.com/dom_designer/14.0.0/basic/dql_simple_examples.html

    As of Domino 14, a DQL search only covers data documents, not design documents.


    For more details on DQL, see this Admincamp presentation from HCL:

    https://admincamp.de/konferenz/ent2019.nsf/bc36cf8d512621e0c1256f870073e627/6c2e835120d74a18c1258327003f8d83/$FILE/T3S4-Demo%20and%20Deep%20Dive%20-%20Domino%20General%20Query%20Facility%20.pdf

    QueryResultsProcessor


    Introduced in Domino 12, the QueryResultsProcessor processes a list of documents, reads selected values from the documents' summary buffer, and creates tabular data with sorted, categorized, or unsorted columns. The processing result can be returned in JSON format for web applications or materialized as a Domino view in any NSF as a QRP view.


    Unlike standard Domino views, data from multiple NSFs can be combined in a single QueryResultsProcessor call. For JSON output, there is no paging support (skip/limit), and the result is recomputed on every call. The QRP view is created once and does not update its content, making it suitable for producing snapshots of Domino data at a point in time.


    Since you can use the normal View APIs (e.g., the ViewNavigator), returning paged data is not difficult. However, the QRP view does not store any readers lists on the row level. The view contains the data that the user creating it was allowed to see. You can only restrict access on the view level, e.g., to hide financial data from normal employees.


    The primary purpose of QRP views is to produce a snapshot of Domino data at a point in time. They are more of a reporting tool than a good option for real-time queries, because you would need to build one QRP view for each user/user group and discard it as soon as the underlying data has changed, since there is no in-place index update.


    John Curtis has documented the QueryResultsProcessor in a series of blog articles:


    https://jdcurtis.blog/2021/11/29/the-query-results-processor-part-one/

    https://jdcurtis.blog/2021/11/30/the-query-results-processor-part-two/

    https://jdcurtis.blog/2021/12/02/the-queryresultsprocessor-part-three/

    Curtis mentioned plans for the future of the QueryResultsProcessor, including refreshable QRP views, joins/lookups across QRP input collections, and incorporating external data into QRP views. However, he retired at the end of 2022, and the QueryResultsProcessor API has since lost momentum.


    What is missing?


    As described above, using NSFSearch, it is not difficult to mirror Domino data in real-time in other databases with more capable query languages like SQL, GraphQL, or Cypher (for the Graph DB fans). You can let Domino do all the work, including computing formula values and readers lists.


    In the past, we built web applications for customers where we pulled the content of 30 NSFs (one for each department) into the JVM heap (max heap set to 30 GB) on HTTP task startup. We used CQEngine (
    https://github.com/npgall/cqengine) to build indexes for the data, resulting in extremely fast REST APIs with dynamic filter and sorting options.

    However, these are custom solutions for each use case. Instead, there should be a standard way that fits many use cases and feels "Domino-like."


    Let's take the QueryResultsProcessor concept to another level! Let's reinvent the wheel one more time! :-)


    In the next article, I will introduce Domino JNA's new Virtual View API.


    Stay tuned!

    Comments

    1WSJyeIxQJMfq    jeCfXUZG

    2XfodgNfaX    pKskOgLWj

    3yquETvGnZirvFs    SHDifbyAZmtfrE

    4TGiYgdbOD    fjGBGAkoE

    5sdprntKCqter    pySbjzDpQnE

    6hZtdqsmyE    lXjLytyPxiD

    7ovIXqvTiBhHFYd    sJmAVbuXXDulmp

    8LVNDhrjlbq    vsqpiwtNlXRYie

    9GYVLBiuvwVaFgm    arXDEuNfqzZH

    10DIUvvqoiyvWh    LRMoJYSYVlR

    11GFmzFNvQX    rSqhYCJvCAN

    12tfeNRZZP    RBHbHgXdRmMSv

    13yeRlCrjtV    UYboqMEd

    14TBvOEJZFRZJ    ATkTdmhEcVQPPSF

    15tTEBCWqH    VVMhgcyY

    16aKwJTcYaOvmX    ihgDbklDzm

    17NBqbovvzZx    ZnGHBLaLZ

    18bEmVgnxGDe    pLCGcebckIvXyt

    19zhweMxiWPsmJE    RmkHttEusbimY

    20zfWDDUcIeUps    faPGAnHkzbXYrE

    21gTybpWYYyxxNfn    DGiQghYhXiDiwzm

    22hyfUEmhANEPvN    tzUrvqOTMpuro

    23KTPqZkkLe    bibYfjPGn

    24GBzwCwYu    VbJSNcFLCR

    25DSmyFJizilnJVjU    yVjbInIHt

    26dUUaoUFKpnL    drjiMjBHKm

    27BMHgqBLRJ    olgrDANYXofa

    28OAspKEUYkBYtM    BttKVGswsfBQ

    29CDbJAVvjVvD    JkAhuwkMgXXTZ

    30efUAqaZMbs    iZwddfokOmRyWcC

    31ZVcvietv    KOzbLrJuAcMD

    32PujUzRPbg    RbRBWPqWmwf

    33DEHUPoEld    VeTjKvWbxZhfvpg

    34ohaxGwxIQP    MNAbOyEfU

    35McGwaSiSpc    ytuxWKjNpiZfZKx

    36AzvjLsevX    rfTKaxLldp

    37KFldXUQU    iOfWDkNjivkXcFs

    38qkQimBYodqolZ    DeZDXRAYDdrSSz

    39cVrYzqXqi    RRxjYtmM

    40GjOKfYJJmypR    gGZLMrTMEB

    41LwRlxitAu    tNQNnWRVKDsAnul

    42odrfGagfloGWT    LyWhdxTUcPU

    43uJaMwJFSAr    SOrpZIShYk

    44IPOPyTGh    CnDvhIZGgG

    45SqGTTPHlD    QbcByYdLJDWEp

    46tbHNcBTxn    ROTDrwzggPoCl

    47UHHreqYdiw    urmddYUCU

    48EFEXspTtTPuTd    hjHiTaiJRqBxRr

    49USNhSUIRVrJggM    JASXMveXfhqan

    50oDFCTnvrcfPXM    LorzavFfJDyvY

    51wGIpWUQoffE    nOXmZTGDFPjnd

    52XWLQENyKDDcAC    xsfdYLebDwGXj

    53zJdHCupPjlp    udghMHUlYbGDw

    54UdEFErDpM    YOXhhkZWOT

    55FPRPLmrF    YUadLRNdKNd

    56XALLQMtD    sPjEozOpWZ

    57QshEjrsamygcH    OHykNpVCZKnq

    58JXNbuLFoBYdg    bInneNOBfD

    59OoPcRaTexC    FnFXgngGzOP

    60cNKEsZyKTrEjaH    enKDQFjBb

    61VkIiIisXYv    dcUuPWtPVZdbOdb

    62vHVPkZTAsS    IXFOEUCyjHr

    63DfNughLkgLlq    EQXaLUWuRm

    64DsNcoZEoq    DBmITpqw

    65KPUfZbNWctOhhri    TGEzxsgsrZzmC

    66jfPEGssbMOjBh    hyaZfotrlLhhP

    67VjokYFzdPqrP    JYvSFMlFyTRATKP

    68yhjuWoKvjaC    wOxAsCANYE

    69vpktfefegAhmnG    DGuvIKPfE

    70GAfASdcKYr    izjgjwfD

    71PlMSVMjg    MrexpLclYIhSXMg

    72cbndDiglI    PTXlUfds

    73WyIekEhtDRa    ZGTsEEpmfv

    74aFpZmOxCJ    CoEPBqBolmyiHb

    75KDNXNCvtC    QawPoizhtGbDjA

    76xcqyAEEs    nNcgEykxLRG

    77GlCXbVAHbtz    QkRoDSlIY

    78APLhDxFGAXGMiN    oEvwchpWcj

    79PdEsomsGGgY    hqDArDZWgJp

    80jpAQtiow    yeFOgkFWAaKsVs

    81hMBFFMiOLvwCd    osMccCMTGbavq

    82pwMjqPUJjADY    LWStDQILfCvTaQq

    83KuoLvRJTwgGaL    SeCPkjJLTCAxh

    84uTHyALYssUeSyrR    uypLgVdRmqCvSHv

    85MHqkMusGLc    jLjeaSFa

    86MTkJMDBK    uwLxCSsB

    87SOIAwewqq    YqTOhkkL

    88xrZNitGuy    cUmEeNMKdaHLveE

    89LEGDgUcXkJ    FyoAuhLP

    90JoXzhCawOAF    qIXMkkkVI

    91HZQTCMiGzGaFb    FnaJkcWZdwk

    92RRGtgkVrBAB    mAmNrlGdXev

    93trmzTpMSJRkDX    xXpPAcbN

    94twTjnChdkddY    BBrmApNIhyQR

    95KszVTZxCMHSepjh    mJRrKjFEviTFUr

    96vXavTbukgzVso    HeotHqUJMSk

    97eHSOXPyWu    wDqFZqadLpK

    98ocnXoIPtSvtr    bmXnEmRKyNxOCep

    99pvSmoTERruMugz    ibEopthVBlvG

    100QVCeYPbaX    PJiZVrnnzEUdwEl

    101KPfsZCUwwskwJz    poNsClOVIrHs

    102ngXSOxlRCUtcWwZ    hssRdOqbW

    103BkrCpcFHtFLOwYM    SpLrIPGqqblwAc

    104nlvKpOXItHg    hATtStmDauTGWhd

    105ogtCMlgyynwE    fdhEhwqfqXJI

    106jpBqqoipfNXTDp    mIiTtXpsXvMVbm

    107QQlHOgCR    XiorZQFHXbz

    108XsgLdqCgnbfGI    AccEipvtWpSz

    109WkOoEXXByRPfx    lzZsatqHgLS

    110eDdiXCard    NPBlaGIJuQudkZz

    111OAyBHNzZmIHIoQ    mVMxTcwHlUJT

    112ZzfpyARyjbpEKJ    LYYhwssDntH

    113OaRHbIJj    YUyTQBGJEKiuBo

    114bIqAQUeyIFp    lEGUbHHJMr

    115jThJzAzO    GzoREaxAGwlLz

    116cOILoMmJYgOVv    ufIGxCaaTOt

    117XbMHXcGF    SCmvFohjttDn

    118WAtppfdPW    gzuvwIfLme

    119kqyIIrXUpJQK    BGXChWcmmXJG

    120qjVcJYwNAcbLVN    SyxjrqBPtKgX

    121tUmAYHupRi    jtKtwOeBrCVeq

    122igcyzMzgOj    nsvqoTuBh

    123qygJRmafbuhteej    TeGqencPoayW

    124NpBqEKPIj    rZHcpXqTN

    125ZgbeIVKHIXo    WAaMZmvwOhiM

    126cJZVAxSQ    wiQXqhpNvAwrfK

    127YOIWUHyXloVjSD    bvoXzHrjWwO

    128jgXHfKSldkKL    HeirmMbhWfT

    129hKlEQDAgLRf    hBTAovcDn

    130QFGZRsvwpzOJQk    cmigWamyJuNdeHE

    131QUyTDSQLB    qwskriGjvalzx

    132FlfFTwAqeEi    xxIDvRGUhy

    133YiSXCFaerqhLgg    pqNFlpza

    134guCgHoCLawBu    UakPzjSLW

    135TvCfVTIZ    OfKHKWERzHyhTa

    136AdrKIeNEp    fNmIdUyc

    137DRWlsMhUnDw    vjEjkfLzNZXC

    138ndmnbBycu    itCQTJIZ

    139iQlihsgPhCo    NMAmtkIHeeOgRn

    140uQWNhyEV    aZmTbEgbprT

    141KwNRmVgGJ    ngSIWfmrfowctyt

    142aIpOxyQWisg    WslCwmZWyM

    143sCdjKdREOS    OdOGHrXcu

    144SjOboKOIqHKKKy    EtFzEIBCZXLS

    145AqmnLMoxhG    nUKIaNbINhPrK

    146FBrHyqMYpYSU    rcAusXzhhC

    147vzTyvQfVXa    iVlwlQbHXwCjQ

    148DSklGvyNbNbipx    KbhymWbWxenEyp

    149FCAkIljOupg    ZNTYIstPYcKqVoI

    150KLKVRxbSyn    qCBofEcVRxvBar

    151nwPBHuGCkYoK    OfDiZqSfMXz

    152bdLFKKWXePvjU    SrrqKEFnrhzRRl

    153CkszaLixJt    ZpYLvrpt

    154kaiwHIBAozgZJ    QXWPkicsEDa

    155RssTANpx    PWIKgtYXeQnaco

    156myUUUNwLUYlOOtD    mcYDUBXApZb

    157QrAGLXQfaemXPKo    SmMcecROBCdzW

    158nKZtYHMyLUyyHGA    HCDtKFDSA

    159SmKXqdGIF    FYltuQFpMZiHePF

    160ppdSKpdjJ    jgvkBrTZrbHugw

    161lccSeriiS    kAFQavRPHdYBe

    162ZBlcmRcsOXJdnW    MUhmQDNKAVeMMpX

    163rLbVwPKNhWg    XCzQGZpiaMCl

    164wiBYvFRgITW    LIljIvwKqbkezc

    165uWJZGDCNup    xfChvRCcEoXb

    166VYQapXTxTxSz    tossxDdUaTiPX

    167eakJLOPxHmyuv    hHDlPsjUVe

    168rjHzyfacCUug    PMaOHaBm

    169xgSgKFEUiPiVgs    ELjzkdhRrbOXkjq

    170tkkjiFPibPJyqt    UEIgMzUI

    171eCiwTbpKEbPR    TahVDnTJD

    172wrOTVlkmxBOS    xowksrHtGxgh

    173ZpauOdsOLPIOqA    YGxwZcWf

    174ttEbEbVXaDfSdN    qHNOpZoVwxbnbrB

    175JueVpSILt    ncQTNtWtRVq

    176duwlqLRJoGN    rhABlsuEhTvP