Python Nucular API Guide

nucular project page with download links

Python Nucular API Guide

The following discussion walks through a sequence of small examples which illustrate ways to interact with a Nucular archive using the Python API. This discussion is intended to help a programmer understand the API. Please consult the API summary for detailed discussion of the operations illustrated here.

In an attempt to make the discussion easy to understand there is a great deal of unneeded repetition in the code examples.

Creating a new archive

To create a new archive establish a session for the archive (even if the directory doesn't exist yet) and execute the create method:

def makeArchive():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "initializing archive"
    archive.create()
    print "archive created!"

In this example the directory should be empty if it exists, and if the directory doesn't exist the parent directory (in this case ../testdata must exist.

Adding some entries using dictionaries

The following interaction adds some entries about frogs, bunnies, and kittens to the archive using dictionaries. The strings 123FROG, 456BUNNY, 789KITTEN are the identity strings used to uniquely identify the entries in the archive.

def addSomeDictionaries():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    D = {
        "name": "frog",
        "food": "tastes delicious, like chicken",
        "descr": "little green slimy things",
        }
    print "adding", D
    archive.indexDictionary("123FROG", D)
    D = {
        "name": "bunny rabbit",
        "food": "just delicious with garlic",
        "descr": "cute and cuddly",
        }
    print "adding", D
    archive.indexDictionary("456BUNNY", D)
    D = {
        "name": "kitten",
        "note": "not edible",
        "descr": "cute and cuddly",
        }
    print "adding", D
    archive.indexDictionary("789KITTEN", D)
    print "SAVING session and making new entries visible now."
    archive.store(lazy=False)

The last step of the function stores the entries with immediate visibility -- this means that the entries are made permanent and visible to all subsequent sessions.

Finding the cuddly entries

The following query looks for entries that mention the word "cuddly" anywhere in the entry (case insensitive).

def cuddlyQuery():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries with cuddly"
    query.anyWord("CUDDLY")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

Here the query is derived from the session object and the query.anyWord method specifies that the entry should contain "cuddly" as a prefix to any word in any attribute. The result of the query is extracted as a list of dictionaries using the query.resultDictionaries method. When evaluated over the archive we just created above the function prints

creating session
making a query
look for entries with cuddly
getting query result dictionaries
found {'food': 'just delicious with garlic', 'i': '456BUNNY', 
 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
found {'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten',
 'descr': 'cute and cuddly'}
total found 2

Note that the identities for the entries show up in the extracted dictionary as the value associated with the key "i".

Finding the cuddly but not delicious entries using boolean queries

There is another way to get results out of a session that avoids explicitly creating a query object by using boolean query strings. It is easier to state static queries using boolean query strings but it is harder to write programs that construct dynamic queries using boolean query strings.

For example

def cuddlyNotDelicious():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "look for entries with cuddly but without delicious"
    dictionaries = archive.dictionaries("cuddly ~delicious")
    print "total found", len(dictionaries)
    for D in dictionaries:
        print D

Here the query string "cuddly ~delicious" looks for all entries containing the word cuddly anywhere but not containing the word delicious.

When evaluated the function prints

creating session
look for entries with cuddly but without delicious
total found 1
{'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten', 'descr': 'cute and cuddly'}

The Boolean Query Specification describes the syntax and usage of boolean queries.

Finding cuddly and delicious entries

Queries can place many conditions on entries. For example the following function evaluates a query looking for entries containing "cuddly" and "delicious".

def cuddlyAndDeliciousQuery():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries with cuddly"
    query.anyWord("CUDDLY")
    print "look for entries with delicious"
    query.anyWord("delicious")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

When evaluated the function prints

creating session
making a query
look for entries with cuddly
look for entries with delicious
getting query result dictionaries
found {'food': 'just delicious with garlic', 'i': '456BUNNY',
 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
total found 1

The equivalent boolean query string "cute cuddly" will also find the cute and cuddly entries.

Adding more entries

The following function adds more entries to the archive to make the discussion a bit more interesting

def addPhoneEntries():
    PHONEBOOK = [
        {"i": "Sally Smithers", "c":"uses too much salt", "p": "111-3333", "g":"female"},
        {"i": "Joe Smithers", "c":"can't cook", "p": "111-3333", "g":"male"},
        {"i": "Sandy Waller", "c":"delicious pizza", "p": "333-2222", "g":"female"},
        {"i": "Joe Blow", "c":"great at a grill", "p": "333-2222", "g":"male"},
        {"i": "Lola Waller", "n":"thinks snails are delicious", "p": "333-2222", "g":"female"},
        ]
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    for D in PHONEBOOK:
        i = D["i"]
        print "now indexing", i
        archive.indexDictionary(i, D)
    print "storing updates with deferred visibility"
    archive.store()

Oops! Here we used archive.store() without specifying lazy=False, so the updates have been stored in deferred mode and will not be visible until the archive is aggregated.

Looking for delicious anywhere

Now if we look for "delicious" anywhere in entries we see just the animals, not the people

def deliciousQuery():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries with delicious"
    query.anyWord("delicious")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

prints the following

creating session
making a query
look for entries with delicious
getting query result dictionaries
found {'i': '123FROG', 'food': 'tastes delicious, like chicken',
 'name': 'frog', 'descr': 'little green slimy things'}
found {'food': 'just delicious with garlic', 'i': '456BUNNY', 
 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
total found 2

The people entries don't show up because they are not visible yet.

Aggregating the updates

To optimize the indexing structures and also to make deferred updates visible we may aggregate the archive:

def partiallyAggregate():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "aggregating update operations to temporary storage"
    archive.aggregateRecent()
    print "unlinking retired files"
    archive.cleanUp()

After executing partiallyAggregate() if we execute deliciousQuery() again the function prints

creating session
making a query
look for entries with delicious
getting query result dictionaries
found {'i': '123FROG', 'food': 'tastes delicious, like chicken',
 'name': 'frog', 'descr': 'little green slimy things'}
found {'food': 'just delicious with garlic', 'i': '456BUNNY',
 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
found {'i': 'Lola Waller', 'p': '333-2222', 'g': 'female',
 'n': 'thinks snails are delicious'}
found {'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
total found 4

The additional entries show up because they have become visible after the aggregation operation.

Looking for delicious cooking

The following function refines the "delicious" query by looking only in the "c" attribute for "delicious" ("c" for "cooking").

def deliciousCookQuery():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries with delicious in 'c'"
    query.attributeWord("c", "delicious")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

When we restrict the search to the "c" attribute the function prints

creating session
making a query
look for entries with delicious in 'c'
getting query result dictionaries
found {'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
total found 1

The equivalent boolean query string "c :.. delicious" will also find the same result.

Phone numbers starting "333"

A query may also just look for an attribute which matches a prefix. The following function looks for entries where the "p" attribute (for phone) starts with "333"

def phoneStarts333Query():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries where 'p' STARTS WITH '333'"
    query.prefixAttribute("p", "333")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

When evaluated the function prints

creating session
making a query
look for entries where 'p' STARTS WITH '333'
getting query result dictionaries
found {'i': 'Joe Blow', 'p': '333-2222', 'c': 'great at a grill', 'g': 'male'}
found {'i': 'Lola Waller', 'p': '333-2222', 'g': 'female', 'n': 'thinks snails are delicious'}
found {'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
total found 3

The equivalent boolean query string "p: 333" will also find the same result.

...but only the girls

A query may also specify that an attribute must match a given value exactly. For example the following function looks for entries where the "g" attribute has the value "female" and the "p" attribute starts with the prefix "333".

def femaleAndPhoneStarts333Query():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "making a query"
    query = archive.Query()
    print "look for entries where 'p' STARTS WITH '333'"
    query.prefixAttribute("p", "333")
    print "look for entries where 'g' is 'female'"
    query.matchAttribute("g", "female")
    print "getting query result dictionaries"
    dictionaries = query.resultDictionaries()
    count = 0
    for D in dictionaries:
        print "found", D
        count+=1
    print "total found", count

When evaluated the function prints

creating session
making a query
look for entries where 'p' STARTS WITH '333'
look for entries where 'g' is 'female'
getting query result dictionaries
found {'i': 'Lola Waller', 'p': '333-2222', 'g': 'female', 'n': 'thinks snails are delicious'}
found {'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
total found 2

The equivalent boolean query string "p:333 g=female" will also find the same result.

Fully aggregating the archive

At some point after an archive has been updated the archive should be fully aggregated. This function does the full aggregation for the example archive:

def completelyAggregate():
    from nucular import Nucular
    print "creating session"
    archive = Nucular.Nucular("../testdata/APIExamples")
    print "aggregating update operations to temporary storage"
    archive.aggregateRecent()
    print "aggregating temporary with permanent storage"
    archive.moveTransientToBase()
    print "unlinking retired files"
    archive.cleanUp()

End of Python Nucular API Guide return to index