return to index
nucular project page with download links

The nTableSpace wrapper API

It is often convenient to store, retrieve, and modify data which closely models an SQL relational database table set using a Nucular archive. The nTableSpace wrapper makes it easy to store data organized into tables of rows identified by primary keys in the manner of SQL databases. This document briefly describes the nTableSpace module and its usage.

This discussion will be most easily understood by readers with some previous background in SQL relational database technology and the Python programming language.

Quick reference comparison chart

The following table compares methods in the tablespace API with similar SQL language constructs. The similarity is generally not exact -- for example SQL commit frequently guarantees transaction serializability whereas the tablespace.store operation does not guarantee serializability with respect to concurrent accesses to the same archive.

Tablespace Operation Similar SQL construct
Connect to a table space
 
# Connect to a new archive and new table space
ts = nTableSpace.newTableSpace("/tmp/myarchive", "db1")

# OR connect to an existing archive and 
# possibly new table space
ts = nTableSpace.getTableSpace(archive, "db1")
 
CREATE DATABASE DB1;
Define a table
 
# "L" is a unique internal abbreviation for the table
ts.defTable("Sells", "L", "SNO", "PNO")
 
CREATE TABLE SELLS (
... 
PRIMARY KEY (SNO, PNO) );
Insert a row into a table
 
ts.insert("Sells", sno=1, pno=1, misc="always late")
 
INSERT INTO SELLS(SNO, PNO, MISC)
VALUES (1, 1, 'always late');
Retrieve rows matching a simple query
 
dictionaryList = ts.match("Supplier", sname='bongo')
 
SELECT *
FROM SUPPLIER
WHERE SNAME='bongo';
Delete rows matching a simple query.
 
ts.deleteByValues("Supplier", sname='bongo')
 
DELETE FROM SUPPLIER 
WHERE SNAME='bongo';
Delete rows with matching primary key
 
# optimized delete: sno is the primary key
ts.delete("Supplier", sno=31)
 
DELETE FROM SUPPLIER 
WHERE SNO=31;
Emulating table joins
 
# joins must be emulated using python code...
def partsForSupplier(sname):
        for sdict in ts.match("Supplier", sname=sname):
            snum = sdict["sno"]
            for spdict in ts.match("Sells", sno=snum):
                pnum = spdict["pno"]
                for pdict in ts.match("Part", pno=pnum):
                    yield pdict["pname"]
pblake = list(partsForSupplier("blake"))
 
SELECT P.PNAME
FROM (SUPPLIER AS S INNER JOIN SELLS AS L
      ON S.SNO=L.SNO) INNER JOIN PARTS AS P
      ON P.PNO=L.PNO
WHERE S.SNAME='blake';
Constructing complex queries
 
# this comparison is very loosely analogous
boiledSelect = ts.selectByValues("Supplier")
boiledSelect.getQuery().anyWord("boil")
boiledSelect()
print "boiled suppliers"
print bb
 
SELECT *
FROM SUPPLIER
WHERE SNAME LIKE "%boil%"
   OR CITY LIKE "%boil%"
   OR ...and so forth for all string columns...
Update rows matching a complex query
 
ts.selectByValues("Supplier", sname="blake").updateSet(city="chico")
 
UPDATE SUPPLIER
SET CITY='chico'
WHERE SNAME='blake';
Delete rows using a complex query
 
ts.selectByValues("Supplier", sname="blake").delete()
 
DELETE FROM SUPPLIER 
WHERE SNAME='blake';
Drop a table
 
ts.dropTable("Sells")
 
DROP TABLE SELLS;
Erase a table space
 
ts.dropAll()
 
DROP DATABASE DB1;
Store modifications
 
ts.store()
 
COMMIT;
Optimize the archive
 
# aggregate and optimize updates.
ts.cleanUp(full=True)
 
-- no standard SQL analogue.


Note that there is no method for creating indexes on tables because the underlying Nucular archive infrastructure automatically indexes everything by default.

The remainder of this document discusses these operations in greater detail.

Creating a brand new archive and populating a table space

To make a new archive and a new tablespace at once use
    ts = nTableSpace.newTableSpace(archive, prefix)
where archive is a directory path for the archive and prefix is a unique prefix to use for this table space within the archive. Bear in mind the following cautions The interaction below initializes a new archive at the directory path /tmp/supplier_parts and creates a table space with prefix SP in the archive using the nTableSpace.newTableSpace function.
    from nucular import nTableSpace
    archive = "/tmp/supplier_parts"
    prefix = "SP"
    ts = nTableSpace.newTableSpace(archive, prefix)
    ts.defTable("Supplier", "S", "SNO")
    ts.insert("Supplier", sno=1, sname="smith", city="london", misc="a very unpleasant person")
    print ts.dumpString()
    ts.store()

Defining a table

The ts.defTable("Supplier", "S", "SNO") operation defines a table named Supplier to be managed by the table space ts with primary key attributes SNO. The calling sequence for defTable is:
    tablespace.defTable(tableName, tableAbbrev, *keyAttributes):
where

Inserting a row

Once the table is created the insert operation
    ts.insert("Supplier", sno=1, sname="smith", city="london", 
                          misc="a very unpleasant person")
Adds a row to the Supplier table, defining the primary key and a number of other descriptive attributes. Each insert must define the primary key but there are no restrictions on the number or names for other attributes in a table row.

Dumping the tablespace to a string for testing

The print ts.dumpString() produces the dump output
TableSpace dump ts prefix='SP'
    table: 'Supplier'
        prefix='S'
        keylist=['SNO']
        'SPS1':
            'city' : 'london'
            'i' : 'SPS1'
            'misc' : 'a very unpleasant person'
            'sname' : 'smith'
            'sno' : 1
            'tablename' : 'Supplier'
which lists the table space content -- showing one table with one row.

Always store any modifications!

Any modifications to a table space must be stored or the modifications will not be made permanent.
    ts.store()
If a program modifies a table space and fails to call the store method the modifications will be discarded.

Attaching to a tablespace in an existing archive

If you want to use an existing archive to store your new table space or if you want to access an existing table space in an existing archive use the getTableSpace function. The operation ts = nTableSpace.getTableSpace(archive, prefix) will use an existing table space in the archive which uses the prefix, or it will create one if there is no such table space in the archive.

The following interaction reconnects to the table space created above and adds two more tables as well as many more rows to the table space.

    archive = "/tmp/supplier_parts"
    prefix = "SP"
    ts = nTableSpace.getTableSpace(archive, prefix)
    ts.insert("Supplier", sno=2, sname="jones", city="paris", 
                          misc="likes bunnies boiled broiled or in a stew")
    ts.insert("Supplier", sno=3, sname="adams", city="vienna", 
                          misc="owns and sells bunnies as a hobby")
    ts.insert("Supplier", sno=31, sname="adams", city="vienna", 
                          misc="this is a typo")
    ts.insert("Supplier", sno=4, sname="blake", city="rome", 
                          misc="has no discernible personality")
    ts.insert("Supplier", sno=5, sname="bongo", city="mahwah", 
                          misc="erroneous")

    ts.defTable("Sells", "L", "SNO", "PNO")
    ts.insert("Sells", sno=1, pno=1, misc="always late")
    ts.insert("Sells", sno=1, pno=2)
    ts.insert("Sells", sno=2, pno=4)
    ts.insert("Sells", sno=3, pno=1)
    ts.insert("Sells", sno=3, pno=3)
    ts.insert("Sells", sno=3, pno=2)
    ts.insert("Sells", sno=4, pno=3)
    ts.insert("Sells", sno=4, pno=4)

    ts.defTable("Part", "P", "PNO")
    ts.insert("Part", PNO=1, PNAME="Screw", PRICE=10)
    ts.insert("Part", PNO=2, PNAME="Nut", PRICE=8)
    ts.insert("Part", PNO=3, PNAME="Bolt", PRICE=15)
    ts.insert("Part", PNO=4, PNAME="Cam", PRICE=25)

Extracting rows matching a pattern

To extract rows from a table based on values for attributes from a table space, use the tablespace.match method. The following example extracts all entries from the Supplier table where the sname is "bongo".
    test = ts.match("Supplier", sname='bongo')
    print "bongo match", test
The result generated by the method is a list of dictionaries representing the matching rows. The print statement displays this list as follows:
    bongo match [{'city': 'mahwah', 'sname': 'bongo', 'i': 'SPS5', 
        'sno': 5, 'misc': 'erroneous', 'tablename': 'Supplier'}]
The calling sequence for match allows the specification of values for any number of attributes as follows
   ts.match(tableName, **argsDict)
Here the argsDict maps attribute names to the desired values. Note that the attributes and values must match on exact equality. See the selectByValues method for a more general query mechanism which permits matching by ranges of values, prefixes of strings, or contained word prefixes.

Deleting rows matching a pattern

The deleteByValues removes rows with attributes that match specified values from a table.
    ts.deleteByValues("Supplier", sname='bongo')
    test = ts.match("Supplier", sname='bongo')
    print "bongo match after delete", test
In this case the print statement reports
    bongo match after delete []
The calling sequence for deleteByValues is similar to the match calling sequence
    ts.deleteByValues(tableName, **argsDict)
Note that the important special case ts.deleteByValues("Supplier") deletes all rows from the supplier table.

The deleteByValues method only supports attribute equality matches. The selectByValues query mechanism described below provides a more general method for deleting rows based on other predicates.

Deleting a row by primary key

In the case where the primary key for the table is known use the tablespace.delete method to delete rows faster (by a constant factor) than using selectByValue
    ts.delete("Supplier", sno=31)

Traversing multiple tables (without joins)

The nTableSpace module is a thin wrapper over the underlying Nucular archive infrastructure, and as such provides no methods for automatically compiling complex join operations, developing query plans, and so forth. However by combining the single table query operations offered by the tablespace abstraction programs can emulate join operations, for example, as follows.
    def partsForSupplier(sname):
        for sdict in ts.match("Supplier", sname=sname):
            snum = sdict["sno"]
            for spdict in ts.match("Sells", sno=snum):
                pnum = spdict["pno"]
                for pdict in ts.match("Part", pno=pnum):
                    yield pdict["pname"]

    pblake = list(partsForSupplier("blake"))
    print "parts for blake", pblake
More complex single table queries with full text searching As mentioned above the selectByValues method allows the creation of more complex queries than the methods described above. In particular the method call
    sel = ts.selectByValues(Tablename, **attributeValues)
Yields an unevaluated selection object sel which can be refined by adding additional constraints. For example, the boiledSelect query below selects all entries in the Supplier table which contain the word prefix boil in any attribute.
    boiledSelect = ts.selectByValues("Supplier")
    q = boiledSelect.getQuery()
    q.anyWord("boil")

    bb = boiledSelect()
    print "boiled suppliers", bb
The method call boiledSelect.getQuery() extracts the Nucular query object associated with the table space selection and the method call q.anyWord("boil") restricts the query to match only entries with the word prefix boil in free text (i.e., in any value in the entry). In addition to anyWord, Nucular queries support other restrictions as described in the Nucular API Summary Document. A selection can be restricted by any of those methods in a similar manner.

The call boiledSelect() evaluates the selection and the resulting list of dictionaries value prints as:

    boiled suppliers
    [{'city': 'paris', 'sname': 'jones', 'i': 'SPS2', 'sno': 2, 
      'misc': 'likes bunnies boiled broiled or in a stew', 'tablename': 'Supplier'}]

Updating or deleting rows matching a complex query

A program can also use selection objects created by the ts.selectByValues method to update or delete matching rows. For example the following interaction sets the city to
chico for all rows matching the boiledSelect selection (all rows in the Supplier table containing the word prefix boil anywhere).
    boiledSelect.updateSet(city="chico")
    print "updated boiled suppliers"
    print boiledSelect()
The printed selection evaluation reflects the change as follows:
updated boiled suppliers
[{'city': 'chico', 'sname': 'jones', 'i': 'SPS2', 'sno': 2, 
  'misc': 'likes bunnies boiled broiled or in a stew', 'tablename': 'Supplier'}]
The following delete removes all rows matching the boiledSelect selection.
    boiledSelect.delete()

Dropping tables and whole table spaces

The table space abstraction also provides methods for dropping a table definition (together with any rows in the table) or a whole table space
    ts.dropTable("Sells")
    ts.dropAll()

Again: always store changes

Just for emphasis, please remember that changes to a table space will only be made permanent in the containing archive when the store operation is called.
    ts.store()
Any unstored operations will be discarded.

Archive cleanup

The table space interface also provides a simplified interface which permits the optimization of the underlying archive using the aggregation operations explained in the overview document.
     ts.cleanUp(full=True)
A non-full cleanUp will only aggregate saved changes to the transient level. A full cleanUp will aggregate the saved changes to the final index.


End of The nTableSpace wrapper API
return to index