return to index
nucular project page with download links

Boolean query syntax and API

Nucular sessions offer a streamlined interface for extracting information from an archive using strings formatted in a special purpose boolean query syntax.

This discussion describes the syntax and usage of boolean query strings illustrating the discussion with examples using the nucular archive built in the API Examples document.

Boolean query API

A program may use a boolean query string to extract a result object from a Nucular session as follows:
result = session.result(booleanString)
As explained in the API summary document result objects allow programs to find some of the matches to a query without fully extracting all of the matches to the query -- and this can allow the program to run more quickly.

For smaller queries it is often more convenient to get all matches for the query in one step. To find all matching dictionaries for a boolean query string use the dictionaries method of the session object. For example the following Python command prompt interaction finds all the dictionary matches mentioning "pizza" from the example archive

$ python
Python 2.5 ...
>>> from nucular import Nucular
>>> session = Nucular.Nucular("../testdata/APIExamples")
>>> for d in session.dictionaries("pizza"):
...     print d
... 
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
>>> 

High level boolean query syntax

The boolean query string syntax follows standard conventions for constructing boolean expressions: expressions are formed from disjunctions of conjunctions of atoms, negations of atoms, or parenthesized expressions. Atoms encode the basic pattern matching primatives supported by Nucular.

Disjunctions

A boolean query string may request the union of results matching a sequence of alternative expressions using the disjunction notation.



For example the following interaction uses the query string "bunny | snail" to find all dictionaries mentioning "bunny" or "snail" anywhere.
>>> for d in session.dictionaries("bunny | snail"):
...     print d
... 
{'food': 'just delicious with garlic', 'i': '456BUNNY', 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
{'i': 'Lola Waller', 'p': '333-2222', 'g': 'female', 'n': 'thinks snails are delicious'}
>>> 
Here the matches for the query string "bunny | snail" evaluates to the matches for the query string bunny unioned with the matches for the query string snail.

Conjunctions

A boolean query string may request the intersection of results matching a sequence of expressions using the conjunction notation



For example the following interaction uses the query string "delicious ~snail" to request all dictionaries which mention "delicious" but don't mention "snail".
>>> for d in session.dictionaries("delicious ~snail"):
...     print d
... 
{'food': 'tastes delicious, like chicken', 'i': '123FROG', 'name': 'frog', 'descr': 'little green slimy things'}
{'food': 'just delicious with garlic', 'i': '456BUNNY', 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
>>> 
Here the query string "delicious ~snail" evaluates to the matches for the query string delicious intersected with the matches for the query string ~snail.

Atoms

The basic pattern matching primatives supported by Nucular are expressed using the atomic expressions or atoms.



Note that larger expressions may be grouped together using parentheses. For example the following interaction extracts the dictionaries matching either "delicious" or "cuddly" but not matching "snail".
>>> for d in session.dictionaries("(delicious | cuddly) ~snail"):
...     print d
... 
{'food': 'tastes delicious, like chicken', 'i': '123FROG', 'name': 'frog', 'descr': 'little green slimy things'}
{'food': 'just delicious with garlic', 'i': '456BUNNY', 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
{'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten', 'descr': 'cute and cuddly'}
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
The discussions to follow present the individual atomic notations.

Searching for a word anywhere

A query string containing a single word matches archive entries containing that word anywhere.



For example the following interaction finds all dictionaries containing the word "cute"
>>> for d in session.dictionaries("cute"):
...     print d
... 
{'food': 'just delicious with garlic', 'i': '456BUNNY', 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
{'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten', 'descr': 'cute and cuddly'}
The boolean query parser recognizes any alphanumeric string as a name like this1 and also recognizes strings enclosed by double quotes as a name like "this one".

Attribute prefix matches

A query string may look for entries containing an attribute that starts with a prefix.



For example the following interaction finds all dictionaries where the "p" attribute starts with "3".
>>> for d in session.dictionaries("p=3.."):
...     print d
... 
{'i': 'Joe Blow', 'p': '333-2222', 'c': 'great at a grill', 'g': 'male'}
{'i': 'Lola Waller', 'p': '333-2222', 'g': 'female', 'n': 'thinks snails are delicious'}
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}

Matching attribute words

A query string may look for a word prefix inside an attribute



For example the following interaction finds all dictionaries where the "p" attribute contains "33" as a word prefix.
>>> for d in session.dictionaries("p:33"):
...     print d
... 
{'i': 'Joe Blow', 'p': '333-2222', 'c': 'great at a grill', 'g': 'male'}
{'i': 'Joe Smithers', 'p': '111-3333', 'c': "can't cook", 'g': 'male'}
{'i': 'Lola Waller', 'p': '333-2222', 'g': 'female', 'n': 'thinks snails are delicious'}
{'i': 'Sally Smithers', 'p': '111-3333', 'c': 'uses too much salt', 'g': 'female'}
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
>>> 

Exact attribute matches

A query string may look for entries containing an attribute value with an exact value.



For example the following interaction finds all dictionary entries containing a food attribute value exactly matching tastes delicious, like chicken.
>>> for d in session.dictionaries('food="tastes delicious, like chicken"'):
...     print d
... 
{'food': 'tastes delicious, like chicken', 'i': '123FROG', 'name': 'frog', 'descr': 'little green slimy things'}
>>> 
This example also illustrates that a name may be enclosed in double quotes like "tastes delicious, like chicken".

Attribute value ranges matches

A boolean query string may look for entries containing attribute values that fall inside an alphabetical range.



For example the following query finds entries where the i attribute lies between Sally and Silly.
>>> for d in session.dictionaries("i=[Sally:Silly]"):
...     print d
... 
{'i': 'Sally Smithers', 'p': '111-3333', 'c': 'uses too much salt', 'g': 'female'}
{'i': 'Sandy Waller', 'p': '333-2222', 'c': 'delicious pizza', 'g': 'female'}
>>> 

Promixity search

A boolean query may search for patterns where words occur in a specific order separated by a limited number of other words.



For example the following query finds the entries containing the word "cute" followed by the word "cuddly" separated by no more than 3 intervening words.
>>> for d in session.dictionaries("<3>cute..cuddly"):
...     print d
... 
{'food': 'just delicious with garlic', 'i': '456BUNNY', 'name': 'bunny rabbit', 'descr': 'cute and cuddly'}
{'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten', 'descr': 'cute and cuddly'}
>>> 

Negations

A query may remove matches to another query from the result set using a negation notation.



For example the following query finds the entries containing the word cute and the word cuddly but not containing the word delicious.
>>> for d in session.dictionaries("cute cuddly ~delicious"):
...     print d
... 
{'i': '789KITTEN', 'note': 'not edible', 'name': 'kitten', 'descr': 'cute and cuddly'}
>>> 
Negations are only permitted in conjunctions with positive constraints. For example you cannot search for entries that don't contain the word delicious unless you also provide a positive search condition to search for.
>>> for d in session.dictionaries("~delicious"):
...     print d
... 
Traceback (most recent call last):
  File "", line 1, in 
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nucular/Nucular.py", line 551, in dictionaries
    result = self.result(queryString)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nucular/Nucular.py", line 543, in result
    result = booleanQuery.booleanResult(queryString, self)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nucular/booleanQuery.py", line 13, in booleanResult
    result = getResult(parse, session, queryString)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nucular/booleanQuery.py", line 18, in getResult
    assert indicator!="NOT", "unrestricted negation not permitted "+repr((qs, parse))
AssertionError: unrestricted negation not permitted ('~delicious', ['NOT', ['anyWord', 'delicious']])



End of Boolean query syntax and API
return to index