Sift API Reference

Koji Smoky Dingo - Filtering Language Sifty Dingo

This is a mini-language based on S-Expressions used for filtering sequences of dict data. The core language only supports some simple logical constructs and a facility for setting and checking flags. The language must be extended to add more predicates specific to the schema of the data being filtered to become useful.

The Sifty Dingo mini-language has nothing to do with the Sifty project, nor the Sieve email filtering language. I just thought that Sifter and Sieve were good names for something that filters stuff.

author:

Christopher O'Brien <obriencj@gmail.com>

license:

GPL v3

class Flagged(sifter, *exprs)[source]

Bases: VariadicSieve

Usage: (flagged NAME [NAME...])

filters for info dicts which have been marked with any of the given named flags

aliases: Sequence[str] = ('?',)
check(session, info)[source]

Override to return True if the predicate matches the given info dict.

This is used by the default run implementation in a filter. Only the info dicts which return True from this method will be included in the results.

Parameters:

info -- The info dict to be checked.

name = 'flagged'
class Flagger(sifter, flag, *exprs)[source]

Bases: LogicAnd

Usage: (flag NAME EXPR [EXPR...])

filters for info dicts which match all of the sub expressions, and marks them with the given named flag.

name = 'flag'
run(session, info_dicts)[source]

Use this Sieve instance to select and return a subset of the info_dicts sequence.

class IntStrSieve(sifter, *tokens)[source]

Bases: Sieve

A Sieve that requires all of its arguments to be matchers. Calls ensure_all_int_or_str on tokens

class ItemPathSieve(sifter, path, *values)[source]

Bases: Sieve

usage: (item PATH [VALUE...])

Resolves the given PATH on each element and checks that any of the given values match. If any do, the element passes.

check(session, data)[source]

Override to return True if the predicate matches the given info dict.

This is used by the default run implementation in a filter. Only the info dicts which return True from this method will be included in the results.

Parameters:

info -- The info dict to be checked.

name = 'item'
class ItemSieve(sifter, *exprs)[source]

Bases: VariadicSieve

A VariadicSieve which performs a comparison by fetching a named key from the info dict.

Subclasses must provide a field attribute which will be used as a key to fetch a comparison value from any checked info dicts.

If a pattern is specified, then the predicate matches if the info dict has an item by the given field key, and the value of that item matches the pattern.

If a pattern is absent then this predicate will only check that given field key exists and is not None.

check(session, info)[source]

Override to return True if the predicate matches the given info dict.

This is used by the default run implementation in a filter. Only the info dicts which return True from this method will be included in the results.

Parameters:

info -- The info dict to be checked.

abstract property field
class Logic(sifter, *exprs)[source]

Bases: Sieve

check = None
class LogicAnd(sifter, *exprs)[source]

Bases: Logic

Usage: (and EXPR [EXPR...])

filters for info dicts which match all sub expressions.

name = 'and'
run(session, info_dicts)[source]

Use this Sieve instance to select and return a subset of the info_dicts sequence.

class LogicNot(sifter, *exprs)[source]

Bases: Logic

Usage: (not EXPR [EXPR...])

filters for info dicts which match none of the sub expressions.

aliases: Sequence[str] = ('!',)
name = 'not'
run(session, info_dicts)[source]

Use this Sieve instance to select and return a subset of the info_dicts sequence.

class LogicOr(sifter, *exprs)[source]

Bases: Logic

Usage: (or EXPR [EXPR...])

filters for info dicts which match any of the sub expressions.

name = 'or'
run(session, info_dicts)[source]

Use this Sieve instance to select and return a subset of the info_dicts sequence.

class MatcherSieve(sifter, *tokens)[source]

Bases: Sieve

A Sieve that requires all of its arguments to be matchers. Calls ensure_all_matcher on tokens

class Sieve(sifter, *tokens, **options)[source]

Bases: object

The abstract base type for all Sieve expressions.

A Sieve is a callable instance which is passed a session and a sequence of info dicts, and returns a filtered subset of those info dicts.

The default run implementation will trigger the prep method first, and then use the check method on each info dict to determine whether it should be included in the results or not. Subclasses can therefore easily write just the check method.

The prep method is there in the event that additional queries should be called on the whole set of incoming data (enabling multicall optimizations).

Sieves are typically instanciated by a Sifter when it compiles the sieve expression string.

Sieve subclasses must provide a name class property or attribute. This property is the key used to define how the Sieve is invoked by the source. For example, a source of (check-enabled X) is going to expect that the Sifter has a Sieve class available with a name of "check-enabled"

Parameters:

sifter (Sifter)

aliases: Sequence[str] = ()
check(session, info)[source]

Override to return True if the predicate matches the given info dict.

This is used by the default run implementation in a filter. Only the info dicts which return True from this method will be included in the results.

Parameters:
  • info (ST) -- The info dict to be checked.

  • session (ClientSession)

Return type:

bool

get_cache(key)[source]

Gets a cache dict from the sifter using the name of this sieve and the given key (which must be hashable)

The same cache dict will be returned for this key until the sifter has its reset method invoked.

Parameters:

key (str)

Return type:

dict

get_info_cache(info)[source]

Gets a cache dict from the sifter using the name of this sieve and the sifter's designated key for the given info dict. The default sifter key will get the "id" value from the info dict.

The same cache dict will be returned for this info dict until the sifter has its reset method invoked.

Parameters:

info (ST)

Return type:

dict

abstract property name: str
prep(session, info_dicts)[source]

Override if some bulk pre-loading operations are necessary.

This is used by the default run implementation to allow bulk operations to be performed over the entire set of info dicts to be filtered, rather than one at a time in the check method

Parameters:
  • session (ClientSession)

  • info_dicts (Iterable[ST])

run(session, info_dicts)[source]

Use this Sieve instance to select and return a subset of the info_dicts sequence.

Parameters:
  • session (ClientSession)

  • info_dicts (Iterable[ST])

Return type:

Iterable[ST]

class Sifter(sieves, source, key='id', params=None)[source]

Bases: object

A flagging data filter, compiled from an s-expression syntax.

Sifter instances are callable, and when invoked with a session and a list of info dicts will perform filtering tests on the data to determine which items match the predicates from the source syntax.

Parameters:
  • sieves (Dict[str, Type[Sieve]] | Iterable[Type[Sieve]]) -- list of classes to use in compiling the source str. Each class should be a subclass of Sieve. The name attribute of each class is used as the lookup value when compiling a sieve expression

  • source (str | Reader) -- Source from which to parse Sieve expressions

  • key (Callable[[Any], Any] | Any) -- Unique hashable identifier key for the info dicts. This is used to deduplicate or otherwise correlate the incoming information. Default, use the "id" value.

  • params (Dict[str, str]) -- Map of text substitutions for quoted strings

get_cache(cachename, key)[source]

Flexible storage for caching data in a sifter. Sieves can use this to record data about individual info dicts, or to cache results from arbitrary koji session calls.

This data is cleared when the reset method is invoked.

Return type:

dict

get_info_cache(cachename, data)[source]

Cache associated with a particular info dict.

This data is cleared when the reset method is invoked

Return type:

dict

is_flagged(flagname, data)[source]

True if the data has been flagged with the given flagname, either via a (flag ...) sieve expression, or via set_flag

Parameters:
  • flagname (str)

  • data (ST)

Return type:

bool

reset()[source]

Clears flags and data caches

run(session, info_dicts)[source]

Clears existing flags and runs contained sieves on the given info_dicts.

Parameters:
  • session (ClientSession)

  • info_dicts (Iterable[ST])

Return type:

Dict[str, List[ST]]

set_flag(flagname, data)[source]

Records the given data as having been flagged with the given flagname.

Parameters:
  • flagname (str)

  • data (ST)

sieve_exprs()[source]

The list of Sieve expressions in this Sifter

Return type:

List[Sieve]

exception SifterError[source]

Bases: BadDingo

complaint: str = 'Error compiling Sifter'
class SymbolSieve(sifter, *tokens)[source]

Bases: Sieve

A Sieve that requires all of its arguments to be matchers. Calls ensure_all_symbol on tokens

class VariadicSieve(sifter, *exprs)[source]

Bases: Sieve

Utility class which automatically applies an outer (or ...) when presented with more than one argument.

This allows for example (name foo bar baz) to automatically become (or (name foo) (name bar) (name baz)) while the name sieve only needs to be written to check for a single value.

ensure_all_int_or_str(values, msg=None)[source]

Checks that all values are either a int, Number, str, or Symbol. Returns each as an int or str as appropriate in a new list. If any value is not an int, Number, str, nor Symbol, raises a SifterError.

Parameters:
  • values (Iterable[Any]) -- sequence of values to ensure or convert

  • msg (str | None) -- optional error message for exception raised if a portion of values could not be coerced to an int or str

Return type:

List[int | str]

ensure_all_matcher(values, msg=None)[source]

Checks that all of the elements in values are either a str, Symbol, Regex, or Glob instance, and returns them as a new list. If not, raises a SifterError.

Parameters:
Return type:

List[str | Matcher]

ensure_all_sieve(values, msg=None)[source]

Checks that all of the elements in values are Sieve instances, and returns them in a new list. If not, raises a SifterError.

Parameters:
Return type:

List[Sieve]

ensure_all_symbol(values, expand=True, msg=None)[source]

Checks that all of the elements in values are Symbols, and returns them as a new list. If not, raises a SifterError.

If expand is True then any SymbolGroup instances will be expanded to their full combination of Symbols and inlined. Otherwise, the inclusion of a SymbolGroup is an error.

Parameters:
  • expand (bool) -- convert any SymbolGroups into their combinant Symbols

  • values (List[Any])

  • msg (str | None)

Return type:

List[Symbol]

ensure_int(value, msg=None)[source]

Checks that valie is an int or Number, and returns it as an int. If value is not an int or Number, raises a SifterError.

Parameters:
  • value (Any)

  • msg (str | None)

Return type:

int

ensure_int_or_str(value, msg=None)[source]

Checks that value is either a int, Number, str, or Symbol. Returns an int or str as appropriate. If value is not an int, Number, str, nor Symbol, raises a SifterError.

Parameters:
  • value (Any) -- the value to coerce into an int or str

  • msg (str | None) -- optional error message if value cannot be coerced

Return type:

int | str

ensure_matcher(value, msg=None)[source]

Checks that value is either a str, or a Matcher instance, and returns it. If not, raises a SifterError.

Parameters:
  • value (Any)

  • msg (str | None)

Return type:

str | Matcher

ensure_sieve(value, msg=None)[source]

Checks that value is a Sieve instance, and returns it. If not, raises a SifterError.

Parameters:
  • value (Any)

  • msg (str | None)

Return type:

Sieve

ensure_str(value, msg=None)[source]

Checks that value is either an int, str, or Symbol, and returns a str version of it. If value is not an int, str, or Symbol, raises a SifterError.

Parameters:
  • value (Any)

  • msg (str | None)

Return type:

str

ensure_symbol(value, msg=None)[source]

Checks that the value is a Symbol, and returns it. If value was not a Symbol, raises a SifterError.

Parameters:
  • value (Any)

  • msg (str | None)

Return type:

Symbol

Sift Modules