Sift API Reference¶
Koji Smoky Dingo - Filtering Language Sifty Dingo
This is a mini-language based on S-Expressions used for filtering sequences of dict data. The core language only supports some simple logical constructs and a facility for setting and checking flags. The language must be extended to add more predicates specific to the schema of the data being filtered to become useful.
The Sifty Dingo mini-language has nothing to do with the Sifty project, nor the Sieve email filtering language. I just thought that Sifter and Sieve were good names for something that filters stuff.
- author:
Christopher O'Brien <obriencj@gmail.com>
- license:
GPL v3
- class Flagged(sifter, *exprs)[source]¶
Bases:
VariadicSieve
Usage:
(flagged NAME [NAME...])
filters for info dicts which have been marked with any of the given named flags
- check(session, info)[source]¶
Override to return True if the predicate matches the given info dict.
This is used by the default
run
implementation in a filter. Only the info dicts which return True from this method will be included in the results.- Parameters:
info -- The info dict to be checked.
- name = 'flagged'¶
- class Flagger(sifter, flag, *exprs)[source]¶
Bases:
LogicAnd
Usage:
(flag NAME EXPR [EXPR...])
filters for info dicts which match all of the sub expressions, and marks them with the given named flag.
- name = 'flag'¶
- class IntStrSieve(sifter, *tokens)[source]¶
Bases:
Sieve
A Sieve that requires all of its arguments to be matchers. Calls
ensure_all_int_or_str
ontokens
- class ItemPathSieve(sifter, path, *values)[source]¶
Bases:
Sieve
usage:
(item PATH [VALUE...])
Resolves the given PATH on each element and checks that any of the given values match. If any do, the element passes.
- check(session, data)[source]¶
Override to return True if the predicate matches the given info dict.
This is used by the default
run
implementation in a filter. Only the info dicts which return True from this method will be included in the results.- Parameters:
info -- The info dict to be checked.
- name = 'item'¶
- class ItemSieve(sifter, *exprs)[source]¶
Bases:
VariadicSieve
A VariadicSieve which performs a comparison by fetching a named key from the info dict.
Subclasses must provide a
field
attribute which will be used as a key to fetch a comparison value from any checked info dicts.If a pattern is specified, then the predicate matches if the info dict has an item by the given field key, and the value of that item matches the pattern.
If a pattern is absent then this predicate will only check that given field key exists and is not None.
- check(session, info)[source]¶
Override to return True if the predicate matches the given info dict.
This is used by the default
run
implementation in a filter. Only the info dicts which return True from this method will be included in the results.- Parameters:
info -- The info dict to be checked.
- abstract property field¶
- class LogicAnd(sifter, *exprs)[source]¶
Bases:
Logic
Usage:
(and EXPR [EXPR...])
filters for info dicts which match all sub expressions.
- name = 'and'¶
- class LogicNot(sifter, *exprs)[source]¶
Bases:
Logic
Usage:
(not EXPR [EXPR...])
filters for info dicts which match none of the sub expressions.
- name = 'not'¶
- class LogicOr(sifter, *exprs)[source]¶
Bases:
Logic
Usage:
(or EXPR [EXPR...])
filters for info dicts which match any of the sub expressions.
- name = 'or'¶
- class MatcherSieve(sifter, *tokens)[source]¶
Bases:
Sieve
A Sieve that requires all of its arguments to be matchers. Calls
ensure_all_matcher
ontokens
- class Sieve(sifter, *tokens, **options)[source]¶
Bases:
object
The abstract base type for all Sieve expressions.
A Sieve is a callable instance which is passed a session and a sequence of info dicts, and returns a filtered subset of those info dicts.
The default
run
implementation will trigger theprep
method first, and then use thecheck
method on each info dict to determine whether it should be included in the results or not. Subclasses can therefore easily write just the check method.The prep method is there in the event that additional queries should be called on the whole set of incoming data (enabling multicall optimizations).
Sieves are typically instanciated by a Sifter when it compiles the sieve expression string.
Sieve subclasses must provide a
name
class property or attribute. This property is the key used to define how the Sieve is invoked by the source. For example, a source of(check-enabled X)
is going to expect that the Sifter has a Sieve class available with a name of"check-enabled"
- Parameters:
sifter (Sifter)
- check(session, info)[source]¶
Override to return True if the predicate matches the given info dict.
This is used by the default
run
implementation in a filter. Only the info dicts which return True from this method will be included in the results.- Parameters:
info (ST) -- The info dict to be checked.
session (ClientSession)
- Return type:
- get_cache(key)[source]¶
Gets a cache dict from the sifter using the name of this sieve and the given key (which must be hashable)
The same cache dict will be returned for this key until the sifter has its
reset
method invoked.
- get_info_cache(info)[source]¶
Gets a cache dict from the sifter using the name of this sieve and the sifter's designated key for the given info dict. The default sifter key will get the "id" value from the info dict.
The same cache dict will be returned for this info dict until the sifter has its
reset
method invoked.- Parameters:
info (ST)
- Return type:
- prep(session, info_dicts)[source]¶
Override if some bulk pre-loading operations are necessary.
This is used by the default
run
implementation to allow bulk operations to be performed over the entire set of info dicts to be filtered, rather than one at a time in thecheck
method- Parameters:
session (ClientSession)
info_dicts (Iterable[ST])
- class Sifter(sieves, source, key='id', params=None)[source]¶
Bases:
object
A flagging data filter, compiled from an s-expression syntax.
Sifter instances are callable, and when invoked with a session and a list of info dicts will perform filtering tests on the data to determine which items match the predicates from the source syntax.
- Parameters:
sieves (Dict[str, Type[Sieve]] | Iterable[Type[Sieve]]) -- list of classes to use in compiling the source str. Each class should be a subclass of Sieve. The name attribute of each class is used as the lookup value when compiling a sieve expression
source (str | Reader) -- Source from which to parse Sieve expressions
key (Callable[[Any], Any] | Any) -- Unique hashable identifier key for the info dicts. This is used to deduplicate or otherwise correlate the incoming information. Default, use the "id" value.
params (Dict[str, str]) -- Map of text substitutions for quoted strings
- get_cache(cachename, key)[source]¶
Flexible storage for caching data in a sifter. Sieves can use this to record data about individual info dicts, or to cache results from arbitrary koji session calls.
This data is cleared when the
reset
method is invoked.- Return type:
- get_info_cache(cachename, data)[source]¶
Cache associated with a particular info dict.
This data is cleared when the
reset
method is invoked- Return type:
- is_flagged(flagname, data)[source]¶
True if the data has been flagged with the given flagname, either via a
(flag ...)
sieve expression, or viaset_flag
- run(session, info_dicts)[source]¶
Clears existing flags and runs contained sieves on the given info_dicts.
- class SymbolSieve(sifter, *tokens)[source]¶
Bases:
Sieve
A Sieve that requires all of its arguments to be matchers. Calls
ensure_all_symbol
ontokens
- class VariadicSieve(sifter, *exprs)[source]¶
Bases:
Sieve
Utility class which automatically applies an outer
(or ...)
when presented with more than one argument.This allows for example
(name foo bar baz)
to automatically become(or (name foo) (name bar) (name baz))
while thename
sieve only needs to be written to check for a single value.
- ensure_all_int_or_str(values, msg=None)[source]¶
Checks that all values are either a int, Number, str, or Symbol. Returns each as an int or str as appropriate in a new list. If any value is not an int, Number, str, nor Symbol, raises a SifterError.
- ensure_all_matcher(values, msg=None)[source]¶
Checks that all of the elements in values are either a str, Symbol, Regex, or Glob instance, and returns them as a new list. If not, raises a SifterError.
- ensure_all_sieve(values, msg=None)[source]¶
Checks that all of the elements in values are Sieve instances, and returns them in a new list. If not, raises a SifterError.
- ensure_all_symbol(values, expand=True, msg=None)[source]¶
Checks that all of the elements in values are Symbols, and returns them as a new list. If not, raises a SifterError.
If expand is True then any SymbolGroup instances will be expanded to their full combination of Symbols and inlined. Otherwise, the inclusion of a SymbolGroup is an error.
- ensure_int(value, msg=None)[source]¶
Checks that valie is an int or Number, and returns it as an int. If value is not an int or Number, raises a SifterError.
- ensure_int_or_str(value, msg=None)[source]¶
Checks that value is either a int, Number, str, or Symbol. Returns an int or str as appropriate. If value is not an int, Number, str, nor Symbol, raises a SifterError.
- ensure_matcher(value, msg=None)[source]¶
Checks that value is either a str, or a Matcher instance, and returns it. If not, raises a SifterError.
- ensure_sieve(value, msg=None)[source]¶
Checks that value is a Sieve instance, and returns it. If not, raises a SifterError.
- ensure_str(value, msg=None)[source]¶
Checks that value is either an int, str, or Symbol, and returns a str version of it. If value is not an int, str, or Symbol, raises a SifterError.
- ensure_symbol(value, msg=None)[source]¶
Checks that the value is a Symbol, and returns it. If value was not a Symbol, raises a SifterError.