htools package

Submodules

htools.config module

htools.config.get_credentials(from_email)

Get the user’s password for a specified email address.

Parameters

from_email (str) – The email address to get the password for.

Returns

str or None – it as a string. Otherwise, return None.

Return type

If a password is found for the specified email address, return

htools.config.get_default_user()

Get user’s default email address. If one has not been set, user has the option to set it here.

Returns

str or None – declines to specify one, None is returned.

Return type

A string containing the default email address. If user

htools.core module

class htools.core.BasicPipeline(*funcs)

Bases: object

Create a simple unidirectional pipeline of functions to apply in order with optional debugging output.

exception htools.core.InvalidArgumentError

Bases: Exception

htools.core.always_true(x, *args, **kwargs)

Similar to identity but returns True instead of x. I’m tempted to name this true but I fear that will cause some horrible bugs where I accidentally use this when I want to use True.

htools.core.amap(attr, *args)

More convenient syntax for quick data exploration. Get an attribute value for multiple objects. Name is short for “attrmap”.

Parameters
  • attr (str) – Name of attribute to retrieve for each object.

  • args (any) – Objects (usually of same type) to retrieve attributes for.

Returns

list

Return type

Result for each object.

Examples

df1 = pd.DataFrame(np.random.randint(0, 10, (4, 5))) df2 = pd.DataFrame(np.random.randint(0, 3, (4, 5))) df3 = pd.DataFrame(np.random.randint(0, 3, (2, 3)))

>>> amap('shape', df1, df2, df3)
[(4, 5), (4, 5), (2, 3)]

net = nn.Sequential(…) >>> amap(‘shape’, *net.parameters()) [torch.Size([5, 3]),

torch.Size([16, 4]), torch.Size([16, 3]), torch.Size([16])]

htools.core.camel2snake(text)

Convert camel case to snake case. This assumes the input is valid camel case (if you have some weird hybrid of camel and snake case, for instance, you’d want to do some preprocessing first).

Parameters

text (str) – Camel case string, e.g. vaderSentimentScore.

Returns

str

Return type

text converted to snake case, e.g. vader_sentiment_score.

htools.core.catch(func, *args, verbose=False)

Error handling for list comprehensions. In practice, it’s recommended to use the higher-level robust_comp() function which uses catch() under the hood.

Parameters
  • func (function) –

  • *args (any type) – Arguments to be passed to func.

  • verbose (bool) – If True, print the error message should one occur.

Returns

any type – Otherwise, return None.

Return type

If the function executes successfully, its output is returned.

Examples

[catch(lambda x: 1 / x, i) for i in range(3)] >>> [None, 1.0, 0.5]

# Note that the filtering method shown below also removes zeros which is # okay in this case. list(filter(None, [catch(lambda x: 1 / x, i) for i in range(3)])) >>> [1.0, 0.5]

htools.core.cd_root(root_subdir='notebooks')

Run at start of Jupyter notebook to enter project root.

Parameters

root_subdir (str) – Name of a subdirectory contained in the project root directory. If not found in the current working directory, this will move to the parent directory.

Examples

Sample file structure (abbreviated): my_project/

py/

fetch_raw_data.py

notebooks/

nb01_eda.ipynb

Running cd_root() from nb01_eda.ipynb will change the working directory from notebooks/ to my_project/, which is typically the same directory we’d run scripts in py/ from. This makes converting from notebooks to scripts easier.

htools.core.dict_sum(*args)

Given two or more dictionaries with numeric values, combine them into a single dictionary. For keys that appear in multiple dictionaries, their corresponding values are added to produce the new value.

This differs from combining two dictionaries in the following manner:

{**d1, **d2}

The method shown above will combine the keys but will retain the value from d2, rather than adding the values from d1 and d2.

Parameters

*args (dicts) – 2 or more dictionaries with numeric values.

Returns

dict – passed in. The corresponding values from each dictionary containing a given key are summed to produce the new value.

Return type

Contains all keys which appear in any of the dictionaries that are

Examples

>>> d1 = {'a': 1, 'b': 2, 'c': 3}
>>> d2 = {'a': 10, 'c': -20, 'd': 30}
>>> d3 = {'c': 10, 'd': 5, 'e': 0}
>>> dict_sum(d1, d2)

{‘a’: 11, ‘b’: 2, ‘c’: -7, ‘d’: 35, ‘e’: 0}

htools.core.differences(obj1, obj2, methods=False, **kwargs)

Find the differences between two objects (generally of the same type - technically this isn’t enforced but we do require that the objects have the same set of attribute names so a similar effect is achieved. Actual type checking was causing problems comparing multiple Args instances, presumably because each Args object is defined when called).

This is a way to get more detail beyond whether two objects are equal or not.

Parameters
  • obj1 (any) – An object.

  • obj2 (any, usually the same type as obj1) – An object.

  • methods (bool) –

    If True, include methods in the comparison. If False, only attributes will be compared. Note that the output may not be particularly interpretable when using method=True; for instance when comparing two strings consisting of different characters, we get a lot of output that looks like this:

    {‘islower’: (<function str.islower()>, <function str.islower()>), ‘isupper’: (<function str.isupper()>, <function str.isupper()>),… ‘istitle’: (<function str.istitle()>, <function str.istitle()>)}

    These attributes all reflect the same difference: if obj1 is ‘abc’ and obj2 is ‘def’, then ‘abc’ != ‘def’ and ‘ABC’ != ‘DEF’ abd ‘Abc’ != ‘Def’.

    When method=False, we ignore all of these, such that differences(‘a’, ‘b’) returns {}. Therefore, it is important to carefully consider what differences you care about identifying.

  • **kwargs (bool) – Can pass args to hdir to include magics or internals.

Returns

dict[str, tuple] – first is the corresponding value for obj1 and the second is the corresponding value for obj2.

Return type

Maps attribute name to a tuple of values, where the

htools.core.eprint(arr, indent=2, spacing=1)

Enumerated print. Prints an iterable with one item per line accompanied by a number specifying its index in the iterable.

Parameters
  • arr (iterable) – The object to be iterated over.

  • indent (int) – Width to assign to column of integer indices. Default is 2, meaning columns will line up as long as <100 items are being printed, which is the expected use case.

  • spacing (int) – Line spacing. Default of 1 will print each item on a new line with no blank lines in between. Spacing of 2 will double space output, and so on for larger values.

Returns

Return type

None

htools.core.fgrep(text, term, window=25, with_idx=False, reverse=False)

Search a string for a given term. If found, print it with some context. Similar to grep -C 1 term text. fgrep is short for faux grep.

Parameters
  • text (str) – Text to search.

  • term (str) – Term to look for in text.

  • window (int) – Number of characters to display before and after the matching term.

  • with_idx (bool) – If True, return index as well as string.

  • reverse (bool) – If True, reverse search direction (find last match rather than first).

Returns

str or tuple[int, str] – If the term isn’t present, an empty string is returned. If with_idx=True, a tuple of (match index, string with text) is returned.

Return type

The desired term and its surrounding context.

htools.core.flatten(nested)

Flatten a nested sequence where the sub-items can be sequences or primitives. This differs slightly from itertools chain methods because those require all sub-items to be sequences. Here, items can be primitives, sequences, nested sequences, or any combination of these. Any iterable items aside from strings will be completely un-nested, so use with caution (e.g. a torch Dataset would be unpacked into separate items for each index). This also returns a list rather than a generator.

Parameters

nested (sequence (list, tuple, set)) – Sequence where some or all of the items are also sequences.

Returns

list

Return type

Flattened version of nested.

htools.core.func_name(func)

Usually just returns the name of a function. The difference is this is compatible with functools.partial, which otherwise makes __name__ inaccessible.

Parameters

func (callable) – Can be a function, partial, or callable class.

htools.core.has_classmethod(cls, meth_name)

Check if a class has a classmethod with a given name. Note that isinstance(cls.meth_name, classmethod) would always return False: we must use getattr_static or cls.__dict__[meth_name] to potentially return True.

Parameters
  • cls (type or obj) – This is generally intended to be a class but it should work on objects (class instances) as well.

  • meth_name (str) – The name of the potential classmethod to check for.

Returns

bool

Return type

True if cls possesses a classmethod with the specified name.

htools.core.hasarg(func, arg)

Checks if a function has a given argument. Works with args and kwargs as well if you exclude the stars. See example below.

Parameters
  • func (function) –

  • arg (str) – Name of argument to look for.

Returns

Return type

bool

Example

def foo(a, b=6, *args):

return

>>> hasarg(foo, 'b')
True
>>> hasarg(foo, 'args')
True
>>> hasarg(foo, 'c')
False
htools.core.hashable(x)

Check if an object is hashable. Hashable objects will usually be immutable though this is not guaranteed.

Parameters

x (object) – The item to check for hashability.

Returns

bool

Return type

True if x is hashable (suggesting immutability), False otherwise.

htools.core.hasstatic(cls, meth_name)

Check if a class possesses a staticmethod of a given name. Similar to hasattr. Note that isinstance(cls.meth_name, staticmethod) would always return False: we must use getattr_static or cls.__dict__[meth_name] to potentially return True.

Parameters
  • cls (Type or any) – A class or an instance (seems to work on both, though more extensive testing may be needed for more complex scenarios).

  • meth_name (str) – Name of method to check. If the class/instance does not contain any attribute with this name, function returns False.

Returns

bool

Return type

True if cls has a staticmethod with name meth_name.

htools.core.hdir(obj, magics=False, internals=False)

Print object methods and attributes, by default excluding magic methods.

Parameters
  • obj (any type) – The object to print methods and attributes for.

  • magics (bool) – Specifies whether to include magic methods (e.g. __name__, __hash__). Default False.

  • internals (bool) – Specifies whether to include internal methods (e.g. _dfs, _name). Default False.

Returns

Keys are method/attribute names, values are strings specifying whether the corresponding key is a ‘method’ or an ‘attr’.

Return type

dict

htools.core.hsplit(text, sep, group=True, attach=True)

Flexible string splitting that retains the delimiter rather, unlike the built-in str.split() method.

Parameters
  • text (str) – The input text to be split.

  • sep (str) – The delimiter to be split on.

  • group (bool) – Specifies whether to group consecutive delimiters together (True), or to separate them (False).

  • attach (bool) – Specifies whether to attach the delimiter to the string that preceeds it (True), or to detach it so it appears in the output list as its own item (False).

Returns

Return type

list[str]

Examples

text = “Score – Giants win 6-5” sep = ‘-‘

# Case 0.1: Delimiters are grouped together and attached to the preceding word. >> hsplit(text, sep, group=True, attach=True) >> [‘Score –’, ‘ Giants win 6-‘, ‘5’]

# Case 0.2: Delimiters are grouped together but are detached from the preceding word, instead appearing as their own item in the output list. >> hsplit(text, sep, group=True, attach=False) >> [‘Score ‘, ‘–’, ‘ Giants win 6’, ‘-‘, ‘5’]

Case 1.1: Delimiters are retained and attached to the preceding string. If the delimiter occurs multiple times consecutively, only the first occurrence is attached, and the rest appear as individual items in the output list. >> hsplit(text, sep, group=False, attach=True) >> [‘Score -‘, ‘-‘, ‘ Giants win 6-‘, ‘5’]

# Case 1.2: Delimiters are retained but are detached from the preceding string. Each instance appears as its own item in the output list. >> hsplit(text, sep, group=False, attach=False) >> [‘Score ‘, ‘-‘, ‘-‘, ‘ Giants win 6’, ‘-‘, ‘5’]

htools.core.identity(x)

Returns the input argument. Sometimes it is convenient to have this if we sometimes apply a function to an item: rather than defining a None variable, sometimes setting it to a function, then checking if it’s None every time we’re about to call it, we can set the default as identity and safely call it without checking.

Parameters

x (any) –

Returns

x

Return type

Unchanged input.

htools.core.ifnone(arg, backup)

Shortcut to provide a backup value if an argument is None. Commonly used for numpy arrays since their truthiness is ambiguous.

Parameters
  • arg (any) – We will check if this is None.

  • backup (any) – This will be returned if arg is None.

Returns

Return type

Either arg or backup will be returned.

htools.core.is_builtin(x, drop_callables=True)

Check if an object is a Python built-in object.

Parameters
  • x (object) –

  • drop_callables (bool) – If True, return False for callables (basically functions, methods, or classes). These typically will return True otherwise since they are of class type or builtin_function_or_method.

Returns

bool

Return type

True if x is a built-in object, False otherwise.

htools.core.is_classmethod(meth)

Companion to has_classmethod that checks a method itself rather than a class and a method name. It does use has_classmethod under the hood.

htools.core.isstatic(meth)

Companion to hasstatic that checks a method itself rather than a class and method name. It does use hasstatic under the hood.

htools.core.item(it, random=True, try_values=True)

Get an item from an iterable (e.g. dict, set, torch DataLoader). This is a quick way to access an item for iterables that don’t support indexing, or do support indexing but require us to know a key.

Parameters
  • it (Iterable) – Container that we want to access a value from.

  • random (bool) – If True, pick a random value from it. Otherwise just return the first value.

  • try_values (bool) – If True, will check if it has a values attribute and will operate on that if it does. We often want to see a random value from a dict rather than a key. If we want both a key and value, we could set try_values=False and pass in d.items().

Returns

any

Return type

An item from the iterable.

htools.core.kwargs_fallback(self, *args, assign=False, **kwargs)
Use inside a method that accepts **kwargs. Sometimes we want to use

an instance variable for some computation but want to give the user the option to pass in a new value to the method (often ML hyperparameters) to be used instead. This function makes that a little more convenient.

self: object

The class instance. In most cases users will literally pass self in.

args: str

One or more names of variables to use this procedure on.

assign: bool

If True, any user-provided kwargs will be used to update attributes of the instance. If False (the default), they will be used in computation but won’t change the state of the instance.

kwargs: any

Just forward along the kwargs passed to the method.

list or single object: If more than one arg is specified, a list of values

is returned. For just one arg, a single value will be returned.

class Foo:

def __init__(self, a, b=3, c=(‘a’, ‘b’, ‘c’)):

self.a, self.b, self.c = a, b, c

def walk(self, d, **kwargs):

a, c = kwargs_fallback(self, ‘a’, ‘c’, **kwargs) print(self.a, self.b, self.c) print(a, c, end=’

‘)

b, c = kwargs_fallback(self, ‘b’, ‘c’, assign=True, **kwargs) print(self.a, self.b, self.c) print(b, c)

# Notice the first kwargs_fallback call doesn’t change attributes of f # but the second does. In the first block of print statements, the variable # b does not exist yet because we didn’t include it in *args. >>> f = Foo(1) >>> f.walk(d=0, b=10, c=100) 1 3 (‘a’, ‘b’, ‘c’) 1 100

1 10 100 10 100

htools.core.listlike(x)

Checks if an object is a list/tuple/set/array etc. Strings and mappings (e.g. dicts) are not considered list-like.

htools.core.lmap(fn, *args)

Basically a wrapper for map that returns a list rather than a generator. This is such a common pattern that I think it deserves its own function (think of it as a concise alternative to a list comprehension). One slight difference is that we use *args instead of passing in an iterable. This adds a slight convenience for the intended use case (fast prototyping). See the Examples for more on this.

Parameters

args (any) –

Returns

Return type

list

Examples

Consider these three equivalent syntax options:

lmap(fn, x, y) [fn(obj) for obj in (x, y)] list(map(fn, (x, y))

When quickly iterating, option 1 saves a bit of typing. The extra parentheses that options 2 and 3 require to put x and y in a temporary data structure can get messy as we add more complex logic.

htools.core.load(path, verbose=True)

Wrapper to load text files or pickled (optionally zipped) or json data.

Parameters
  • path (str) – File to load. File type will be inferred from extension. Must be one of ‘.txt’, ‘.json’, ‘.pkl’, or ‘.zip’.

  • verbose (bool, optional) – If True, will print message stating where object was loaded from.

Returns

object

Return type

The Python object that was pickled to the specified file.

htools.core.max_key(d, fn=<function identity>)

Find the maximum value in a dictionary and return the associated key. If we want to compare values using something other than their numeric values, we can specify a function. For example, with a dict mapping strings to strings, fn=len would return the key with the longest value.

Parameters
  • d (dict) – Values to select from.

  • fn (callable) – Takes 1 argument (a single value from d.values()) and returns a number. This will be used to sort the items.

Returns

Return type

A key from dict d.

htools.core.method_of(meth)

Retrieve the class a method belongs to. This will NOT work on attributes. Also, this won’t help if your goal is to retrieve an instance: this returns the type of the instance. Not thoroughly tested but it seems to work regardless of whether you pass in meth from an instance or a class (the output is the same in both cases).

Parameters

meth (MethodType) – The method to retrieve the class of.

Returns

type

Return type

The class which defines the method in question.

Examples

class Foo:
def my_method(self, x):

return x*2

f = Foo() assert method_of(Foo.my_method) == method_of(f.my_method) == Foo

htools.core.ngrams(word, n=3, step=1, drop_last=False)

To get non-overlapping sequences, pass in same value for step as n.

htools.core.parallelize(func, items, total=None, chunksize=1000, processes=None)

Apply a function to a sequence of items in parallel. A progress bar is included.

Parameters
  • func (function) – This will be applied to each item in items.

  • items (Iterable) – Sequence of items to apply func to.

  • total (int or None) – This defaults to the length of items. In the case that items is a generator, this lets us pass in the length explicitly. This lets tdqm know how quickly to advance our progress bar.

  • chunksize (int) – Positive int that determines the size of chunks submitted to the process pool as separate tasks. Multiprocessing’s default is 1 but larger values should speed things up, especially with long sequences.

  • processes (None) – Optionally set number of processes to run in parallel.

Returns

Return type

list

htools.core.pipe(x, *funcs, verbose=False, attr='')

Convenience function to apply many functions in order to some object. This lets us replace messy notation where it’s hard to keep parenthesis straight:

list(parse_processed_text(tokenize_rows(porter_stem(strip_html_tags(

text)))))

with:

pipe(text, strip_html_tags, porter_stem, tokenize_rows,

parse_processed_text, list)

or if we have a list of functions:

pipe(x, *funcs)

Parameters
  • x (any) – Object to apply functions to.

  • *funcs (function(s)) – Functions in the order you want to apply them. Use functools.partial to specify other arguments.

  • verbose (bool) – If True, print x (or an attribute of x) after each step.

  • attr (str) – If specified and verbose is True, will print this attribute of x after each function is applied.

Returns

Return type

output of last func in *funcs

htools.core.print_object_sizes(space, limit=None, exclude_underscore=True)

Print the object names and sizes of the currently defined objects.

Parameters
  • space (dict) – locals(), globals(), or vars()

  • limit (int or None) – Optionally limit the number of objects displayed (default None for no limit).

  • exclude_underscore (bool) – Determine whether to exclude objects whose names start with an underscore (default True).

htools.core.quickmail(subject, message, to_email, from_email=None, img_path=None, img_name=None, verbose=True)

Send an email.

Parameters
  • from_email (str) – Gmail address being used to send email.

  • to_email (str) – Recipient’s email.

  • subject (str) – Subject line of email.

  • message (str) – Body of email.

Returns

Return type

None

htools.core.rmvars(*args)

Wrapper to quickly free up memory by deleting global variables. Htools 3.0 does not provide a way to do this for local variables.

Parameters

args (str) – One or more variable names to delete. Do not pass in the variable itself.

Returns

Return type

None

htools.core.safe_map(func, seq)

This addresses the issue of error handling in map() or list comprehension operations by simply skipping any items that throw an error. Note that values of None will be removed from the resulting list.

Parameters
  • func (function) – Function to apply to each item in seq.

  • seq (generator, iterator) – The sequence to iterate over. This could also be a generator, list, set, etc.

Returns

Return type

list

Examples

# Notice that instead of throwing an error when dividing by zero, that # entry was simply dropped. >>> safe_map(lambda x: x/(x-2), range(4)) [-0.0, -1.0, 3.0]

htools.core.save(obj, path, mode_pre='w', verbose=True)

Wrapper to save data as text, pickle (optionally zipped), or json.

Parameters
  • obj (any) – Object to save. This will be pickled/jsonified/zipped inside the function - do not convert it before-hand.

  • path (str) – File name to save object to. Should end with .txt, .pkl, .zip, or .json depending on desired output format. If .zip is used, object will be zipped and then pickled. (.sh extension is also allowed and will be treated identically to .txt.)

  • mode_pre (str) – Determines whether to write or append text. One of (‘w’, ‘a’).

  • verbose (bool) – If True, print a message confirming that the data was pickled, along with its path.

Returns

Return type

None

htools.core.select(items, keep=(), drop=())

Select a subset of a data structure. When used on a mapping (e.g. dict), you can specify a list of keys to include or exclude. When used on a sequence like a list or tuple, specify indices instead of keys.

Parameters
  • items (abc.Sequence or abc.Mapping) – The dictionary to select items from.

  • keep (Iterable[str]) – Sequence of keys to keep.

  • drop (Iterable[str]) – Sequence of keys to drop. You should specify either keep or drop, not both.

Returns

dictkeep), or all keys except the specified ones (when passing in drop).

Return type

Dictionary containing only the specified keys (when passing in

htools.core.shell(cmd)

Execute shell command (between subprocess and os, there’s ~5 different ways to do this and I always forget which I want. This is just a way for me to choose once and not have to decide again. There are rare situations where we may need a different function (subprocess.run is blocking; if we want to launch a process and continue the script without waiting for completion, we can use subprocess.check_call).

Parameters

cmd (str) – Example: ‘ls *.csv’

Returns

  • tuple (returncode (int), stderr, stdout. I believe stderr and stdout are)

  • None if nothing is returned and str otherwise.

htools.core.smap(*x)

Get shape of each array/tensor in a list or tuple.

Parameters

*x (np.arrays or torch.tensors) – We use star unpacking here to create a consistent interface with amap() and lmap().

Returns

list

Return type

Shape of each array/tensor in input.

htools.core.snake2camel(text)

Convert snake case to camel case. This assumes the input is valid snake case (if you have some weird hybrid of snake and camel case, for instance, you’d want to do some preprocessing first).

Parameters

text (str) – Snake case string, e.g. vader_sentiment_score.

Returns

str

Return type

text converted to camel case, e.g. vaderSentimentScore.

htools.core.spacer(char='-', n_chars=79, newlines_before=1, newlines_after=1)

Get string to separate output when printing output for multiple items.

Parameters
  • char (str) – The character that will be printed repeatedly.

  • n_chars (int) – The number of times to repeat char. We expect that char is a single character so this will be the total line length.

  • newlines_before (int) – Number of newline characters to add before the spacer.

  • newlines_after (int) – Number of newline characters to add after the spacer.

Returns

Return type

str

htools.core.tdir(obj, **kwargs)

A variation of the built in dir function that shows the attribute names as well as their types. Methods are excluded as they can change the object’s state.

Parameters
  • obj (any type) – The object to examine.

  • kwargs (bool) – Additional arguments to be passed to hdir. Options are magics and internals. See hdir documentation for more information.

Returns

  • dict[str, type] (Dictionary mapping the name of the object’s attributes to)

  • the corresponding types of those attributes.

htools.core.to_camel(text)

Experimental feature: tries to convert any common format to camel case. This hasn’t been extensively tested but it seems to work with camel case (no change), snake case, upper camel case, words separated by hyphens/dashes/spaces, and combinations of the above. It may occasionally split words that should not be split, though this should be rare if names use actual English words (this might not work so well on fastai-style variable names (very short, e.g. “tfms” for “transforms”), but the intended use case is mostly for fixing column names in pandas.

Parameters

text (str) –

Returns

str

Return type

Input text converted to snake case.

htools.core.to_snake(text)

Experimental feature: tries to convert any common format to snake case. This hasn’t been extensively tested but it seems to work with snake case (no change), camel case, upper camel case, words separated by hyphens/dashes/spaces, and combinations of the above. It may occasionally split words that should not be split, though this should be rare if names use actual English words (this might not work so well on fastai-style variable names (very short, e.g. “tfms” for “transforms”), but the intended use case is mostly for fixing column names in pandas.

Parameters

text (str) –

Returns

str

Return type

Input text converted to snake case.

htools.core.tolist(x, length_like=None, length=None, error_message='x length does not match desired length.')

Helper to let a function accept a single value or a list of values for a certain parameter.

WARNING: if x is a primitive and you specify a length (either via length_like or length, the resulting list will contain multiple references to the same item). This is mostly intended for use on lists of floats or ints so I don’t think it’s a problem, but keep this in mind when considering using this on mutable objects.

Parameters
  • x (Iterable) – Usually either a list/tuple or a primitive.

  • length_like (None or object) – If provided, we check that x is the same length. If x is a primitive, we’ll make it the same length.

  • length (None or int) – Similar to length_like but lets us specify the desired length directly. length_like overrides this, though you should only provide one or the other.

  • error_message (str) – Displayed in the event that a desired length is specified and x is list-like and does not match that length. You can pass in your own error message if you want something more specific to your current use case.

Returns

Return type

list

Examples

def train(lrs):

lrs = tolist(lrs) …

We can now pass in a single learning rate or multiple. >>> train(3e-3) >>> train([3e-4, 3e-3])

htools.core.vcounts(arr, normalize=True)

Equivalent of pandas_htools vcounts method that we can apply on lists or arrays. Basically just a wrapper around Counter but with optional normalization.

Parameters
  • arr (Iterable) – Sequence of values to count. Typically a list or numpy array.

  • normalize (bool) – If True, counts will be converted to percentages.

Returns

dict – that they occur in arr.

Return type

Maps unique items in arr to the number of times (or % of times)

htools.core.xor_none(*args, n=1)

Checks that exactly 1 (or n) of inputs is not None. Useful for validating optional function arguments (for example, ensuring the user specifies either a directory name or a list of files but not both.

Parameters
  • args (any) –

  • n (int) – The desired number of non-None elements. Usually 1 but we allow the user to specify other values.

Returns

  • None (This will raise an error if the condition is not satisfied. Do not)

  • use this as an if condition (e.g. `if xor_none(a, b) (print(‘success’)`.)

  • This would always evaluate to False because the function doesn’t explicitly

  • return a value so we get None.

htools.magics module

htools.meta module

class htools.meta.AbstractAttrs

Bases: type

Basically the attribute equivalent of abc.abstractmethod: this allows us to define an abstract parent class that requires its children to possess certain class and/or instance attributes. This differs from abc.abstractproperty in a few ways:

1. abstractproperty ignores instance attributes. AbstractAttrs lets us specify required instance attributes and/or class attributes and distinguish between the two. 2. abstractproperty considers the requirement fulfilled by methods, properties, and class attributes. AbstractAttrs does not allow methods (including classmethods and staticmethods) to fulfill either requirement, though properties can fulfill either.

Examples

This class defines required instance attributes and class attributes, but you can also specify one or the other. If you don’t care whether an attribute is at the class or instance level, you can simply use @abc.abstractproperty.

class Parent(metaclass=AbstractAttrs,

inst_attrs=[‘name’, ‘metric’, ‘strategy’], class_attrs=[‘order’, ‘is_val’, ‘strategy’]):

pass

Below, we define a child class that fulfills some but not all requirements.

class Child(Parent):

order = 1 metric = ‘mse’

def __init__(self, x):

self.x = x

@staticmethod def is_val(x):

@property def strategy():

def name(self):

More specifically:

Pass -possesses class attr ‘order’ -possess attribute ‘strategy’ (property counts as an instance attribute but not a class attribute. This is consistent with how it can be called: inst.my_property returns a value, cls.my_property returns a property object.)

Fail -‘metric’ is a class attribute while our interface requires it to be a class attribute -‘name’ is a method but it must be an instance attribute -‘is_val’ is a staticmethod but it must be a class attribute

class htools.meta.AutoInit

Bases: object

Mixin class where child class has a long list of init arguments where the parameter name and the class attribute will be the same. Note that *args are not supported in the init method because each attribute that is defined in the resulting object must have a name. A variable length list of args can still be passed in as a single argument, of course, without the use of star unpacking.

This updated version of AutoInit is slightly more user friendly than in V1 (no more passing locals() to super()) but also slower and probably requires more testing (all because of the frame hack in the init method). Note that usage differs from the AutoInit present in htools<=2.0.0, so this is a breaking change.

Examples

Without AutoInit:

class Child:
def __init__(self, name, age, sex, hair, height, weight, grade, eyes):

self.name = name self.age = age self.sex = sex self.hair = hair self.height = height self.weight = weight self.grade = grade self.eyes = eyes

def __repr__(self):

return f’Child(name={self.name}, age={self.age}, sex={self.sex}, ‘ f’hair={self.hair}, weight={self.weight}, ‘ f’grade={self.grade}, eyes={self.eyes})’

With AutoInit:

class Child(AutoInit):
def __init__(self, name, age, sex, hair, height, weight, grade, eyes):

super().__init__()

Note that we could also use the following method, though this is less informative when constructing instances of the child class and does not have the built in __repr__ that comes with AutoInit:

class Child:
def __init__(self, **kwargs):

self.__dict__.update(kwargs)

class htools.meta.Callback

Bases: abc.ABC

Abstract base class for callback objects to be passed to @callbacks decorator. Children must implement on_begin and on_end methods. Both should accept the decorated function’s inputs and output as arguments

Often, we may want to use the @debug decorator on one or both of these methods. If both methods should perform the same steps, one shortcut is to implement a single undecorated __call__ method, then have the debug-decorated on_begin and on_end methods return self(inputs, output).

abstract on_begin(func, inputs, output=None)
Parameters
  • func (function) – The function being decorated.

  • inputs (dict) – Dictionary of bound arguments passed to the function being decorated with @callbacks.

  • output (any) – Callbacks to be executed after the function call can pass the function output to the callback. The default None value will remain for callbacks that execute before the function.

abstract on_end(func, inputs, output=None)
Parameters
  • func (function) – The function being decorated.

  • inputs (dict) – Dictionary of bound arguments passed to the function being decorated with @callbacks.

  • output (any) – Callbacks to be executed after the function call can pass the function output to the callback. The default None value will remain for callbacks that execute before the function.

abstract setup(func)
Parameters

func (function) – The function being decorated.

class htools.meta.ContextDecorator

Bases: abc.ABC

Abstract class that makes it easier to define classes that can serve either as decorators or context managers. This is a viable option if the function decorator case effectively wants to execute the function inside a context manager. If you want to do something more complex, this may not be appropriate since it’s not clear what would happen in the context manager use case.

Examples

import time

class Timer(ContextDecorator):

def __init__(self):

# More complex decorators might need to store variables here.

def __enter__(self):

self.start = time.perf_counter()

def __exit__(self, exc_type, exc_value, traceback):

print(‘TIME:’, time.perf_counter() - self.start)

@Timer() def foo(a, *args):

# do something

with Timer():

# do something

# Both of these usage methods work!

class htools.meta.LazyChainMeta

Bases: type

Metaclass to create LazyChainable objects.

class htools.meta.LazyChainable

Bases: object

Base class that allows children to lazily chain methods, similar to a Spark RDD.

Chainable methods must be decorated with @staticmethod and @chainmethod and be named with a leading underscore. A public method without the leading underscore will be created, so don’t overwrite this with another method. Chainable methods accept an instance of the same class as the first argument, process the instance in some way, then return it. A chain of commands will be stored until the exec() method is called. It can operate either in place or not.

Examples

class Sequence(LazyChainable):

def __init__(self, numbers, counter, new=True):

super().__init__() self.numbers = numbers self.counter = counter self.new = new

@staticmethod @lazychain def _sub(instance, n):

instance.counter -= n return instance

@staticmethod @lazychain def _gt(instance, n=0):

instance.numbers = list(filter(lambda x: x > n, instance.numbers)) return instance

@staticmethod @lazychain def _call(instance):

instance.new = False return instance

def __repr__(self):

pre, suf = super().__repr__().split(‘(‘) argstrs = (f’{k}={repr(v)}’ for k, v in vars(self).items()) return f’{pre}({“, “.join(argstrs)}, {suf}’

>>> seq = Sequence([3, -1, 5], 0)
>>> output = seq.sub(n=3).gt(0).call().exec()
>>> output

Sequence(ops=[], numbers=[3, 5], counter=-3, new=False)

>>> seq   # Unchanged because exec was not in place.

Sequence(ops=[], numbers=[3, -1, 5], counter=0, new=True)

>>> output = seq.sub(n=3).gt(-1).call().exec(inplace=True)
>>> output   # None because exec was in place.
>>> seq   # Changed

Sequence(ops=[], numbers=[3, -1, 5], counter=-3, new=False)

exec(inplace=False)
class htools.meta.LoggerMixin

Bases: object

Mixin class that configures and returns a logger.

Examples

class Foo(LoggerMixin):

def __init__(self, a, log_file):

self.a = a self.log_file = log_file self.logger = self.get_logger(log_file)

def walk(self, location):

self.logger.info(f’walk received argument {location}’) return f’walking to {location}’

get_logger(path=None, fmode='a', level='info', fmt='%(asctime)s [%(levelname)s]: %(message)s')
Parameters
  • path (str or None) – If provided, this will be the path the logger writes to. If left as None, logging will only be to stdout.

  • fmode (str) – Logging mode when using a log file. Default ‘a’ for ‘append’. ‘w’ will overwrite the previously logged messages. Note: this only affects what happens when we create a new logger (‘w’ will remove any existing text in the log file if it exists, while ‘a’ won’t. But calling logger.info(my_msg) twice in a row with the same logger will always result in two new lines, regardless of mode.

  • level (str) – Minimum level necessary to log messages. One of (‘debug’, ‘info’, ‘warning’, ‘error’)

  • fmt (str) – Format that will be used for logging messages. This uses the logging module’s formatting language, not standard Python string formatting.

Returns

Return type

logging.logger

class htools.meta.MultiLogger(path, fmode='w', fmt='%(message)s')

Bases: htools.meta.LoggerMixin

Easy way to get a pre-configured logger. This can also be used to record stdout, either through the context manager provided by contextlib or the function decorator defined in this module.

It delegates to its logger and should be used as follows when explicitly called by the user:

logger = MultiLogger(‘train.log’) logger.info(‘Starting model training.’numeric)

Notice we call the info method rather than write.

write(buf)

Provided for compatibility with redirect_stdout to allow logging of stdout while still printing it to the screen. The user should never call this directly.

class htools.meta.ReadOnly

Bases: object

Descriptor to make an attribute read-only. This means that once a value has been set, the user cannot change or delete it. Note that read-only attributes must first be created as class variables (see example below). To allow more flexibility, we do allow the user to manually manipulate the instance dictionary.

Examples

class Dog:

breed = ReadOnly() def __init__(self, breed, age):

# Once breed is set in the line below, it cannot be changed. self.breed = breed self.age = age

>>> d = Dog('dalmatian', 'Arnold')
>>> d.breed

‘dalmatian’

>>> d.breed = 'labrador'

PermissionError: Attribute is read-only.

>>> del d.breed

PermissionError: Attribute is read-only.

class htools.meta.SaveableMixin

Bases: object

Provide object saving and loading methods. If you want to be able to pass a file name rather than a full path to save, the object can define a self.dir attribute.

classmethod load(path)

Load object from pickle file.

Parameters

path (str or Path) – Name of file where object is stored.

save(path=None, fname=None)

Pickle object with optional compression.

Parameters
  • path (str or Path) – Path to save object to.

  • fname (str or Path) – If passed in, method will use this as a filename within the object’s dir attribute.

exception htools.meta.TimeExceededError

Bases: Exception

htools.meta.add_docstring(func)

Add the docstring from another function/class to the decorated function/class.

Examples

@add_docstring(nn.Conv2d) class ReflectionPaddedConv2d(nn.Module):

htools.meta.add_kwargs(*fns, required=True, variable=True)

When one or more functions are called inside another function, we often have the choice of accepting **kwargs in our outer function (downside: user can’t see parameter names with quick documentation tools) or explicitly typing out each parameter name and default (downsides: time consuming and error prone since it’s easy to update the inner function and forget to update the outer one). This lets us update the outer function’s signature automatically based on the inner function(s)’s signature(s). The Examples section should make this more clear.

The wrapped function must accept **kwargs, but you shouldn’t refer to kwargs explicitly inside the function. Its variables will be made available essentially as global variables. This shares a related goal with fastai’s delegates decorator but it provides a slightly different solution: delegates updates the quick documentation but the variables are still ultimately only available as kwargs. Here, they are available like regular variables.

Note: don’t actually use this for anything important, I imagine it could lead to some pretty nasty bugs. I was just determined to get something working.

Parameters
  • fns (functions) – The inner functions whose signatures you wish to use to update the signature of the decorated outer function. When multiple functions contain a parameter with the same name, priority is determined by the order of fns (earlier means higher priority).

  • required (bool) – If True, include required arguments from inner functions (that is, positional arguments or positional_or_keyword arguments with no default value). If False, exclude these (it may be preferable to explicitly include them in the wrapped function’s signature).

  • variable (bool) – If True, include *kwargs and **kwargs from the inner functions. They will be made available as {inner_function_name}_args and {inner_function_name}_kwargs, respectively (see Examples). Otherwise, they will be excluded.

Examples

def foo(x, c, *args, a=3, e=(11, 9), b=True, f=(‘a’, ‘b’, ‘c’), **kwargs):

print(‘in foo’) return x * c

def baz(n, z=’z’, x=’xbaz’, c=’cbaz’):

print(‘in baz’) return n + z + x + c

baz comes before foo so its x param takes priority and has a default value of ‘xbaz’. The decorated function always retains first priority so the c param remains positional despite its appearance as a positional arg in foo.

@add_kwargs(baz, foo, positional=True) def bar(c, d=16, **kwargs):

foo_res = foo(x, c, *foo_args, a=a, e=e, b=b, f=f, **foo_kwargs) baz_res = baz(n, z, x, c) return {‘c’: c, ‘n’: n, ‘d’: d, ‘x’: x, ‘z’: z, ‘a’: a,

‘e’: e, ‘b’: b, ‘f’: f}

bar ends up with the following signature: <Signature (c, n, d=16, x=’xtri’, foo_args=(), z=’z’, *, a=3, e=(11, 9),

b=True, f=(‘a’, ‘b’, ‘c’), foo_kwargs={}, **kwargs)>

Notice many variables are available inside the function even though they aren’t explicitly hard-coded into our function definition. When using shift-tab in Jupyter or other quick doc tools, they will all be visible. You can see how passing in multiple functions can quickly get messy so if you insist on using this, try to keep it to 1-2 functions if possible.

htools.meta.assert_raises(error)

Context manager to assert that an error is raised. This can be nice if we don’t want to clutter up a notebook with error messages.

Parameters

error (class inheriting from Exception or BaseException) – The type of error to catch, e.g. ValueError.

Examples

# First example does not throw an error. >>> with assert_raises(TypeError) as ar: >>> a = ‘b’ + 6

# Second example throws an error. >>> with assert_raises(ValueError) as ar: >>> a = ‘b’ + 6

AssertionError: Wrong error raised. Expected PermissionError, got TypeError(can only concatenate str (not “int”) to str)

# Third example throws an error because the code inside the context manager # completed successfully. >>> with assert_raises(ValueError) as ar: >>> a = ‘b’ + ‘6’

AssertionError: No error raised, expected PermissionError.

htools.meta.auto_repr(cls)

Class decorator that provides __repr__ method automatically based on __init__ parameters. This aims to provide a simpler alternative to AutoInit that does not require access to the arguments passed to __init__. Attributes will only be included in the repr if they are in the class dict and appear in __init__ as a named parameter (with the same name).

Examples

@auto_repr class Foo:

def __init__(self, a, b=6, c=None, p=0.5, **kwargs):

self.a = a self.b = b # Different name to demonstrate that cat is not included in repr. self.cat = c # Property is not stored in class dict, not included in repr. self.p = p

@property def p(self):

return self._p

@p.setter def p(self, val):

if val > 0:

self._p = val

else:

raise ValueError(‘p must be non-negative’)

>>> f = Foo(3, b='b', c='c')
>>> f

Foo(a=3, b=’b’)

htools.meta.block_timer()

Context manager to time a block of code. This works similarly to @timer but can be used on code outside of functions.

Examples

with block_timer() as bt:

# Code inside the context manager will be timed. arr = [str(i) for i in range(25_000_000)] first = None while first != ‘100’:

arr.pop(0)

htools.meta.bound_args(func, args, kwargs, collapse_kwargs=True)

Get the bound arguments for a function (with defaults applied). This is very commonly used when building decorators that log, check, or alter how a function was called.

Parameters
  • func (function) –

  • args (tuple) – Notice this is not *args. Just pass in the tuple.

  • kwargs (dict) – Notice this is not **kwargs. just pass in the dict.

  • collapse_kwargs (bool) – If True, collapse kwargs into the regular parameter dict. E.g. {‘a’: 1, ‘b’: True, ‘kwargs’: {‘c’: ‘c_val’, ‘d’: 0}} -> {‘a’: 1, ‘b’: True, ‘c’: ‘c_val’, ‘d’: 0}

Returns

OrderedDict[str, any]

Return type

Maps parameter name to passed value.

class htools.meta.cached_property(func)

Bases: object

Decorator for computationally expensive methods that should only be computed once (i.e. they take zero arguments aside from self and are slow to execute). Lowercase name is used for consistency with more decorators. Heavily influenced by example in Python Cookbook by David Beazley and Brian K. Jones. Note that, as with the @property decorator, no parentheses are used when calling the decorated method.

Examples

class Vocab:

def __init__(self, tokens):

self.tokens = tokens

@cached_property def embedding_matrix(self):

print(‘Building matrix…’) # Slow computation to build and return a matrix of word embeddings. return matrix

# First call is slow. >>> v = Vocab(tokens) >>> v.embedding_matrix

Building matrix… [[.03, .5, .22, .01],

[.4, .13, .06, .55] [.77, .14, .05, .9]]

# Second call accesses attribute without re-computing # (notice no “Building matrix” message). >>> v.embedding_matrix

[[.03, .5, .22, .01],

[.4, .13, .06, .55] [.77, .14, .05, .9]]

htools.meta.callbacks(cbs)

Decorator that attaches callbacks to a function. Callbacks should be defined as classes inheriting from abstract base class Callback that implement on_begin and on_end methods. This allows us to store states rather than just printing outputs or relying on global variables.

Parameters

cbs (list) – List of callbacks to execute before and after the decorated function.

Examples

@callbacks([PrintHyperparameters(), PlotActivationHist(),

ActivationMeans(), PrintOutput()])

def train_one_epoch(**kwargs):

# Train model.

htools.meta.chainmethod(func)

Decorator for methods in classes that want to implement eager chaining. Chainable methods should be instance methods that change 1 or more instance attributes and return None. All this decorator does is ensure these methods are called on a deep copy of the instance instead of on the instance itself so that operations don’t affect the original object. The new object is returned.

Examples

@auto_repr class EagerChainable:

def __init__(self, arr, b=3):

self.arr = arr self.b = b

@chainmethod def double(self):

self.b *= 2

@chainmethod def add(self, n):

self.arr = [x+n for x in self.arr]

@chainmethod def append(self, n):

self.arr.append(n)

>>> ec = EagerChainable([1, 3, 5, -22], b=17)
>>> ec

EagerChainable(arr=[1, 3, 5, -22], b=17)

>>> ec2 = ec.append(99).double().add(400)
>>> ec2

EagerChainable(arr=[401, 403, 405, 378, 499], b=34)

>>> ec   # Remains unchanged.
EagerChainable(arr=[1, 3, 5, -22], b=17)
htools.meta.copy_func(func)

Copy a function. Regular copy and deepcopy functionality do not work on functions the way they do on most objects. If we want to create a new function based on another without altering the old one (as in rename_params), this should be used.

Parameters

func (function) – Function to duplicate.

Returns

function

Return type

Copy of input func.

Examples

def foo(a, b=3, *args, c=5, **kwargs):

return a, b, c, args, kwargs

foo2 = copy_func(foo)

>>> foo2.__code__ == foo.__code__
True
>>> foo2 == foo
False
htools.meta.count_calls(func)

Count the number of times a function has been called. The function can access this value inside itself through the attribute ‘calls’. Note that counting is defined such that during the first call, func.calls already=1 (i.e. it can be considered the n’th call, not that n calls have previously taken place not counting the current one).

htools.meta.debug(func=None, prefix='', arguments=True, out_path=None)

Decorator that prints information about a function call. Often, this will only be used temporarily when debugging. Note that a wrapped function that accepts *args will display a signature including an ‘args’ parameter even though it isn’t a named parameter, because the goal here is to explicitly show which values are being passed to which parameters. This does mean that the printed string won’t be executable code in this case, but that shouldn’t be necessary anyway since it would contain the same call that just occurred.

The decorator can be used with or without arguments.

Parameters
  • func (function) – Function being decorated.

  • prefix (str) – A short string to prepend the printed message with. Ex: ‘>>>’

  • arguments (bool) – If True, the printed message will include the function arguments. If False, it will print the function name but not its arguments.

  • out_path (str or Path or None) – If provided, a dict of arguments will be saved as a json file as specified by this path. Intermediate directories will be created if necessary. Function arguments will be made available for string formatting if you wish to use that in the file name. Example: ‘data/models/{prefix}/args.json’. The argument “prefix” will be used to save the file in the appropriate place. Note: arguments does not affect this since arguments are the only thing saved here.

Examples

Occasionally, you might pass arguments to different parameters than you intended. Throwing a debug_call decorator on the function helps you check that the arguments are matching up as expected. For example, the parameter names in the function below have an unexpected order, so you could easily make the following call and expect to get 8. The debug decorator helps catch that the third argument is being passed in as the x parameter.

@debug def f(a, b, x=0, y=None, z=4, c=2):

return a + b + c

>>> f(3, 4, 1)
CALLING f(a=3, b=4, x=1, y=None, z=4, c=2)
9

@debug(prefix=’***’, arguments=False) def f(a, b, x=0, y=None, z=4, c=2):

return a + b + c

>>> f(3, 4, 1)
*** CALLING f()
9
htools.meta.delegate(attr, iter_magics=False, skip=(), getattr_=True)

Decorator that automatically delegates attribute calls to an attribute of the class. This is a nice convenience to have when using composition. User can also choose to delegate magic methods related to iterables.

Note: I suspect this could lead to some unexpected behavior so be careful using this in production.

KNOWN ISSUES: -Max recursion error when a class inherits from nn.Module and delegates to the actual model. -Causes pickling issues at times. Haven’t figured out cause yet.

Parameters
  • attr (str) – Name of variable to delegate to.

  • iter_magics (bool) – If True, delegate the standard magic methods related to iterables: ‘__getitem__’, ‘__setitem__’, ‘__delitem__’, and ‘__len__’.

  • skip (Iterable[str]) – Can optionally provide a list of iter_magics to skip. This only has an effect when iter_magics is True. For example, you may want to be able to iterate over the class but no allow item deletion. In this case you should pass skip=(‘__delitem__’).

  • getattr_ (bool) – If True, delegate non-magic methods. This means that if you try to access an attribute or method that the object produced by the decorated class does not have, it will look for it in the delegated object.

Examples

Example 1: We can use BeautifulSoup methods like find_all directly on the Page object. Most IDEs should let us view quick documentation as well.

@delegate(‘soup’) class Page:

def __init__(self, url, logfile, timeout):

self.soup = self.fetch(url, timeout=timeout)

page = Page(‘http://www.coursera.org’) page.find_all(‘div’)

Example 2: Magic methods except for __delitem__ are delegated.

@delegate(‘data’, True, skip=(‘__delitem__’)) class Foo:

def __init__(self, data, city):

self.data = data self.city = city

>>> f = Foo(['a', 'b', 'c'], 'San Francisco')
>>> len(f)
3
>>> for char in f:
>>>     print(char)
a
b
c
>>> f.append(3); f.data
['a', 'b', 'c', 3]
>>> del f[0]
TypeError: 'Foo' object doesn't support item deletion
>>> f.clear(); f.data
[]
htools.meta.fallback(meth=None, *, keep=(), drop=(), save=False)

Make instance/class attributes available as default arguments for a method. Kwargs can be passed in to override one or more of them. You can also choose for kwargs to update the instance attributes if desired.

When using default values for keep/drop/save, the decorator can be used without parentheses. If you want to change one or more arguments, they must be passed in as keyword args (meth is never explicitly passed in, of course).

Parameters
  • meth (method) – The method to decorate. Unlike the other arguments, this is passed in implicitly.

  • keep (Iterable[str] or str) – Name(s) of instance attributes to include. If you specify a value here, ONLY these instance attributes will be made available as fallbacks. If you don’t pass in any value, the default is for all instance attributes to be made available. You can specify keep, drop, or neither, but not both. This covers all possible options: keep only a few, keep all BUT a few, or keep all (drop all is the default case and doesn’t require a decorator).

  • drop (Iterable[str] or str) – Name(s) of instance attributes to ignore. I.e. if you want to make all instance attributes available as fallbacks except for self.df, you could specify drop=(‘df’).

  • save (bool) – If True, kwargs that share names with instance attributes will be overwritten with their new values. E.g. if we previously had self.lr = 3e-3 and you call your decorated method with obj.mymethod(lr=1), self.lr will be set to 1.

Examples

# Ex 1. self.a, self.b, and self.c are all available as defaults

class Tree:
def __init__(self, a, b, c=3):

self.a = a self.b = b self.c = c

@fallback def call(self, **kwargs):

return a, b, c

# Ex 2. self.b is not available as a default. We must put b in call’s # signature or the variable won’t be accessible.

class Tree:
def __init__(self, a, b, c=3):

self.a = a self.b = b self.c = c

@fallback(drop=(‘b’)) def call(self, b, **kwargs):

return a, b, c

# Ex 3. Self.b and self.c are available as defaults. If b or c are # specified in kwargs, the corresponding instance attribute will be updated # to take on the new value.

class Tree:
def __init__(self, a, b, c=3):

self.a = a self.b = b self.c = c

@fallback(keep=[‘b’, ‘c’], save=True) def call(self, a, **kwargs):

return a, b, c

htools.meta.function_interface(present=(), required=(), defaults=(), startswith=(), args: (True, False, None) = None, kwargs: (True, False, None) = None, like_func=None)

Decorator factory to enforce a some kind of function signature interface (i.e. the first two arguments must be (‘model’, ‘x’) or the function must accept **kwargs or the parameter ‘learning_rate’ must be present but not required because it has a default value).

Parameters
  • present (Iterable[str]) – List of parameter names that must be present in the function signature. This will not check anything about their order or if they’re required, just that they’re present.

  • required (Iterable[str]) – List of names that must be required parameters in the function (i.e. they have no default value).

  • defaults (Iterable[str]) – List of names that must be present in the function signature with default values.

  • startswith (Iterable[str]) – List of names that the function signature must start with. Order matters.

  • args (bool) – If True, require function to accept *args. If False, require that it doesn’t. If None, don’t check either way.

  • kwargs (bool) – If True, require function to accept **kwargs. If False, require that it doesn’t. If None, don’t check either way.

  • like_func (None or function) – If provided, this function’s signature will define the interface that all future decorated functions must match. Their name will obviously be different but all parameters must match (that means names, order, types, defaults, etc.).

htools.meta.handle(func=None, default=None)

Decorator that provides basic error handling. This is a rare decorator that is often most useful without the syntactic sugar: for instance, we may have a pre-existing function and want to apply it to a pandas Series while handling errors. See Examples.

Parameters
  • func (callable) – The function to decorate.

  • default (any) – This is the value that will be returned when the wrapped function throws an error.

Examples

There are a few different ways to use this function:

@handle def func():

# Do something

@handle(default=0) def func():

# Do something

def some_func(x):

# Do something

df.name.apply(handle(some_func))

htools.meta.handle_interrupt(func=None, cbs=(), verbose=True)

Decorator that allows us to interrupt a function with ctrl-c. We can pass in callbacks that execute on function end. Keep in mind that local variables will be lost as soon as func stops running. If func is a method, it may be appropriate to update instance variables while running, which we can access because the instance will be the first element of args (passed in as self).

Note: Kwargs are passed to callbacks as a single dict, not as **kwargs.

Parameters
  • func (function) –

  • cbs (Iterable[Callback]) – List of callbacks to execute when func completes. These will execute whether we interrupt or not.

  • verbose (bool) – If True, print a message to stdout when an interrupt occurs.

htools.meta.hasarg(func, arg)

Check if a function has a parameter with a given name. (Technically, hasparam might be a more appropriate name but hasarg lets us match the no-space convention of hasattr and getattr while maintaining readability.)

Parameters
  • func (function) –

  • arg (str) – The name of the parameter that you want to check for in func’s signature.

Returns

bool

Return type

True if func has a parameter named arg.

htools.meta.immutify_defaults(func)

Decorator to make a function’s defaults arguments effectively immutable. We accomplish this by storing the initially provided defaults and assigning them back to the function’s signature after each call. If you use a variable as a default argument, this does not mean that the variable’s value will remain unchanged - it just ensures the initially provided value will be used for each call.

htools.meta.lazychain(func)

Decorator to register a method as chainable within a LazyChainable class.

htools.meta.log_cmd(path, mode='w', defaults=False)

Decorator that saves the calling command for a python script. This is often useful for CLIs that train ML models. It makes it easy to re-run the script at a later date with the same or similar arguments. If importing a wrapped function (or class with a wrapped method), you must include

os.environ[‘LOG_CMD’] = ‘true’

in your script if you want logging to occur (accidentally overwriting log files unintentionally can be disastrous). Values ‘True’ and ‘1’ also work but True and 1 do not (os.environ requires strings). Note that these values will not persist once the script completes.

Parameters
  • path (str or Path) – Specifies file where output will be saved.

  • mode (str) – Determines whether output should overwrite old file or be appended. One of (‘a’, ‘w’). In most cases we will want append mode because we’re tracking multiple trials.

  • defaults (bool) – If True, include all arg values, even those that weren’t specified from the command line (e.g. if your CLI function accepts up to 10 args (some with default values) and you pass in 3, the command will be logged as if you explicitly passed in all 10. This can be useful if you think your default args might change over time). If False, only args that were explicitly mentioned in your command will be used.

Examples

``` # train.py import fire

@log_cmd(‘logs/training_runs.txt’) def train(lr, epochs, dropout, arch, data_version, layer_dims):

# Train model

if __name__ == ‘__main__’:

fire.Fire(train)

```

$ python train.py –lr 3e-3 –epochs 50 –dropout 0.5 –arch awd_lstm –data_version 1 –layer_dims ‘[64, 128, 256]’ –dl_kwargs ‘{“shuffle”: False, “drop_last”: True}’

After running the script with the above command, the file ‘logs/training_runs.txt’ now contains a nicely formatted version of the calling command with a separate line for each argument name/value pair.

We can also use variables that are passed to our function. All function args and kwargs will be passed to the string formatter so your variable names must match:

@log_cmd(‘logs/train_run_v{version_number}.{ext}’) def train(version_number, ext, epochs, arch=’lstm’):

# Train model

htools.meta.log_stdout(func=None, fname='')

Decorator that logs all stdout produced by a function.

Parameters
  • func (function) – If the decorator is used without parenthesis, the function will be passed in as the first argument. You never need to explicitly specify a function.

  • fname (str) – Path to log file which will be created. If None is specified, the default is to write to ./logs/wrapped_func_name.log. If specified, this must be a keyword argument.

Examples

@log_stdout def foo(a, b=3):

print(a) a *= b print(a) return a**b

@log_stdout(fname=’../data/mylog.log’) def foo(a, b=3):

htools.meta.params(func)

Get parameters in a functions signature.

Parameters

func (function) –

Returns

dict

Return type

Maps name (str) to Parameter.

htools.meta.rename_params(func, **old2new)

Rename one or more parameters. Docstrings and default arguments are updated accordingly. This is useful when working with code that uses hasarg. For example, my Incendio library uses parameter names to pass the correct arguments to different metrics.

# TODO: looks like this updates the signature but doesn’t actually change the variable names. So you can’t call the decorated function with the new argument names.

Parameters
  • func (function) – The old function to change.

  • old2new (str) – One or more parameter names to change and their corresponding new names. See Example below for a more concrete example.

Returns

function

Return type

Same as input func but with updated parameter names.

Examples

def foo(a, b, *args, c=3, **kwargs):

pass

foo_metric = rename_params(func, a=y_true, b=y_pred)

foo_metric will work exactly like foo but its first two parameters will now be named “y_true” and “y_pred”, respectively.

htools.meta.return_stdout(func)

Decorator that returns printed output from the wrapped function. This may be useful if we define a function that only prints information and returns nothing, then later decide we want to access the printed output. Rather than re-writing everything, we can slap a @return_stdout decorator on top and leave it as is. This should not be used if the decorated function already returns something else since we will only return what is printed to stdout. For that use case, consider the log_stdout function.

htools.meta.temporary_globals(func, **kwargs)

Make a dict of key-value pairs temporarily available to a function in its global vars. We have to use function globals rather than globals() because the latter is evaluated when importing this function and so takes on the globals of htools/meta.py rather than of the scope where the code will ultimately be executed. Used in add_kwargs and fallback decorators (i.e. mostly for toy functionality, risky to actually use this).

htools.meta.timebox(seconds, strict=True, freq=0.1, cleanup=True)

Try to execute code for specified amount of time before throwing error. If you don’t want to throw an error, use with a try/except block.

Parameters
  • seconds (float) – Max number of seconds before throwing error. This will be enforced with a relatively low level of precision.

  • strict (bool) – If True, timeout will cause an error to be raised, halting execution of the entire program. If False, a warning message will be printed and the timeboxed operation will end, letting the program proceed to the next step.

  • freq (float) – How often to update progress bar (measured in seconds).

  • cleanup (bool) – If True, progress bar will disappear on function end. This is nice if we’re calling the decorated function inside a loop and don’t want hundreds of progress bars littering the notebook/terminal.

Examples

with time_box(5) as tb:

x = computationally_expensive_code()

More permissive version: x = step_1() with timebox(5) as tb:

try:

x = slow_step_2()

except TimeExceededError:

pass

htools.meta.timebox_handler(time, frame)
htools.meta.timeboxed(time, strict=True, freq=0.1)

Decorator version of timebox. Try to execute decorated function for time seconds before throwing exception.

Parameters
  • time (float) – Max number of seconds before throwing error. This will be enforced with a relatively low level of precision.

  • strict (bool) – If True, timeout will cause an error to be raised, halting execution of the entire program. If False, a warning message will be printed and the timeboxed operation will end, letting the program proceed to the next step.

  • freq (float) – How often to update the progress bar (measured in seconds).

Examples

@timeboxed(5) def func(x, y):

# If function does not complete within 5 seconds, will throw error.

htools.meta.timer(func)

Provide conservative time estimate for a function to run. Behavior may not be interpretable for recursive functions.

Parameters

func (function) – The function to time.

Examples

import time

@timer def count_to(x):

for i in range(x):

time.sleep(0.5)

>>> count_to(10)
[TIMER]: count_to executed in approximately 5.0365 seconds.
htools.meta.typecheck(func_=None, **types)

Decorator to enforce type checking for a function or method. There are two ways to call this: either explicitly passing argument types to the decorator, or letting it infer them using type annotations in the function that will be decorated. We allow multiple both usage methods since older versions of Python lack type annotations, and also because I feel the annotation syntax can hurt readability.

Parameters
  • func_ (function) – The function to decorate. When using decorator with manually-specified types, this is None. Underscore is used so that func can still be used as a valid keyword argument for the wrapped function.

  • types (type) – Optional way to specify variable types. Use standard types rather than importing from the typing library, as subscripted generics are not supported (e.g. typing.List[str] will not work; typing.List will but at that point there is no benefit over the standard list).

Examples

In the first example, we specify types directly in the decorator. Notice that they can be single types or tuples of types. You can choose to specify types for all arguments or just a subset.

@typecheck(x=float, y=(int, float), iters=int, verbose=bool) def process(x, y, z, iters=5, verbose=True):

print(f’z = {z}’) for i in range(iters):

if verbose: print(f’Iteration {i}…’) x *= y

return x

>>> process(3.1, 4.5, 0, 2.0)
TypeError: iters must be <class 'int'>, not <class 'float'>.
>>> process(3.1, 4, 'a', 1, False)
z = a
12.4

Alternatively, you can let the decorator infer types using annotations in the function that is to be decorated. The example below behaves equivalently to the explicit example shown above. Note that annotations regarding the returned value are ignored.

@typecheck def process(x:float, y:(int, float), z, iters:int=5, verbose:bool=True):

print(f’z = {z}’) for i in range(iters):

if verbose: print(f’Iteration {i}…’) x *= y

return x

>>> process(3.1, 4.5, 0, 2.0)
TypeError: iters must be <class 'int'>, not <class 'float'>.
>>> process(3.1, 4, 'a', 1, False)
z = a
12.4
htools.meta.validating_property(func, allow_del=False)

Factory that makes properties that perform some user-specified validation when setting values. The returned function must be used as a descriptor to create a class variable before setting the instance attribute.

Parameters
  • func (function) – Function or lambda that accepts a single parameter. This will be used when attempting to set a value for the managed attribute. It should return True if the value is acceptable, False otherwise.

  • allow_del (bool) – If True, allow the attribute to be deleted.

Returns

function – will be used as a descriptor, so it must create a class variable as shown below. In the example, also notice that the name passed to LengthyInt mustt match the name of the variable it is assigned to.

Return type

A property with validation when setting values. Note that this

Examples

LengthyInt = validating_property(

lambda x: isinstance(x, int) and len(str(int)) > 4

)

class Foo:

long = LengthyInt(‘long’) def __init__(self, a, long):

self.a = a self.long = long

>>> foo = Foo(3, 4)

ValueError: Invalid value 4 for argument long.

# No error on instantiation because the argument is a valid LengthyInt. >>> foo = Foo(3, 543210) >>> foo.long

543210

>>> foo = Foo(3, 'abc')
ValueError: Invalid value 'abc' for argument long.
htools.meta.valuecheck(func)

Decorator that checks if user-specified arguments are acceptable. Because this re-purposes annotations to specify values rather than types, this can NOT be used together with the @typecheck decorator. Keep in mind that this tests for equality, so 4 and 4.0 are considered equivalent.

Parameters

func (function) – The function to decorate. Use annotations to specify acceptable values as tuples, as shown below.

Examples

@valuecheck def foo(a, b:(‘min’, ‘max’), c=6, d:(True, False)=True):

return d, c, b, a

>>> foo(3, 'min')
(True, 6, 'min', 3)
>>> foo(True, 'max', d=None)
ValueError: Invalid argument for parameter d. Value must be in
(True, False).
>>> foo('a', 'mean')
ValueError: Invalid argument for parameter b. Value must be in
('min', 'max').
htools.meta.verbose_log(path, fmode='w', fmt='%(message)s')

Decorator to log stdout to a file while also printing it to the screen. Commonly used for model training.

Parameters
  • path (str or Path) – Log file.

  • fmode (str) – One of (‘a’, ‘w’) for ‘append’ mode or ‘write’ mode. Note that ‘w’ only overwrites the existing file once when the decorated function is defined: subsequent calls to the function will not overwrite previously logged content.

  • fmt (str) – String format for logging messages. Uses formatting specific to logging module, not standard Python string formatting.

htools.meta.wrapmethods(*decorators, methods=(), internals=False)

Class wrapper that applies 1 or more decorators to every non-magic method (properties are also excluded). For example, we often want @debug to be applied to many different methods.

Parameters
  • decorators (callable) – 1 or more decorators to apply to methods within a class. By default, methods with 1 or 2 leading underscores are excluded.

  • methods (Iterable[str]) – Names of methods to wrap if you don’t want to wrap all of them. Internal methods can be wrapped but magic methods and properties cannot.

  • internals (bool) – If True, apply decorators to methods named with leading single underscores. This will be ignored if methods is specified.

Module contents