Basic Pipeline

vflow allows us to easily construct a pipeline with several perturbations (e.g. different data subsamples, models, and metrics) by wrapping the set of functions at each stage in a Vset. We can then perform aggregate operations on our Vset (e.g. to fit all perturbations) and easily access downstream results.

Our pipeline can be visualized from any stage using build_graph(vset, draw=True):

Vset outputs can be easily converted to pandas dataframes using dict_to_df(out):

We can then compute aggregate statistics on specified pipeline stages using perturbations_stats(data, *group_by):

Feature Engineering Pipeline

This vflow pipeline predicts disease progression using the diabetes dataset (regression).