Package qcsv :: Module qcsv
[frames] | no frames]

Module qcsv

source code

Functions
 
read(fname, delimiter=',', skip_header=False)
read loads cell data, column headers and type information for each column given a file path to a CSV formatted file.
source code
 
data(fname, delimiter=',', skip_header=False)
data loads cell data and column headers.
source code
 
column_types(names, rows)
column_types infers type information from the columns in rows.
source code
 
cast(types, names, rows)
cast type casts all of the values in 'rows' to their corresponding types in types.
source code
 
convert_missing_cells(types, names, rows, dstr='', dint=0, dfloat=0.0)
convert_missing_cells changes the values of all NULL cells to the values specified by dstr, dint and dfloat.
source code
 
convert_columns(names, rows, **kwargs)
convert_columns executes converter functions on specific columns, where the parameter names for kwargs are the column names, and the parameter values are functions of one parameter that return a single value.
source code
 
convert_types(types, names, rows, fstr=None, fint=None, ffloat=None)
convert_types works just like convert_columns, but on types instead of specific columns.
source code
 
column(types, names, rows, colname)
column returns the column with name "colname", where the column returned is a triple of the column type, the column name and a list of cells in the column.
source code
 
columns(types, names, rows)
columns returns a list of all columns in the data set, where each column is a triple of its type, name and a list of cells in the column.
source code
 
type_str(typ)
type_str returns a string representation of a column type.
source code
 
cell_str(cell_contents)
cell_str is a convenience function for converting cell contents to a string when there are still NULL values.
source code
 
print_data_table(types, names, rows)
print_data_table is a convenience function for pretty-printing the data in tabular format, including header names and type annotations.
source code
Variables
  __package__ = 'qcsv'
Function Details

read(fname, delimiter=',', skip_header=False)

source code 

read loads cell data, column headers and type information for each column given a file path to a CSV formatted file.

All cells have left and right whitespace trimmed.

All rows MUST be the same length.

delimiter is the string the separates each field in a row.

If skip_header is set, then no column headers are read, and column names are set to their corresponding indices (as strings).

data(fname, delimiter=',', skip_header=False)

source code 

data loads cell data and column headers.

All cells have left and right whitespace trimmed.

All rows MUST be the same length.

delimiter and skip_header are described in read.

column_types(names, rows)

source code 

column_types infers type information from the columns in rows. Types are stored as either a Python type conversion function (str, int or float) or as a None value.

A column has type None if and only if all cells in the column are empty. (Cells are empty if the length of its value is zero after left and right whitespace has been trimmed.)

A column has type float if and only if all cells in the column are empty, integers or floats AND at least one value is a float.

A column has type int if and only if all cells in the column are empty or integers AND at least one value is an int.

A column has type string in any other case.

cast(types, names, rows)

source code 

cast type casts all of the values in 'rows' to their corresponding types in types.

The only special case here is missing values or NULL columns. If a value is missing or a column has type NULL (i.e., all values are missing), then the value is replaced with None, which is Python's version of a NULL value.

N.B. cast is idempotent. i.e., cast(x) = cast(cast(x)).

convert_missing_cells(types, names, rows, dstr='', dint=0, dfloat=0.0)

source code 

convert_missing_cells changes the values of all NULL cells to the values specified by dstr, dint and dfloat. For example, all NULL cells in columns with type "string" will be replaced with the value given to dstr.

convert_columns(names, rows, **kwargs)

source code 

convert_columns executes converter functions on specific columns, where the parameter names for kwargs are the column names, and the parameter values are functions of one parameter that return a single value.

e.g., convert_columns(names, rows, colname=lambda s: s.lower()) would convert all values in the column with name 'colname' to lowercase.

convert_types(types, names, rows, fstr=None, fint=None, ffloat=None)

source code 

convert_types works just like convert_columns, but on types instead of specific columns. This function will likely be more useful, since sanitizatiion functions are typically type oriented rather than column oriented.

However, when there are specific kinds of columns that need special sanitization, convert_columns should be used.

cell_str(cell_contents)

source code 

cell_str is a convenience function for converting cell contents to a string when there are still NULL values.

N.B. If you choose to work with data while keeping NULL values, you will likely need to write more functions similar to this one.