lipd package

Module contents

lipd.addEnsemble(D, dsn, ensemble)

Create ensemble entry and then add it to the specified LiPD dataset.

Parameters:
  • D (dict) – LiPD data
  • dsn (str) – Dataset name
  • ensemble (list) – Nested numpy array of ensemble column data.
Return dict D:

LiPD data

lipd.collapseTs(ts=None)

Collapse a time series back into LiPD record form.

Example
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D)
3. New_D = lipd.collapseTs(ts)
Parameters:ts (list) – Time series
Return dict:Metadata
lipd.doi()

Update publication information using data DOIs. Updates LiPD files on disk, not in memory.

Example
1: lipd.readLipd()
2: lipd.doi()
Return none:
lipd.ensToDf(ensemble)

Create an ensemble data frame from some given nested numpy arrays

Parameters:ensemble (list) – Ensemble data
Return obj df:Pandas dataframe
lipd.excel()

Convert Excel files to LiPD files. LiPD data is returned directly from this function.

Example
1: lipd.readExcel()
2: D = lipd.excel()
Return dict _d:Metadata
lipd.extractTs(d, chron=False)

Create a time series using LiPD data (uses paleoData by default)

Example : paleoData
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D)
Example : chronData
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D, chron=True)
Parameters:
  • d (dict) – Metadata
  • chron (bool) – Create a chronData time series
Return list l:

Time series

lipd.filterTs(ts, expression)

Create a new time series that only contains entries that match the given expression.

Example:
D = lipd.loadLipd()
ts = lipd.extractTs(D)
new_ts = filterTs(ts, “archiveType == marine sediment”)
new_ts = filterTs(ts, “paleoData_variableName == sst”)
Parameters:
  • expression (str) – Expression
  • ts (list) – Time series
Return list new_ts:
 

Filtered time series that matches the expression

lipd.getCsv(L=None)

Get CSV from LiPD metadata

Example
c = lipd.getCsv(D[“Africa-ColdAirCave.Sundqvist.2013”])
Parameters:L (dict) – One LiPD record
Return dict d:CSV data
lipd.getLipdNames(D=None)

Get a list of all LiPD names in the library

Example
names = lipd.getLipdNames(D)
Return list f_list:
 File list
lipd.getMetadata(L)

Get metadata from a LiPD data in memory

Example
m = lipd.getMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])
Parameters:L (dict) – One LiPD record
Return dict d:LiPD record (metadata only)
lipd.noaa(d=None)

Convert between NOAA and LiPD files

Example: LiPD to NOAA converter
1: D = lipd.readLipd()
2: lipd.noaa(D)
Example: NOAA to LiPD converter
1: readNoaa()
2: lipd.noaa()
Return none:
lipd.queryTs(ts, expression)

Find the indices of the time series entries that match the given expression.

Example:
D = lipd.loadLipd()
ts = lipd.extractTs(D)
matches = queryTs(ts, “archiveType == marine sediment”)
matches = queryTs(ts, “geo_meanElev <= 2000”)
Parameters:
  • expression (str) – Expression
  • ts (list) – Time series
Return list _idx:
 

Indices of entries that match the criteria

lipd.readAll(usr_path='')

Read all approved file types at once. Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:usr_path (str) – Path to file / directory (optional)
Return str cwd:Current working directory
lipd.readExcel(usr_path='')

Read Excel file(s) Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:usr_path (str) – Path to file / directory (optional)
Return str cwd:Current working directory
lipd.readLipd(usr_path='')

Read LiPD file(s). Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:usr_path (str) – Path to file / directory (optional)
Return dict _d:Metadata
lipd.readNoaa(usr_path='')

Read NOAA file(s) Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:usr_path (str) – Path to file / directory (optional)
Return str cwd:Current working directory
lipd.run()

Initialize and start objects. This is called automatically when importing the package.

Return none:
lipd.showDfs(d)

Display the available data frame names in a given data frame collection

Parameters:d (dict) – Dataframe collection
Return none:
lipd.showLipds(D=None)

Display the dataset names of a given LiPD data

Example
lipd.showLipds(D)
Pararm dict D:LiPD data
Return none:
lipd.showMetadata(dat)

Display the metadata specified LiPD in pretty print

Example
showMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])
Parameters:dat (dict) – Metadata
Return none:
lipd.tsToDf(tso)

Create Pandas DataFrame from TimeSeries object. Use: Must first extractTs to get a time series. Then pick one item from time series and pass it through

Parameters:tso (dict) – Time series entry
Return dict dfs:
 Pandas dataframes
lipd.viewTs(ts)

View the contents of one time series entry in a nicely formatted way

Example
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D)
3. viewTs(ts[0])
Parameters:ts (dict) – One time series entry
Return none:
lipd.writeLipd(dat, usr_path='', filename='')

Write LiPD data to file(s)

Parameters:
  • dat (dict) – Metadata
  • usr_path (str) – Destination (optional)
  • filename (str) – LiPD filename, for writing one specific file (optional)
Return none:

Submodules

alternates

List of alternate and synonym keys

bag

lipd.bag.create_bag(dir_bag)

Create a Bag out of given files. :param str dir_bag: Directory that contains csv, jsonld, and changelog files. :return obj: Bag

lipd.bag.finish_bag(dir_bag)

Closing steps for creating a bag :param obj dir_bag: :return None:

lipd.bag.open_bag(dir_bag)

Open Bag at the given path :param str dir_bag: Path to Bag :return obj: Bag

lipd.bag.resolved_flag(bag)

Check DOI flag in bag.info to see if doi_resolver has been previously run :param obj bag: Bag :return bool: Flag

lipd.bag.validate_md5(bag)

Check if Bag is valid :param obj bag: Bag :return None:

blanks

List of empty and ignored keys

csvs

lipd.csvs.get_csv_from_metadata(name, metadata)

Two goals. Get all csv from metadata, and return new metadata with generated filenames to match files. :param str name: LiPD dataset name :param dict metadata: Metadata :return dict: Csv Data

lipd.csvs.merge_csv_metadata(d)

Using the given metadata dictionary, retrieve CSV data from CSV files, and insert the CSV values into their respective metadata columns. Checks for both paleoData and chronData tables. :param dict d: Metadata :return dict: Modified metadata dictionary

lipd.csvs.read_csv_from_file(filename)

Opens the target CSV file and creates a dictionary with one list for each CSV column. :param str filename: :return list of lists: column values

lipd.csvs.write_csv_to_file(d)

Writes columns of data to a target CSV file. :param dict d: A dictionary containing one list for every data column. Keys: int, Values: list :return None:

dataframes

lipd.dataframes.create_dataframe(ensemble)

Create a data frame from given nested lists of ensemble data :param list ensemble: Ensemble data :return obj: Dataframe

lipd.dataframes.get_filtered_dfs(lib, expr)

Main: Get all data frames that match the given expression :return dict: Filenames and data frames (filtered)

lipd.dataframes.lipd_to_df(metadata, csvs)

Create an organized collection of data frames from LiPD data :param dict metadata: LiPD data :param dict csvs: Csv data :return dict: One data frame per table, organized in a dictionary by name

lipd.dataframes.ts_to_df(metadata)

Create a data frame from one TimeSeries object :param dict metadata: Time Series dictionary :return dict: One data frame per table, organized in a dictionary by name

directory

lipd.directory.browse_dialog_dir()

Open up a GUI browse dialog window and let to user pick a target directory. :return str: Target directory path

lipd.directory.browse_dialog_file()

Open up a GUI browse dialog window and let to user select one or more files :return str _path: Target directory path :return list _files: List of selected files

lipd.directory.check_file_age(filename, days)

Check if the target file has an older creation date than X amount of time. i.e. One day: 60*60*24 :param str filename: Target filename :param int days: Limit in number of days :return bool: True - older than X time, False - not older than X time

lipd.directory.collect_metadata_file(full_path)

Create the file metadata and add it to the appropriate section by file-type :param str full_path: :param dict existing_files: :return dict existing files:

lipd.directory.collect_metadata_files(cwd, new_files, existing_files)

Collect all files from a given path. Separate by file type, and return one list for each type If ‘files’ contains specific :param str cwd: Directory w/ target files :param list new_files: Specific new files to load :param dict existing_files: Files currently loaded, separated by type :return list: All files separated by type

lipd.directory.create_tmp_dir()

Creates tmp directory in OS temp space. :return str: Path to tmp directory

lipd.directory.dir_cleanup(dir_bag, dir_data)

Moves JSON and csv files to bag root, then deletes all the metadata bag files. We’ll be creating a new bag with the data files, so we don’t need the other text files and such. :param str dir_bag: Path to root of Bag :param str dir_data: Path to Bag /data subdirectory :return None:

lipd.directory.filename_from_path(path)

Extract the file name from a given file path. :param str path: File path :return str: File name with extension

lipd.directory.find_files()

Search for the directory containing jsonld and csv files. chdir and then quit. :return none:

lipd.directory.get_filenames_generated(d, name='', csvs='')

Get the filenames that the LiPD utilities has generated (per naming standard), as opposed to the filenames that originated in the LiPD file (that possibly don’t follow the naming standard) :param dict d: Data :param str name: LiPD dataset name to prefix :param list csvs: Filenames list to merge with :return list: Filenames

lipd.directory.get_filenames_in_lipd(path, name='')

List all the files contained in the LiPD archive. Bagit, JSON, and CSV :param str path: Directory to be listed :param str name: LiPD dataset name, if you want to prefix it to show file hierarchy :return list: Filenames found

lipd.directory.get_src_or_dst(mode, path_type)

User sets the path to a LiPD source location :param str mode: “read” or “write” mode :param str path_type: “directory” or “file” :return str path: dir path to files :return list files: files chosen

lipd.directory.get_src_or_dst_path(prompt, count)

Let the user choose a path, and store the value. :return str _path: Target directory :return str count: Counter for attempted prompts

lipd.directory.get_src_or_dst_prompt(mode)

String together the proper prompt based on the mode :param str mode: “read” or “write” :return str prompt: The prompt needed

lipd.directory.list_files(x, path='')

Lists file(s) in given path of the X type. :param str x: File extension that we are interested in. :param str path: Path, if user would like to check a specific directory outside of the CWD :return list of str: File name(s) to be worked on

lipd.directory.rm_file_if_exists(path, filename)

Remove a file if it exists. Useful for when we want to write a file, but it already exists in that locaiton. :param str filename: Filename :param str path: Directory :return none:

lipd.directory.rm_files_in_dir(path)

Removes all files within a directory, but does not delete the directory :param str path: Target directory :return none:

doi_main

lipd.doi_main.doi_main(files)

Main function that controls the script. Take in directory containing the .lpd file(s). Loop for each file. :return None:

lipd.doi_main.process_lpd(name, dir_tmp)

Opens up json file, invokes doi_resolver, closes file, updates changelog, cleans directory, and makes new bag. :param str name: Name of current .lpd file :param str dir_tmp: Path to tmp directory :return none:

lipd.doi_main.prompt_force()

Ask the user if they want to force update files that were previously resolved :return bool: response

doi_resolver

class lipd.doi_resolver.DOIResolver(dir_root, name, root_dict)

Bases: object

Use DOI id(s) to pull updated publication info from doi.org and overwrite file data.

Input: Original publication dictionary Output: Updated publication dictionary (success), original publication dictionary (fail)

static compare_replace(pub_dict, fetch_dict)

Take in our Original Pub, and Fetched Pub. For each Fetched entry that has data, overwrite the Original entry :param pub_dict: (dict) Original pub dictionary :param fetch_dict: (dict) Fetched pub dictionary from doi.org :return: (dict) Updated pub dictionary, with fetched data taking precedence

static compile_authors(authors)

Compiles authors “Last, First” into a single list :param list authors: Raw author data retrieved from doi.org :return list: Author objects

static compile_date(date_parts)

Compiles date only using the year :param list date_parts: List of date parts retrieved from doi.org :return str: Date string or NaN

compile_fetch(raw, doi_id)

Loop over Raw and add selected items to Fetch with proper formatting :param dict raw: JSON data from doi.org :param str doi_id: :return dict:

find_doi(curr_dict)

Recursively search the file for the DOI id. More taxing, but more flexible when dictionary structuring isn’t absolute :param dict curr_dict: Current dictionary being searched :return dict bool: Recursive - Current dictionary, False flag that DOI was not found :return str bool: Final - DOI id, True flag that DOI was found

get_data(doi_id, idx)

Resolve DOI and compile all attributes into one dictionary :param str doi_id: :param int idx: Publication index :return dict: Updated publication dictionary

illegal_doi(doi_string)

DOI string did not match the regex. Determine what the data is. :param doi_string: (str) Malformed DOI string :return: None

main()

Main function that gets file(s), creates outputs, and runs all operations. :return dict: Updated or original data for jsonld file

noaa_citation(doi_string)

Special instructions for moving noaa data to the correct fields :param doi_string: (str) NOAA url :return: None

remove_empties(pub)

ensembles

lipd.ensembles.create_ensemble(ensemble)

Add ensemble data to a LiPD object :param list ensemble: Ensemble data nested lists :return dict: Structured Ensemble data

lipd.ensembles.insert_ensemble(d, ens)

Insert the ensemble table dictionary into the LiPD metadata :param dict d: LiPD metadata :param dict ens: Ensemble data to insert :return dict:

excel

lipd.excel.cells_dn_meta(workbook, sheet, row, col, final_dict)

Traverse all cells in a column moving downward. Primarily created for the metadata sheet, but may use elsewhere. Check the cell title, and switch it to. :param obj workbook: :param str sheet: :param int row: :param int col: :param dict final_dict: :return: none

lipd.excel.cells_rt_meta(workbook, sheet, row, col)

Traverse all cells in a row. If you find new data in a cell, add it to the list. :param obj workbook: :param str sheet: :param int row: :param int col: :return list: Cell data for a specific row

lipd.excel.cells_rt_meta_pub(workbook, sheet, row, col, pub_qty)

Publication section is special. It’s possible there’s more than one publication. :param obj workbook: :param str sheet: :param int row: :param int col: :param int pub_qty: Number of distinct publication sections in this file :return list: Cell data for a specific row

lipd.excel.compile_authors(cell)

Split the string of author names into the BibJSON format. :param str cell: Data from author cell :return: (list of dicts) Author names

lipd.excel.compile_fund(workbook, sheet, row, col)

Compile funding entries. Iter both rows at the same time. Keep adding entries until both cells are empty. :param obj workbook: :param str sheet: :param int row: :param int col: :return list of dict: l

lipd.excel.compile_geo(d)

Compile top-level Geography dictionary. :param d: :return:

lipd.excel.compile_geometry(lat, lon, elev)

Take in lists of lat and lon coordinates, and determine what geometry to create :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:

lipd.excel.compile_temp(d, key, value)

Compiles temporary dictionaries for metadata. Adds a new entry to an existing dictionary. :param dict d: :param str key: :param any value: :return dict:

lipd.excel.count_chron_variables(temp_sheet)

Count the number of chron variables :param obj temp_sheet: :return int: variable count

lipd.excel.excel_main(file)

Parse data from Excel spreadsheets into LiPD files. :return list: Filenames of LiPD files created

lipd.excel.extract_short(string_in)

Extract the short name from a string that also has units. :param str string_in: :return str:

lipd.excel.extract_units(string_in)

Extract units from parenthesis in a string. i.e. “elevation (meters)” :param str string_in: :return str:

lipd.excel.geometry_linestring(lat, lon, elev)

GeoJSON Linestring. Latitude and Longitude have 2 values each. :param list lat: Latitude values :param list lon: Longitude values :return dict:

lipd.excel.geometry_point(lat, lon, elev)

GeoJSON point. Latitude and Longitude only have one value each :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:

lipd.excel.geometry_range(crd_range, elev, crd_type)

Range of coordinates. (e.g. 2 latitude coordinates, and 0 longitude coordinates) :param crd_range: Latitude or Longitude values :param elev: Elevation value :param crd_type: Coordinate type, lat or lon :return dict:

lipd.excel.get_chron_data(temp_sheet, row, total_vars)

Capture all data in for a specific chron data row (for csv output) :param obj temp_sheet: :param int row: :param int total_vars: :return list: data_row

lipd.excel.get_chron_var(temp_sheet, start_row)

Capture all the vars in the chron sheet (for json-ld output) :param obj temp_sheet: :param int start_row: :return: (list of dict) column data

lipd.excel.instance_str(cell)

Match data type and return string :param any cell: :return str:

lipd.excel.logger_excel = <logging.Logger object>

VERSION: LiPD v1.2

lipd.excel.name_to_jsonld(title_in)

Convert formal titles to camelcase json_ld text that matches our context file Keep a growing list of all titles that are being used in the json_ld context :param str title_in: :return str:

lipd.excel.traverse_to_chron_data(temp_sheet)

Traverse down to the first row that has chron data :param obj temp_sheet: :return int: traverse_row

lipd.excel.traverse_to_chron_var(temp_sheet)

Traverse down to the row that has the first variable :param obj temp_sheet: :return int:

inferred_data

lipd.inferred_data.get_inferred_data_table(pc, table)

Table level: Dive down, calculate data, then return the new table with the inferred data. :param str pc: Paleo or Chron table type :param dict table: Table data :return dict table: Table with new data

io

jsons

lipd.jsons.get_csv_from_json(d)

Get CSV values when mixed into json data. Pull out the CSV data and put it into a dictionary. :param dict d: JSON with CSV values :return dict: CSV values. (i.e. { CSVFilename1: { Column1: [Values], Column2: [Values] }, CSVFilename2: … }

lipd.jsons.idx_name_to_num(d)

Switch from index-by-name to index-by-number. :param dict d: Metadata :return dict: Modified metadata

lipd.jsons.idx_num_to_name(d)

Switch from index-by-number to index-by-name. :param dict d: Metadata :return dict: Modified Metadata

lipd.jsons.read_json_from_file(filename)

Import the JSON data from target file. :param str filename: Target File :return dict: JSON data

lipd.jsons.read_jsonld()

Find jsonld file in the cwd (or within a 2 levels below cwd), and load it in. :return dict: Jsonld data

lipd.jsons.remove_csv_from_json(d)

Remove all CSV data ‘values’ entries from paleoData table in the JSON structure. :param dict d: JSON data - old structure :return dict: Metadata dictionary without CSV values

lipd.jsons.write_json_to_file(json_data, filename='metadata')

Write all JSON in python dictionary to a new json file. :param dict json_data: JSON data :param str filename: Target filename (defaults to ‘metadata.jsonld’) :return None:

loggers

lipd.loggers.create_benchmark(name, log_file, level=20)

Creates a logger for function benchmark times :param str name: Name of the logger :param str log_file: Filename :return obj: Logger

lipd.loggers.create_logger(name)

Creates a logger with the below attributes. :param str name: Name of the logger :return obj: Logger

lipd.loggers.log_benchmark(fn, start, end)

Log a given function and how long the function takes in seconds :param str fn: Function name :param float start: Function start time :param float end: Function end time :return none:

lipd.loggers.update_changelog()

Create or update the changelog txt file. Prompt for update description. :return None:

lpd_noaa

class lipd.lpd_noaa.LPD_NOAA(dir_root, name, lipd_dict)

Bases: object

Creates a NOAA object that contains all the functions needed to write out a LiPD file as a NOAA text file. Supports LiPD Version: v1.2 NOAA txt template: v3.0

Return none:Writes NOAA text to file in local storage
get_master()

Get the master json that has been modified :return dict: self.lipd_data

get_wdc_paleo_url()

When a NOAA file is created, it creates a URL link to where the dataset will be hosted in NOAA’s archive Retrieve and add this link to the original LiPD file, so we can trace the dataset to NOAA. :return str:

main()

Load in the template file, and run through the parser :return none:

misc

lipd.misc.cast_float(x)

Attempt to cleanup string or convert to number value. :param any x: :return float:

lipd.misc.cast_int(x)

Cast unknown type into integer :param any x: :return int:

lipd.misc.cast_values_csvs(d, idx, x)

Attempt to cast string to float. If error, keep as a string. :param dict d: Data :param int idx: Index number :param str x: Data :return any:

lipd.misc.check_dsn(name, _json)

Get a dataSetName. If one is not provided, then insert the filename as the dataSetName. :param str name: Filename w/o extension :param dict _json: Metadata :return dict _json: Metadata

lipd.misc.clean_doi(doi_string)

Use regex to extract all DOI ids from string (i.e. 10.1029/2005pa001215) :param str doi_string: Raw DOI string value from input file. Often not properly formatted. :return list: DOI ids. May contain 0, 1, or multiple ids.

lipd.misc.fix_coordinate_decimal(d)

Coordinate decimal degrees calculated by an excel formula are often too long as a repeating decimal. Round them down to 5 decimals :param dict d: Metadata :return dict d: Metadata

lipd.misc.generate_timestamp(fmt=None)

Generate a timestamp to mark when this file was last modified. :param str fmt: Special format instructions :return str: YYYY-MM-DD format, or specified format

lipd.misc.generate_tsid(size=8)

Generate a TSid string. Use the “PYT” prefix for traceability, and 8 trailing generated characters ex: PYT9AG234GS :return:

lipd.misc.get_appended_name(name, columns)

Append numbers to a name until it no longer conflicts with the other names in a column. Necessary to avoid overwriting columns and losing data. Loop a preset amount of times to avoid an infinite loop. There shouldn’t ever be more than two or three identical variable names in a table. :param str name: Variable name in question :param dict columns: Columns listed by variable name :return str: Appended variable name

lipd.misc.get_authors_as_str(x)

Take author or investigator data, and convert it to a concatenated string of names. Author data structure has a few variations, so account for all. :param any x: Author data :return str: Author string

lipd.misc.get_dsn(d)

Get the dataset name from a record :param dict d: Metadata :return str: Dataset name

lipd.misc.get_ensemble_counts(d)

Determine if this is a 1 or 2 column ensemble. Then determine how many columns and rows it has. :param d: :return:

lipd.misc.get_missing_value_key(d)

Get the Missing Value entry from a table of data. If none is found, try the columns. If still none found, prompt user. :param dict d: Table of data :return str: Missing Value

lipd.misc.get_table_key(key, d, fallback='')

Try to get a table name from a data table :param str key: Key to try first :param dict d: Data table :param str fallback: (optional) If we don’t find a table name, use this as a generic name fallback. :return str: Data table name

lipd.misc.get_variable_name_col(d)

Get the variable name from a table or column :param dict d: Metadata :return str:

lipd.misc.is_ensemble(d)

Check if a table of data is an ensemble table. Is the first values index a list? ensemble. Int/float? not ensemble. :param dict d: Table data :return bool: Ensemble or not ensemble

lipd.misc.load_fn_matches_ext(file_path, file_type)

Check that the file extension matches the target extension given. :param str file_path: Path to be checked :param str file_type: Target extension :return bool:

lipd.misc.match_arr_lengths(l)

Check that all the array lengths match so that a DataFrame can be created successfully. :param list l: Nested arrays :return bool: Valid or invalid

lipd.misc.match_operators(inp, relate, cut)

Compare two items. Match a string operator to an operator function :param str inp: Comparison item :param str relate: Comparison operator :param any cut: Comparison item :return bool: Comparison truth

lipd.misc.mv_files(src, dst)

Move all files from one directory to another :param str src: Source directory :param str dst: Destination directory :return none:

lipd.misc.normalize_name(s)

Remove foreign accents and characters to normalize the string. Prevents encoding errors. :param str s: :return str:

lipd.misc.path_type(path, target)

Determine if given path is file, directory, or other. Compare with target to see if it’s the type we wanted. :param str path: Path :param str target: Target type wanted :return bool:

lipd.misc.prompt_protocol()

Prompt user if they would like to save pickle file as a dictionary or an object. :return str: Answer

lipd.misc.put_tsids(x)

Recursively add in TSids into any columns that do not have them. Look for “columns” keys, and then start looping and adding generated TSids to each column :param any x: Recursive, so could be any data type. :return any x: Recursive, so could be any data type.

lipd.misc.rm_empty_doi(d)

If an “identifier” dictionary has no doi ID, then it has no use. Delete it. :param dict d: JSON Metadata :return dict: JSON Metadata

lipd.misc.rm_empty_fields(x)

Go through N number of nested data types and remove all empty entries. Recursion :param any x: Dictionary, List, or String of data :return any: Returns a same data type as original, but without empties.

lipd.misc.rm_files(path, extension)

Remove all files in the given directory with the given extension :param str path: Directory :param str extension: File type to remove :return none:

lipd.misc.rm_keys_from_dict(d, keys)

Given a dictionary and a key list, remove any data in the dictionary with the given keys. :param dict d: Data :param list keys: List of key data to remove :return dict d: Data (with keys + data removed)

lipd.misc.rm_missing_values_table(d)

Loop for each table column and remove the missingValue key & data :param dict d: Table data :return dict d: Table data

lipd.misc.rm_values_fields(x)

(recursive) Remove all “values” fields from the metadata :param any x: Any data type :return dict: metadata without “values”

lipd.misc.split_path_and_file(s)

Given a full path to a file, split and return a path and filename :param str s: Full path :return str str: Path, filename

lipd.misc.unwrap_arrays(l)

Unwrap nested lists to be one “flat” list of lists. Mainly for prepping ensemble data for DataFrame() creation :param list l: Nested lists :return list: Flattened lists

noaa

lipd.noaa.lpd_to_noaa(obj)

Convert a LiPD format to NOAA format :param obj obj: LiPD object :return obj: LiPD object (modified)

lipd.noaa.noaa_prompt()

Convert between NOAA and LiPD file formats. :return:

lipd.noaa.noaa_to_lpd(files)

Convert NOAA format to LiPD format :param dict files: Files metadata :return None:

noaa_lpd

class lipd.noaa_lpd.NOAA_LPD(dir_root, dir_tmp, name)

Bases: object

main()

Convert a NOAA text file into a lipds file. CSV files will be created if chronology or data sections are available. :return dict: Metadata Dictionary

regexes

timeseries

lipd.timeseries.collapse(l)

LiPD Version 1.3 Main function to initiate time series to LiPD conversion :param list l: Time series :return dict _master: LiPD data, sorted by dataset name

lipd.timeseries.extract(d, chron)

LiPD Version 1.3 Main function to initiate LiPD to TSOs conversion. :param dict d: Metadata for one LiPD file :param bool chron: Paleo mode (default) or Chron mode :return list _ts: Time series

lipd.timeseries.get_matches(expr_lst, ts)

Get a list of TimeSeries objects that match the given expression. :param list expr_lst: Expression :param list ts: TimeSeries :return list new_ts: Matched time series objects :return list idxs: Indices of matched objects

lipd.timeseries.mode_ts(ec, ts=None, b=None)

Get string for the mode :param bool b: Chron boolean (for extract) :param str ec: extract or collapse :param list ts: Time series (for collapse) :return str phrase: Phrase

lipd.timeseries.translate_expression(expression)

Check if the expression is valid, then check turn it into an expression that can be used for filtering. :return list of lists: One or more matches. Each list has 3 strings.

validator_api

lipd.validator_api.create_detailed_results(data)
lipd.validator_api.display_results(data, detailed=False)

Display the results from the validator in a brief or detailed way. :param dict data: Results, sorted by dataset name :param bool detailed: Detailed results on or off :return none:

lipd.validator_api.get_validator_format(data_json, data_csv, filenames)

Format the LIPD data in the layout that the Lipd.net validator accepts. Example of one _file metadata. _file_list will contain 1 or more _file’s _file = {

“type”: “bagit/json/csv”, “filenameFull”: /path/to/filename.txt, “filenameShort”: filename.txt, “data”: “”, “pretty”: “”

}

Parameters:
  • data_json (dict) – Metadata
  • data_csv (dict) – CSV data
  • filenames (list) – All files found in LiPD archive
Return list:

Validator-formatted data

lipd.validator_api.get_validator_results(data)

Send LiPD data to the Lipd.net validator and get the results back. :param data: :return:

versions

lipd.versions.get_lipd_version(d)

Check what version of LiPD this file is using. If none is found, assume it’s using version 1.0 :param dict d: Metadata :return float:

lipd.versions.update_lipd_v1_1(d)

Update LiPD v1.0 to v1.1 - chronData entry is a list that allows multiple tables - paleoData entry is a list that allows multiple tables - chronData now allows measurement, model, summary, modelTable, ensemble, calibratedAges tables - Added ‘lipdVersion’ key

Parameters:d (dict) – Metadata v1.0
Return dict d:Metadata v1.1
lipd.versions.update_lipd_v1_2(d)

Update LiPD v1.1 to v1.2 - Added NOAA compatible keys : maxYear, minYear, originalDataURL, WDCPaleoURL, etc - ‘calibratedAges’ key is now ‘distribution’ - paleoData structure mirrors chronData. Allows measurement, model, summary, modelTable, ensemble,

distribution tables
Parameters:d (dict) – Metadata v1.1
Return dict d:Metadata v1.2
lipd.versions.update_lipd_v1_3(d)

Update LiPD v1.2 to v1.3 - Added ‘createdBy’ key - Top-level folder inside LiPD archives are named “bag”. (No longer <datasetname>) - .jsonld file is now generically named ‘metadata.jsonld’ (No longer <datasetname>.lpd ) - All “paleo” and “chron” prefixes are removed from “paleoMeasurementTable”, “paleoModel”, etc. - Merge isotopeInterpretation and climateInterpretation into “interpretation” block - ensemble table entry is a list that allows multiple tables - summary table entry is a list that allows multiple tables :param dict d: Metadata v1.2 :return dict d: Metadata v1.3

lipd.versions.update_lipd_v1_3_names(d)

Update the key names and merge interpretation data :param dict d: Metadata :return dict d: Metadata

lipd.versions.update_lipd_v1_3_structure(d)

Update the structure for summary and ensemble tables :param dict d: Metadata :return dict d: Metadata

lipd.versions.update_lipd_version(d)

Metadata is indexed by number at this step.

Use the current version number to determine where to start updating from. Use “chain versioning” to make it modular. If a file is a few versions behind, convert to EACH version until reaching current. If a file is one version behind, it will only convert once to the newest. :param dict d: Metadata :return dict d: Metadata

zips

lipd.zips.unzipper(filename, dir_tmp)

Unzip .lpd file contents to tmp directory. :param str filename: filename.lpd :param str dir_tmp: Tmp folder to extract contents to :return None:

lipd.zips.zipper(root_dir='', name='', path_name_ext='')

Zips up directory back to the original location :param str root_dir: Root directory of the archive :param str name: <datasetname>.lpd :param str path_name_ext: /path/to/filename.lpd