lipd package¶
Module contents¶
-
lipd.
addEnsemble
(D, dsn, ensemble)¶ Create ensemble entry and then add it to the specified LiPD dataset.
Parameters: - D (dict) – LiPD data
- dsn (str) – Dataset name
- ensemble (list) – Nested numpy array of ensemble column data.
Return dict D: LiPD data
-
lipd.
collapseTs
(ts=None)¶ Collapse a time series back into LiPD record form.
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. New_D = lipd.collapseTs(ts)Parameters: ts (list) – Time series Return dict: Metadata
-
lipd.
doi
()¶ Update publication information using data DOIs. Updates LiPD files on disk, not in memory.
Example1: lipd.readLipd()2: lipd.doi()Return none:
-
lipd.
ensToDf
(ensemble)¶ Create an ensemble data frame from some given nested numpy arrays
Parameters: ensemble (list) – Ensemble data Return obj df: Pandas dataframe
-
lipd.
excel
()¶ Convert Excel files to LiPD files. LiPD data is returned directly from this function.
Example1: lipd.readExcel()2: D = lipd.excel()Return dict _d: Metadata
-
lipd.
extractTs
(d, chron=False)¶ Create a time series using LiPD data (uses paleoData by default)
Example : paleoData1. D = lipd.readLipd()2. ts = lipd.extractTs(D)Example : chronData1. D = lipd.readLipd()2. ts = lipd.extractTs(D, chron=True)Parameters: - d (dict) – Metadata
- chron (bool) – Create a chronData time series
Return list l: Time series
-
lipd.
filterTs
(ts, expression)¶ Create a new time series that only contains entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)new_ts = filterTs(ts, “archiveType == marine sediment”)new_ts = filterTs(ts, “paleoData_variableName == sst”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list new_ts: Filtered time series that matches the expression
-
lipd.
getCsv
(L=None)¶ Get CSV from LiPD metadata
Examplec = lipd.getCsv(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: CSV data
-
lipd.
getLipdNames
(D=None)¶ Get a list of all LiPD names in the library
Examplenames = lipd.getLipdNames(D)Return list f_list: File list
-
lipd.
getMetadata
(L)¶ Get metadata from a LiPD data in memory
Examplem = lipd.getMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: LiPD record (metadata only)
-
lipd.
noaa
(d=None)¶ Convert between NOAA and LiPD files
Example: LiPD to NOAA converter1: D = lipd.readLipd()2: lipd.noaa(D)Example: NOAA to LiPD converter1: readNoaa()2: lipd.noaa()Return none:
-
lipd.
queryTs
(ts, expression)¶ Find the indices of the time series entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)matches = queryTs(ts, “archiveType == marine sediment”)matches = queryTs(ts, “geo_meanElev <= 2000”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list _idx: Indices of entries that match the criteria
-
lipd.
readAll
(usr_path='')¶ Read all approved file types at once. Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
readExcel
(usr_path='')¶ Read Excel file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
readLipd
(usr_path='')¶ Read LiPD file(s). Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return dict _d: Metadata
-
lipd.
readNoaa
(usr_path='')¶ Read NOAA file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
run
()¶ Initialize and start objects. This is called automatically when importing the package.
Return none:
-
lipd.
showDfs
(d)¶ Display the available data frame names in a given data frame collection
Parameters: d (dict) – Dataframe collection Return none:
-
lipd.
showLipds
(D=None)¶ Display the dataset names of a given LiPD data
Examplelipd.showLipds(D)Pararm dict D: LiPD data Return none:
-
lipd.
showMetadata
(dat)¶ Display the metadata specified LiPD in pretty print
ExampleshowMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: dat (dict) – Metadata Return none:
-
lipd.
tsToDf
(tso)¶ Create Pandas DataFrame from TimeSeries object. Use: Must first extractTs to get a time series. Then pick one item from time series and pass it through
Parameters: tso (dict) – Time series entry Return dict dfs: Pandas dataframes
-
lipd.
viewTs
(ts)¶ View the contents of one time series entry in a nicely formatted way
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. viewTs(ts[0])Parameters: ts (dict) – One time series entry Return none:
-
lipd.
writeLipd
(dat, usr_path='', filename='')¶ Write LiPD data to file(s)
Parameters: - dat (dict) – Metadata
- usr_path (str) – Destination (optional)
- filename (str) – LiPD filename, for writing one specific file (optional)
Return none:
Submodules¶
alternates¶
List of alternate and synonym keys
bag¶
-
lipd.bag.
create_bag
(dir_bag)¶ Create a Bag out of given files. :param str dir_bag: Directory that contains csv, jsonld, and changelog files. :return obj: Bag
-
lipd.bag.
finish_bag
(dir_bag)¶ Closing steps for creating a bag :param obj dir_bag: :return None:
-
lipd.bag.
open_bag
(dir_bag)¶ Open Bag at the given path :param str dir_bag: Path to Bag :return obj: Bag
-
lipd.bag.
resolved_flag
(bag)¶ Check DOI flag in bag.info to see if doi_resolver has been previously run :param obj bag: Bag :return bool: Flag
-
lipd.bag.
validate_md5
(bag)¶ Check if Bag is valid :param obj bag: Bag :return None:
blanks¶
List of empty and ignored keys
csvs¶
-
lipd.csvs.
get_csv_from_metadata
(name, metadata)¶ Two goals. Get all csv from metadata, and return new metadata with generated filenames to match files. :param str name: LiPD dataset name :param dict metadata: Metadata :return dict: Csv Data
-
lipd.csvs.
merge_csv_metadata
(d)¶ Using the given metadata dictionary, retrieve CSV data from CSV files, and insert the CSV values into their respective metadata columns. Checks for both paleoData and chronData tables. :param dict d: Metadata :return dict: Modified metadata dictionary
-
lipd.csvs.
read_csv_from_file
(filename)¶ Opens the target CSV file and creates a dictionary with one list for each CSV column. :param str filename: :return list of lists: column values
-
lipd.csvs.
write_csv_to_file
(d)¶ Writes columns of data to a target CSV file. :param dict d: A dictionary containing one list for every data column. Keys: int, Values: list :return None:
dataframes¶
-
lipd.dataframes.
create_dataframe
(ensemble)¶ Create a data frame from given nested lists of ensemble data :param list ensemble: Ensemble data :return obj: Dataframe
-
lipd.dataframes.
get_filtered_dfs
(lib, expr)¶ Main: Get all data frames that match the given expression :return dict: Filenames and data frames (filtered)
-
lipd.dataframes.
lipd_to_df
(metadata, csvs)¶ Create an organized collection of data frames from LiPD data :param dict metadata: LiPD data :param dict csvs: Csv data :return dict: One data frame per table, organized in a dictionary by name
-
lipd.dataframes.
ts_to_df
(metadata)¶ Create a data frame from one TimeSeries object :param dict metadata: Time Series dictionary :return dict: One data frame per table, organized in a dictionary by name
directory¶
-
lipd.directory.
browse_dialog_dir
()¶ Open up a GUI browse dialog window and let to user pick a target directory. :return str: Target directory path
-
lipd.directory.
browse_dialog_file
()¶ Open up a GUI browse dialog window and let to user select one or more files :return str _path: Target directory path :return list _files: List of selected files
-
lipd.directory.
check_file_age
(filename, days)¶ Check if the target file has an older creation date than X amount of time. i.e. One day: 60*60*24 :param str filename: Target filename :param int days: Limit in number of days :return bool: True - older than X time, False - not older than X time
-
lipd.directory.
collect_metadata_file
(full_path)¶ Create the file metadata and add it to the appropriate section by file-type :param str full_path: :param dict existing_files: :return dict existing files:
-
lipd.directory.
collect_metadata_files
(cwd, new_files, existing_files)¶ Collect all files from a given path. Separate by file type, and return one list for each type If ‘files’ contains specific :param str cwd: Directory w/ target files :param list new_files: Specific new files to load :param dict existing_files: Files currently loaded, separated by type :return list: All files separated by type
-
lipd.directory.
create_tmp_dir
()¶ Creates tmp directory in OS temp space. :return str: Path to tmp directory
-
lipd.directory.
dir_cleanup
(dir_bag, dir_data)¶ Moves JSON and csv files to bag root, then deletes all the metadata bag files. We’ll be creating a new bag with the data files, so we don’t need the other text files and such. :param str dir_bag: Path to root of Bag :param str dir_data: Path to Bag /data subdirectory :return None:
-
lipd.directory.
filename_from_path
(path)¶ Extract the file name from a given file path. :param str path: File path :return str: File name with extension
-
lipd.directory.
find_files
()¶ Search for the directory containing jsonld and csv files. chdir and then quit. :return none:
-
lipd.directory.
get_filenames_generated
(d, name='', csvs='')¶ Get the filenames that the LiPD utilities has generated (per naming standard), as opposed to the filenames that originated in the LiPD file (that possibly don’t follow the naming standard) :param dict d: Data :param str name: LiPD dataset name to prefix :param list csvs: Filenames list to merge with :return list: Filenames
-
lipd.directory.
get_filenames_in_lipd
(path, name='')¶ List all the files contained in the LiPD archive. Bagit, JSON, and CSV :param str path: Directory to be listed :param str name: LiPD dataset name, if you want to prefix it to show file hierarchy :return list: Filenames found
-
lipd.directory.
get_src_or_dst
(mode, path_type)¶ User sets the path to a LiPD source location :param str mode: “read” or “write” mode :param str path_type: “directory” or “file” :return str path: dir path to files :return list files: files chosen
-
lipd.directory.
get_src_or_dst_path
(prompt, count)¶ Let the user choose a path, and store the value. :return str _path: Target directory :return str count: Counter for attempted prompts
-
lipd.directory.
get_src_or_dst_prompt
(mode)¶ String together the proper prompt based on the mode :param str mode: “read” or “write” :return str prompt: The prompt needed
-
lipd.directory.
list_files
(x, path='')¶ Lists file(s) in given path of the X type. :param str x: File extension that we are interested in. :param str path: Path, if user would like to check a specific directory outside of the CWD :return list of str: File name(s) to be worked on
-
lipd.directory.
rm_file_if_exists
(path, filename)¶ Remove a file if it exists. Useful for when we want to write a file, but it already exists in that locaiton. :param str filename: Filename :param str path: Directory :return none:
-
lipd.directory.
rm_files_in_dir
(path)¶ Removes all files within a directory, but does not delete the directory :param str path: Target directory :return none:
doi_main¶
-
lipd.doi_main.
doi_main
(files)¶ Main function that controls the script. Take in directory containing the .lpd file(s). Loop for each file. :return None:
-
lipd.doi_main.
process_lpd
(name, dir_tmp)¶ Opens up json file, invokes doi_resolver, closes file, updates changelog, cleans directory, and makes new bag. :param str name: Name of current .lpd file :param str dir_tmp: Path to tmp directory :return none:
-
lipd.doi_main.
prompt_force
()¶ Ask the user if they want to force update files that were previously resolved :return bool: response
doi_resolver¶
-
class
lipd.doi_resolver.
DOIResolver
(dir_root, name, root_dict)¶ Bases:
object
Use DOI id(s) to pull updated publication info from doi.org and overwrite file data.
Input: Original publication dictionary Output: Updated publication dictionary (success), original publication dictionary (fail)
-
static
compare_replace
(pub_dict, fetch_dict)¶ Take in our Original Pub, and Fetched Pub. For each Fetched entry that has data, overwrite the Original entry :param pub_dict: (dict) Original pub dictionary :param fetch_dict: (dict) Fetched pub dictionary from doi.org :return: (dict) Updated pub dictionary, with fetched data taking precedence
Compiles authors “Last, First” into a single list :param list authors: Raw author data retrieved from doi.org :return list: Author objects
-
static
compile_date
(date_parts)¶ Compiles date only using the year :param list date_parts: List of date parts retrieved from doi.org :return str: Date string or NaN
-
compile_fetch
(raw, doi_id)¶ Loop over Raw and add selected items to Fetch with proper formatting :param dict raw: JSON data from doi.org :param str doi_id: :return dict:
-
find_doi
(curr_dict)¶ Recursively search the file for the DOI id. More taxing, but more flexible when dictionary structuring isn’t absolute :param dict curr_dict: Current dictionary being searched :return dict bool: Recursive - Current dictionary, False flag that DOI was not found :return str bool: Final - DOI id, True flag that DOI was found
-
get_data
(doi_id, idx)¶ Resolve DOI and compile all attributes into one dictionary :param str doi_id: :param int idx: Publication index :return dict: Updated publication dictionary
-
illegal_doi
(doi_string)¶ DOI string did not match the regex. Determine what the data is. :param doi_string: (str) Malformed DOI string :return: None
-
main
()¶ Main function that gets file(s), creates outputs, and runs all operations. :return dict: Updated or original data for jsonld file
-
noaa_citation
(doi_string)¶ Special instructions for moving noaa data to the correct fields :param doi_string: (str) NOAA url :return: None
-
remove_empties
(pub)¶
-
static
ensembles¶
-
lipd.ensembles.
create_ensemble
(ensemble)¶ Add ensemble data to a LiPD object :param list ensemble: Ensemble data nested lists :return dict: Structured Ensemble data
-
lipd.ensembles.
insert_ensemble
(d, ens)¶ Insert the ensemble table dictionary into the LiPD metadata :param dict d: LiPD metadata :param dict ens: Ensemble data to insert :return dict:
excel¶
-
lipd.excel.
cells_dn_meta
(workbook, sheet, row, col, final_dict)¶ Traverse all cells in a column moving downward. Primarily created for the metadata sheet, but may use elsewhere. Check the cell title, and switch it to. :param obj workbook: :param str sheet: :param int row: :param int col: :param dict final_dict: :return: none
-
lipd.excel.
cells_rt_meta
(workbook, sheet, row, col)¶ Traverse all cells in a row. If you find new data in a cell, add it to the list. :param obj workbook: :param str sheet: :param int row: :param int col: :return list: Cell data for a specific row
-
lipd.excel.
cells_rt_meta_pub
(workbook, sheet, row, col, pub_qty)¶ Publication section is special. It’s possible there’s more than one publication. :param obj workbook: :param str sheet: :param int row: :param int col: :param int pub_qty: Number of distinct publication sections in this file :return list: Cell data for a specific row
Split the string of author names into the BibJSON format. :param str cell: Data from author cell :return: (list of dicts) Author names
-
lipd.excel.
compile_fund
(workbook, sheet, row, col)¶ Compile funding entries. Iter both rows at the same time. Keep adding entries until both cells are empty. :param obj workbook: :param str sheet: :param int row: :param int col: :return list of dict: l
-
lipd.excel.
compile_geo
(d)¶ Compile top-level Geography dictionary. :param d: :return:
-
lipd.excel.
compile_geometry
(lat, lon, elev)¶ Take in lists of lat and lon coordinates, and determine what geometry to create :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.
compile_temp
(d, key, value)¶ Compiles temporary dictionaries for metadata. Adds a new entry to an existing dictionary. :param dict d: :param str key: :param any value: :return dict:
-
lipd.excel.
count_chron_variables
(temp_sheet)¶ Count the number of chron variables :param obj temp_sheet: :return int: variable count
-
lipd.excel.
excel_main
(file)¶ Parse data from Excel spreadsheets into LiPD files. :return list: Filenames of LiPD files created
-
lipd.excel.
extract_short
(string_in)¶ Extract the short name from a string that also has units. :param str string_in: :return str:
-
lipd.excel.
extract_units
(string_in)¶ Extract units from parenthesis in a string. i.e. “elevation (meters)” :param str string_in: :return str:
-
lipd.excel.
geometry_linestring
(lat, lon, elev)¶ GeoJSON Linestring. Latitude and Longitude have 2 values each. :param list lat: Latitude values :param list lon: Longitude values :return dict:
-
lipd.excel.
geometry_point
(lat, lon, elev)¶ GeoJSON point. Latitude and Longitude only have one value each :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.
geometry_range
(crd_range, elev, crd_type)¶ Range of coordinates. (e.g. 2 latitude coordinates, and 0 longitude coordinates) :param crd_range: Latitude or Longitude values :param elev: Elevation value :param crd_type: Coordinate type, lat or lon :return dict:
-
lipd.excel.
get_chron_data
(temp_sheet, row, total_vars)¶ Capture all data in for a specific chron data row (for csv output) :param obj temp_sheet: :param int row: :param int total_vars: :return list: data_row
-
lipd.excel.
get_chron_var
(temp_sheet, start_row)¶ Capture all the vars in the chron sheet (for json-ld output) :param obj temp_sheet: :param int start_row: :return: (list of dict) column data
-
lipd.excel.
instance_str
(cell)¶ Match data type and return string :param any cell: :return str:
-
lipd.excel.
logger_excel
= <logging.Logger object>¶ VERSION: LiPD v1.2
-
lipd.excel.
name_to_jsonld
(title_in)¶ Convert formal titles to camelcase json_ld text that matches our context file Keep a growing list of all titles that are being used in the json_ld context :param str title_in: :return str:
-
lipd.excel.
traverse_to_chron_data
(temp_sheet)¶ Traverse down to the first row that has chron data :param obj temp_sheet: :return int: traverse_row
-
lipd.excel.
traverse_to_chron_var
(temp_sheet)¶ Traverse down to the row that has the first variable :param obj temp_sheet: :return int:
inferred_data¶
-
lipd.inferred_data.
get_inferred_data_table
(pc, table)¶ Table level: Dive down, calculate data, then return the new table with the inferred data. :param str pc: Paleo or Chron table type :param dict table: Table data :return dict table: Table with new data
io¶
jsons¶
-
lipd.jsons.
get_csv_from_json
(d)¶ Get CSV values when mixed into json data. Pull out the CSV data and put it into a dictionary. :param dict d: JSON with CSV values :return dict: CSV values. (i.e. { CSVFilename1: { Column1: [Values], Column2: [Values] }, CSVFilename2: … }
-
lipd.jsons.
idx_name_to_num
(d)¶ Switch from index-by-name to index-by-number. :param dict d: Metadata :return dict: Modified metadata
-
lipd.jsons.
idx_num_to_name
(d)¶ Switch from index-by-number to index-by-name. :param dict d: Metadata :return dict: Modified Metadata
-
lipd.jsons.
read_json_from_file
(filename)¶ Import the JSON data from target file. :param str filename: Target File :return dict: JSON data
-
lipd.jsons.
read_jsonld
()¶ Find jsonld file in the cwd (or within a 2 levels below cwd), and load it in. :return dict: Jsonld data
-
lipd.jsons.
remove_csv_from_json
(d)¶ Remove all CSV data ‘values’ entries from paleoData table in the JSON structure. :param dict d: JSON data - old structure :return dict: Metadata dictionary without CSV values
-
lipd.jsons.
write_json_to_file
(json_data, filename='metadata')¶ Write all JSON in python dictionary to a new json file. :param dict json_data: JSON data :param str filename: Target filename (defaults to ‘metadata.jsonld’) :return None:
loggers¶
-
lipd.loggers.
create_benchmark
(name, log_file, level=20)¶ Creates a logger for function benchmark times :param str name: Name of the logger :param str log_file: Filename :return obj: Logger
-
lipd.loggers.
create_logger
(name)¶ Creates a logger with the below attributes. :param str name: Name of the logger :return obj: Logger
-
lipd.loggers.
log_benchmark
(fn, start, end)¶ Log a given function and how long the function takes in seconds :param str fn: Function name :param float start: Function start time :param float end: Function end time :return none:
-
lipd.loggers.
update_changelog
()¶ Create or update the changelog txt file. Prompt for update description. :return None:
lpd_noaa¶
-
class
lipd.lpd_noaa.
LPD_NOAA
(dir_root, name, lipd_dict)¶ Bases:
object
Creates a NOAA object that contains all the functions needed to write out a LiPD file as a NOAA text file. Supports LiPD Version: v1.2 NOAA txt template: v3.0
Return none: Writes NOAA text to file in local storage -
get_master
()¶ Get the master json that has been modified :return dict: self.lipd_data
-
get_wdc_paleo_url
()¶ When a NOAA file is created, it creates a URL link to where the dataset will be hosted in NOAA’s archive Retrieve and add this link to the original LiPD file, so we can trace the dataset to NOAA. :return str:
-
main
()¶ Load in the template file, and run through the parser :return none:
-
misc¶
-
lipd.misc.
cast_float
(x)¶ Attempt to cleanup string or convert to number value. :param any x: :return float:
-
lipd.misc.
cast_int
(x)¶ Cast unknown type into integer :param any x: :return int:
-
lipd.misc.
cast_values_csvs
(d, idx, x)¶ Attempt to cast string to float. If error, keep as a string. :param dict d: Data :param int idx: Index number :param str x: Data :return any:
-
lipd.misc.
check_dsn
(name, _json)¶ Get a dataSetName. If one is not provided, then insert the filename as the dataSetName. :param str name: Filename w/o extension :param dict _json: Metadata :return dict _json: Metadata
-
lipd.misc.
clean_doi
(doi_string)¶ Use regex to extract all DOI ids from string (i.e. 10.1029/2005pa001215) :param str doi_string: Raw DOI string value from input file. Often not properly formatted. :return list: DOI ids. May contain 0, 1, or multiple ids.
-
lipd.misc.
fix_coordinate_decimal
(d)¶ Coordinate decimal degrees calculated by an excel formula are often too long as a repeating decimal. Round them down to 5 decimals :param dict d: Metadata :return dict d: Metadata
-
lipd.misc.
generate_timestamp
(fmt=None)¶ Generate a timestamp to mark when this file was last modified. :param str fmt: Special format instructions :return str: YYYY-MM-DD format, or specified format
-
lipd.misc.
generate_tsid
(size=8)¶ Generate a TSid string. Use the “PYT” prefix for traceability, and 8 trailing generated characters ex: PYT9AG234GS :return:
-
lipd.misc.
get_appended_name
(name, columns)¶ Append numbers to a name until it no longer conflicts with the other names in a column. Necessary to avoid overwriting columns and losing data. Loop a preset amount of times to avoid an infinite loop. There shouldn’t ever be more than two or three identical variable names in a table. :param str name: Variable name in question :param dict columns: Columns listed by variable name :return str: Appended variable name
Take author or investigator data, and convert it to a concatenated string of names. Author data structure has a few variations, so account for all. :param any x: Author data :return str: Author string
-
lipd.misc.
get_dsn
(d)¶ Get the dataset name from a record :param dict d: Metadata :return str: Dataset name
-
lipd.misc.
get_ensemble_counts
(d)¶ Determine if this is a 1 or 2 column ensemble. Then determine how many columns and rows it has. :param d: :return:
-
lipd.misc.
get_missing_value_key
(d)¶ Get the Missing Value entry from a table of data. If none is found, try the columns. If still none found, prompt user. :param dict d: Table of data :return str: Missing Value
-
lipd.misc.
get_table_key
(key, d, fallback='')¶ Try to get a table name from a data table :param str key: Key to try first :param dict d: Data table :param str fallback: (optional) If we don’t find a table name, use this as a generic name fallback. :return str: Data table name
-
lipd.misc.
get_variable_name_col
(d)¶ Get the variable name from a table or column :param dict d: Metadata :return str:
-
lipd.misc.
is_ensemble
(d)¶ Check if a table of data is an ensemble table. Is the first values index a list? ensemble. Int/float? not ensemble. :param dict d: Table data :return bool: Ensemble or not ensemble
-
lipd.misc.
load_fn_matches_ext
(file_path, file_type)¶ Check that the file extension matches the target extension given. :param str file_path: Path to be checked :param str file_type: Target extension :return bool:
-
lipd.misc.
match_arr_lengths
(l)¶ Check that all the array lengths match so that a DataFrame can be created successfully. :param list l: Nested arrays :return bool: Valid or invalid
-
lipd.misc.
match_operators
(inp, relate, cut)¶ Compare two items. Match a string operator to an operator function :param str inp: Comparison item :param str relate: Comparison operator :param any cut: Comparison item :return bool: Comparison truth
-
lipd.misc.
mv_files
(src, dst)¶ Move all files from one directory to another :param str src: Source directory :param str dst: Destination directory :return none:
-
lipd.misc.
normalize_name
(s)¶ Remove foreign accents and characters to normalize the string. Prevents encoding errors. :param str s: :return str:
-
lipd.misc.
path_type
(path, target)¶ Determine if given path is file, directory, or other. Compare with target to see if it’s the type we wanted. :param str path: Path :param str target: Target type wanted :return bool:
-
lipd.misc.
prompt_protocol
()¶ Prompt user if they would like to save pickle file as a dictionary or an object. :return str: Answer
-
lipd.misc.
put_tsids
(x)¶ Recursively add in TSids into any columns that do not have them. Look for “columns” keys, and then start looping and adding generated TSids to each column :param any x: Recursive, so could be any data type. :return any x: Recursive, so could be any data type.
-
lipd.misc.
rm_empty_doi
(d)¶ If an “identifier” dictionary has no doi ID, then it has no use. Delete it. :param dict d: JSON Metadata :return dict: JSON Metadata
-
lipd.misc.
rm_empty_fields
(x)¶ Go through N number of nested data types and remove all empty entries. Recursion :param any x: Dictionary, List, or String of data :return any: Returns a same data type as original, but without empties.
-
lipd.misc.
rm_files
(path, extension)¶ Remove all files in the given directory with the given extension :param str path: Directory :param str extension: File type to remove :return none:
-
lipd.misc.
rm_keys_from_dict
(d, keys)¶ Given a dictionary and a key list, remove any data in the dictionary with the given keys. :param dict d: Data :param list keys: List of key data to remove :return dict d: Data (with keys + data removed)
-
lipd.misc.
rm_missing_values_table
(d)¶ Loop for each table column and remove the missingValue key & data :param dict d: Table data :return dict d: Table data
-
lipd.misc.
rm_values_fields
(x)¶ (recursive) Remove all “values” fields from the metadata :param any x: Any data type :return dict: metadata without “values”
-
lipd.misc.
split_path_and_file
(s)¶ Given a full path to a file, split and return a path and filename :param str s: Full path :return str str: Path, filename
-
lipd.misc.
unwrap_arrays
(l)¶ Unwrap nested lists to be one “flat” list of lists. Mainly for prepping ensemble data for DataFrame() creation :param list l: Nested lists :return list: Flattened lists
noaa¶
-
lipd.noaa.
lpd_to_noaa
(obj)¶ Convert a LiPD format to NOAA format :param obj obj: LiPD object :return obj: LiPD object (modified)
-
lipd.noaa.
noaa_prompt
()¶ Convert between NOAA and LiPD file formats. :return:
-
lipd.noaa.
noaa_to_lpd
(files)¶ Convert NOAA format to LiPD format :param dict files: Files metadata :return None:
noaa_lpd¶
regexes¶
timeseries¶
-
lipd.timeseries.
collapse
(l)¶ LiPD Version 1.3 Main function to initiate time series to LiPD conversion :param list l: Time series :return dict _master: LiPD data, sorted by dataset name
-
lipd.timeseries.
extract
(d, chron)¶ LiPD Version 1.3 Main function to initiate LiPD to TSOs conversion. :param dict d: Metadata for one LiPD file :param bool chron: Paleo mode (default) or Chron mode :return list _ts: Time series
-
lipd.timeseries.
get_matches
(expr_lst, ts)¶ Get a list of TimeSeries objects that match the given expression. :param list expr_lst: Expression :param list ts: TimeSeries :return list new_ts: Matched time series objects :return list idxs: Indices of matched objects
-
lipd.timeseries.
mode_ts
(ec, ts=None, b=None)¶ Get string for the mode :param bool b: Chron boolean (for extract) :param str ec: extract or collapse :param list ts: Time series (for collapse) :return str phrase: Phrase
-
lipd.timeseries.
translate_expression
(expression)¶ Check if the expression is valid, then check turn it into an expression that can be used for filtering. :return list of lists: One or more matches. Each list has 3 strings.
validator_api¶
-
lipd.validator_api.
create_detailed_results
(data)¶
-
lipd.validator_api.
display_results
(data, detailed=False)¶ Display the results from the validator in a brief or detailed way. :param dict data: Results, sorted by dataset name :param bool detailed: Detailed results on or off :return none:
-
lipd.validator_api.
get_validator_format
(data_json, data_csv, filenames)¶ Format the LIPD data in the layout that the Lipd.net validator accepts. Example of one _file metadata. _file_list will contain 1 or more _file’s _file = {
“type”: “bagit/json/csv”, “filenameFull”: /path/to/filename.txt, “filenameShort”: filename.txt, “data”: “”, “pretty”: “”}
Parameters: - data_json (dict) – Metadata
- data_csv (dict) – CSV data
- filenames (list) – All files found in LiPD archive
Return list: Validator-formatted data
-
lipd.validator_api.
get_validator_results
(data)¶ Send LiPD data to the Lipd.net validator and get the results back. :param data: :return:
versions¶
-
lipd.versions.
get_lipd_version
(d)¶ Check what version of LiPD this file is using. If none is found, assume it’s using version 1.0 :param dict d: Metadata :return float:
-
lipd.versions.
update_lipd_v1_1
(d)¶ Update LiPD v1.0 to v1.1 - chronData entry is a list that allows multiple tables - paleoData entry is a list that allows multiple tables - chronData now allows measurement, model, summary, modelTable, ensemble, calibratedAges tables - Added ‘lipdVersion’ key
Parameters: d (dict) – Metadata v1.0 Return dict d: Metadata v1.1
-
lipd.versions.
update_lipd_v1_2
(d)¶ Update LiPD v1.1 to v1.2 - Added NOAA compatible keys : maxYear, minYear, originalDataURL, WDCPaleoURL, etc - ‘calibratedAges’ key is now ‘distribution’ - paleoData structure mirrors chronData. Allows measurement, model, summary, modelTable, ensemble,
distribution tablesParameters: d (dict) – Metadata v1.1 Return dict d: Metadata v1.2
-
lipd.versions.
update_lipd_v1_3
(d)¶ Update LiPD v1.2 to v1.3 - Added ‘createdBy’ key - Top-level folder inside LiPD archives are named “bag”. (No longer <datasetname>) - .jsonld file is now generically named ‘metadata.jsonld’ (No longer <datasetname>.lpd ) - All “paleo” and “chron” prefixes are removed from “paleoMeasurementTable”, “paleoModel”, etc. - Merge isotopeInterpretation and climateInterpretation into “interpretation” block - ensemble table entry is a list that allows multiple tables - summary table entry is a list that allows multiple tables :param dict d: Metadata v1.2 :return dict d: Metadata v1.3
-
lipd.versions.
update_lipd_v1_3_names
(d)¶ Update the key names and merge interpretation data :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.
update_lipd_v1_3_structure
(d)¶ Update the structure for summary and ensemble tables :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.
update_lipd_version
(d)¶ Metadata is indexed by number at this step.
Use the current version number to determine where to start updating from. Use “chain versioning” to make it modular. If a file is a few versions behind, convert to EACH version until reaching current. If a file is one version behind, it will only convert once to the newest. :param dict d: Metadata :return dict d: Metadata
zips¶
-
lipd.zips.
unzipper
(filename, dir_tmp)¶ Unzip .lpd file contents to tmp directory. :param str filename: filename.lpd :param str dir_tmp: Tmp folder to extract contents to :return None:
-
lipd.zips.
zipper
(root_dir='', name='', path_name_ext='')¶ Zips up directory back to the original location :param str root_dir: Root directory of the archive :param str name: <datasetname>.lpd :param str path_name_ext: /path/to/filename.lpd