lynguine.access

The access module provides functionality for accessing data from various sources, including local files, web resources, and databases.

IO Module

lynguine.access.io.multiline_str_representer(dumper, data)[source]
lynguine.access.io.str_type()[source]
lynguine.access.io.bool_type()[source]
lynguine.access.io.int_type()[source]
lynguine.access.io.float_type()[source]
lynguine.access.io.extract_dtypes(details)[source]

Extract dtypes from directory.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The dtypes.

Return type:

dict

lynguine.access.io.extract_sheet(details, gsheet=True)[source]

Extract the sheet name from details

Parameters:
  • details (dict) – The details of the file to be read.

  • gsheet (bool) – Whether to use gspread_pandas.

lynguine.access.io.read_json(details)[source]

Read data from a json file.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.write_json(df, details)[source]

Write data to a json file.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.read_yaml(details)[source]

Read data from a yaml file.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_markdown(details)[source]

Read data from a markdown file.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.write_markdown(df, details)[source]

Write data to a markdown file.

Parameters:
  • df (pandas.DataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.write_yaml(df, details)[source]

Write data to a yaml file.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.read_bibtex(details)[source]

Read data from a bibtex file.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.write_bibtex(df, details)[source]

Write data to a bibtex file.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.read_directory(details, filereader=None, filereader_args={}, default_glob='*', source=None)[source]

Read data from a directory of files.

Parameters:
  • details (dict) – The details of the directory to be read.

  • filereader (function) – The function to be used to read the file.

  • filereader_args (dict) – The arguments to be passed to the filereader.

  • default_glob (str) – The default glob to be used if none is specified.

  • source (dict) – The source information for the data.

Raises:

ValueError – if the same filename is specified multiple times.

lynguine.access.io.read_list(filelist)[source]

Read from a list of files.

Parameters:

filelist (list) – The list of files to be read.

Returns:

The data read from the files.

Return type:

pandas.DataFrame

lynguine.access.io.read_files(filelist, store_fields=None, filereader=None, filereader_args=None)[source]

Read files from a given list.

Parameters:
  • filelist (list) – The list of files to be read.

  • store_fields (dict) – The fields to be stored in the data.

  • filereader (function) – The function to be used to read the file.

  • filereader_args (dict) – The arguments to be passed to the filereader.

Returns:

The data read from the files.

Return type:

pandas.DataFrame

lynguine.access.io.write_directory(df, details, filewriter=None, filewriter_args={})[source]

Write scoring data to a directory of files.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

  • filewriter (function) – The function to be used to write the file.

  • filewriter_args (dict) – The arguments to be passed to the filewriter.

Raises:

ValueError – if the same filename is specified multiple times.

lynguine.access.io.read_json_file(filename)[source]

Read a json file and return a python dictionary.

Parameters:

filename (str) – The filename of the json file.

Returns:

The data read from the file.

Return type:

dict

lynguine.access.io.write_json_file(data, filename)[source]

Write a json file from a python dicitonary.

lynguine.access.io.default_file_reader(typ)[source]

Return the default file reader for a given type.

Parameters:

typ (str) – The type of file to be read.

Returns:

The default file reader.

Return type:

function

Raises:

ValueError – if the type is not recognised.

lynguine.access.io.default_file_writer(typ)[source]

Return the default file writer for a given type.

Parameters:

typ (str) – The type of file to be written.

Returns:

The default file writer.

Return type:

function

Raises:

ValueError – if the type is not recognised.

lynguine.access.io.read_file(filename)[source]

“Attempt to read the file given the extention.

lynguine.access.io.read_yaml_file(filename)[source]

Read a yaml file and return a python dictionary.

lynguine.access.io.read_bibtex_file(filename)[source]

Red a bibtex file and return a python dictionary.

Parameters:

filename (str) – The filename of the bibtex file.

Returns:

The data read from the file.

Return type:

dict

lynguine.access.io.yaml_prep(data)[source]

Prepare any fields for writing in yaml

Parameters:

data (dict) – The data to be prepared.

Returns:

The prepared data.

Return type:

dict

lynguine.access.io.write_bibtex_file(data, filename)[source]

Write a bibtex file from a python dictionary.

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the bibtex file.

lynguine.access.io.write_yaml_file(data, filename)[source]

Write a yaml file from a python dictionary.

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the yaml file.

lynguine.access.io.read_yaml_meta_file(filename)[source]

Read meta information associated with a file as a yaml and return a python dictionary if it exists.

Parameters:

filename (str) – The filename of the file.

Returns:

The meta information.

Return type:

dict

lynguine.access.io.write_yaml_meta_file(data, filename)[source]

Write meta information associated with a file to a yaml.

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the file.

lynguine.access.io.read_markdown_file(filename, include_content=True)[source]

Read a markdown file and return a python dictionary.

Parameters:
  • filename (str) – The filename of the markdown file.

  • include_content (bool) – Whether to include the content in the data.

Returns:

The data read from the file.

Return type:

dict

lynguine.access.io.read_docx_file(filename, include_content=True)[source]

Read information from a docx file.

Parameters:
  • filename (str) – The filename of the docx file.

  • include_content (bool) – Whether to include the content in the data.

Returns:

The data read from the file.

Return type:

dict

lynguine.access.io.read_talk_file(filename, include_content=True)[source]

Read a markdown talk file.

Parameters:
  • filename (str) – The filename of the talk file.

  • include_content (bool) – Whether to include the content in the data.

Returns:

The data read from the file.

lynguine.access.io.read_talk_include_file(filename, include_content=True)[source]

Read a markdown talk include file.

Parameters:
  • filename (str) – The filename of the talk include file.

  • include_content (bool) – Whether to include the content in the data.

Returns:

The data read from the file.

lynguine.access.io.write_url_file(data, filename, content, include_content=True)[source]

Write a url to a file

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the url file.

  • content (str) – The content of the url file.

  • include_content (bool) – Whether to include the content in the data.

lynguine.access.io.write_markdown_file(data, filename, content=None, include_content=True)[source]

Write a markdown file from a python dictionary

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the markdown file.

  • content (str) – The content of the markdown file.

  • include_content (bool) – Whether to include the content in the data.

lynguine.access.io.create_document_content(**kwargs)[source]

Create a document content from the arguments. :param content: The content of the document. :type content: str :param filename: The filename of the document. :type filename: str :param directory: The directory of the document. :type directory: str :return: The data, filename and content of the document. :rtype: tuple

lynguine.access.io.create_letter(**kwargs)[source]

Create a markdown letter. :param content: The content of the letter. :type content: str :param filename: The filename of the letter. :type filename: str :param directory: The directory of the letter. :type directory: str :return: The data, filename and content of the letter. :rtype: tuple

lynguine.access.io.write_letter_file(data, filename, content, include_content=True)[source]

Write a letter file from a python dictionary

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the letter.

  • content (str) – The content of the letter.

  • include_content (bool) – Whether to include the content in the data.

Write a url to prepopulate a Google form

lynguine.access.io.write_docx_file(data, filename, content, include_content=True)[source]

Write a docx file from a python dictionary.

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the docx file.

  • content (str) – The content of the docx file.

  • include_content (bool) – Whether to include the content in the data.

lynguine.access.io.write_tex_file(data, filename, content, include_content=True)[source]

Write a docx file from a python dictionary.

Parameters:
  • data (dict) – The data to be written.

  • filename (str) – The filename of the docx file.

  • content (str) – The content of the docx file.

  • include_content (bool) – Whether to include the content in the data.

lynguine.access.io.read_csv(details)[source]

Read data from a csv file.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_excel(details)[source]

Read data from an excel spreadsheet.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_fake(details)[source]

Read data from an artificially generated source.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read from the source.

Return type:

pandas.DataFrame

lynguine.access.io.read_local(details)[source]

Read data directly from details file.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read from the settings file..

Return type:

pandas.DataFrame

Raises:

ValueError – If the ‘details’ is not a dictionary or is missing required keys.

lynguine.access.io.read_gsheet(details)[source]

Read data from a Google sheet.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.write_excel(df, details)[source]

Write data to an excel spreadsheet.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.write_csv(df, details)[source]

Write data to an csv spreadsheet.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the file to be written.

lynguine.access.io.write_gsheet(df, details)[source]

Read data from a Google sheet.

Parameters:

details (dict) – The details of the file to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.gdrf_(default_glob, filereader, name='', docstr='')[source]

Function generator for different directory readers.

Parameters:
  • default_glob (str) – The default glob to be used for the directory reader.

  • filereader (function) – The function to be used to read the files.

  • name (str) – The name of the function to be created.

  • docstr (str) – The docstring for the function to be created.

Returns:

The function to be created.

Return type:

function

lynguine.access.io.update_store_fields(details)[source]

Add default store fields values

Parameters:

details (dict) – The details to update with.

Returns:

The updated details.

Return type:

dict

lynguine.access.io.gdwf_(filewriter, name='', docstr='')[source]

Function generator for different directory writers.

Parameters:
  • filewriter (function) – The function to be used to write the files.

  • name (str) – The name of the function to be created.

  • docstr (str) – The docstring for the function to be created.

Returns:

The function to be created.

Return type:

function

lynguine.access.io.populate_directory_readers(readers)[source]

Populate the directory readers automatically creates functions for reading directories.

Parameters:

readers (list) – The readers to be created.

lynguine.access.io.populate_directory_writers(writers)[source]

This function automatically create functions for writing directories.

Parameters:

writers (list) – The writers to be created.

lynguine.access.io.finalize_data(df, interface)[source]

Finalize the data frame by augmenting with any columns.

Parameters:
Returns:

The finalized data frame.

Return type:

pandas.DataFrame or lynguine.data.CustomDataFrame

lynguine.access.io.read_hstack(details)[source]

Read data from a horizontal stack of data sources.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_stack(details)[source]

Read data from a horizontal stack of data series, where each source is a single-row DataFrame. Returns a single-row DataFrame combining all sources.

Parameters:

details (dict) – The details of the data series to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_vstack(details)[source]

Read data from a vertical stack of data sources.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read from the file.

Return type:

pandas.DataFrame

lynguine.access.io.read_series(details)[source]

Read in the series data from the details given in configuration. A series type is a data frame where the indices aren’t unique. If read in as a series, then each entry in each column of data frame is converted to a list, where the number of elements of the list are the number of non-unique elements from the index.

Parameters:

details (dict) – The details of the series data to be read.

Returns:

The data read in.

Return type:

pandas.DataFrame

lynguine.access.io.read_data(details)[source]

Read in the data from the details given in configuration.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read in.

Return type:

pandas.DataFrame

lynguine.access.io.read_auto(details)[source]

Read in the data from the details given in configuration. Use the file extension to determine the type of data to read.

Parameters:

details (dict) – The details of the data to be read.

Returns:

The data read in.

Return type:

pandas.DataFrame

lynguine.access.io.convert_data(read_details, write_details)[source]

Convert a data set from one form to another.

Parameters:
  • read_details (dict) – The details of the data to be read.

  • write_details (dict) – The details of the data to be written.

lynguine.access.io.data_exists(details)[source]

Check if a particular data structure exists or needs to be created.

Parameters:

details (dict) – The details of the data to be checked.

Returns:

Whether the data exists or not.

Return type:

bool

lynguine.access.io.load_or_create_df(details, index)[source]

Load in a data frame or create it if it doesn’t exist yet.

Parameters:
  • details (dict) – The details of the data to be loaded or created.

  • index (pandas.Index) – The index to be used if the data frame needs to be created.

Returns:

The data frame.

lynguine.access.io.globals_data(details, index=None)[source]

Load in the globals data to a data frame.

Parameters:

details (dict) – The details of the data to be loaded.

lynguine.access.io.cache(details, index=None)[source]

Load in the cache data to a data frame.

Parameters:

details (dict) – The details of the data to be loaded.

lynguine.access.io.scores(details, index=None)[source]

Load in the score data to data frames.

Parameters:

details (dict) – The details of the data to be loaded.

lynguine.access.io.series(details, index=None)[source]

Load in a series to data frame

Parameters:

details (dict) – The details of the data to be loaded.

lynguine.access.io.write_data(df, details)[source]

Write the data using the details given in configuration.

Parameters:
  • df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The data to be written.

  • details (dict) – The details of the data to be written.

lynguine.access.io.read_bibtex_directory(details)

Read a directory of bibtex files.

lynguine.access.io.read_docx_directory(details)

Read a directory of word files.

lynguine.access.io.read_json_directory(details)

Read a directory of json files.

lynguine.access.io.read_markdown_directory(details)

Read a directory of markdown files.

lynguine.access.io.read_meta_directory(details)

Read a directory of yaml meta files.

lynguine.access.io.read_plain_directory(details)

Read a directory of files.

lynguine.access.io.read_yaml_directory(details)

Read a directory of yaml files.

lynguine.access.io.write_json_directory(df, details)

Write a directory of json files.

lynguine.access.io.write_markdown_directory(df, details)

Write a directory of markdown files.

lynguine.access.io.write_meta_directory(df, details)

Write a directory of yaml meta files.

lynguine.access.io.write_yaml_directory(df, details)

Write a directory of yaml files.

Download Module

class lynguine.access.download.FileDownloader(interface, data_resources, data_name)[source]

Bases: object

A class for downloading data files from a url.

Initialize the FileDownloader class. :param data_resources: The data resources dictionary. :param data_name: The name of the data to download.

property interface

Return the interface object. :return: The interface object.

property data_name

Return the name of the data to download. :return: The name of the data to download.

property data_resources

Return the data resources dictionary. :return: The data resources dictionary.

download_data(prompt=<function prompt_stdin>)[source]

Check with the user that they are happy with terms and conditions for the data, then download it. :param prompt: A function that takes a string and returns a boolean. :return: None :raises: ValueError if the data is not found.

class lynguine.access.download.GitDownloader(interface, data_resources, data_name, git_url)[source]

Bases: FileDownloader

Initialize the FileDownloader class. :param data_resources: The data resources dictionary. :param data_name: The name of the data to download.

property data_name

Return the name of the data to download. :return: The name of the data to download.

property data_resources

Return the data resources dictionary. :return: The data resources dictionary.

download_data(prompt=<function prompt_stdin>)

Check with the user that they are happy with terms and conditions for the data, then download it. :param prompt: A function that takes a string and returns a boolean. :return: None :raises: ValueError if the data is not found.

property interface

Return the interface object. :return: The interface object.