lynguine.util
The util module provides various utility functions for working with data, files, and text.
DataFrame Utilities
- lynguine.util.dataframe.convert_datetime_to_str(df)[source]
Convert datetime columns to strings in isoformat for ease of writing.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The DataFrame to convert.
- Returns:
The converted DataFrame.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFram
- lynguine.util.dataframe.reorder_dataframe(df, order)[source]
This function reorders the given data frame columns with the order given by the columns listed in order and any remaining columns placed alphabetically after order.
- Parameters:
df (pd.DataFrame or lynguine.data.CustomDataFrame) – The DataFrame to reorder.
:
- lynguine.util.dataframe.convert_datetime(df, columns)[source]
Preprocessor to set datetime type on columns.
- lynguine.util.dataframe.convert_int(df, columns)[source]
Preprocessor to set integer type on columns.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be converted.
columns (list) – The columns to be converted.
- Returns:
The converted dataframe.
- Return type:
- lynguine.util.dataframe.convert_string(df, columns)[source]
Preprocessor to set string type on columns.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be converted.
columns (list) – The columns to be converted.
- Returns:
The converted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.convert_year_iso(df, column='year', month=1, day=1)[source]
Preprocessor to set string type on columns.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be converted.
columns (list) – The columns to be converted.
- Returns:
The converted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.addmonth(df, source='date')[source]
Add month column based on source date field.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be augmented.
source (str) – The source column to be used.
- Returns:
The augmented dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- Raises:
KeyError – If the source column is not in the dataframe
TypeError – If the source column is not of type datetime.date
ValueError – If the source column is not a valid date
- lynguine.util.dataframe.addyear(df, source='date')[source]
Add year column and based on source date field.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be augmented.
source (str) – The source column to be used.
- Returns:
The augmented dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.augmentmonth(df, destination='month', source='date')[source]
Augment the month column based on source date field.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be augmented.
destination (str) – The destination column to be used.
source (str) – The source column to be used.
- Returns:
The augmented dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.augmentyear(df, destination='year', source='date')[source]
Augment the year column based on source date field.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be augmented.
destination (str) – The destination column to be used.
source (str) – The source column to be used.
- Returns:
The augmented dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.augmentcurrency(df, source='amount', sf=0)[source]
Preprocessor to set integer type on columns.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be converted.
columns (list) – The columns to be converted.
- Returns:
The converted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.fillna(df, column, value)[source]
Fill missing values in a column with a given value.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be converted.
column (str) – The column to be converted.
value (str) – The value to be used to fill missing values.
- Returns:
The converted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.ascending(df, by)[source]
Sort dataframe in ascending order.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be sorted.
by (str) – The column to sort by.
- Returns:
The sorted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.descending(df, by)[source]
Sort dataframe in descending order.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be sorted.
by (str) – The column to sort by.
- Returns:
The sorted dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.recent(df, column='year', since_year=2000)[source]
Filter on whether item is recent
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
column (str) – The column to be filtered on.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.current(df, start='start', end='end', current=None, today=None)[source]
Filter on whether the row is current as given by start and end dates. If current is given then it is used instead of the range check. If today is given then it is used instead of the current date.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
start (str) – The start date of the entry.
end (str) – The end date of the entry.
current (str) – Column of true/false current entries.
today (datetime.date) – The date to be used as today.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.former(df, end='end')[source]
Filter on whether item is former.
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
end (str) – The end date of the entry.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.onbool(df, column='current', invert=False)[source]
Filter on whether column is positive (or negative if inverted)
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
column (str) – The column to be filtered on.
invert (bool) – Whether to invert the filter.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.columnis(df, column, value)[source]
Filter on whether a given column is equal to a given value
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
column (str) – The column to be filtered on.
value – The value to be used to filter.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
- lynguine.util.dataframe.columncontains(df, column, value)[source]
Filter on whether column contains a given value
- Parameters:
df (pandas.DataFrame or lynguine.data.CustomDataFrame) – The dataframe to be filtered.
column (str) – The column to be filtered on.
value – The value to be used to filter.
- Returns:
The filtered dataframe.
- Return type:
pandas.DataFrame or lynguine.data.CustomDataFrame
File Utilities
- lynguine.util.files.get_git_version(filename, full_path, git_path)[source]
Get the latest Git version (commit hash) of a file.
- lynguine.util.files.read_txt_file(filename, dir_name='.', comment_char='#')[source]
Read in a text file ignoring lines that start with a comment character.
YAML Utilities
- exception lynguine.util.yaml.FileFormatError(ind, msg=None, field=None)[source]
Bases:
Exception
Exception raised for errors in the file format.
- add_note()
Exception.add_note(note) – add a note to the exception
- args
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- lynguine.util.yaml.update_from_file(dictionary, filename)[source]
Update a given dictionary with the fields from a specified file.
- lynguine.util.yaml.header_field(field, fields, user_file=['_config.yml'])[source]
Return one field from yaml header fields.
Liquid Template Utilities
- lynguine.util.liquid.load_template_env(ext='.md', template_dir=None)[source]
Load in the templates to be used for lists.
- lynguine.util.liquid.relative_url(string)[source]
Filter to convert to a relative_url a jupyter notebook under liquid
Miscellaneous Utilities
- lynguine.util.misc.iskeyword()
x.__contains__(y) <==> y in x.
- lynguine.util.misc.log = <lynguine.log.Logger object>
Utility functions for helping, e.g. to create the relevant yaml files quickly.
- lynguine.util.misc.reorder_dictionary(dictionary, order, sort_remaining=True)[source]
Reorder a dictionary according to a given order.
- lynguine.util.misc.extract_full_filename(details)[source]
Return the filename from the details of directory and filename
- lynguine.util.misc.extract_root_directory(directory, environs=['HOME', 'USERPROFILE', 'TEMP', 'TMPDIR', 'TMP'])[source]
Extract a root directory and a subdirectory from a given directory string.
- lynguine.util.misc.extract_abs_filename(details)[source]
Return the absolute filename by adding current directory if it’s not present
- lynguine.util.misc.isna(entry)[source]
Check if an entry is missing.
- Parameters:
entry – The entry to be checked.
- Returns:
True if the entry is missing, False otherwise.
- Return type:
- lynguine.util.misc.to_valid_var(variable)[source]
Convert a given input (scalar or string) to a valid Python variable name. Replaces invalid characters with underscores and ensures the name is not a Python keyword.
- lynguine.util.misc.to_camel_case(text)[source]
Remove non alpha-numeric characters and convert to camel case.
- Parameters:
text (str) – The text to be converted.
- Returns:
The text converted to camel case.
- Return type:
- Raises:
ValueError – If the text is empty.
- lynguine.util.misc.sub_path_environment(path, environs=['HOME', 'USERPROFILE', 'TEMP', 'TMPDIR', 'TMP', 'BASE'])[source]
Replace a path with values from environment variables.
- lynguine.util.misc.get_path_env(environs=['HOME', 'USERPROFILE', 'TEMP', 'TMPDIR', 'TMP', 'BASE'])[source]
Return the current path with environment variables.
- lynguine.util.misc.get_url_file(url, directory=None, filename=None, ext=None)[source]
Download a file from a url and save it to disk.
Text Utilities
TeX Utilities
- lynguine.util.tex.extract_bib_files(text)[source]
Extract all the bib files listed in the file lines.
- lynguine.util.tex.substitute_inputs(filename, directories=None)[source]
Take the base file and substitute in any input and include files.
- lynguine.util.tex.input_file_name(filename, extension='.tex')[source]
Return the filename with the extension if it exists.
- lynguine.util.tex.process_file(filename, extension='.tex')[source]
Process a file and return the lines.
- lynguine.util.tex.extract_diagrams(lines, type='all')[source]
Extract all the diagrams listed in the file.
- lynguine.util.tex.extract_citations(lines)[source]
Extract all the citations listed in the file lines.
- lynguine.util.tex.make_bib_file(citations_list, bib_files)[source]
Create a new bibliography file for a given list of citations.
- lynguine.util.tex.get_bib_strings(string_list, bib_files)[source]
Create a new bibliography file for a given list of bibtex strings.
Talk Utilities
- lynguine.util.talk.talk_field(field, filename, user_file=['_config.yml'])[source]
Return one field from a talk.
- lynguine.util.talk.extract_all(filename, user_file=['_config.yml'])[source]
List the different files the talk file creates.
- lynguine.util.talk.extract_inputs(filename, snippets_path='..')[source]
Extract input and include files from a talk
Fake Data Generation
- lynguine.util.fake.prefix(name)[source]
Checks if name contains a prefix. If so, returns the prefix and the name without the prefix. Otherwise returns None and the name.
- Returns:
A tuple containing the prefix and the name without the prefix.
- Return type:
- lynguine.util.fake.suffix(name)[source]
Returns a random suffix.
- Returns:
A random suffix.
- Return type:
- lynguine.util.fake.author_editor()[source]
Returns a random author or editor name.
- Returns:
A random author or editor..
- Return type:
- lynguine.util.fake.random_entry_type()[source]
Returns a random entry type.
- Returns:
A random entry type.
- Return type:
- lynguine.util.fake.bibliography_entry()[source]
Returns a random bibliograhy entry.
- Returns:
A random bibliograhy entry.
- Return type:
- lynguine.util.fake.to_bibtex_author(entry, translate_unicode=True, author_type='author')[source]
Convert a citeproc author/editor bibliography entry to bibtex format.
Citeproc separates authors into family, given, prefix. Bibtext combines them into a single author field. This function converts the citeproc format to the bibtex format using liquid syntax.
- lynguine.util.fake.to_bibtex(entry, translate_unicode=True)[source]
Convert a citeproc bibliography entry to bibtex format.
HTML Utilities
- lynguine.util.html.get_reference(key_name)[source]
Gets a reference from the web.
File no longer implemented as the web page no longer exists.
- Parameters:
key_name (string) – the key name of the reference
- Returns:
the reference
- Return type:
string