cardinal_pythonlib.rnc_text


Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).

This file is part of cardinal_pythonlib.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Low-quality functions relating to textfile results storage/analysis.

cardinal_pythonlib.rnc_text.csv_to_list_of_dicts(lines: List[str], csvheader: str, quotechar: str = '"') List[Dict[str, str]][source]

Extracts data from a list of CSV lines (starting with a defined header line) embedded in a longer text block but ending with a blank line.

Parameters:
  • lines – CSV lines

  • csvheader – CSV header line

  • quotecharquotechar parameter passed to csv.reader()

Returns:

list of dictionaries mapping fieldnames (from the header) to values

cardinal_pythonlib.rnc_text.csv_to_list_of_fields(lines: List[str], csvheader: str, quotechar: str = '"') List[List[str]][source]

Extracts data from a list of CSV lines (starting with a defined header line) embedded in a longer text block but ending with a blank line.

Used for processing e.g. MonkeyCantab rescue text output.

Parameters:
  • lines – CSV lines

  • csvheader – CSV header line

  • quotecharquotechar parameter passed to csv.reader()

Returns:

list (by row) of lists (by value); see example

Test code:

import logging
from cardinal_pythonlib.rnc_text import *
logging.basicConfig(level=logging.DEBUG)

myheader = "field1,field2,field3"
mycsvlines = [
    "irrelevant line",
    myheader,  # header: START
    "row1value1,row1value2,row1value3",
    "row2value1,row2value2,row2value3",
    "",  # terminating blank line: END
    "other irrelevant line",
]
csv_to_list_of_fields(mycsvlines, myheader)
# [['row1value1', 'row1value2', 'row1value3'], ['row2value1', 'row2value2', 'row2value3']]
cardinal_pythonlib.rnc_text.dictlist_convert_to_bool(dict_list: Iterable[Dict], key: str) None[source]

Process an iterable of dictionaries. For each dictionary d, convert (in place) d[key] to a bool. If that fails, convert it to None.

cardinal_pythonlib.rnc_text.dictlist_convert_to_datetime(dict_list: Iterable[Dict], key: str, datetime_format_string: str) None[source]

Process an iterable of dictionaries. For each dictionary d, convert (in place) d[key] to a datetime.datetime form, using datetime_format_string as the format parameter to datetime.datetime.strptime().

cardinal_pythonlib.rnc_text.dictlist_convert_to_float(dict_list: Iterable[Dict], key: str) None[source]

Process an iterable of dictionaries. For each dictionary d, convert (in place) d[key] to a float. If that fails, convert it to None.

cardinal_pythonlib.rnc_text.dictlist_convert_to_int(dict_list: Iterable[Dict], key: str) None[source]

Process an iterable of dictionaries. For each dictionary d, convert (in place) d[key] to an integer. If that fails, convert it to None.

cardinal_pythonlib.rnc_text.dictlist_convert_to_string(dict_list: Iterable[Dict], key: str) None[source]

Process an iterable of dictionaries. For each dictionary d, convert (in place) d[key] to a string form, str(d[key]). If the result is a blank string, convert it to None.

cardinal_pythonlib.rnc_text.dictlist_replace(dict_list: Iterable[Dict], key: str, value: Any) None[source]

Process an iterable of dictionaries. For each dictionary d, change (in place) d[key] to value.

cardinal_pythonlib.rnc_text.dictlist_wipe_key(dict_list: Iterable[Dict], key: str) None[source]

Process an iterable of dictionaries. For each dictionary d, delete d[key] if it exists.

cardinal_pythonlib.rnc_text.find_line_beginning(strings: Sequence[str], linestart: str | None) int[source]

Finds the index of the line in strings that begins with linestart, or -1 if none is found.

If linestart is None, match an empty line.

cardinal_pythonlib.rnc_text.find_line_containing(strings: Sequence[str], contents: str) int[source]

Finds the index of the line in strings that contains contents, or -1 if none is found.

cardinal_pythonlib.rnc_text.get_bool(strings: Sequence[str], prefix: str, ignoreleadingcolon: bool = False, precedingline: str = '') bool | None[source]

Fetches a boolean parameter via get_string().

cardinal_pythonlib.rnc_text.get_bool_raw(s: str) bool | None[source]

Maps "Y", "y" to True and "N", "n" to False.

cardinal_pythonlib.rnc_text.get_bool_relative(strings: Sequence[str], prefix1: str, delta: int, prefix2: str, ignoreleadingcolon: bool = False) bool | None[source]

Fetches a boolean parameter via get_string_relative().

cardinal_pythonlib.rnc_text.get_datetime(strings: Sequence[str], prefix: str, datetime_format_string: str, ignoreleadingcolon: bool = False, precedingline: str = '') datetime | None[source]

Fetches a datetime.datetime parameter via get_string().

cardinal_pythonlib.rnc_text.get_float(strings: Sequence[str], prefix: str, ignoreleadingcolon: bool = False, precedingline: str = '') float | None[source]

Fetches a float parameter via get_string().

cardinal_pythonlib.rnc_text.get_float_raw(s: str) float | None[source]

Converts its input to a float.

Parameters:

s – string

Returns:

int(s), or None if s is None

Raises:

ValueError – if it’s a bad string

cardinal_pythonlib.rnc_text.get_float_relative(strings: Sequence[str], prefix1: str, delta: int, prefix2: str, ignoreleadingcolon: bool = False) float | None[source]

Fetches a float parameter via get_string_relative().

cardinal_pythonlib.rnc_text.get_int(strings: Sequence[str], prefix: str, ignoreleadingcolon: bool = False, precedingline: str = '') int | None[source]

Fetches an integer parameter via get_string().

cardinal_pythonlib.rnc_text.get_int_raw(s: str) int | None[source]

Converts its input to an int.

Parameters:

s – string

Returns:

int(s), or None if s is None

Raises:

ValueError – if it’s a bad string

cardinal_pythonlib.rnc_text.get_int_relative(strings: Sequence[str], prefix1: str, delta: int, prefix2: str, ignoreleadingcolon: bool = False) int | None[source]

Fetches an int parameter via get_string_relative().

cardinal_pythonlib.rnc_text.get_lines_from_to(strings: List[str], firstlinestart: str, list_of_lastline_starts: Iterable[str | None]) List[str][source]

Takes a list of strings. Returns a list of strings FROM firstlinestart (inclusive) TO the first of list_of_lastline_starts (exclusive).

To search to the end of the list, use list_of_lastline_starts = [].

To search to a blank line, use list_of_lastline_starts = [None]

cardinal_pythonlib.rnc_text.get_string(strings: Sequence[str], prefix: str, ignoreleadingcolon: bool = False, precedingline: str = '') str | None[source]

Find a string as per get_what_follows().

Parameters:
Returns:

the line fragment

cardinal_pythonlib.rnc_text.get_string_relative(strings: Sequence[str], prefix1: str, delta: int, prefix2: str, ignoreleadingcolon: bool = False, stripwhitespace: bool = True) str | None[source]

Finds a line (string) in strings beginning with prefix1. Moves delta lines (strings) further. Returns the end of the line that begins with prefix2, if found.

Parameters:
  • strings – as above

  • prefix1 – as above

  • delta – as above

  • prefix2 – as above

  • ignoreleadingcolon – restrict the result to the part after its first colon?

  • stripwhitespace – strip whitespace from the start/end of the result?

Returns:

the line fragment

cardinal_pythonlib.rnc_text.get_what_follows(strings: Sequence[str], prefix: str, onlyatstart: bool = True, stripwhitespace: bool = True, precedingline: str = '') str[source]

Find a string in strings that begins with prefix; return the part that’s after prefix. Optionally, require that the preceding string (line) is precedingline.

Parameters:
  • strings – strings to analyse

  • prefix – prefix to find

  • onlyatstart – only accept the prefix if it is right at the start of s

  • stripwhitespace – remove whitespace from the result

  • precedingline – if truthy, require that the preceding line be as specified here

Returns:

the line fragment

cardinal_pythonlib.rnc_text.get_what_follows_raw(s: str, prefix: str, onlyatstart: bool = True, stripwhitespace: bool = True) Tuple[bool, str][source]

Find the part of s that is after prefix.

Parameters:
  • s – string to analyse

  • prefix – prefix to find

  • onlyatstart – only accept the prefix if it is right at the start of s

  • stripwhitespace – remove whitespace from the result

Returns:

(found, result)

Return type:

tuple

cardinal_pythonlib.rnc_text.is_empty_string(s: str) bool[source]

Is the string empty (ignoring whitespace)?

cardinal_pythonlib.rnc_text.output_csv(filehandle: TextIO, values: Iterable[str]) None[source]

Write a line of CSV. POOR; does not escape things properly. DEPRECATED.

Parameters:
  • filehandle – file to write to

  • values – values

cardinal_pythonlib.rnc_text.produce_csv_output(filehandle: TextIO, fields: Sequence[str], values: Iterable[str]) None[source]

Produce CSV output, without using csv.writer, so the log can be used for lots of things.

  • … eh? What was I talking about?

  • POOR; DEPRECATED.

Parameters:
  • filehandle – file to write to

  • fields – field names

  • values – values