cardinal_pythonlib.chebi


Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).

This file is part of cardinal_pythonlib.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Functions to assist with the ChEBI database.

ChEBI: Chemical Entities of Biological Interest (ChEBI) database from EMBL-EBI (European Molecular Biology Laboratory / European Bioinformatics Institute).

See https://www.ebi.ac.uk/chebi/

Examples:

cardinalpythonlib_chebi test

cardinalpythonlib_chebi search citalopram
cardinalpythonlib_chebi search citalopram --exact_search
cardinalpythonlib_chebi search zopiclone
cardinalpythonlib_chebi search zopiclone --exact_search
cardinalpythonlib_chebi search zopiclone --exact_match
cardinalpythonlib_chebi search salicylic --inexact_search

cardinalpythonlib_chebi describe citalopram simvastatin --exact_match

cardinalpythonlib_chebi ancestors citalopram simvastatin

Then try this syntax:

cardinalpythonlib_chebi categorize         --entities entities.txt         --entity_synonyms entity_synonyms.txt         --categories categories.txt         --category_synonyms category_synonyms.txt         --manual_categories manual_categories.txt         --results results.csv

using files like these:

# entities.txt
# Things to classify.

agomelatine
aspirin
citalopram
simvastatin
# entity_synonyms.txt
# Renaming of entities prior to lookup.
# Find these via "cardinalpythonlib_chebi search ..." or Google with "CHEBI".

aspirin, acetylsalicylic acid
# categories.txt
# Categories to detect, in order of priority (high to low).

serotonin reuptake inhibitor
antidepressant

antilipemic drug

non-steroidal anti-inflammatory drug
# category_synonyms.txt
# Categories that are equivalent but ChEBI doesn't know.

glucagon-like peptide-1 receptor agonist, hypoglycemic agent
# manual_categories.txt
# Categorizations that ChEBI doesn't know.

agomelatine, antidepressant
class cardinal_pythonlib.chebi.CaseInsensitiveDict[source]

Case-insensitive dictionary for strings; see https://stackoverflow.com/questions/2082152/case-insensitive-dictionary

class cardinal_pythonlib.chebi.HashableChebiEntity(chebi_id)[source]

Hashable version of libchebipy.ChebiEntity.

cardinal_pythonlib.chebi.brief_description(entity: ChebiEntity) str[source]
Parameters:

entity – a ChebiEntity

Returns:

name and ID

Return type:

str

cardinal_pythonlib.chebi.categorize_from_file(entity_filename: str, category_filename: str, results_filename: str, entity_synonyms_filename: str | None = None, category_synonyms_filename: str | None = None, manual_categories_filename: str | None = None, relationships: List[str] | None = None, output_dialect: str = 'excel', headers: bool = True) None[source]

Categorizes entities.

Parameters:
  • entity_filename – input filename, one entity per line

  • category_filename – filename containing permissible categories, one per line (earlier preferable to later)

  • results_filename – output filename for CSV results

  • entity_synonyms_filename – Name of CSV file (with optional # comments) containing synonyms in the format entity_from, entity_to.

  • category_synonyms_filename – Name of CSV file (with optional # comments) containing synonyms in the format category_from, categoryto.

  • manual_categories_filename – Name of CSV file (with optional # comments) containing manual categorizations in the format entity, category.

  • relationships – list of valid relationship types defining ancestry, e.g. “has_role”

  • output_dialect – CSV output dialect

  • headers – add CSV headers?

cardinal_pythonlib.chebi.describe_entity(entity: ChebiEntity) None[source]

Test function to describe a ChEBI entity.

Parameters:

entity – a ChebiEntity

cardinal_pythonlib.chebi.gen_ancestor_info(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None, starting_generation_: int = 0, seen_: Set[HashableChebiEntity] | None = None) Generator[Tuple[HashableChebiEntity, str, int], None, None][source]

Retrieves all ancestors (“outgoing” links).

Parameters:
  • entity – starting entity

  • relationships – list of valid relationship types, e.g. “has_role”

  • max_generations – maximum number of generations to pursue, or None for unlimited

  • starting_generation_ – for internal use only, for recursion

  • seen_ – for internal use only, for recursion

Returns:

of tuples entity, relationship, n_generations_above_start

Return type:

list

cardinal_pythonlib.chebi.gen_ancestors(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None) Generator[HashableChebiEntity, None, None][source]

Generates ancestors as per gen_ancestor_info(), without relationship or generation info.

cardinal_pythonlib.chebi.get_category(entity_name: str, categories: Sequence[str], entity_synonyms: CaseInsensitiveDict | None = None, category_synonyms: CaseInsensitiveDict | None = None, manual_categories: CaseInsensitiveDict | None = None, relationships: List[str] | None = None) str | None[source]
Parameters:
  • entity_name – name of entity to categorize

  • categories – permissible categories (earlier preferable to later)

  • entity_synonyms – map to rename entities

  • category_synonyms – mapping of categories to other (preferred) categories

  • manual_categories – manual overrides mapping entity to category

  • relationships – list of valid relationship types defining ancestry, e.g. “has_role”

Returns:

chosen category, or None if none found

cardinal_pythonlib.chebi.get_chebi_id_number(entity: ChebiEntity) int[source]

Returns the CHEBI ID number as an integer.

Parameters:

entity – a libchebipy.ChebiEntity

cardinal_pythonlib.chebi.get_chebi_id_number_str(entity: ChebiEntity) str[source]

Returns the CHEBI ID number as a string.

Parameters:

entity – a libchebipy.ChebiEntity

cardinal_pythonlib.chebi.get_entity(chebi_id: int | str) ChebiEntity[source]

Fetch a ChEBI entity by its ID.

Parameters:

chebi_id – integer ChEBI ID like 15903, or string ID like '15903', or string ID like 'CHEBI:15903'.

cardinal_pythonlib.chebi.main() None[source]

Command-line entry point.

cardinal_pythonlib.chebi.read_dict(filename: str) CaseInsensitiveDict[source]

Reads a filename that may have comments but is otherwise in the format

a1, b1
a2, b2
...
Parameters:

filename – filename to read

Returns:

mapping the first column (converted to lower case) to the second (case left intact).

Return type:

dict

cardinal_pythonlib.chebi.report_ancestors(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None) None[source]

Fetches and reports on ancestors of an entity, e.g. via “is_a” relationships. See gen_ancestor_info().

cardinal_pythonlib.chebi.report_ancestors_multiple(entity_names: List[str], relationships: List[str] | None = None, max_generations: int | None = None) None[source]

Looks up entities, then reports on ancestors. Fetches and reports on ancestors of an entity, e.g. via “is_a” relationships. See gen_ancestor_info().

cardinal_pythonlib.chebi.search_and_describe(search_term: int | str, exact_search: bool = False, exact_match: bool = False) None[source]

Search for a ChEBI term and describe it to the log.

Parameters:
  • search_term – search term

  • exact_search – exact search?

  • exact_match – exact match?

cardinal_pythonlib.chebi.search_and_describe_multiple(search_terms: List[int | str], exact_search: bool = False, exact_match: bool = False) None[source]

Search for ChEBI terms; describe matching entries to the log.

Parameters:
  • search_terms – search term(s)

  • exact_search – exact search?

  • exact_match – exact match?

cardinal_pythonlib.chebi.search_and_list(search_term: int | str, exact_search: bool = False, exact_match: bool = False) None[source]

Search for a ChEBI term; print matching entries to the log.

Parameters:
  • search_term – search term

  • exact_search – exact search?

  • exact_match – exact match?

cardinal_pythonlib.chebi.search_and_list_multiple(search_terms: List[int | str], exact_search: bool = False, exact_match: bool = False) None[source]

Search for ChEBI terms; print matching entries to the log.

Parameters:
  • search_terms – search term(s)

  • exact_search – exact search?

  • exact_match – exact match?

cardinal_pythonlib.chebi.search_entities(search_term: int | str, exact_search: bool = False, exact_match: bool = False) List[ChebiEntity][source]

Search for ChEBI entities.

Case-insensitive.

Parameters:
  • search_term – String or integer to search for.

  • exact_search – The exact parameter to libchebipy.search().

  • exact_match – Ensure that the name of the result exactly matches the search term. Example: an exact search for “zopiclone” gives both “zopiclone (CHEBI:32315)” and “(5R)-zopiclone (CHEBI:53762)”; this option filters to the first.

cardinal_pythonlib.chebi.testfunc1() None[source]

Test ChEBI interface.

cardinal_pythonlib.chebi.translate(term: str, mapping: CaseInsensitiveDict) Tuple[str, bool][source]

Translates a term through a dictionary. If the term (once converted to lower case) is in the dictionary (see read_dict()), the mapped term is returned; otherwise the original search term is returned.

Parameters:
  • term – term to look up

  • mapping – the mapping dictionary

Returns:

result (str), renamed? (bool)

Return type:

tuple