cardinal_pythonlib.chebi
Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).
This file is part of cardinal_pythonlib.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Functions to assist with the ChEBI database.
ChEBI: Chemical Entities of Biological Interest (ChEBI) database from EMBL-EBI (European Molecular Biology Laboratory / European Bioinformatics Institute).
See https://www.ebi.ac.uk/chebi/
Examples:
cardinalpythonlib_chebi test
cardinalpythonlib_chebi search citalopram
cardinalpythonlib_chebi search citalopram --exact_search
cardinalpythonlib_chebi search zopiclone
cardinalpythonlib_chebi search zopiclone --exact_search
cardinalpythonlib_chebi search zopiclone --exact_match
cardinalpythonlib_chebi search salicylic --inexact_search
cardinalpythonlib_chebi describe citalopram simvastatin --exact_match
cardinalpythonlib_chebi ancestors citalopram simvastatin
Then try this syntax:
cardinalpythonlib_chebi categorize --entities entities.txt --entity_synonyms entity_synonyms.txt --categories categories.txt --category_synonyms category_synonyms.txt --manual_categories manual_categories.txt --results results.csv
using files like these:
# entities.txt
# Things to classify.
agomelatine
aspirin
citalopram
simvastatin
# entity_synonyms.txt
# Renaming of entities prior to lookup.
# Find these via "cardinalpythonlib_chebi search ..." or Google with "CHEBI".
aspirin, acetylsalicylic acid
# categories.txt
# Categories to detect, in order of priority (high to low).
serotonin reuptake inhibitor
antidepressant
antilipemic drug
non-steroidal anti-inflammatory drug
# category_synonyms.txt
# Categories that are equivalent but ChEBI doesn't know.
glucagon-like peptide-1 receptor agonist, hypoglycemic agent
# manual_categories.txt
# Categorizations that ChEBI doesn't know.
agomelatine, antidepressant
- class cardinal_pythonlib.chebi.CaseInsensitiveDict[source]
Case-insensitive dictionary for strings; see https://stackoverflow.com/questions/2082152/case-insensitive-dictionary
- class cardinal_pythonlib.chebi.HashableChebiEntity(chebi_id)[source]
Hashable version of
libchebipy.ChebiEntity
.
- cardinal_pythonlib.chebi.brief_description(entity: ChebiEntity) str [source]
- Parameters:
entity¶ – a
ChebiEntity
- Returns:
name and ID
- Return type:
str
- cardinal_pythonlib.chebi.categorize_from_file(entity_filename: str, category_filename: str, results_filename: str, entity_synonyms_filename: str | None = None, category_synonyms_filename: str | None = None, manual_categories_filename: str | None = None, relationships: List[str] | None = None, output_dialect: str = 'excel', headers: bool = True) None [source]
Categorizes entities.
- Parameters:
entity_filename¶ – input filename, one entity per line
category_filename¶ – filename containing permissible categories, one per line (earlier preferable to later)
results_filename¶ – output filename for CSV results
entity_synonyms_filename¶ – Name of CSV file (with optional # comments) containing synonyms in the format
entity_from, entity_to
.category_synonyms_filename¶ – Name of CSV file (with optional # comments) containing synonyms in the format
category_from, categoryto
.manual_categories_filename¶ – Name of CSV file (with optional # comments) containing manual categorizations in the format
entity, category
.relationships¶ – list of valid relationship types defining ancestry, e.g. “has_role”
output_dialect¶ – CSV output dialect
headers¶ – add CSV headers?
- cardinal_pythonlib.chebi.describe_entity(entity: ChebiEntity) None [source]
Test function to describe a ChEBI entity.
- Parameters:
entity¶ – a
ChebiEntity
- cardinal_pythonlib.chebi.gen_ancestor_info(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None, starting_generation_: int = 0, seen_: Set[HashableChebiEntity] | None = None) Generator[Tuple[HashableChebiEntity, str, int], None, None] [source]
Retrieves all ancestors (“outgoing” links).
- Parameters:
- Returns:
of tuples
entity, relationship, n_generations_above_start
- Return type:
list
- cardinal_pythonlib.chebi.gen_ancestors(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None) Generator[HashableChebiEntity, None, None] [source]
Generates ancestors as per
gen_ancestor_info()
, without relationship or generation info.
- cardinal_pythonlib.chebi.get_category(entity_name: str, categories: Sequence[str], entity_synonyms: CaseInsensitiveDict | None = None, category_synonyms: CaseInsensitiveDict | None = None, manual_categories: CaseInsensitiveDict | None = None, relationships: List[str] | None = None) str | None [source]
- Parameters:
entity_name¶ – name of entity to categorize
categories¶ – permissible categories (earlier preferable to later)
entity_synonyms¶ – map to rename entities
category_synonyms¶ – mapping of categories to other (preferred) categories
manual_categories¶ – manual overrides mapping entity to category
relationships¶ – list of valid relationship types defining ancestry, e.g. “has_role”
- Returns:
chosen category, or
None
if none found
- cardinal_pythonlib.chebi.get_chebi_id_number(entity: ChebiEntity) int [source]
Returns the CHEBI ID number as an integer.
- Parameters:
entity¶ – a
libchebipy.ChebiEntity
- cardinal_pythonlib.chebi.get_chebi_id_number_str(entity: ChebiEntity) str [source]
Returns the CHEBI ID number as a string.
- Parameters:
entity¶ – a
libchebipy.ChebiEntity
- cardinal_pythonlib.chebi.get_entity(chebi_id: int | str) ChebiEntity [source]
Fetch a ChEBI entity by its ID.
- Parameters:
chebi_id¶ – integer ChEBI ID like
15903
, or string ID like'15903'
, or string ID like'CHEBI:15903'
.
- cardinal_pythonlib.chebi.read_dict(filename: str) CaseInsensitiveDict [source]
Reads a filename that may have comments but is otherwise in the format
a1, b1 a2, b2 ...
- Parameters:
filename¶ – filename to read
- Returns:
mapping the first column (converted to lower case) to the second (case left intact).
- Return type:
dict
- cardinal_pythonlib.chebi.report_ancestors(entity: ChebiEntity, relationships: List[str] | None = None, max_generations: int | None = None) None [source]
Fetches and reports on ancestors of an entity, e.g. via “is_a” relationships. See
gen_ancestor_info()
.
- cardinal_pythonlib.chebi.report_ancestors_multiple(entity_names: List[str], relationships: List[str] | None = None, max_generations: int | None = None) None [source]
Looks up entities, then reports on ancestors. Fetches and reports on ancestors of an entity, e.g. via “is_a” relationships. See
gen_ancestor_info()
.
- cardinal_pythonlib.chebi.search_and_describe(search_term: int | str, exact_search: bool = False, exact_match: bool = False) None [source]
Search for a ChEBI term and describe it to the log.
- cardinal_pythonlib.chebi.search_and_describe_multiple(search_terms: List[int | str], exact_search: bool = False, exact_match: bool = False) None [source]
Search for ChEBI terms; describe matching entries to the log.
- cardinal_pythonlib.chebi.search_and_list(search_term: int | str, exact_search: bool = False, exact_match: bool = False) None [source]
Search for a ChEBI term; print matching entries to the log.
- cardinal_pythonlib.chebi.search_and_list_multiple(search_terms: List[int | str], exact_search: bool = False, exact_match: bool = False) None [source]
Search for ChEBI terms; print matching entries to the log.
- cardinal_pythonlib.chebi.search_entities(search_term: int | str, exact_search: bool = False, exact_match: bool = False) List[ChebiEntity] [source]
Search for ChEBI entities.
Case-insensitive.
- Parameters:
search_term¶ – String or integer to search for.
exact_search¶ – The
exact
parameter tolibchebipy.search()
.exact_match¶ – Ensure that the name of the result exactly matches the search term. Example: an exact search for “zopiclone” gives both “zopiclone (CHEBI:32315)” and “(5R)-zopiclone (CHEBI:53762)”; this option filters to the first.
- cardinal_pythonlib.chebi.translate(term: str, mapping: CaseInsensitiveDict) Tuple[str, bool] [source]
Translates a term through a dictionary. If the term (once converted to lower case) is in the dictionary (see
read_dict()
), the mapped term is returned; otherwise the original search term is returned.