cardinal_pythonlib.sqlalchemy.merge_db
Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).
This file is part of cardinal_pythonlib.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Function “merge_db” to merge two databases via SQLAlchemy.
Notes:
Note in passing: there is no common base class for SQLAlchemy ORM instances
(it’s not DeclarativeMeta). For example, in CamCOPS:
> Phq9.__bases__
(<class 'camcops_server.cc_modules.cc_task.TaskHasPatientMixin'>,
<class 'camcops_server.cc_modules.cc_task.Task'>,
<class 'sqlalchemy.ext.declarative.api.Base'>)
… and that last Base isn’t a permanent class, just a newly named
thing; see sqlalchemy.ext.declarative.api.declarative_base().
Again, with the CamCOPS classes:
> issubclass(Phq9, Base)
True
> issubclass(Base, DeclarativeMeta)
False
> Base.__bases__
(<class 'object'>,)
So the best type hints we have are:
class: Type
instance: object
- class cardinal_pythonlib.sqlalchemy.merge_db.TableDependency(parent_table_id: TableIdentity | None = None, child_table_id: TableIdentity | None = None, parent_table: Table | None = None, child_table: Table | None = None, parent_tablename: str | None = None, child_tablename: str | None = None, metadata: MetaData | None = None)[source]
Stores a table dependency for use in functions such as
sqlalchemy.schema.sort_tables(), which requires a tuple of twoTableobjects, in the order(parent, child), wherechilddepends onparent(e.g. a field likechild.parent_idrefers toparent.id).The parent and child tables can be specified by name,
Tableobject, or ourTableIdentitydescriptor class.- property child_table: Table
Returns the child table as a
Table.
- property child_tablename: str
Returns the child table’s string name.
- property parent_table: Table
Returns the parent table as a
Table.
- property parent_tablename: str
Returns the parent table’s string name.
- class cardinal_pythonlib.sqlalchemy.merge_db.TableDependencyClassification(table: Table, children: List[Table] | None = None, parents: List[Table] | None = None)[source]
Class to describe/classify a table in terms of its dependencies.
- Parameters:
- property child_names: List[str]
Returns the names of this table’s children.
- property circular_description: str
Description of the circular chain.
- property description: str
Short description.
- property is_child: bool
Is this table a child?
- property is_parent: bool
Is this table a parent?
- property parent_names: List[str]
Returns the names of this table’s parents.
- set_circular(circular: bool, chain: List[Table] | None = None) None[source]
Mark this table as circular (or not).
- property standalone: bool
Is this table standalone (neither a child nor a parent)?
- property tablename: str
Returns the table’s name.
- class cardinal_pythonlib.sqlalchemy.merge_db.TranslationContext(oldobj: object, newobj: object, objmap: Dict[object, object], table: Table, tablename: str, src_session: Session, dst_session: Session, src_engine: Engine, dst_engine: Engine, src_table_names: List[str], missing_src_columns: List[str] | None = None, info: Dict[str, Any] | None = None)[source]
Information-passing object for user callbacks from
merge_db().- Parameters:
oldobj¶ – The old SQLAlchemy ORM object from the source session.
newobj¶ –
The framework’s go at building a new SQLAlchemy ORM object, which will be inserted into the destination session.
The sequence is:
newobjis createda
TranslationContextis created, referring tonewobjThe
translate_fnparameter tomerge_db()will be called with theTranslationContextas its parameterthe user-suppled
translate_fn()function can, at this point, modify thenewobjattributeif the user function sets the
newobjattribute toNone, this object will be skipped
If the
TranslationContext’snewobjmember is notNone, the new object is inserted into the destination session.
objmap¶ – A dictionary mapping old to new objects, for objects in tables other than standalone tables.
table¶ – SQLAlchemy
Tableobject from the metadata. (Not necessarily bound to any session, but will reflect the structure of the destination, not necessarily the source, since the merge operation assumes that the metadata describes the destination.)tablename¶ – Table name that corresponds to
table.src_session¶ – The SQLAlchemy
Sessionobject for the source.dst_session¶ – The SQLAlchemy
Sessionobject for the destination.src_engine¶ – The SQLAlchemy
Engineobject for the source.dst_engine¶ – The SQLAlchemy
Engineobject for the destination.missing_src_columns¶ – Names of columns known to be present in the destination but absent from the source.
info¶ – Extra dictionary for additional user-specified information.
It is possible that
oldobjandnewobjare the SAME OBJECT.
- cardinal_pythonlib.sqlalchemy.merge_db.classify_tables_by_dependency_type(metadata: MetaData, extra_dependencies: List[TableDependency] | None = None, skip_dependencies: List[TableDependency] | None = None, all_dependencies: List[TableDependency] | None = None, even_use_alter: bool = False, sort: bool = True) List[TableDependencyClassification][source]
Inspects a metadata object (optionally adding other specified dependencies) and returns a list of objects describing their dependencies.
- Parameters:
metadata¶ – the
MetaDatato inspectextra_dependencies¶ – Additional dependencies. (Not used if you specify all_dependencies.)
skip_dependencies¶ – Additional table dependencies to IGNORE. (Not used if you specify all_dependencies.)
all_dependencies¶ – If you have precalculated all dependencies, you can pass that in here, to save redoing the work.
even_use_alter¶ – Even include relationships with
use_alterset. See SQLAlchemy documentation. (Not used if you specify all_dependencies.)sort¶ – sort the results by table name?
- Returns:
list of
TableDependencyClassificationobjects, one for each table
- cardinal_pythonlib.sqlalchemy.merge_db.get_all_dependencies(metadata: MetaData, extra_dependencies: List[TableDependency] | None = None, skip_dependencies: List[TableDependency] | None = None, sort: bool = False, even_use_alter: bool = False, debug: bool = False) List[TableDependency][source]
Describes how the tables found in the metadata depend on each other. (If table B contains a foreign key to table A, for example, then B depends on A.)
- Parameters:
metadata¶ – The metadata to inspect.
extra_dependencies¶ – Additional table dependencies to specify manually.
skip_dependencies¶ – Additional table dependencies to IGNORE.
sort¶ – Sort into alphabetical order of (parent, child) table names?
even_use_alter¶ – Even include relationships with
use_alterset. See SQLAlchemy documentation.debug¶ – Show debugging information.
- Returns:
a list of
TableDependencyobjects
- cardinal_pythonlib.sqlalchemy.merge_db.merge_db(base_class: Type, src_engine: Engine, dst_session: Session, allow_missing_src_tables: bool = True, allow_missing_src_columns: bool = True, translate_fn: Callable[[TranslationContext], None] | None = None, skip_tables: List[TableIdentity] | None = None, only_tables: List[TableIdentity] | None = None, tables_to_keep_pks_for: List[TableIdentity] | None = None, extra_table_dependencies: List[TableDependency] | None = None, skip_table_dependencies: List[TableDependency] | None = None, dummy_run: bool = False, info_only: bool = False, report_every: int = 1000, flush_per_table: bool = True, flush_per_record: bool = False, commit_with_flush: bool = False, commit_at_end: bool = True, prevent_eager_load: bool = True, trcon_info: Dict[str, Any] | None = None, even_use_alter_relationships: bool = False, debug_table_structure: bool = False, debug_table_dependencies: bool = False, debug_copy_sqla_object: bool = False, debug_rewrite_relationships: bool = False, use_sqlalchemy_order: bool = True) None[source]
Copies an entire database as far as it is described by
metadataandbase_class, from SQLAlchemy ORM sessionsrc_sessiontodst_session, and in the process:creates new primary keys at the destination, or raises an error if it doesn’t know how (typically something like:
Field 'name' doesn't have a default value)maintains relationships, or raises an error if it doesn’t know how
Basic method:
Examines the metadata for the SQLAlchemy ORM base class you provide.
Assumes that the tables exist (in the destination).
For each table/ORM class found in the metadata:
Queries (via the ORM) from the source.
For each ORM instance retrieved:
Writes information to the destination SQLAlchemy session.
If that ORM object has relationships, process them too.
If a table is missing in the source, then that’s OK if and only if
allow_missing_src_tablesis set. (Similarly with columns andallow_missing_src_columns; we ask the ORM to perform a partial load, of a subset of attributes only.)- Parameters:
base_class¶ – your ORM base class, e.g. from
Base = declarative_base()src_engine¶ – SQLALchemy
Enginefor the source databasedst_session¶ – SQLAlchemy
Sessionfor the destination databaseallow_missing_src_tables¶ – proceed if tables are missing from the source (allowing you to import from older, incomplete databases)
allow_missing_src_columns¶ – proceed if columns are missing from the source (allowing you to import from older, incomplete databases)
translate_fn¶ –
optional function called with each instance, so you can modify instances in the pipeline. Signature:
def my_translate_fn(trcon: TranslationContext) -> None: # We can modify trcon.newobj, or replace it (including # setting trcon.newobj = None to omit this object). pass
skip_tables¶ – tables to skip (specified as a list of
TableIdentity)only_tables¶ – tables to restrict the processor to (specified as a list of
TableIdentity)tables_to_keep_pks_for¶ – tables for which PKs are guaranteed to be safe to insert into the destination database, without modification (specified as a list of
TableIdentity)extra_table_dependencies¶ – optional list of
TableDependencyobjects (q.v.) to includeskip_table_dependencies¶ – optional list of
TableDependencyobjects (q.v.) to IGNORE; unusualdummy_run¶ – don’t alter the destination database
info_only¶ – show info, then stop
report_every¶ – provide a progress report every n records
flush_per_table¶ – flush the session after every table (reasonable)
flush_per_record¶ – flush the session after every instance (AVOID this if tables may refer to themselves)
commit_with_flush¶ –
COMMITwith each flush?commit_at_end¶ –
COMMITwhen finished?prevent_eager_load¶ – disable any eager loading (use lazy loading instead)
trcon_info¶ – additional dictionary passed to
TranslationContext.info(seeTranslationContext)even_use_alter_relationships¶ – Even include relationships with
use_alterset. See SQLAlchemy documentation.debug_table_structure¶ – Debug table structure? Can be long-winded.
debug_table_dependencies¶ – Debug calculating table dependencies?
debug_copy_sqla_object¶ – Debug copying objects?
debug_rewrite_relationships¶ – Debug rewriting ORM relationships?
use_sqlalchemy_order¶ – If true, use the table order suggested by SQLAlchemy. If false, calculate our own.
- cardinal_pythonlib.sqlalchemy.merge_db.suggest_table_order(classified_tables: List[TableDependencyClassification]) List[Table][source]
Suggest an order to process tables in, according to precalculated dependencies.
- Parameters:
classified_tables¶ – The tables, with dependency information.
- Returns:
A list of the tables, sorted into a sensible order.