cardinal_pythonlib.sqlalchemy.merge_db
Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).
This file is part of cardinal_pythonlib.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Function “merge_db” to merge two databases via SQLAlchemy.
Notes:
Note in passing: there is no common base class for SQLAlchemy ORM instances
(it’s not DeclarativeMeta
). For example, in CamCOPS:
> Phq9.__bases__
(<class 'camcops_server.cc_modules.cc_task.TaskHasPatientMixin'>,
<class 'camcops_server.cc_modules.cc_task.Task'>,
<class 'sqlalchemy.ext.declarative.api.Base'>)
… and that last Base
isn’t a permanent class, just a newly named
thing; see sqlalchemy.ext.declarative.api.declarative_base()
.
Again, with the CamCOPS classes:
> issubclass(Phq9, Base)
True
> issubclass(Base, DeclarativeMeta)
False
> Base.__bases__
(<class 'object'>,)
So the best type hints we have are:
class: Type
instance: object
- class cardinal_pythonlib.sqlalchemy.merge_db.TableDependency(parent_table_id: TableIdentity | None = None, child_table_id: TableIdentity | None = None, parent_table: Table | None = None, child_table: Table | None = None, parent_tablename: str | None = None, child_tablename: str | None = None, metadata: MetaData | None = None)[source]
Stores a table dependency for use in functions such as
sqlalchemy.schema.sort_tables()
, which requires a tuple of twoTable
objects, in the order(parent, child)
, wherechild
depends onparent
(e.g. a field likechild.parent_id
refers toparent.id
).The parent and child tables can be specified by name,
Table
object, or ourTableIdentity
descriptor class.- property child_table: Table
Returns the child table as a
Table
.
- property child_tablename: str
Returns the child table’s string name.
- property parent_table: Table
Returns the parent table as a
Table
.
- property parent_tablename: str
Returns the parent table’s string name.
- class cardinal_pythonlib.sqlalchemy.merge_db.TableDependencyClassification(table: Table, children: List[Table] | None = None, parents: List[Table] | None = None)[source]
Class to describe/classify a table in terms of its dependencies.
- Parameters:
- property child_names: List[str]
Returns the names of this table’s children.
- property circular_description: str
Description of the circular chain.
- property description: str
Short description.
- property is_child: bool
Is this table a child?
- property is_parent: bool
Is this table a parent?
- property parent_names: List[str]
Returns the names of this table’s parents.
- set_circular(circular: bool, chain: List[Table] | None = None) None [source]
Mark this table as circular (or not).
- property standalone: bool
Is this table standalone (neither a child nor a parent)?
- property tablename: str
Returns the table’s name.
- class cardinal_pythonlib.sqlalchemy.merge_db.TranslationContext(oldobj: object, newobj: object, objmap: Dict[object, object], table: Table, tablename: str, src_session: Session, dst_session: Session, src_engine: Engine, dst_engine: Engine, src_table_names: List[str], missing_src_columns: List[str] | None = None, info: Dict[str, Any] | None = None)[source]
Information-passing object for user callbacks from
merge_db()
.- Parameters:
oldobj¶ – The old SQLAlchemy ORM object from the source session.
newobj¶ –
The framework’s go at building a new SQLAlchemy ORM object, which will be inserted into the destination session.
The sequence is:
newobj
is createda
TranslationContext
is created, referring tonewobj
The
translate_fn
parameter tomerge_db()
will be called with theTranslationContext
as its parameterthe user-suppled
translate_fn()
function can, at this point, modify thenewobj
attributeif the user function sets the
newobj
attribute toNone
, this object will be skipped
If the
TranslationContext
’snewobj
member is notNone
, the new object is inserted into the destination session.
objmap¶ – A dictionary mapping old to new objects, for objects in tables other than standalone tables.
table¶ – SQLAlchemy
Table
object from the metadata. (Not necessarily bound to any session, but will reflect the structure of the destination, not necessarily the source, since the merge operation assumes that the metadata describes the destination.)tablename¶ – Table name that corresponds to
table
.src_session¶ – The SQLAlchemy
Session
object for the source.dst_session¶ – The SQLAlchemy
Session
object for the destination.src_engine¶ – The SQLAlchemy
Engine
object for the source.dst_engine¶ – The SQLAlchemy
Engine
object for the destination.missing_src_columns¶ – Names of columns known to be present in the destination but absent from the source.
info¶ – Extra dictionary for additional user-specified information.
It is possible that
oldobj
andnewobj
are the SAME OBJECT.
- cardinal_pythonlib.sqlalchemy.merge_db.classify_tables_by_dependency_type(metadata: MetaData, extra_dependencies: List[TableDependency] | None = None, sort: bool = True) List[TableDependencyClassification] [source]
Inspects a metadata object (optionally adding other specified dependencies) and returns a list of objects describing their dependencies.
- Parameters:
- Returns:
list of
TableDependencyClassification
objects, one for each table
- cardinal_pythonlib.sqlalchemy.merge_db.get_all_dependencies(metadata: MetaData, extra_dependencies: List[TableDependency] | None = None, sort: bool = True) List[TableDependency] [source]
Describes how the tables found in the metadata depend on each other. (If table B contains a foreign key to table A, for example, then B depends on A.)
- Parameters:
- Returns:
a list of
TableDependency
objects
See
sort_tables_and_constraints()
for method.
- cardinal_pythonlib.sqlalchemy.merge_db.merge_db(base_class: Type, src_engine: Engine, dst_session: Session, allow_missing_src_tables: bool = True, allow_missing_src_columns: bool = True, translate_fn: Callable[[TranslationContext], None] | None = None, skip_tables: List[TableIdentity] | None = None, only_tables: List[TableIdentity] | None = None, tables_to_keep_pks_for: List[TableIdentity] | None = None, extra_table_dependencies: List[TableDependency] | None = None, dummy_run: bool = False, info_only: bool = False, report_every: int = 1000, flush_per_table: bool = True, flush_per_record: bool = False, commit_with_flush: bool = False, commit_at_end: bool = True, prevent_eager_load: bool = True, trcon_info: Dict[str, Any] | None = None) None [source]
Copies an entire database as far as it is described by
metadata
andbase_class
, from SQLAlchemy ORM sessionsrc_session
todst_session
, and in the process:creates new primary keys at the destination, or raises an error if it doesn’t know how (typically something like:
Field 'name' doesn't have a default value
)maintains relationships, or raises an error if it doesn’t know how
Basic method:
Examines the metadata for the SQLAlchemy ORM base class you provide.
Assumes that the tables exist (in the destination).
For each table/ORM class found in the metadata:
Queries (via the ORM) from the source.
For each ORM instance retrieved:
Writes information to the destination SQLAlchemy session.
If that ORM object has relationships, process them too.
If a table is missing in the source, then that’s OK if and only if
allow_missing_src_tables
is set. (Similarly with columns andallow_missing_src_columns
; we ask the ORM to perform a partial load, of a subset of attributes only.)- Parameters:
base_class¶ – your ORM base class, e.g. from
Base = declarative_base()
src_engine¶ – SQLALchemy
Engine
for the source databasedst_session¶ – SQLAlchemy
Session
for the destination databaseallow_missing_src_tables¶ – proceed if tables are missing from the source (allowing you to import from older, incomplete databases)
allow_missing_src_columns¶ – proceed if columns are missing from the source (allowing you to import from older, incomplete databases)
translate_fn¶ –
optional function called with each instance, so you can modify instances in the pipeline. Signature:
def my_translate_fn(trcon: TranslationContext) -> None: # We can modify trcon.newobj, or replace it (including # setting trcon.newobj = None to omit this object). pass
skip_tables¶ – tables to skip (specified as a list of
TableIdentity
)only_tables¶ – tables to restrict the processor to (specified as a list of
TableIdentity
)tables_to_keep_pks_for¶ – tables for which PKs are guaranteed to be safe to insert into the destination database, without modification (specified as a list of
TableIdentity
)extra_table_dependencies¶ – optional list of
TableDependency
objects (q.v.)dummy_run¶ – don’t alter the destination database
info_only¶ – show info, then stop
report_every¶ – provide a progress report every n records
flush_per_table¶ – flush the session after every table (reasonable)
flush_per_record¶ – flush the session after every instance (AVOID this if tables may refer to themselves)
commit_with_flush¶ –
COMMIT
with each flush?commit_at_end¶ –
COMMIT
when finished?prevent_eager_load¶ – disable any eager loading (use lazy loading instead)
trcon_info¶ – additional dictionary passed to
TranslationContext.info
(seeTranslationContext
)