Skip to content

Cfm 29 transmogrify plugin #347

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
6 changes: 4 additions & 2 deletions app/composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@
"PipelineToolkit\\": "availableplugins/PipelineToolkit/src/",
"SqlConnector\\": "availableplugins/SqlConnector/src/",
"SshKeyAuthenticator\\": "plugins/SshKeyAuthenticator/src/",
"CoreJob\\": "plugins/CoreJob/src/"
"CoreJob\\": "plugins/CoreJob/src/",
"Transmogrify\\": "plugins/Transmogrify/src/"
}
},
"autoload-dev": {
Expand All @@ -66,7 +67,8 @@
"PipelineToolkit\\Test\\": "availableplugins/PipelineToolkit/tests/",
"SqlConnector\\Test\\": "availableplugins/SqlConnector/tests/",
"SshKeyAuthenticator\\Test\\": "plugins/SshKeyAuthenticator/tests/",
"CoreJob\\Test\\": "plugins/CoreJob/tests/"
"CoreJob\\Test\\": "plugins/CoreJob/tests/",
"Transmogrify\\": "plugins/Transmogrify/src/"
}
},
"scripts": {
Expand Down
222 changes: 222 additions & 0 deletions app/plugins/Transmogrify/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
# Transmogrify (COmanage Registry Plugin)

Transmogrify is a command‑line migration tool bundled as a CakePHP plugin for COmanage Registry PE. It copies and transforms data from a legacy/source Registry schema (cm_… tables) into the PE target schema using a configurable mapping (config/schema/tables.json) plus code hooks for special cases.

It is designed to be:
- Safe by default: target tables that already contain data are skipped and not overwritten.
- Transparent: shows a progress bar with in‑line warnings and errors.
- Configurable: table selection, custom tables.json path, and optional helper modes (info, schema, etc).


## Prerequisites
- Configure database connections in local/config/database.php
- Source DB (used by Transmogrify) and target DB must both be reachable. Transmogrify initializes two Doctrine DBAL connections internally.
- The default tables mapping file is at:
- app/plugins/Transmogrify/config/schema/tables.json
- Actions
- Finalize any pending Jobs. Jobs in Queued or Progress state will be skipped.
- Restore extended type defaults
- Run the health checks.

## Command
Invoke from your app root:

- bin/cake transmogrify [options]

Cake’s standard flags also apply (eg: --verbose, --quiet, -h).


## Options
These options come directly from TransmogrifyCommand::buildOptionParser.

- --tables-config PATH
- Path to the transmogrify tables JSON config (defaults to the plugin’s tables.json).

- --dump-tables-config
- Print the effective tables configuration (after schema extension) and exit.

- --table NAME (repeatable)
- Migrate only the specified table(s). Repeat --table to select multiple.
- If omitted, Transmogrify processes all known tables in the configured order.

- --list-tables
- List available target tables from the transmogrify config and exit.

- --info
- Print source/target database configuration and exit.

- --info-json
- Output info in JSON (use with --info).

- --info-ping
- Ping connections and include connectivity + server version details (use with --info or --info-schema).

- --info-schema
- Print schema information and whether the database is empty (defaults to inspecting the target DB).

- --info-schema-role ROLE
- When used with --info-schema, select which database to inspect: source or target (default: target).

- --login-identifier-copy
- Enable helper logic to copy/set up login identifiers during migration (see your deployment’s identifier policy). Use together with --login-identifier-type to choose the identifier type.

- --login-identifier-type TYPE
- Identifier type value to use for login identifiers when --login-identifier-copy is set.

- --orgidentities-health
- Run Org Identities health check (eligibility/exclusion breakdown based on non-historical links and person existence) and print a transmogrification readiness report, then exit.

- --groups-health
- Run Groups health check (AR‑Group‑9: invalid Standard group names) and print a transmogrification readiness report, then exit.

- --groups-colon-replacement STRING
- Optional: replace ":" with STRING in Standard group names during migration (opt‑in). Use with care; the name "CO" remains invalid and will not be auto‑renamed.


## Typical usage

- Migrate everything using the default mapping
- bin/cake transmogrify

- Preview environment information
- bin/cake transmogrify --info
- bin/cake transmogrify --info --info-json
- bin/cake transmogrify --info --info-ping

- Inspect schema state (target by default, or choose source)
- bin/cake transmogrify --info-schema
- bin/cake transmogrify --info-schema --info-schema-role source

- List the tables Transmogrify knows how to process
- bin/cake transmogrify --list-tables

- Dump the effective tables configuration
- bin/cake transmogrify --dump-tables-config

- Migrate a subset of tables (in safe order)
- bin/cake transmogrify --table types --table people --table person_roles

- Use a custom tables.json mapping
- bin/cake transmogrify --tables-config /path/to/your/tables.json

- Migrate with login identifier help
- bin/cake transmogrify --table identifiers --login-identifier-copy --login-identifier-type eppn

Hints:
- Combine Cake’s verbosity flags for extra diagnostics: add --verbose to see per‑row details emitted by some hooks; add --quiet to minimize output (progress still shows).


## Behavior notes
- Target table already has data? Transmogrify will skip that table and warn (no overwrite).
- Primary key sequences are aligned automatically to preserve/accept explicit IDs where possible.
- Ordering and foreign keys:
- The tool emits rows in a dependency‑friendly order and may retry deferred rows to satisfy FK dependencies.
- You’ll see warnings for rows skipped due to unresolved foreign keys or missing type mappings.
- Progress UI:
- A single‑line progress bar updates in place.
- Warnings and errors appear under the bar as they happen.


## Mapping and hooks
- Mapping is defined in config/schema/tables.json.
- You can specify per‑table field maps, boolean normalization, caches, and pre/post hooks.
- Some tables use custom SELECTs to ensure parent‑before‑child ordering (eg recursive chains).
- Type lookups depend on the ‘types’ cache; migrate ‘types’ early or include it in the same run if you migrate data that requires type IDs.


## Exit codes
- 0: Success
- Non‑zero: Error (check the emitted [ERROR] lines)


## Getting help
- bin/cake transmogrify -h
- Consult this plugin’s code for advanced behavior:
- src/Command/TransmogrifyCommand.php
- config/schema/tables.json
- src/Lib/Traits/* (type mapping, caching, row transformations, hooks)

## Org Identities → External Identities: Readiness and Migration Behavior

This plugin migrates Organizational Identities to the new model introduced in COmanage Registry v5:

- OrgIdentity was split into ExternalIdentity and ExternalIdentityRole.
- ExternalIdentity directly relates to Person (a Person may have multiple External Identities, but each External Identity belongs to a single Person).
- The CoOrgIdentityLink crosswalk was eliminated and Organizational Identity Pooling was dropped.
- affiliation was replaced by affiliation_type_id.
- o was renamed to organization.
- ou was renamed to department.
- External Identities no longer have Primary Names.
- External Identities do not carry the login flag on Identifiers. As of v5.1.0, the Identifier Mapper Pipeline Plugin can be used to set this flag.

Eligibility (reasoning) is based on “non‑historical” links in cm_co_org_identity_links (co_org_identity_link_id IS NULL):
- A) No non‑historical link: excluded (x)
- B) Has non‑historical link(s) but all co_person_id are NULL: excluded (x)
- C) Has at least one non‑historical link with a non‑NULL co_person_id: included (✓)

Included (✓) Org Identities are migrated as:
- One ExternalIdentity linked directly to the Person (person_id from the link)
- One ExternalIdentityRole for role‑like attributes with v5 field changes applied:
- affiliation → affiliation_type_id (type‑mapped)
- o → organization
- ou → department
- External Identities do not have Primary Names; identifier “login” flags are not set by this migration (use the Identifier Mapper Pipeline if needed)

### Recommended preflight: Org Identities Health command

Run this before migrating to verify which Org Identities will be included vs excluded, using the same non‑historical link reasoning:

```bash
bin/cake transmogrify --orgidentities-health
```


You’ll see a fixed‑width table with Reason, Included/Excluded counts, and an Indicator (✓ included, x excluded). Reasons are:
- A) No non‑historical link (excluded)
- B) Has non‑historical link(s) but all co_person_id are NULL (excluded)
- C) Has at least one non‑historical link with a non‑NULL co_person_id (included)

Totals summarize overall readiness. Use this report to address data conditions (eg, missing person links) so that important Org Identities are eligible for migration.

### Recommended preflight: Groups Health command (Standard naming rule)

Transmogrify enforces a naming rule for Standard groups: a Standard group is considered invalid if its name contains a colon (:) or equals “CO” (case‑insensitive, trimmed). Invalid Standard groups will not be migrated by default and require admin action (eg, rename) before proceeding.

Run this health check to see how many groups are affected and where action is needed:

```bash
bin/cake transmogrify --groups-health
```

You’ll see a fixed‑width table with Reason, Included/Excluded counts, and an Indicator (✓ valid/eligible, x invalid/action required). Reasons are:
- Invalid: Standard group name contains “:” (excluded)
- Standard groups (type S) whose name includes a colon are considered invalid by default and require renaming before migration.

- Invalid: Standard group name equals “CO” (excluded)
- Standard groups (type S) named exactly “CO” (case‑insensitive, trimmed) are considered invalid and require renaming.

- Valid: Does not violate the naming rule (included)
- All other groups: either not Standard type, or Standard whose name does not contain “:” and is not exactly “CO”.

Totals summarize overall readiness:
- Invalid (total): total number of invalid Standard groups (sum of the invalid reasons).
- Valid (total): total number of groups eligible to migrate without renaming.
- Total Groups: grand total of groups evaluated (valid + invalid).

Use this report to identify Standard groups that must be renamed to comply with the rule. After remediation, re‑run the health check and verify that Invalid (total) is 0 and all groups you intend to migrate appear under Valid.

Optional remediation helper (opt‑in): colon replacement
- By default, Transmogrify does not change group names and will error on invalid Standard names.
- You can opt in to replace “:” in Standard group names with a safer character or string during migration. The special name “CO” remains invalid and is not auto‑renamed.

Example (replace ":" with "-"), shorthand when passing a lone "-" is problematic:

```bash
bin/cake transmogrify --groups-colon-replacement-dash
```

For every other character, use the full option:

```bash
bin/cake transmogrify --groups-colon-replacement '@'
```
7 changes: 7 additions & 0 deletions app/plugins/Transmogrify/config/plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"name": "Transmogrify",
"version": "1.0.0",
"description": "Data migration command to transmogrify data from a source database into the target schema.",
"types": {},
"schema": null
}
Loading