TAP Identity Match Protocol
v1.0.0 March 2021
Copyright © 2021 University Corporation for Advanced Internet Development, Inc.
About Identity Match
Identity Match refers to the task of determining if a Person or Subject presented by a System of Record ("SoR") is already known to the Identity Management System ("IdMS"). The goal of an Identity Match request is to obtain an identifier that uniquely identifies the subject. This identifier is referred to as a Reference Identifier.
There are several characteristics of an Identity Match implementation.
Coordinated vs Independent Attributes
In a coordinated implementation, all Systems of Record agree to a single set of "golden" attributes, and the Identity Match component is authoritative for these attributes. When an SoR presents attributes for matching, the Identity Match component matches against the single, golden set of attributes.
In an independent implementation, each system of record is authoritative for all of its own attributes. When an SoR presents attributes for matching, the ID Match component may match against a canonical representation of these attributes as managed by the IdMS, but it may also match against each SoR’s representation of these attributes.
Synchronous vs Asynchronous Match Resolution
When a partial/potential match occurs, the client may be able to resolve the match synchronously by presenting information to the data entry personnel and submitting a forced resolution request.
If the client is not able to do so, then the match is resolved asynchronously, usually by a notification going to a match administrator, who can then view the potential match and resolve it.
When matching is performed via batch operations, it must be asynchronous.
Implementation at Registry vs Standalone
Identity Match can be performed at the Person Registry or as a standalone service.
Match Before Registry vs Match At Registry
Identity Match can be performed before a record is added to the Person Registry, or at the Person Registry. In either case, the Identity Match might be performed by the Person Registry, or by a standalone service.
Algorithm
The matching algorithm is generally out of scope for the Identity Match API.
About the Identity Match API
This API is used to obtain a unique reference to a Person known to the IdMS based on data known to an SoR. The Person may or may not be known to the IdMS at the time of the query, but generally the Person will be new to the SoR.
This API is exposed by the Identity Match component. The Registry component may also expose this API, either because it implements Identity Match natively or because it brokers requests to the Identity Match component.
All services described in this document are mandatory unless otherwise stated.
Important
|
Identifiers used by the Match API may be returned in either string or number notation. This includes the match Reference Identifier, the match request identifier, and any local identifiers. |
Resources
The Identity Match API operates with the primary resource being People. The goal of a client is to obtain a Reference Identifier for a person.
That said, there is nothing about the API that requires the subjects of match requests to be actual people. The same concepts can be applied to other types of subjects that may need to be matched.
Although not required, it is strongly recommended that HTTPS be used for all transactions.
Attributes
Two types of attributes are used by the Identity Match protocol, Core Schema Attributes (which are largely used to transfer information about the subject to be matched) and Match API Attributes (which are defined by the protocol).
Match API Attributes
The Identity Match API defines several attributes to manage match requests and responses.
confidence
A representation of the Match Engine’s confidence of a result, from 0 (lowest confidence) through 100 (maximum confidence).
Type |
integer |
explanation
A human readable explanation generated by the Match Engine as to why a specific result was generated, intended to provide guidance to an administrator trying to resolve a potential match.
Type |
string |
matchRequest
A transaction identifier assigned by the Match Engine to represent a specific match request action.
Type |
string |
referenceId
The Reference Identifier is the unique identifier assigned by the Match Engine to each individual subject identified by the match algorithm. It is equivalent to the TAP Core Schema identifier attribute with type reference, but this simplified representation is suitable in certain contexts where a structured representation does not make sense.
Plural |
referenceIds |
Type |
string |
sorId
The SoR ID is the unique identifier used within the SoR for the candidate subject. It is equivalent to the TAP Core Schema identifier attribute with type sor or sor-label, but this simplified representation is suitable in certain contexts where a structured representation does not make sense.
Plural |
sorIds |
Type |
string |
sorLabel
The SoR Label is a short string identifying the calling SoR, such as sis
or
hrms
, and will typically be assigned by the Match Engine. (The label will
most likely be conveyed out of band to the SoR.)
Plural |
sorLabels |
Type |
string |
Note
|
Reference Identifiers, SoR Labels, and SoR IDs should only be constructed from RFC 3986 Unreserved Characters. If other characters must be used, they must be percent-encoded when used in URLs. |
Warning
|
If the SoR ID is sensitive (for example, if it is based on the Social Security Number), placing it in the URL may not be ideal. In such cases, use of a salted-hashed or alternate identifier is recommended. |
Core Schema Attributes
The Identity Match API uses the Core Schema JSON Representation (CS-JSON) as the basis for transportation of attributes. By default, all attributes support multiple values in transport, and therefore the Complex Attribute With Multiple Values or Simple Attribute With Multiple Values representations are used unless otherwise noted. The exceptions are
-
dateOfBirth
-
gender
-
primaryAffiliation
-
primaryCampus
-
pronouns
-
test
These exceptions are represented using the Simple Attribute format.
Any attributes defined in the Core Schema are permitted in Identity Match requests. However, the following list is likely to represent the most common attributes:
-
identifier/sor (sorId, provided in the URL)
-
address
-
affiliation
-
dateOfBirth
-
emailAddress
-
gender
-
identifier/national (may be hashed)
-
identifier/network
-
identifier/enterprise
-
name
-
telephoneNumber
The specific attributes in use are determined by the implementation.
Reference Identifier Request
The objective of a Reference Identifier Request is to obtain a unique reference identifier from the Identity Match component in order to canonically identify the person presented by the SoR. Reference Identifiers are conveyed as JSON strings.
Important
|
The Reference Identifier returned to an SoR may match an identifier previously known to the SoR. For example, if an employee returns after an absence of several years, the IdMS may have the original Reference Identifier whereas the HRMS may have purged its copy. |
Standard Request
The Standard Request involves sending a bundle of System of Record attributes to an endpoint constructed using the SoR Label and SoR ID. Several responses are possible, depending on the configuration of the Match Engine. These responses are described below.
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
|
PUT /v1/people/sis/971194843 { "sorAttributes": { "names":[ { "type":"official", "given":"Pat", "family":"Lee" } ], "dateOfBirth":"1983-03-18", "identifiers":[ { "type":"national", "identifier":"3B902AE12DF55196" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } }
Search Only Request
A search only request is similar to a standard request, except that a new identity will never be created as a result of the request. Note that a search only request does not imply that the requesting SoR intends to add a role record for an identity, and so is not an instruction for the match engine to change any state. That is, this request is read-only from the perspective of the match server.
The search only request uses POST
rather than PUT
, but is otherwise
identical to the standard request. GET
is not used to avoid embedding
sensitive information in the URL.
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
|
POST /v1/people/sis/971194843 { "sorAttributes": { "names":[ { "type":"official", "given":"Pat", "family":"Lee" } ], "dateOfBirth":"1983-03-18", "identifiers":[ { "type":"national", "identifier":"3B902AE12DF55196" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } }
Response: Unique Match Found
If the request subject matches an existing identity, the server responds with an HTTP 200 OK response status code and a Unique Match Found Response object.
Implementations may optionally return additional identifiers in a successful response. Coordinated implementations may further return golden record attributes in a successful response.
200 OK { "referenceId": "M225127891" }
200 OK { "referenceId":"M225127891", "identifiers":[ { "type":"network", "identifier":"pl388" }, { "type":"enterprise", "identifier":"905003148" } ] }
200 OK { "referenceId":"M225127891", "identifiers":[ { "type":"network", "identifier":"pl388" }, { "type":"enterprise", "identifier":"905003148" } ], "golden":{ "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "dateOfBirth":"1983-03-18", } }
Response: New Identifier Assigned
If the request subject does not match any existing identities, the server creates a new identity and responds with an HTTP 201 Created response status code. The response body also uses the same Reference Identifier Response object as for the Unique Match Found response.
From the client perspective, the Unique Match Found and New Identifier Assigned responses are functionally the same, except that New Identifier Assigned cannot be returned for a Search Only Request.
201 Created { "referenceId": "M225127891" }
Response: No Match Found
If the request subject does not match any existing identities and the response is for a Search Only Request, the server responds with an HTTP 404 Not Found status code. There is no message body.
404 Not Found
Response: Potential Match Found
If the Identity Match engine cannot canonically determine either that the request corresponds to a single existing record, or that it does not match any existing record, a Potential Match situation results. If the client is capable of interactively selecting from potential matches, the server responds with an HTTP 300 Multiple Choices response status code, and a response body as described below.
This response may include as few as one candidate. Implementation of this response is optional if the Externally Handled response is implemented. (See also the Forced Reconciliation Request, below.)
The response may include a confidence value indicating the Match Engine’s certainty in a given candidate. Although optional, it is strongly recommended to implement this attribute to provide assistance to the administrator trying to resolve the potential match.
Attributes may be returned to facilitate the selection of a candidate. The response may include attributes that the Match Engine does not use to perform matching. The specific set of attributes returned may vary by local implementation.
To indicate a new person, the Match Engine may create a provisional record
returned in the Potential Matches response, with a suitable referenceId
assigned. Alternately, the referenceId may be set to new
. Either way, the
original data submitted is provided as a candidate with confidence omitted.
The Identity Match engine may elect to assign a Match Request ID to the transaction that generated the Potential Matches response. The Match Request ID is optional. If assigned, it is the same identifier as used to retrieve Pending Matches, and may be used in that context. If assigned, the Match Request ID must be provided when making a Forced Reconciliation Request to resolve the Potential Matches response.
300 Multiple Choices { "matchRequest":"1009", "candidates": [ { "confidence":"85", "referenceId":"M219488003", "explanation":"Family name exact match, given name initial match", "attributes": [ { "sor":"HRMS", "record":{ "identifiers":[ { "type":"sor", "identifier":"089010023" }, { "type":"network", "identifier":"pl292" } ], "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "ou":"Biomedical Informatics" } }, { "sor":"Alumni", "record":{ "identifiers":[ { "type":"sor", "identifier":"A330-200" }, { "type":"network", "identifier":"pl292" } ], "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "ou":"Class of 1997" } } ], "identifiers":[ { "type":"network", "identifier":"pl292" }, { "type":"enterprise", "identifier":"905008772" } ] }, { "confidence":"71", "referenceId":"M523441767", "attributes": [ { "sor":"guest", "record": { "identifiers":[ { "type":"sor", "identifier":"pl388" }, { "type":"network", "identifier":"pl388" } ], "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } } ], "identifiers":[ { "type":"network", "identifier":"pl388" }, { "type":"enterprise", "identifier":"905003148" } ] }, { "referenceId":"new", "attributes": [ { "sor":"SIS", "identifiers":[ { "type":"sor", "identifier":"971194843" } ] "names":[ { "type":"official", "given":"Pat", "family":"Lee" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } ] } ] }
300 Multiple Choices { "matchRequest":"1009", "candidates": [ { "confidence":"85", "referenceId":"M219488003", "golden":{ "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "dateOfBirth":"1983-03-18", }, "attributes": [ ... ] } ] }
300 Multiple Choices { "matchRequest":"1009", "candidates": [ { "confidence":"85", "referenceId":"M219488003", "explanation":"Family name exact match, given name initial match", "attributes": [ ... ] } ] }
Response: Potential Match Found (Externally Handled)
Some configurations may not support an interactive transaction over the API to resolve a conflict. In this scenario, the Identity Match engine will respond with an HTTP 202 Accepted response code. An optional Potential Match (External) Response JSON object may be returned.
It is expected that a request will be forwarded to a Reconciliation Manager, an administrator whose responsibilities include reviewing potential matches (probably via email with a URL for resolution). The engine is responsible for maintaining enough state to be able to handle a manual reconciliation at a later (but not too much later) time.
This response may not be returned for a Search Only request.
See also Console Support Operations, below.
202 Accepted { "matchRequest":"1009" }
Response: Error With Request
If the Match Engine encounters an error due to the client request, such as failing to submit required attributes or submitting invalid values for attributes, the server response with an HTTP 400 Bad Request response code. An optional Error Response JSON object may be returned.
400 Bad Request { "error":"Required field foo not provided" }
Forced Reconciliation Request
After a Potential Matches Found 300 Multiple Choices
response, the client must
submit a modified request with an Identity Match Reference Identifier to link to
an existing record, or an indication that a new person should be created. This
request is substantially similar to the Standard Request, but with additional
attributes to indicate the resolution.
The client must include the referenceId
associated with the desired target
record, or the value new
to indicate that a new person has been identified
(and a new Reference Identifier should be assigned).
If the Potential Match Response included a Match Request Identifier, the client must provide this identifier in the Forced Reconciliation Request.
This request should result in a Unique Match Found 200 OK
or New Identifier
Assigned 201 Created
response. Only candidates listed in the 300 response
should be included in the Forced Reconciliation Request, however it is up to an
individual implementation as to whether or not to accept an Identity Match
Reference Identifier provided in a given Forced Reconciliation Request.
The Identity Match Engine may indicate a forced reconciliation request is being
attempted using out of date information by returning an HTTP 409 Conflict
response code, or an invalid Match Request Identifier using an HTTP 404 Not
Found
response code.
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
|
PUT /v1/people/sis/971194843 { "matchRequest":"1009", "sorAttributes": { ... }, "referenceId":"M523441767" }
Update Match Attributes
The Identity Match engine must be kept up to date if attributes used for matching are updated. For example, if a person’s name changes, the SOR must update the name attributes in case the person subsequently shows up via a different SOR. This can also be handled out of band, eg by sync’ing the match database with a person registry database, or by the two databases being the same.
A complete set of attributes are provided to replace the existing set. The Identity Match engine may elect to keep replaced attributes for fuzzy matching against historical records, or for other purposes.
For a coordinated implementation, the Reference Identifier must be included, as the Identity Match engine need not maintain a full set of per-SoR attributes. For an independent implementation, the Reference Identifier is optional.
Operation |
Out of Band Sync |
Coordinated |
Independent |
Update Match Request |
Do Not Implement |
Required, including Reference Identifier |
Required (Reference Identifier optional) |
Request Inventory of Requests |
Optional |
Required |
Required |
Request Current Values |
Optional |
Required, including Reference Identifier |
Required (Reference Identifier optional) |
Delete Current Values |
Do Not Implement |
Do Not Implement |
Required |
Update Match Attributes Request
This request is syntactically the same as a Standard Request, what distinguishes them is the state on the server. If a Reference Identifier is already associated with the SoR and SoR ID, the request is treated as an Update Match Attributes Request, and the request is therefore a request to update the System of Record attributes for use in future match requests. The match engine may simply update its internal state and reply appropriately. Requesting a rematch of an already matched record is not supported via this mechanism.
If the requested SoR and SoR ID pair represents a pending match request (ie: there is no Reference Identifier associated with the request), the Match Engine should treat the request as a Standard Match Request, but using the updated attributes. The request should be treated as an SoR submitting updated attributes to fix a prior request that resulted in a pending match.
Note
|
If a System of Record updates its internal SoR ID for a given Reference Identifier (eg, due to a merge or split of the SoR’s record), the SoR should send a Forced Reconciliation Request with the already known Reference Identifier followed by a Delete Current Values request. |
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
None |
PUT /v1/people/sis/971194843 { "sorAttributes": { "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "dateOfBirth":"1983-03-18", "identifiers":[ { "type":"national", "identifier":"3B902AE12DF55196" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } } 200 OK
Request Inventory of Requests
Systems of Record may request an inventory of all submitted requests, as indexed by the System of Record Identifier.
Request Method |
|
Request Endpoint |
|
Request Body |
none |
Response Codes |
|
Response Body |
|
GET /v1/people/sis 200 OK { "sorids": [ "971194843", "980121418", "937762041", "915233810" ] }
Request Current Values
The current attributes for a given SoR ID as known to the Match Engine may be
retrieved by the client. Note that if a match has previously been made, the
referenceId
will be included in the response. If the request is pending manual
resolution, the referenceId
attribute will be omitted. The Match Engine may
elect to return former values for attributes, so long as they are notated with
appropriate metadata (such as revision
).
In a coordinated implementation, this operation should return the attributes as
provided by the specified SoR, not the golden attributes. To obtain the golden
attributes, use a Reference Identifier Obtain SOR Records request. If the SOR
attributes are not tracked, then the sorAttributes
entry should be omitted.
Request Method |
|
Request Endpoint |
|
Request Body |
none |
Response Codes |
|
Response Body |
|
GET /v1/people/sis/971194843 200 OK { "meta": { "requestTime":"2013-06-08T11:23:37Z", "resolutionTime":"2013-06-10T11:23:37Z", "referenceId":"M523441767" } "sorAttributes": { "names":[ { "type":"official", "given":"Patricia", "family":"Lee" } ], "dateOfBirth":"1983-03-18", "identifiers":[ { "type":"national", "identifier":"3B902AE12DF55196" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] } }
Delete Current Values
The client may request that the SoR attributes for a subject be deleted. Typically, this should only be done when a record was added erroneously, as the Match Engine will perform better if it has access to historical records.
Request Method |
|
Request Endpoint |
|
Request Body |
none |
Response Codes |
|
Response Body |
none |
DELETE /v1/people/sis/971194843 200 OK
Pending Matches
In order to support an Identity Console or independent front end, certain
additional resources are needed. These operations are mandatory, in order to
support decoupling of services (in this case the Identity Match engine from the
Identity Console). However, where a given implementation implements interactive,
stateless resolution (ie: it returns 300 Multiple Choices
and not 202
Accepted
for potential matches), the concept of "pending matches" may not
apply, and these operations may effectively be no-ops.
Note that resolving a pending reconciliation request is handled via the Forced Reconciliation Request, described above.
Request Pending Matches
The client may request an inventory of all pending matches requiring administrator review.
The Match Engine should indicate no available Pending Matches via 200 OK
and
an empty set of matchRequests
.
Request Method |
|
Request Endpoint |
|
Request Body |
none |
Response Codes |
|
Response Body |
|
GET /v1/matchRequests?status=pending 200 OK { "matchRequests": { "1009": { "attributes": { "sor":"SIS", "identifiers":[ { "type":"sor", "identifier":"971194843" } ] "names":[ { "type":"official", "given":"Pat", "family":"Lee" } ], "telephoneNumbers":[ { "type":"mobile", "number":"8185551234" } ] }, "requestTime":"2013-06-08T11:23:37Z" }, "1014": { "attributes": { "sor":"HRMS", "sorId":"914890374", "identifiers":[ { "type":"sor", "identifier":"089010023" }, { "type":"network", "identifier":"p5478" } ], "names":[ { "type":"official", "given":"Richard", "family":"Hess" } ], "ou":"Biology" }, "requestTime":"2013-06-08T11:23:37Z" } } }
Request Resolved Matches
Similarly, the client may request an inventory of all resolved matches.
Request Method |
|
Request Endpoint |
|
Request Body |
none |
Response Codes |
|
Response Body |
|
Request Pending Match
It may be desirable for the response to include all known SOR data, if the match engine has access to it. This would facilitate the reconciliation administrator in determining how to resolve a pending match. Such functionality is optional, and could also be provided by having the Identity Console query the Identity Registry.
If the match request is still pending, the server responds with a 300 Multiple
Choices
. If the match request has been resolved, the server responds with a
200 OK
.
Request Method |
|
Request Endpoint |
`/v1/matchRequests/<matchRequest> |
Request Body |
none |
Response Codes |
|
Response Body |
|
Reference Identifiers
It may sometimes be necessary to manually relink records, perhaps due to bad data presented at the initial match request. Directly manipulating reference identifier resources can be used to accomplish this.
The Identity Match engine may wish to restrict access to these requests to specific administrative clients.
Obtain SOR Records
All SOR records attached to a specific reference identifier may be retrieved.
A coordinated implementation that does not track individual SOR records must
return 501 Not Implemented
.
Request Method |
|
Request Endpoint |
`/v1/matchRequests?referenceId=<referenceId> |
Request Body |
none |
Response Codes |
|
Response Body |
|
Join Reference Identifiers
Under certain circumstances, the Identity Match engine may not have enough information to link a new record with an existing reference identifier, resulting in an individual having two (or more) reference identifiers. The reference identifiers can later be joined together using this request. The active reference identifier is provided in the resource URL and the deprecated reference identifiers in the request body.
If supported by the engine, the identifiers to maintain as primary may also be specified.
In a coordinated implementation, the match engine may also accept a set of attributes to keep in the golden record.
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
none |
PUT /v1/referenceIds/M523441767 { "referenceIds":[ "M787800232", "M350100023" ], "identifiers":[ { "type":"network", "identifier":"pl388" }, { "type":"enterprise", "identifier":"905003148" } ] }
Reassign Reference Identifier
If the Match Engine incorrectly assigned a reference identifier to a record, it
is possible for an administrator to specify a replacement reference identifier
for that record. This is very similar to a Forced Reconciliation Request, except
that sorAttributes
are not specified.
The reference identifier can be set to new
to indicate that a new identifier
should be assigned.
Request Method |
|
Request Endpoint |
|
Request Body |
|
Response Codes |
|
Response Body |
|
PUT /v1/people/sis/971194843 { "referenceId":"M523441767" }
References
-
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax
Changelog
v1.0.0
-
Initial release.
-
Changes from the ID Match API Strawman:
-
Removed
GET
option from Search Only Request. -
Potential Match Found candidate attributes moved to
record
. -
Response format for Current Values Request updated to include metadata.
-
Clarified Request Pending Match response.
-
Removed options for returning identifiers and golden attributes in Obtain SOR Records response.
-