Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
md-transforms/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
188 lines (138 sloc)
7.14 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Metadata Transforms | |
XSLT transformations of SAML metadata | |
## Installation | |
The scripts in this repository depend on a [Bash Library](https://github.internet2.edu/InCommon/bash-library) of basic scripts. Download and install the latter before continuing. | |
Download the source, change directory to the source directory, and install the source into `/tmp` as follows: | |
```Shell | |
$ export BIN_DIR=/tmp/bin | |
$ export LIB_DIR=/tmp/lib | |
$ ./install.sh $BIN_DIR $LIB_DIR | |
``` | |
or install into your home directory: | |
```Shell | |
$ export BIN_DIR=$HOME/bin | |
$ export LIB_DIR=$HOME/lib | |
$ ./install.sh $BIN_DIR $LIB_DIR | |
``` | |
An installation directory will be created if one doesn't already exist. In any case, the following files will be installed: | |
```Shell | |
$ ls -1 $BIN_DIR | |
http_xsltproc.sh | |
process_export_aggregate.sh | |
process_main_aggregate.sh | |
$ ls -1 $LIB_DIR | |
list_all_IdP_DisplayNames_csv.xsl | |
list_all_IdPs_csv.xsl | |
list_all_RandS_IdPs_csv.xsl | |
list_all_RandS_SPs_csv.xsl | |
list_all_SPs_csv.xsl | |
security_contacts_legacy_list_csv.xsl | |
security_contacts_summary_json.xsl | |
security_contacts_summary_local_json.xsl | |
``` | |
## Overview | |
Bash script `http_xsltproc.sh` is a wrapper around the `xsltproc` command-line tool. Unlike `xsltproc`, the `http_xsltproc.sh` script fetches the target XML document from an HTTP server using HTTP Conditional GET [RFC 7232]. If the server responds with 200, the script caches the resource and returns the response body. If the server responds with 304, the script returns the cached resource instead. See the inline help file for details: | |
```Shell | |
$ $BIN_DIR/http_xsltproc.sh -h | |
``` | |
The `http_xsltproc.sh` script requires two environment variables. `CACHE_DIR` is the absolute path to the cache directory (which may or may not exist) whereas `LIB_DIR` specifies a directory containing various helper scripts. | |
For example, let's use the library installed in the previous section and specify the cache as follows: | |
```Shell | |
$ export CACHE_DIR=/tmp/cache | |
``` | |
The following examples show how to use the script to create some cron jobs on incommon.org. | |
### Example #1 | |
The goal is to transform InCommon metadata into the following CSV file: | |
* https://incommon.org/federation/metadata/all_IdP_DisplayNames.csv | |
The above resource is used to construct a [List of IdP Display Names](https://spaces.internet2.edu/x/2IDmBQ) in the spaces wiki. | |
Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV file at the above URL. Specifically, let's suppose the following process runs every hour on incommon.org: | |
```Shell | |
# determine the metadata location | |
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml | |
# create the resource | |
xsl_file=$LIB_DIR/list_all_IdP_DisplayNames_csv.xsl | |
resource_file=/tmp/all_IdP_DisplayNames.csv | |
$BIN_DIR/http_xsltproc.sh -F -o $resource_file $xsl_file $xml_location | |
exit_code=$? | |
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response | |
if [ $exit_code -gt 1 ]; then | |
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2 | |
exit $exit_code | |
fi | |
# move the resource to the web directory | |
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/ | |
/bin/mv $resource_file $resource_dir | |
``` | |
Observe that the command `http_xsltproc.sh -F` forces a fresh SAML metadata file. If the server responds with `304 Not Modified`, the process terminates without updating the resource file. | |
### Example #2 | |
The goal is to transform InCommon metadata into the following pair of CSV files: | |
* https://incommon.org/federation/metadata/all_RandS_IdPs.csv | |
* https://incommon.org/federation/metadata/all_RandS_SPs.csv | |
The above resources are used to construct the [List of Research and Scholarship Entities](https://spaces.internet2.edu/x/ZoUABg) in the spaces wiki. | |
Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org: | |
```Shell | |
# determine the metadata location | |
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml | |
# create the first resource | |
xsl_file=$LIB_DIR/list_all_RandS_IdPs_csv.xsl | |
resource1_file=/tmp/all_RandS_IdPs.csv | |
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location | |
exit_code=$? | |
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response | |
if [ $exit_code -gt 1 ]; then | |
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2 | |
exit $exit_code | |
fi | |
# create the second resource | |
xsl_file=$LIB_DIR/list_all_RandS_SPs_csv.xsl | |
resource2_file=/tmp/all_RandS_SPs.csv | |
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location" | |
exit_code=$? | |
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached | |
if [ $exit_code -gt 1 ]; then | |
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2 | |
exit $exit_code | |
fi | |
# move the resources to the web directory | |
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/ | |
/bin/mv $resource1_file $resource2_file $resource_dir | |
``` | |
Observe the commands `http_xsltproc.sh -F` and `http_xsltproc.sh -C`. The former forces a fresh SAML metadata file as in the previous example; the latter goes directly to cache. If file is not in the cache (which is highly unlikely), the process terminates without updating any resource files. | |
### Example #3 | |
The goal is to transform InCommon metadata into the following pair of CSV files: | |
* https://incommon.org/federation/metadata/all_exported_IdPs.csv | |
* https://incommon.org/federation/metadata/all_exported_SPs.csv | |
The above resources are used to construct the [List of Exported Entities](https://spaces.internet2.edu/x/DYD4BQ) in the spaces wiki. | |
Suppose there is an automated process that transforms the InCommon export aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org: | |
```Shell | |
# determine the metadata location | |
xml_location=http://md.incommon.org/InCommon/InCommon-metadata-export.xml | |
# create the first resource | |
xsl_file=$LIB_DIR/list_all_IdPs_csv.xsl | |
resource1_file=/tmp/all_exported_IdPs.csv | |
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location | |
exit_code=$? | |
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response | |
if [ $exit_code -gt 1 ]; then | |
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2 | |
exit $exit_code | |
fi | |
# create the second resource | |
xsl_file=$LIB_DIR/list_all_SPs_csv.xsl | |
resource2_file=/tmp/all_exported_SPs.csv | |
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location" | |
exit_code=$? | |
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached | |
if [ $exit_code -gt 1 ]; then | |
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2 | |
exit $exit_code | |
fi | |
# move the resources to the web directory | |
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/ | |
/bin/mv $resource1_file $resource2_file $resource_dir | |
``` | |
The commands `http_xsltproc.sh -F` and `http_xsltproc.sh -C` behave exactly as described in the previous example. | |
## Compatibility | |
The executable scripts are compatible with GNU/Linux and Mac OS. The library files are written in XSLT 1.0. | |
## Dependencies | |
* [Bash Library](https://github.internet2.edu/InCommon/bash-library) |