Skip to content
Permalink
65da547996
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
188 lines (138 sloc) 7.14 KB
# Metadata Transforms
XSLT transformations of SAML metadata
## Installation
The scripts in this repository depend on a [Bash Library](https://github.internet2.edu/InCommon/bash-library) of basic scripts. Download and install the latter before continuing.
Download the source, change directory to the source directory, and install the source into `/tmp` as follows:
```Shell
$ export BIN_DIR=/tmp/bin
$ export LIB_DIR=/tmp/lib
$ ./install.sh $BIN_DIR $LIB_DIR
```
or install into your home directory:
```Shell
$ export BIN_DIR=$HOME/bin
$ export LIB_DIR=$HOME/lib
$ ./install.sh $BIN_DIR $LIB_DIR
```
An installation directory will be created if one doesn't already exist. In any case, the following files will be installed:
```Shell
$ ls -1 $BIN_DIR
http_xsltproc.sh
process_export_aggregate.sh
process_main_aggregate.sh
$ ls -1 $LIB_DIR
list_all_IdP_DisplayNames_csv.xsl
list_all_IdPs_csv.xsl
list_all_RandS_IdPs_csv.xsl
list_all_RandS_SPs_csv.xsl
list_all_SPs_csv.xsl
security_contacts_legacy_list_csv.xsl
security_contacts_summary_json.xsl
security_contacts_summary_local_json.xsl
```
## Overview
Bash script `http_xsltproc.sh` is a wrapper around the `xsltproc` command-line tool. Unlike `xsltproc`, the `http_xsltproc.sh` script fetches the target XML document from an HTTP server using HTTP Conditional GET [RFC 7232]. If the server responds with 200, the script caches the resource and returns the response body. If the server responds with 304, the script returns the cached resource instead. See the inline help file for details:
```Shell
$ $BIN_DIR/http_xsltproc.sh -h
```
The `http_xsltproc.sh` script requires two environment variables. `CACHE_DIR` is the absolute path to the cache directory (which may or may not exist) whereas `LIB_DIR` specifies a directory containing various helper scripts.
For example, let's use the library installed in the previous section and specify the cache as follows:
```Shell
$ export CACHE_DIR=/tmp/cache
```
The following examples show how to use the script to create some cron jobs on incommon.org.
### Example #1
The goal is to transform InCommon metadata into the following CSV file:
* https://incommon.org/federation/metadata/all_IdP_DisplayNames.csv
The above resource is used to construct a [List of IdP Display Names](https://spaces.internet2.edu/x/2IDmBQ) in the spaces wiki.
Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV file at the above URL. Specifically, let's suppose the following process runs every hour on incommon.org:
```Shell
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml
# create the resource
xsl_file=$LIB_DIR/list_all_IdP_DisplayNames_csv.xsl
resource_file=/tmp/all_IdP_DisplayNames.csv
$BIN_DIR/http_xsltproc.sh -F -o $resource_file $xsl_file $xml_location
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi
# move the resource to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
/bin/mv $resource_file $resource_dir
```
Observe that the command `http_xsltproc.sh -F` forces a fresh SAML metadata file. If the server responds with `304 Not Modified`, the process terminates without updating the resource file.
### Example #2
The goal is to transform InCommon metadata into the following pair of CSV files:
* https://incommon.org/federation/metadata/all_RandS_IdPs.csv
* https://incommon.org/federation/metadata/all_RandS_SPs.csv
The above resources are used to construct the [List of Research and Scholarship Entities](https://spaces.internet2.edu/x/ZoUABg) in the spaces wiki.
Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org:
```Shell
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml
# create the first resource
xsl_file=$LIB_DIR/list_all_RandS_IdPs_csv.xsl
resource1_file=/tmp/all_RandS_IdPs.csv
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi
# create the second resource
xsl_file=$LIB_DIR/list_all_RandS_SPs_csv.xsl
resource2_file=/tmp/all_RandS_SPs.csv
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location"
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi
# move the resources to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
/bin/mv $resource1_file $resource2_file $resource_dir
```
Observe the commands `http_xsltproc.sh -F` and `http_xsltproc.sh -C`. The former forces a fresh SAML metadata file as in the previous example; the latter goes directly to cache. If file is not in the cache (which is highly unlikely), the process terminates without updating any resource files.
### Example #3
The goal is to transform InCommon metadata into the following pair of CSV files:
* https://incommon.org/federation/metadata/all_exported_IdPs.csv
* https://incommon.org/federation/metadata/all_exported_SPs.csv
The above resources are used to construct the [List of Exported Entities](https://spaces.internet2.edu/x/DYD4BQ) in the spaces wiki.
Suppose there is an automated process that transforms the InCommon export aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org:
```Shell
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata-export.xml
# create the first resource
xsl_file=$LIB_DIR/list_all_IdPs_csv.xsl
resource1_file=/tmp/all_exported_IdPs.csv
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi
# create the second resource
xsl_file=$LIB_DIR/list_all_SPs_csv.xsl
resource2_file=/tmp/all_exported_SPs.csv
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location"
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi
# move the resources to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
/bin/mv $resource1_file $resource2_file $resource_dir
```
The commands `http_xsltproc.sh -F` and `http_xsltproc.sh -C` behave exactly as described in the previous example.
## Compatibility
The executable scripts are compatible with GNU/Linux and Mac OS. The library files are written in XSLT 1.0.
## Dependencies
* [Bash Library](https://github.internet2.edu/InCommon/bash-library)