Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
Tom Scavo committed Oct 30, 2016
2 parents 6198803 + 907deba commit 99368f9
Showing 1 changed file with 123 additions and 40 deletions.
163 changes: 123 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,10 @@

XSLT transformations of SAML metadata

## Contents

Executables:

* http_xsltproc.sh

Library files:

* list_all_IdP_DisplayNames_csv.xsl
* list_all_IdPs_csv.xsl
* list_all_RandS_IdPs_csv.xsl
* list_all_RandS_SPs_csv.xsl
* list_all_SPs_csv.xsl

## Installation

The scripts in this repository depend on a [Bash Library](https://github.internet2.edu/InCommon/bash-library) of basic scripts. Download and install the latter before continuing.

Download the source, change directory to the source directory, and install the source into ``/tmp`` as follows:

```Shell
Expand All @@ -34,63 +22,158 @@ $ export LIB_DIR=$HOME/lib
$ ./install.sh $BIN_DIR $LIB_DIR
```

An installation directory will be created if it doesn't already exist.
An installation directory will be created if one doesn't already exist. In any case, the following files will be installed:

```Shell
$ ls -1 $BIN_DIR
http_xsltproc.sh

$ ls -1 $LIB_DIR
list_all_IdP_DisplayNames_csv.xsl
list_all_IdPs_csv.xsl
list_all_RandS_IdPs_csv.xsl
list_all_RandS_SPs_csv.xsl
list_all_SPs_csv.xsl
```

## Overview

Bash script ``http_xsltproc.sh`` is a wrapper around the ``xsltproc`` command-line tool. Unlike ``xsltproc``, this script fetches the target XML document from an HTTP server. See the inline help file for details:
Bash script ``http_xsltproc.sh`` is a wrapper around the ``xsltproc`` command-line tool. Unlike ``xsltproc``, the ``http_xsltproc.sh`` script fetches the target XML document from an HTTP server using HTTP Conditional GET [RFC 7232]. If the server responds with 200, the script caches the resource and returns the response body. If the server responds with 304, the script returns the cached resource instead. See the inline help file for details:

```Shell
$ $BIN_DIR/http_xsltproc.sh -h
```

Here's an example of script usage:
The ``http_xsltproc.sh`` script requires two environment variables. ``CACHE_DIR`` is the absolute path to the cache directory (which may or may not exist) whereas ``LIB_DIR`` specifies a directory containing various helper scripts.

For example, let's use the library installed in the previous section and specify the cache as follows:

```Shell
$ MD_LOCATION=http://md.incommon.org/InCommon/InCommon-metadata.xml
$ $BIN_DIR/http_xsltproc.sh $LIB_DIR/list_all_IdP_DisplayNames_csv.xsl $MD_LOCATION | head
IdP Display Name,IdP Entity ID,IdP Discovery,Registrar ID
"Ohio State University",urn:mace:incommon:osu.edu,show,https://incommon.org
"Cornell University",https://shibidp.cit.cornell.edu/idp/shibboleth,show,https://incommon.org
"University of California - Office of the President",urn:mace:incommon:ucop.edu,show,https://incommon.org
"University of California-Irvine",urn:mace:incommon:uci.edu,show,https://incommon.org
"University of Washington",urn:mace:incommon:washington.edu,show,https://incommon.org
"Internet2",urn:mace:incommon:internet2.edu,show,https://incommon.org
"University of California-San Diego",urn:mace:incommon:ucsd.edu,show,https://incommon.org
"Georgetown University",https://shibb-idp.georgetown.edu/idp/shibboleth,show,https://incommon.org
"Case Western Reserve University",urn:mace:incommon:case.edu,show,https://incommon.org
$ export CACHE_DIR=/tmp/cache
```

The following examples show how to use the script to create some cron jobs on incommon.org.

### Example #1

Consider the following URLs:
The goal is to transform InCommon metadata into the following CSV file:

```Shell
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml
resource_url=https://incommon.org/federation/metadata/all_IdP_DisplayNames.csv
```
* https://incommon.org/federation/metadata/all_IdP_DisplayNames.csv

The above resource is used to construct a [List of IdP Display Names](https://spaces.internet2.edu/x/2IDmBQ) in the spaces wiki.

Suppose there is an automated process that transforms the SAML metadata at ``xml_location`` into the CSV file at ``resource_url``. Specifically, let's suppose the following process runs every hour on www.incommon.org:
Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV file at the above URL. Specifically, let's suppose the following process runs every hour on incommon.org:

```Shell
# the XSL script and the shell script are included in the md-transforms repository
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml

# create the resource
xsl_file=$LIB_DIR/list_all_IdP_DisplayNames_csv.xsl
resource_file=/tmp/all_IdP_DisplayNames.csv
$BIN_DIR/http_xsltproc.sh -F -o "$resource_file" "$xsl_file" "$xml_location"
$BIN_DIR/http_xsltproc.sh -F -o $resource_file $xsl_file $xml_location
exit_code=$?
if [ $exit_code -ne 0 ]; then
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi

# the resource_dir is the target web directory for the resource_file
# move the resource to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
mv $resource_file $resource_dir
exit 0
```

Observe that the command ``http_xsltproc.sh -F`` forces a fresh SAML metadata file. If the server responds with ``304 Not Modified``, the process terminates without updating the resource file.

### Example #2

The goal is to transform InCommon metadata into the following pair of CSV files:

* https://incommon.org/federation/metadata/all_RandS_IdPs.csv
* https://incommon.org/federation/metadata/all_RandS_SPs.csv

The above resources are used to construct the [List of Research and Scholarship Entities](https://spaces.internet2.edu/x/ZoUABg) in the spaces wiki.

Suppose there is an automated process that transforms the main InCommon metadata aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org:

```Shell
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata.xml

# create the first resource
xsl_file=$LIB_DIR/list_all_RandS_IdPs_csv.xsl
resource1_file=/tmp/all_RandS_IdPs.csv
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi

# create the second resource
xsl_file=$LIB_DIR/list_all_RandS_SPs_csv.xsl
resource2_file=/tmp/all_RandS_SPs.csv
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location"
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi

# move the resources to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
mv $resource1_file $resource2_file $resource_dir
```

Observe the commands ``http_xsltproc.sh -F`` and ``http_xsltproc.sh -C``. The former forces a fresh SAML metadata file as in the previous example; the latter goes directly to cache. If file is not in the cache (which is highly unlikely), the process terminates without updating any resource files.

### Example #3

The goal is to transform InCommon metadata into the following pair of CSV files:

* https://incommon.org/federation/metadata/all_exported_IdPs.csv
* https://incommon.org/federation/metadata/all_exported_SPs.csv

The above resources are used to construct the [List of Exported Entities](https://spaces.internet2.edu/x/DYD4BQ) in the spaces wiki.

Suppose there is an automated process that transforms the InCommon export aggregate into the CSV files at the above URLs. Specifically, let's suppose the following process runs every hour on incommon.org:

```Shell
# determine the metadata location
xml_location=http://md.incommon.org/InCommon/InCommon-metadata-export.xml

# create the first resource
xsl_file=$LIB_DIR/list_all_IdPs_csv.xsl
resource1_file=/tmp/all_exported_IdPs.csv
$BIN_DIR/http_xsltproc.sh -F -o $resource1_file $xsl_file $xml_location
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if 304 response
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi

# create the second resource
xsl_file=$LIB_DIR/list_all_SPs_csv.xsl
resource2_file=/tmp/all_exported_SPs.csv
$BIN_DIR/http_xsltproc.sh -C -o "$resource2_file" "$xsl_file" "$xml_location"
exit_code=$?
[ $exit_code -eq 1 ] && exit 0 # short-circuit if not cached
if [ $exit_code -gt 1 ]; then
echo "ERROR: http_xsltproc.sh failed with status code: $exit_code" >&2
exit $exit_code
fi

# move the resources to the web directory
resource_dir=/home/htdocs/www.incommonfederation.org/federation/metadata/
mv $resource1_file $resource2_file $resource_dir
```

The commands ``http_xsltproc.sh -F`` and ``http_xsltproc.sh -C`` behave exactly as described in the previous example.

## Compatibility

The executable scripts are compatible with GNU/Linux and Mac OS. The library files are written in XSLT 1.0.
Expand Down

0 comments on commit 99368f9

Please sign in to comment.