diff --git a/bin/probe_saml_idp.sh b/bin/probe_saml_idp.sh index f92df3a..3167a61 100755 --- a/bin/probe_saml_idp.sh +++ b/bin/probe_saml_idp.sh @@ -1,7 +1,7 @@ #!/bin/bash ####################################################################### -# Copyright 2016 Internet2 +# Copyright 2016--2017 Internet2 # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,7 +16,7 @@ # limitations under the License. ####################################################################### -script_version="0.3" +script_version="0.9" user_agent_string="SAML IdP Probe ${script_version}" ####################################################################### @@ -28,25 +28,59 @@ display_help () { ${user_agent_string} Given a single identifier, assumed to be an IdP entityID, probe - all browser-facing SSO endpoints in IdP metadata. + the browser-facing SSO endpoints in IdP metadata. - Usage: ${0##*/} [-hvq] [-a] [-t CONNECT_TIME [-m MAX_TIME]] [-r MAX_REDIRS] ID + Usage: ${0##*/} [-hvV] [-b BINDING_URI] [-f MD_PATH] [-t CONNECT_TIME [-m MAX_TIME]] [-r MAX_REDIRS] ID + + The script takes one (and only one) identifier on the command + line. If the identifier is missing, the script immediately + terminates. + + By default, the script probes all browser-facing SSO endpoints + in IdP metadata. This includes any SSO endpoint with one of the + following bindings: + + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign + urn:mace:shibboleth:1.0:profiles:AuthnRequest + + The script probes at most one endpoint of each type (since only + one SSO endpoint per binding is allowed in metadata). Options: -h Display this message -v Write verbose messages to stdout - -q Run quietly (i.e., write no messages to stdout) + -V Trace the HTTP exchange(s) and write the trace output to stdout + -b Probe the endpoint with the given binding only + -f Use the metadata at the given file path -t Allowed time (in secs) to connect to the host -m Maximum time (in secs) of a complete probe -r Maximum number of HTTP redirects followed - -a Probe all SAML endpoints Option -h is mutually exclusive of all other options. Options - -q and -v are mutually exclusive of each other. Options -u and -f - are mutually exclusive of each other as well. - - The argument of the -t option is the TCP connect time, that is, - the maximum time (in secs) allotted to obtain a TCP connection. + -v and -V are mutually exclusive of each other. + + Option -v causes verbose messages to be written to stdout. + Option -V traces each HTTP exchange and writes the resulting + trace output to stdout. Since trace output is extensive, + options -V and -b are often used together. + + Use the -b option to drill down on a single endpoint with a + specific binding. The option argument is one of the binding + URIs listed above. + + The argument of the -f option is the path to a SAML metadata + file, from which the script obtains IdP metadata as needed. + If the -f option is omitted, the script requests IdP metadata + just-in-time from a metadata query server as described below + in the configuration section. + + The argument of the -t option is the maximum TCP connect time, + that is, the maximum time (in secs) allowed to obtain a TCP + connection. If omitted, the maximum TCP connect time defaults + to $CONNECT_TIMEOUT_DEFAULT secs. + Note that the TCP connect time includes the time it takes to do a DNS name lookup. Since the latter is unconstrained, it may consume all available TCP connect time. Thus the TCP connect @@ -56,40 +90,76 @@ display_help () { The argument of the -m option is the maximum total time (in secs) allotted to each probe. A reasonable value is a few seconds beyond the TCP connect time. Any value less than the TCP connect - time causes the script to immediately fail. + time causes the script to immediately fail. If this option is + omitted, the script computes a reasonable value based on the + value of the connect time option. + + For best results, the script follows redirects. The argument of + the -r option is the maximum number of redirects followed. The + default value of this option is ${MAX_REDIRS_DEFAULT}. + + ENVIRONMENT + + The required environment variable LIB_DIR specifies a directory + containing at least the following library files, which act as + helper scripts for ${0##*/}: + + $LIB_FILENAMES + + If any of the library files are missing, the script immediately + fails. - By default, the script probes the SAML2 HTTP-Redirect and HTTP-POST - endpoints in IdP metadata. Use option -a to probe all SAML browser- - facing SSO endpoints in metadata, including the SAML2 HTTP-POST-SimpleSign - endpoint and any SAML1 endpoint that might be present. The script - probes at most one endpoint of each type. + The script requires a temporary directory for logging and so + forth. An optional environment variable called TMPDIR is used + for this purpose. If TMPDIR is not found, the script creates a + temporary directory on the fly. For example, the current + invocation of the script (the one you issued a moment ago) + chose the following temporary directory: - CONFIG + $TMPDIR - The script reads a file of config parameters. The script loads the - config file from the following file location: + Be sure to check the temporary directory for log files and + other verbose output. In particular, a detailed trace of each + endpoint probe is recorded there, including the HTTP response. - $config_file_default + CONFIGURATION + + The script reads a file of config parameters. The config file is + loaded from the following file location: + + $config_file As a result of reading the config file, the following config parameters are initialized: - MDQ_BASE_URL - SAML2_SP_ENTITY_ID - SAML2_SP_ACS_URL - SAML2_SP_ACS_BINDING - SAML1_SP_ENTITY_ID - SAML1_SP_ACS_URL - SAML1_SP_ACS_BINDING + MDQ_BASE_URL=$MDQ_BASE_URL + CONNECT_TIMEOUT_DEFAULT=$CONNECT_TIMEOUT_DEFAULT + MAX_REDIRS_DEFAULT=$MAX_REDIRS_DEFAULT + SAML2_SP_ENTITY_ID=$SAML2_SP_ENTITY_ID + SAML2_SP_ACS_URL=$SAML2_SP_ACS_URL + SAML2_SP_ACS_BINDING=$SAML2_SP_ACS_BINDING + SAML1_SP_ENTITY_ID=$SAML1_SP_ENTITY_ID + SAML1_SP_ACS_URL=$SAML1_SP_ACS_URL + SAML1_SP_ACS_BINDING=$SAML1_SP_ACS_BINDING - The MDQ_BASE_URL is the base URL of a Metadata Query Server + The MDQ_BASE_URL is the base URL of a metadata query server (i.e., a server that conforms to the Metadata Query Protocol). The base URL is used to construct an MDQ request URL, which the - script uses to request entity metadata just-in-time. + script uses to request entity metadata just-in-time. To use a + metadata file in the file system, specify the -f option on the + command line. + + The CONNECT_TIMEOUT_DEFAULT is the default maximum TCP connect + time. To override the default on the fly, specify the -t option + on the command line. + + The MAX_REDIRS_DEFAULT is the default maximum number of redirects + followed. To override the default on the fly, specify the -r option + on the command line. The three SAML2_SP parameters define a SAML2 SP, that is, an SP with one or more SAML2 browser-facing endpoints in metadata. The - SAML AuthnRequest transmitted to the IdP contains the values of + SAML2 AuthnRequest transmitted to the IdP contains the values of these parameters. Note: An IdP reacts differently to requests from different SPs. Changing the values of these parameters may produce different results. @@ -99,6 +169,81 @@ display_help () { given SP may support both SAML2 and SAML1, in which case the SAML1_SP_ENTITY_ID config parameter may be identical to the SAML2_SP_ENTITY_ID parameter.) + + Typically the SAML2_SP and SAML1_SP config parameters correspond + to actual SPs that the IdP trusts, that is, for which the IdP has + consumed metadata. This is not strictly necessary, however. These + config parameters may be intentionally bogus in order to test the + resulting IdP response. + + STANDARD OUTPUT + + By default, the script outputs one line of output for each IdP + endpoint probed. (Use the -v option to obtain verbose output + or the -V option to trace the HTTP exchanges.) Each line of + output has the following space-delimited fields: + + 1. code: a curl error code + 2. output: a curl output string + 3. location: the location of an IdP endpoint in metadata + 4. binding: the binding of an IdP endpoint in metadata + 5. entityID: the entityID of the IdP + 6. registrarID: the registrar ID + + See the curl man page (http://linux.die.net/man/1/curl) for a + brief description of possible error codes. + + The curl output string has the following format: + + redirects:9;response:999;dns:9.999;tcp:9.999;ssl:9.999;total:9.999 + + The redirects in the output string are the number of HTTP redirects + followed by this script. The response is the ultimate HTTP response + code (after redirects). If the HTTP exchange does not complete, the + HTTP response will be 000 by convention. The remaining four values + in the output string are times (in secs) computed by curl: + + dns is the elapsed time up to and including the DNS lookup + (curl time_namelookup variable) + tcp is the elapsed time up to and including the TCP connection + (curl time_connect variable) + ssl is the elapsed time up to and including the SSL exchange + (curl time_appconnect variable) (only curl 7.19.0 and later) + total is the total elapsed time of the probe + (curl time_total variable) + + See the curl man page (curl --write-out option) for detailed + explanations of these timings. + + The location field is the value of the Location XML attribute of a + browser-facing SSO endpoint in metadata. + + The binding field is the value of the Binding XML attribute of a + browser-facing SSO endpoint in metadata. + + The entityID is the name of the IdP. An entityID is an arbitrary URI, + as given by the entityID XML attribute on the + element in SAML metadata. + + The registrarID is the name of the registrar that registered the IdP + metadata in the first place. By convention, a registrarID is an + arbitrary URI, as given by the registrationAuthority XML attribute + on the extension element in SAML metadata. + Since the latter element is optional in metadata, this field may be + NULL in the output. + + EXAMPLES + + \$ \$BIN_DIR/${0##*/} -h + \$ id=https://idp.incommonfederation.org/idp/shibboleth + \$ \$BIN_DIR/${0##*/} \$id + \$ \$BIN_DIR/${0##*/} -t ${CONNECT_TIMEOUT_DEFAULT} -r ${MAX_REDIRS_DEFAULT} \$id + \$ uri=urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + \$ \$BIN_DIR/${0##*/} -V -b \$uri \$id + + Note that the second and third examples above behave identically. + The last example probes a specific endpoint in IdP metadata, traces + the HTTP exchanges, and writes the (extensive) trace output to stdout. HELP_MSG } @@ -142,53 +287,96 @@ for lib_filename in $LIB_FILENAMES; do fi done -# basic curl defaults -connect_timeout_default=2; max_redirs_default=7 +# If env var TMPDIR exists, use it; otherwise +# create a new TMPDIR and use that instead. +if [ -z "$TMPDIR" ] || [ ! -d "$TMPDIR" ]; then + # create temporary directory + TMPDIR="$( make_temp_file -d )" + if [ ! -d "$TMPDIR" ] ; then + printf "ERROR: $script_name unable to create temporary dir\n" >&2 + exit 2 + fi +fi + +# use TMPDIR directory (remove trailing slash) +tmp_dir="${TMPDIR%%/}/probe_saml_idp_$$" + +# every run of this script gets its own subdir +if [ -d "$tmp_dir" ]; then + echo "ERROR: $script_name: directory already exists: $tmp_dir" >&2 + exit 2 +fi + +# create temporary subdirectory +/bin/mkdir "$tmp_dir" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to create tmp dir ($status_code) $tmp_dir" >&2 + exit 2 +fi -# default binding URIs -binding_uris_default="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" +log_file="$tmp_dir/main_log.txt" + +# delayed logging +printf "$script_name using source lib directory: %s\n" "$LIB_DIR" >> "$log_file" +for lib_filename in $LIB_FILENAMES; do + lib_file="$LIB_DIR/$lib_filename" + printf "$script_name sourced lib file: %s\n" "$lib_file" >> "$log_file" +done # default config file -config_file_default="${script_bin}/.config_saml_idp_probe.sh" +config_file="${script_bin}/.config_saml_idp_probe.sh" + +# load config file +load_config -v "$config_file" >> "$log_file" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to load $config_file" >&2 + exit 2 +fi + +# validate config parameters +validate_config -v >> "$log_file" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to verify $config_file" >&2 + exit 2 +fi ####################################################################### # Process command-line options and arguments ####################################################################### -help_mode=false; quiet_mode=false; verbose_mode=false -local_opts=; curl_opts= +help_mode=false; verbose_mode=false; trace_mode=false; md_file_mode=false connect_timeout=; max_time=; max_redirs= -binding_uris="$binding_uris_default" -while getopts ":hqvt:m:r:a" opt; do +while getopts ":hvVt:m:r:f:b:" opt; do case $opt in h) help_mode=true ;; - q) - quiet_mode=true - verbose_mode=false - #local_opts="$local_opts -$opt" - ;; v) - quiet_mode=false verbose_mode=true - local_opts="$local_opts -$opt" + trace_mode=false + ;; + V) + verbose_mode=false + trace_mode=true ;; t) connect_timeout="$OPTARG" - curl_opts="$curl_opts -t $OPTARG" ;; m) max_time="$OPTARG" - curl_opts="$curl_opts -m $OPTARG" ;; r) max_redirs="$OPTARG" - curl_opts="$curl_opts -r $OPTARG" ;; - a) - binding_uris="$binding_uris urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign" - binding_uris="$binding_uris urn:mace:shibboleth:1.0:profiles:AuthnRequest" + f) + md_file_mode=true + md_path="$OPTARG" + ;; + b) + binding_uri="$OPTARG" ;; \?) echo "ERROR: $script_name: Unrecognized option: -$OPTARG" >&2 @@ -206,54 +394,62 @@ if $help_mode; then exit 0 fi -# redirect stdout and stderr to the bit bucket -if $quiet_mode; then - exec 1>/dev/null - exec 2>/dev/null -fi +printf "See: %s\n" "$tmp_dir" -# check consistency of timeout options +# check mutual consistency of timeout options if [ -n "$max_time" -a -z "$connect_timeout" ]; then echo "ERROR: $script_name: the -m option requires the presence of the -t option" >&2 exit 2 fi -# set default connect timeout if necessary -if [ -z "$connect_timeout" ]; then - connect_timeout=$connect_timeout_default - curl_opts="$curl_opts -t $connect_timeout" -else - if [ "$connect_timeout" -le 0 ] ; then - echo "ERROR: $script_name: connect timeout ($connect_timeout) must be a positive integer" >&2 - exit 2 - fi +# use default connect timeout if necessary +[ -z "$connect_timeout" ] && connect_timeout=$CONNECT_TIMEOUT_DEFAULT + +# check reasonableness of connect timeout +if [ "$connect_timeout" -le 0 ] ; then + echo "ERROR: $script_name: connect timeout ($connect_timeout) must be a positive integer" >&2 + exit 2 fi # compute max time if necessary -if [ -z "$max_time" ]; then - max_time=$(( connect_timeout + 2 )) - curl_opts="$curl_opts -m $max_time" -else - if [ "$max_time" -le "$connect_timeout" ]; then - echo "ERROR: $script_name: max time ($max_time) must be greater than the connect timeout ($connect_timeout)" >&2 - exit 2 - fi +[ -z "$max_time" ] && max_time=$(( connect_timeout + 2)) + +# check reasonableness of max time +if [ "$max_time" -le "$connect_timeout" ]; then + echo "ERROR: $script_name: max time ($max_time) must be greater than the connect timeout ($connect_timeout)" >&2 + exit 2 fi -# check maximum number of redirects -if [ -z "$max_redirs" ]; then - max_redirs=$max_redirs_default - curl_opts="$curl_opts -r $max_redirs" +# use default max number of redirects if necessary +[ -z "$max_redirs" ] && max_redirs=$MAX_REDIRS_DEFAULT + +# check reasonableness of max number of redirects +if [ "$max_redirs" -le 0 ] ; then + echo "ERROR: $script_name: max number of redirects ($max_redirs) must be a positive integer" >&2 + exit 2 fi -if $verbose_mode; then - printf "$script_name using connect timeout: %d secs\n" $connect_timeout - printf "$script_name using max time: %d secs\n" $max_time - printf "$script_name using max redirects: %d\n" $max_redirs +# check the metadata file +if $md_file_mode; then + if [ -z "$md_path" ]; then + echo "ERROR: $script_name: option -f requires an argument" >&2 + exit 2 + fi + if [ ! -f "$md_path" ]; then + echo "ERROR: $script_name: file does not exist: $md_path" >&2 + exit 2 + fi fi -config_file="$config_file_default" -$verbose_mode && echo "$script_name using config file $config_file" +if [ -n "$binding_uri" ] && \ + [ "$binding_uri" != "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect" ] && \ + [ "$binding_uri" != "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ] && \ + [ "$binding_uri" != "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign" ] && \ + [ "$binding_uri" != "urn:mace:shibboleth:1.0:profiles:AuthnRequest" ]; then + + echo "ERROR: $script_name: endpoint binding not supported: $binding_uri" >&2 + exit 2 +fi # determine the entityID shift $(( OPTIND - 1 )) @@ -266,155 +462,164 @@ if [ -z "$1" ] ; then exit 2 fi entityID="$1" -$verbose_mode && echo "$script_name using entityID $entityID" + +if $verbose_mode; then + printf "$script_name using connect timeout: %d secs\n" $connect_timeout + printf "$script_name using max time: %d secs\n" $max_time + printf "$script_name using max redirects: %d\n" $max_redirs + + $md_file_mode && printf "$script_name using metadata file: %s\n" $md_path + [ -n "$binding_uri" ] && printf "$script_name using binding: %s\n" $binding_uri + + echo "$script_name using entityID $entityID" +fi ##################################################################### # Initialization ##################################################################### -if $verbose_mode; then - printf "$script_name using source lib directory: %s\n" "$LIB_DIR" - for lib_filename in $LIB_FILENAMES; do - lib_file="$LIB_DIR/$lib_filename" - printf "$script_name sourcing lib file: %s\n" "$lib_file" - done -fi - -# determine temporary directory -if [ -n "$TMP_DIR" ] && [ -d "$TMP_DIR" ]; then - $verbose_mode && printf "$script_name using existing temporary dir: %s\n" "$TMP_DIR" - # use existing temporary directory (remove trailing slash) - tmp_dir="${TMP_DIR%%/}/probe_saml_idp_$$" -elif [ -n "$TMPDIR" ] && [ -d "$TMPDIR" ]; then - $verbose_mode && printf "$script_name using system temporary dir: %s\n" "$TMPDIR" - # use system temporary directory (remove trailing slash) - tmp_dir="${TMPDIR%%/}/probe_saml_idp_$$" +# list of binding URIs (SSO endpoints to be probed) +if [ -n "$binding_uri" ]; then + binding_uris="$binding_uri" else - # create temporary directory - new_dir="$( make_temp_file -d )" - if [ ! -d "$new_dir" ] ; then - printf "ERROR: $script_name unable to create temporary dir\n" >&2 - exit 2 - fi - $verbose_mode && printf "$script_name using new temporary dir: %s\n" "$new_dir" - # use new temporary directory (remove trailing slash) - tmp_dir="${new_dir%%/}/probe_saml_idp_$$" + binding_uris="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign + urn:mace:shibboleth:1.0:profiles:AuthnRequest" fi +printf "$script_name using binding(s): %s\n" "$binding_uris" >> "$log_file" -# every run of this script gets its own subdir -if [ -d "$tmp_dir" ]; then - echo "ERROR: $script_name: directory already exists: $tmp_dir" >&2 - exit 2 -fi - -# create temporary directory if necessary -$verbose_mode && printf "$script_name creating temporary subdir: %s\n" "$tmp_dir" -/bin/mkdir "$tmp_dir" -status_code=$? -if [ $status_code -ne 0 ]; then - echo "ERROR: $script_name failed to create tmp dir ($status_code) $tmp_dir" >&2 - exit 2 -fi - -# load config file -$verbose_mode && echo "$script_name loading config file $config_file" -load_config $local_opts "$config_file" -status_code=$? -if [ $status_code -ne 0 ]; then - echo "ERROR: $script_name failed to load $config_file" >&2 - exit 2 -fi - -# validate config parameters -if [ -z "$MDQ_BASE_URL" ]; then - echo "ERROR: $script_name requires config param MDQ_BASE_URL" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ENTITY_ID" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ENTITY_ID" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ACS_URL" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ACS_URL" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ACS_BINDING" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ACS_BINDING" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ENTITY_ID" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ENTITY_ID" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ACS_URL" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ACS_URL" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ACS_BINDING" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ACS_BINDING" >&2 - exit 2 -fi +# for downstream functions with the same options +curl_opts="-t $connect_timeout" +curl_opts="$curl_opts -m $max_time" +curl_opts="$curl_opts -r $max_redirs" +printf "$script_name using curl options: %s\n" "$curl_opts" >> "$log_file" ##################################################################### -# Main processing +# Helper functions ##################################################################### -# get entity metadata -entityDescriptor=$( getEntityFromServer -T "$tmp_dir" -u "$MDQ_BASE_URL" $entityID ) -exit_status=$? -if [ "$exit_status" -ne 0 ]; then - echo "ERROR: $script_name: unable to obtain metadata for entityID: $entityID" >&2 - exit 3 -fi +# depends on: +# md_tools.sh +# http_tools.sh +# extract_entity.xsl +# +get_entity_descriptor () { -# short-circuit if this is not an IdP -if ! echo "$entityDescriptor" | $_GREP -Fq 'IDPSSODescriptor '; then - echo "ERROR: $script_name: entity is not an IdP: $entityID" >&2 - exit 3 -fi + local status_code -# extract the registrar ID from the entity descriptor -registrarID=$( echo "$entityDescriptor" \ - | $_GREP -F -m 1 ' registrationAuthority=' \ - | $_SED -e 's/^.* registrationAuthority="\([^"]*\)".*$/\1/' -) + # get entity metadata for this entityID + if $md_file_mode; then + entityDescriptor=$( getEntityFromFile -f "$md_path" $entityID ) + else + entityDescriptor=$( getEntityFromServer -T "$tmp_dir" -u "$MDQ_BASE_URL" $entityID ) + fi + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain metadata for entity: $entityID" >&2 + [ "$status_code" -gt 1 ] && return 3 + return 1 + fi -# compute all SSO endpoints -endpoints=$( echo "$entityDescriptor" \ - | $_GREP -E '<(md:)?SingleSignOnService ' -) + return 0 +} -# iterate over a subset of browser-facing SSO endpoints -has_no_saml_http_endpoints=true -for binding_uri in $binding_uris; do +# depends on: +# md_tools.sh +# entity_endpoints_txt.xsl +# entity_idp_names_txt.xsl +# +parse_entity_descriptor () { - # compute the SAML2 SSO endpoint - endpoint=$( echo "$endpoints" \ - | $_GREP -F -m 1 ' Binding="'$binding_uri'"' - ) - if [ -z "$endpoint" ]; then - $verbose_mode && printf "$script_name: no endpoint with Binding=\"%s\"\n" "$binding" - continue + local status_code + local names + + # short-circuit if this entity is not an IdP + if ! echo "$entityDescriptor" | $_GREP -Eq '<(md:)?IDPSSODescriptor '; then + echo "WARNING: $FUNCNAME: entity is not an IdP: $entityID" >&2 + return 1 fi - has_no_saml_http_endpoints=false - # compute the endpoint location and binding - location=$( echo "$endpoint" \ - | $_SED -e 's/^.* Location="\([^"]*\)".*$/\1/' + # list all the IdP SSO endpoints in the entity descriptor + endpoints=$( echo "$entityDescriptor" \ + | listEndpoints \ + | filterEndpoints -r IDPSSODescriptor -t SingleSignOnService ) - binding=$( echo "$endpoint" \ - | $_SED -e 's/^.* Binding="\([^"]*\)".*$/\1/' + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain IdP SSO endpoints for entity: $entityID" >&2 + return 3 + fi + + # every IdP MUST have at least one SSO endpoint + if [ -z "$endpoints" ]; then + echo "ERROR: $FUNCNAME: entity has no IdP SSO endpoints: $entityID" >&2 + return 4 + fi + + # extract the IdP names (for logging purposes) + names=$( echo "$entityDescriptor" | extractIdPNames ) + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain IdP names for entity: $entityID" >&2 + return 5 + fi + + # IdP mdui:DisplayName + displayName=$( echo "$names" | $_CUT -f2 ) + [ -z "$displayName" ] && displayName=NULL + + # md:OrganizationName is best for metadata registered by InCommon + # (admittedly, should be using md:OrganizationDisplayName instead) + orgName=$( echo "$names" | $_CUT -f3 ) + [ -z "$orgName" ] && orgName=NULL + + # mdrpi:RegistrationInfo/@registrationAuthority + registrarID=$( echo "$names" | $_CUT -f5 ) + [ -z "$registrarID" ] && registrarID=NULL + + return 0 +} + +# depends on: +# md_tools.sh +# +get_sso_endpoint () { + + # compute the endpoint location for this binding + location=$( echo "$endpoints" \ + | filterEndpoints -b $binding \ + | listEndpointLocations \ + | /usr/bin/head -n 1 ) - $verbose_mode && printf "$script_name probing endpoint with Location=\"%s\" and Binding=\"%s\"\n" "$location" "$binding" + + # if there is no endpoint location, skip this binding + if [ -z "$location" ]; then + echo "WARNING: $FUNCNAME: entity has no SSO endpoint that supports the ${binding##*:} binding: $entityID" >&2 + return 1 + fi + + return 0 +} + +# depends on: +# saml_tools.sh +# http_tools.sh +# +probe_sso_endpoint () { + + local tmp_subdir + local output + local status_code # create temporary subdirectory if necessary - tmp_subdir="$tmp_dir/${binding_uri##*:}" + tmp_subdir="$tmp_dir/${binding##*:}" if [ ! -d "$tmp_subdir" ]; then /bin/mkdir "$tmp_subdir" - exit_status=$? - if [ $exit_status -ne 0 ]; then - echo "ERROR: $script_name failed to create tmp dir ($exit_status) $tmp_subdir" >&2 - exit 3 + status_code=$? + if [ $status_code -ne 0 ]; then + echo "ERROR: $FUNCNAME failed to create tmp dir ($status_code) $tmp_subdir" >&2 + return 3 fi fi @@ -425,18 +630,87 @@ for binding_uri in $binding_uris; do -T "$tmp_subdir" \ $location $binding SingleSignOnService ) - exit_status=$? - if [ "$exit_status" -ne 0 ]; then - echo "ERROR: $script_name: probe_saml2_idp_endpoint failed ($exit_status)" >&2 - exit 3 + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: endpoint probe failed ($status_code): $location" >&2 + return 4 fi + # sanity check: the location and binding in the output + # must match the location and binding in metadata + + $trace_mode && /bin/cat "$tmp_subdir/curl_trace.txt" + echo "$output $entityID $registrarID" + return +} + +##################################################################### +# Main processing +##################################################################### + +# Given a single entityID, and a variable list of browser-facing +# HTTP protocol bindings, perform the following sequence of steps: +# +# 1. Get entity metadata +# 2. Parse entity metadata +# 3. For each binding: +# a. Get the corresponding SSO endpoint from metadata +# b. If the location exists, probe the SSO endpoint +# +# The loop iterates an indeterminate number of times depending +# on the list of bindings (which is influenced by command-line +# options) and the actual endpoints in metadata. +# +# if status_code > 1, a fatal error occurred + +# get entity metadata +get_entity_descriptor +status_code=$? +if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue +fi + +# parse entity metadata +parse_entity_descriptor +status_code=$? +if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue +fi + +printf "$script_name using IdP endpoints:\n%s\n" "$endpoints" >> "$log_file" +printf "$script_name using names: %s %s %s\n" "$displayName" "$orgName" "$registrarID" >> "$log_file" + +# iterate over a subset of SSO endpoints +has_no_saml_http_endpoints=true +for binding in $binding_uris; do + + # get the corresponding SSO endpoint from metadata + get_sso_endpoint + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue + fi + has_no_saml_http_endpoints=false + + printf "$script_name probing endpoint with Location=\"%s\" and Binding=\"%s\"\n" "$location" "$binding" >> "$log_file" + + # probe the SSO endpoint + probe_sso_endpoint + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue + fi + done if $has_no_saml_http_endpoints; then - echo "WARNING: $script_name: no SAML HTTP endpoints to probe" >&2 + echo "WARNING: $script_name: no SAML browser-facing endpoints to probe" >&2 fi exit 0 diff --git a/bin/probe_saml_idps.sh b/bin/probe_saml_idps.sh index a75fe25..0eb12fd 100755 --- a/bin/probe_saml_idps.sh +++ b/bin/probe_saml_idps.sh @@ -1,7 +1,7 @@ #!/bin/bash ####################################################################### -# Copyright 2015--2016 Internet2 +# Copyright 2015--2017 Internet2 # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,7 +16,7 @@ # limitations under the License. ####################################################################### -script_version="0.6" +script_version="0.9" user_agent_string="SAML IdP Probe ${script_version}" ####################################################################### @@ -27,39 +27,71 @@ display_help () { /bin/cat <<- HELP_MSG ${user_agent_string} - Given a list of identifiers, assumed to be IdP entityIDs, determine - which of those identifiers correspond to live SAML IdP deployments. - This script probes SAML2 IdP deployments only. + Given a list of identifiers, assumed to be IdP entityIDs, probe + one or more browser-facing SSO endpoints in the metadata of each + IdP. - Usage: ${0##*/} [-hvq] [-d OUT_DIR] [-t CONNECT_TIME [-m MAX_TIME]] [-r MAX_REDIRS] [ID ...] + Usage: ${0##*/} [-hvq] [-123] [-f MD_PATH] [-d OUT_DIR] [-t CONNECT_TIME [-m MAX_TIME]] [-r MAX_REDIRS] [ID ...] The script optionally takes a sequence of identifiers on the command line. If none are given, the script takes its input from stdin. The script iterates over all input identifiers. For each identifier, - if the corresponding entity is a SAML2 IdP, the script probes the - HTTP-Redirect endpoint, the HTTP-POST endpoint, and the - HTTP-POST-SimpleSign endpoint at that IdP. In other words, this - script probes up to three (3) browser-facing SSO endpoints at the - IdP. + up to four browser-facing SSO endpoints are probed depending on the + command-line options. By default, the script probes just one endpoint + per IdP: + + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + + To probe a different set of endpoints per IdP, use an appropriate + command-line option. Options: -h Display this message -v Write verbose messages to stdout -q Run quietly (i.e., write no messages to stdout) + -1 Probe SAML1 endpoints only + -2 Probe SAML2 endpoints only + -3 Probe both SAML1 and SAML2 endpoints + -f Use the metadata at the given file path + -d Path to an output directory -t Time (in secs) to connect to the host -m Maximum time (in secs) of a complete probe -r Maximum number of HTTP redirects followed - -d Path to an output directory Option -h is mutually exclusive of all other options. Options - -q and -v are mutually exclusive of each other. Options -u and -f - are mutually exclusive of each other as well. + -q and -v are mutually exclusive of each other. Options -1, + -2, and -3 are also mutually exclusive of each other. + + By default, the script probes SSO endpoints that support the + SAML2 HTTP-Redirect binding only. Use the -2 option to probe + all endpoints with one of the following SAML2 bindings: + + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign - The argument of the -t option is the TCP connect time, that is, - the maximum time (in secs) allotted to the TCP connection. Note - that the TCP connect time includes the time it takes to do a - DNS name lookup. Since the latter is unconstrained, it may + Alternatively, use the -1 option to probe all endpoints with + a SAML1 binding: + + urn:mace:shibboleth:1.0:profiles:AuthnRequest + + To probe both SAML1 and SAML2 endpoints (up to four endpoints + per IdP), use the -3 option. + + The argument of the -f option is the path to a SAML metadata + file, from which the script obtains IdP metadata as needed. + If the -f option is omitted, the script requests IdP metadata + just-in-time from a metadata query server as described below + in the configuration section. + + The argument of the -t option is the maximum TCP connect time, + that is, the maximum time (in secs) allotted to obtain a TCP + connection. If omitted, the maximum TCP connect time allowed + defaults to $CONNECT_TIMEOUT_DEFAULT secs. + + Note that the TCP connect time includes the time it takes to do + a DNS name lookup. Since the latter is unconstrained, it may consume all available TCP connect time. Thus the TCP connect time should be kept small (on the order of a few seconds) since larger values will slow this script considerably. @@ -67,68 +99,115 @@ display_help () { The argument of the -m option is the maximum total time (in secs) allotted to each probe. A reasonable value is a few seconds beyond the TCP connect time. Any value less than the TCP connect - time causes the script to immediately fail. + time causes the script to immediately fail. If this option is + omitted, the script computes a reasonable value based on the + value of the connect time option. + + For best results, the script follows redirects. The argument of + the -r option is the maximum number of redirects followed. The + default value of this option is ${MAX_REDIRS_DEFAULT}. + + ENVIRONMENT + + The required environment variable LIB_DIR specifies a directory + containing at least the following library files, which act as + helper scripts for ${0##*/}: + + $LIB_FILENAMES - CONFIG + If any of the library files are missing, the script immediately + fails. - The script reads a file of config parameters. The script loads the - config file from the following file location: + The script requires a temporary directory for logging and so + forth. An optional environment variable called TMPDIR is used + for this purpose. If TMPDIR is not found, the script creates a + temporary directory on the fly. For example, this invocation + of the script (the one you issued a moment ago) chose the + following temporary directory: - $config_file_default + $TMPDIR + + Be sure to check the temporary directory for log files and + other verbose output. In particular, a detailed trace of each + endpoint probe is recorded there, including the HTTP response. + + CONFIGURATION + + The script reads a file of config parameters. The config file is + loaded from the following file location: + + $config_file As a result of reading the config file, the following config parameters are initialized: - MDQ_BASE_URL - SAML2_SP_ENTITY_ID - SAML2_SP_ACS_URL - SAML2_SP_ACS_BINDING - SAML1_SP_ENTITY_ID - SAML1_SP_ACS_URL - SAML1_SP_ACS_BINDING + MDQ_BASE_URL=$MDQ_BASE_URL + CONNECT_TIMEOUT_DEFAULT=$CONNECT_TIMEOUT_DEFAULT + MAX_REDIRS_DEFAULT=$MAX_REDIRS_DEFAULT + SAML2_SP_ENTITY_ID=$SAML2_SP_ENTITY_ID + SAML2_SP_ACS_URL=$SAML2_SP_ACS_URL + SAML2_SP_ACS_BINDING=$SAML2_SP_ACS_BINDING + SAML1_SP_ENTITY_ID=$SAML1_SP_ENTITY_ID + SAML1_SP_ACS_URL=$SAML1_SP_ACS_URL + SAML1_SP_ACS_BINDING=$SAML1_SP_ACS_BINDING - The MDQ_BASE_URL is the base URL of a Metadata Query Server + The MDQ_BASE_URL is the base URL of a metadata query server (i.e., a server that conforms to the Metadata Query Protocol). The base URL is used to construct an MDQ request URL, which the - script uses to request entity metadata just-in-time. + script uses to request entity metadata just-in-time. To use a + metadata file in the file system, specify the -f option on the + command line. + + The CONNECT_TIMEOUT_DEFAULT is the default maximum TCP connect + time. To override the default on the fly, specify the -t option + on the command line. + + The MAX_REDIRS_DEFAULT is the default maximum number of redirects + followed. To override the default on the fly, specify the -r option + on the command line. The three SAML2_SP parameters define a SAML2 SP, that is, an SP with one or more SAML2 browser-facing endpoints in metadata. The - SAML AuthnRequest transmitted to the IdP contains the values of + SAML2 AuthnRequest transmitted to the IdP contains the values of these parameters. Note: An IdP reacts differently to requests from different SPs. Changing the values of these parameters may - produce different probe results. + produce different results. Similarly, the three SAML1_SP parameters define a SAML1 SP, that is, an SP with a SAML1 browser-facing endpoint in metadata. (Any given SP may support both SAML2 and SAML1, in which case the - SAML1_SP_ENTITY_ID parameter may be identical to the - SAML2_SP_ENTITY_ID parameter.) The script probes SAML1 endpoints - if the -a option is given on the command line. Omit that option - to probe SAML2 endpoints only. + SAML1_SP_ENTITY_ID config parameter may be identical to the + SAML2_SP_ENTITY_ID parameter.) - STDOUT + Typically the SAML2_SP and SAML1_SP config parameters correspond + to actual SPs that the IdP trusts, that is, for which the IdP has + consumed metadata. This is not strictly necessary, however. These + config parameters may be intentionally bogus in order to test the + resulting IdP response. + + STANDARD OUTPUT By default, the script outputs an abbreviated log to stdout (but this may be suppressed by use of the -q option). A line of standard output has the following space-delimited fields: - 1) code: a curl exit code - 2) output: a curl output string - 3) location: the URL of the probed SAML2 HTTP-Redirect endpoint - 4) liveness: IdP liveness indicator + 1. code: a curl exit code + 2. output: a curl output string + 3. location: the location of an IdP endpoint in metadata + 4. entityID: the entityID of the IdP See the curl man page (http://linux.die.net/man/1/curl) for a brief description of possible exit codes. - The output string has the following format: + The curl output string has the following format: - response:999;dns:9.999;tcp:9.999;ssl:9.999;total:9.999 + redirects:9;response:999;dns:9.999;tcp:9.999;ssl:9.999;total:9.999 - The response in the output string is the HTTP response code of the - probed web server. If the probe does not complete, the HTTP response - will be 000. The remaining four values in the output string are times - (in secs) computed by curl: + The redirects in the output string are the number of HTTP redirects + followed by this script. The response is the ultimate HTTP response + code. If the HTTP exchange does not complete, the HTTP response will + be 000 by convention. The remaining four values in the output string + are times (in secs) computed by curl: dns is the elapsed time up to and including the DNS lookup (curl time_namelookup variable) @@ -142,17 +221,12 @@ display_help () { See the curl man page (curl --write-out option) for detailed explanations of these timings. - The location is the actual URL probed by this script. It is - an HTTP-Redirect endpoint location in metadata. + The location field is the value of the Location XML attribute of a + browser-facing SSO endpoint in metadata. - The IdP liveness indicator takes on one of four values: DEAD, - UNRESPONSIVE, INDETERMINATE, or SUCCESS. By definition, a probe - succeeds (SUCCESS) if its exit code is 0. For our purposes, a - probe completely fails (DEAD) if its exit code is either 6 or 7. - (Exit code 6 indicates a DNS lookup failure while code 7 means - the host is unreachable on the network.) A probe that times out - (exit code 28) is labeled as UNRESPONSIVE. All other exit codes - are regarded as INDETERMINATE. + The entityID is the name of the IdP. An entityID is an arbitrary URI, + as given by the entityID XML attribute on the + element in SAML metadata. FILES @@ -160,22 +234,36 @@ display_help () { -d option is specified on the command line. The output files are written to the given OUT_DIR. - ${NO_SAML2_HTTP_ENDPOINT_FILENAME} + $IDP_ENDPOINTS_MISSING_LOG_FILENAME - A list of IdPs that do not expose a suitable SAML2 HTTP endpoint - location in metadata. A suitable endpoint supports one of the - following SAML2 HTTP bindings: HTTP-Redirect, HTTP-POST, or - HTTP-POST-SimpleSign. An IdP that supports SAML1 only will - necessarily appear on this list, and will therefore not be probed. + A list of IdPs that do not expose certain endpoint locations in + metadata. By default, all IdPs that lack a SAML2 HTTP-Redirect + endpoint will be listed in this file. Depending on the specified + command-line options, other bindings will appear in this file. + For example, if the -1 option is specified on the command line, + all IdPs that lack a SAML1 SSO endpoint will be listed instead. A line in the output file has the following space-delimited fields: - 1) entityID: the entityID of the IdP - 2) registrarID: the registrar ID + 1. binding: a SAML binding URI + 2. entityID: the entityID of the IdP + 3. registrarID: the registrar ID + The binding is one of the following binding URIs: + + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign + urn:mace:shibboleth:1.0:profiles:AuthnRequest + The entityID is the name of the IdP. An entityID is an arbitrary URI, as given by the entityID XML attribute on the element in SAML metadata. + + A particular entityID may appear more than once in this file. For + instance, if the -2 option is specified on the command line, any + given IdP may be listed multiple times since the script attempts + to probe multiple SAML2 endpoints in that case. The registrarID is the name of the registrar that registered the IdP metadata in the first place. By convention, a registrar ID is an @@ -185,31 +273,90 @@ display_help () { the log file (which is why it is always the last field on any output line). - ${IDP_LOG_FILENAME} + $IDP_ENDPOINTS_LOG_FILENAME - A log of each probe. Each line records the result of the probe of - a single SAML IdP. A line in the log file has the following - space-delimited fields: + A log of each probed endpoint. Each line records the result of the + probe of a single IdP endpoint. A line in the log file has the + following space-delimited fields: + + 1. code: a curl exit code + 2. output: a curl output string + 3. location: the location of an IdP endpoint in metadata + 4. binding: the binding of an IdP endpoint in metadata + 5. entityID: the entityID of the IdP + 6. registrarID: the registrar ID + + The code, output, and location fields are the same as those printed + to stdout. + + The binding field is the value of the Binding XML attribute of a + browser-facing SSO endpoint in metadata. - 1) code: a curl exit code - 2) output: a curl output string - 3) location: a SAML2 HTTP-Redirect endpoint location - 4) entityID: the entityID of the Shibboleth IdP - 5) registrarID: the registrar ID + The entityID and the registrarID are the same as in the previous + file. - The code, output, and statusURL fields are the same as those printed - to stdout. + $IDP_ENDPOINTS_RESPONSIVE_FILENAME + + A list of responsive endpoints suitable for post-processing. + By definition, an IdP endpoint is responsive if the curl error + code is 0 and the HTTP response code is either 200 or 401. All + other endpoints are categorized as non-responsive. + + A line in the file has the following tab-delimited fields: + + 1. code: a curl error code + 2. response: an HTTP response code + 3. location: the location of an IdP endpoint in metadata + 4. binding: the binding of an IdP endpoint in metadata + 5. entityID: the entityID of the IdP + 6. displayName: the IdP display name + 7. orgName: the organization name + 8. registrarID: the registrar ID + + The code, location, binding, entityID, and registrarID fields are + the same as in the previous file. + + The response field - The location, entityID, and registrarID fields are the same as in the - previous output file. + The displayName field is the display name of the IdP. It is the + value of the mdui:DisplayName element in metadata. + + The orgName field is the name of the organization responsible + for the IdP. It is the value of the md:OrganizationName element + in metadata. - Examples: ${0##*/} -h - ${0##*/} \$id - ${0##*/} -t ${connect_timeout_default} \$id - cat \$id_file | ${0##*/} -v -t 4 - ${0##*/} -q -f /path/to/md_file.xml \$id1 \$id2 \$id3 + $IDP_ENDPOINTS_NOT_RESPONSIVE_FILENAME + + A list of non-responsive endpoints suitable for post-processing. + See the definition of responsive endpoint above. Any given + endpoint appears in exactly one of these two files. + + A line in the file has the following tab-delimited fields: + + 1. code: a curl error code + 2. response: an HTTP response code + 3. location: the location of an IdP endpoint in metadata + 4. binding: the binding of an IdP endpoint in metadata + 5. entityID: the entityID of the IdP + 6. displayName: the display name of the IdP + 7. orgName: the name of the organization responsible for the IdP + 8. registrarID: the registrar ID + + All fields are exactly the same as in the previous file. + + EXAMPLES + + \$ \$BIN_DIR/${0##*/} -h + \$ \$BIN_DIR/${0##*/} \$id1 \$id2 \$id3 + \$ \$BIN_DIR/${0##*/} -t ${CONNECT_TIMEOUT_DEFAULT} -r ${MAX_REDIRS_DEFAULT} \$id1 \$id2 \$id3 + \$ \$BIN_DIR/${0##*/} -v -3 \$id1 \$id2 \$id3 + \$ cat \$id_file | \$BIN_DIR/${0##*/} -f /path/to/md_file.xml -d /path/to/out_dir/ Note that the second and third examples above behave identically. + The fourth example probes all SSO endpoints, both SAML1 and SAML2, + for each IdP. The fifth example takes a list of identifiers on + standard input, retrieves metadata from a file, and writes its + output to both stdout and files in the given output directory. HELP_MSG } @@ -253,19 +400,67 @@ for lib_filename in $LIB_FILENAMES; do fi done -# basic curl defaults -connect_timeout_default=2; max_redirs_default=7 +# If env var TMPDIR exists, use it; otherwise +# create a new TMPDIR and use that instead. +if [ -z "$TMPDIR" ] || [ ! -d "$TMPDIR" ]; then + # create temporary directory + TMPDIR="$( make_temp_file -d )" + if [ ! -d "$TMPDIR" ] ; then + printf "ERROR: $script_name unable to create temporary dir\n" >&2 + exit 2 + fi +fi -# default binding URIs -binding_uris_default="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" +# use TMPDIR directory (remove trailing slash) +tmp_dir="${TMPDIR%%/}/probe_saml_idps_$$" + +# every run of this script gets its own subdir +if [ -d "$tmp_dir" ]; then + echo "ERROR: $script_name: directory already exists: $tmp_dir" >&2 + exit 2 +fi + +# create temporary subdirectory +/bin/mkdir "$tmp_dir" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to create tmp dir ($status_code) $tmp_dir" >&2 + exit 2 +fi + +log_file="$tmp_dir/main_log.txt" + +# delayed logging +printf "$script_name using source lib directory: %s\n" "$LIB_DIR" >> "$log_file" +for lib_filename in $LIB_FILENAMES; do + lib_file="$LIB_DIR/$lib_filename" + printf "$script_name sourced lib file: %s\n" "$lib_file" >> "$log_file" +done # default config file -config_file_default="${script_bin}/.config_saml_idp_probe.sh" +config_file="${script_bin}/.config_saml_idp_probe.sh" + +# load config file +load_config -v "$config_file" >> "$log_file" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to load $config_file" >&2 + exit 2 +fi + +# validate config parameters +validate_config -v >> "$log_file" +status_code=$? +if [ $status_code -ne 0 ]; then + echo "ERROR: $script_name failed to verify $config_file" >&2 + exit 2 +fi # output filenames -NO_SAML2_HTTP_ENDPOINT_FILENAME="idps-no-saml2-http-redirect-endpoint.txt" -IDP_LOG_FILENAME="idps-saml-log.txt" -IDP_NAMES_FILENAME="idp-names.txt" +IDP_ENDPOINTS_MISSING_LOG_FILENAME="idp-endpoints-missing-log.txt" +IDP_ENDPOINTS_LOG_FILENAME="idp-endpoints-log.txt" +IDP_ENDPOINTS_RESPONSIVE_FILENAME="idp-endpoints-responsive.txt" +IDP_ENDPOINTS_NOT_RESPONSIVE_FILENAME="idp-endpoints-non-responsive.txt" ERROR_LOG_FILENAME="error-log.txt" COMPATIBILITY_SCRIPT_FILENAME="compatibility.sh" @@ -274,39 +469,48 @@ COMPATIBILITY_SCRIPT_FILENAME="compatibility.sh" ####################################################################### help_mode=false; quiet_mode=false; verbose_mode=false -local_opts=; curl_opts= +local_opts=; md_file_mode=false +saml1_mode=false; saml2_mode=false connect_timeout=; max_time=; max_redirs= -binding_uris="$binding_uris_default" -while getopts ":hqvat:m:r:d:" opt; do +while getopts ":hvq123t:m:r:f:d:" opt; do case $opt in h) help_mode=true ;; + v) + quiet_mode=false + verbose_mode=true + local_opts="$local_opts -$opt" + ;; q) quiet_mode=true verbose_mode=false #local_opts="$local_opts -$opt" ;; - v) - quiet_mode=false - verbose_mode=true - local_opts="$local_opts -$opt" + 1) + saml1_mode=true + saml2_mode=false + ;; + 2) + saml1_mode=false + saml2_mode=true ;; - a) - binding_uris="$binding_uris urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign" - binding_uris="$binding_uris urn:mace:shibboleth:1.0:profiles:AuthnRequest" + 3) + saml1_mode=true + saml2_mode=true ;; t) connect_timeout="$OPTARG" - curl_opts="$curl_opts -t $OPTARG" ;; m) max_time="$OPTARG" - curl_opts="$curl_opts -m $OPTARG" ;; r) max_redirs="$OPTARG" - curl_opts="$curl_opts -r $OPTARG" + ;; + f) + md_file_mode=true + md_path="$OPTARG" ;; d) OUT_DIR="$OPTARG" @@ -350,102 +554,69 @@ else exit 2 fi fi + # redirect stderr to a file + #$quiet_mode && exec 2>"$ERROR_LOG_FILE" fi -# check consistency of timeout options +# check mutual consistency of timeout options if [ -n "$max_time" -a -z "$connect_timeout" ]; then echo "ERROR: $script_name: the -m option requires the presence of the -t option" >&2 exit 2 fi -# set default connect timeout if necessary -if [ -z "$connect_timeout" ]; then - connect_timeout=$connect_timeout_default - curl_opts="$curl_opts -t $connect_timeout" -else - if [ "$connect_timeout" -le 0 ] ; then - echo "ERROR: $script_name: connect timeout ($connect_timeout) must be a positive integer" >&2 - exit 2 - fi +# use default connect timeout if necessary +[ -z "$connect_timeout" ] && connect_timeout=$CONNECT_TIMEOUT_DEFAULT + +# check reasonableness of connect timeout +if [ "$connect_timeout" -le 0 ] ; then + echo "ERROR: $script_name: connect timeout ($connect_timeout) must be a positive integer" >&2 + exit 2 fi # compute max time if necessary -if [ -z "$max_time" ]; then - max_time=$(( connect_timeout + 2 )) - curl_opts="$curl_opts -m $max_time" -else - if [ "$max_time" -le "$connect_timeout" ]; then - echo "ERROR: $script_name: max time ($max_time) must be greater than the connect timeout ($connect_timeout)" >&2 - exit 2 - fi +[ -z "$max_time" ] && max_time=$(( connect_timeout + 2)) + +# check reasonableness of max time +if [ "$max_time" -le "$connect_timeout" ]; then + echo "ERROR: $script_name: max time ($max_time) must be greater than the connect timeout ($connect_timeout)" >&2 + exit 2 fi -# check maximum number of redirects -if [ -z "$max_redirs" ]; then - max_redirs=$max_redirs_default - curl_opts="$curl_opts -r $max_redirs" +# use default max number of redirects if necessary +[ -z "$max_redirs" ] && max_redirs=$MAX_REDIRS_DEFAULT + +# check reasonableness of max number of redirects +if [ "$max_redirs" -le 0 ] ; then + echo "ERROR: $script_name: max number of redirects ($max_redirs) must be a positive integer" >&2 + exit 2 +fi + +# check the metadata file +if $md_file_mode; then + if [ -z "$md_path" ]; then + echo "ERROR: $script_name: option -f requires an argument" >&2 + exit 2 + fi + if [ ! -f "$md_path" ]; then + echo "ERROR: $script_name: file does not exist: $md_path" >&2 + exit 2 + fi fi if $verbose_mode; then printf "$script_name using connect timeout: %d secs\n" $connect_timeout printf "$script_name using max time: %d secs\n" $max_time printf "$script_name using max redirects: %d\n" $max_redirs + + $md_file_mode && printf "$script_name using metadata file: %s\n" $md_path fi -config_file="$config_file_default" -$verbose_mode && echo "$script_name using config file $config_file" - ##################################################################### # Initialization ##################################################################### -# report bootstrap operations -if $verbose_mode; then - printf "$script_name using source lib directory: %s\n" "$LIB_DIR" - for lib_filename in $LIB_FILENAMES; do - lib_file="$LIB_DIR/$lib_filename" - printf "$script_name sourcing lib file: %s\n" "$lib_file" - done -fi - -# determine temporary directory -if [ -n "$TMP_DIR" ] && [ -d "$TMP_DIR" ]; then - $verbose_mode && printf "$script_name using existing temporary dir: %s\n" "$TMP_DIR" - # use existing temporary directory (remove trailing slash) - tmp_dir="${TMP_DIR%%/}/probe_saml_idps_$$" -elif [ -n "$TMPDIR" ] && [ -d "$TMPDIR" ]; then - $verbose_mode && printf "$script_name using system temporary dir: %s\n" "$TMPDIR" - # use system temporary directory (remove trailing slash) - tmp_dir="${TMPDIR%%/}/probe_saml_idps_$$" -else - # create temporary directory - new_dir="$( make_temp_file -d )" - if [ ! -d "$new_dir" ] ; then - printf "ERROR: $script_name unable to create temporary dir\n" >&2 - exit 2 - fi - $verbose_mode && printf "$script_name using new temporary dir: %s\n" "$new_dir" - # use new temporary directory (remove trailing slash) - tmp_dir="${new_dir%%/}/probe_saml_idps_$$" -fi - -# every run of this script gets its own subdir -if [ -d "$tmp_dir" ]; then - echo "ERROR: $script_name: directory already exists: $tmp_dir" >&2 - exit 2 -fi - -# create temporary directory if necessary -$verbose_mode && printf "$script_name creating temporary subdir: %s\n" "$tmp_dir" -/bin/mkdir "$tmp_dir" -status_code=$? -if [ $status_code -ne 0 ]; then - echo "ERROR: $script_name failed to create tmp dir ($status_code) $tmp_dir" >&2 - exit 2 -fi - # temp file TODO: store each response separately? -HTTP_RESPONSE_FILE="${tmp_dir}/http_response.txt" +#HTTP_RESPONSE_FILE="${tmp_dir}/http_response.txt" # read the input into a temporary file IN_FILE="${tmp_dir}/tmp_entityids_in_$$.txt" @@ -463,64 +634,57 @@ else fi $verbose_mode && printf "$script_name processing temp input file: %s\n" "$IN_FILE" -# load config file -$verbose_mode && echo "$script_name loading config file $config_file" -load_config $local_opts "$config_file" -status_code=$? -if [ $status_code -ne 0 ]; then - echo "ERROR: $script_name failed to load $config_file" >&2 - exit 2 +# list of binding URIs (SSO endpoints to be probed) +binding_uris="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect" +if $saml2_mode && $saml1_mode; then + binding_uris="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign + urn:mace:shibboleth:1.0:profiles:AuthnRequest" +elif $saml2_mode; then + binding_uris="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST + urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST-SimpleSign" +elif $saml1_mode; then + binding_uris="urn:mace:shibboleth:1.0:profiles:AuthnRequest" fi +printf "$script_name using binding(s): %s\n" "$binding_uris" >> "$log_file" -# validate config parameters -if [ -z "$MDQ_BASE_URL" ]; then - echo "ERROR: $script_name requires config param MDQ_BASE_URL" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ENTITY_ID" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ENTITY_ID" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ACS_URL" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ACS_URL" >&2 - exit 2 -fi -if [ -z "$SAML2_SP_ACS_BINDING" ]; then - echo "ERROR: $script_name requires config param SAML2_SP_ACS_BINDING" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ENTITY_ID" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ENTITY_ID" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ACS_URL" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ACS_URL" >&2 - exit 2 -fi -if [ -z "$SAML1_SP_ACS_BINDING" ]; then - echo "ERROR: $script_name requires config param SAML1_SP_ACS_BINDING" >&2 - exit 2 -fi +# for downstream functions with the same options +curl_opts="-t $connect_timeout" +curl_opts="$curl_opts -m $max_time" +curl_opts="$curl_opts -r $max_redirs" +printf "$script_name using curl options: %s\n" "$curl_opts" >> "$log_file" ##################################################################### -# Helper functions +# Output functions ##################################################################### init_out_files () { $DO_NOT_PRINT_FILES && return # output files - NO_SAML2_HTTP_ENDPOINT_FILE="$OUT_DIR/$NO_SAML2_HTTP_ENDPOINT_FILENAME" - IDP_LOG_FILE="$OUT_DIR/$IDP_LOG_FILENAME" - IDP_NAMES_FILE="$OUT_DIR/$IDP_NAMES_FILENAME" + IDP_ENDPOINTS_MISSING_LOG_FILE="$OUT_DIR/$IDP_ENDPOINTS_MISSING_LOG_FILENAME" + IDP_ENDPOINTS_LOG_FILE="$OUT_DIR/$IDP_ENDPOINTS_LOG_FILENAME" + IDP_ENDPOINTS_RESPONSIVE_FILE="$OUT_DIR/$IDP_ENDPOINTS_RESPONSIVE_FILENAME" + IDP_ENDPOINTS_NOT_RESPONSIVE_FILE="$OUT_DIR/$IDP_ENDPOINTS_NOT_RESPONSIVE_FILENAME" ERROR_LOG_FILE="$OUT_DIR/$ERROR_LOG_FILENAME" COMPATIBILITY_SCRIPT_FILE="$OUT_DIR/$COMPATIBILITY_SCRIPT_FILENAME" # clean up from last time if necessary - /bin/rm -f "$NO_SAML2_HTTP_ENDPOINT_FILE" - /bin/rm -f "$IDP_LOG_FILE" - /bin/rm -f "$IDP_NAMES_FILE" - /bin/rm -f "$ERROR_LOG_FILE" + /bin/rm -f \ + "$IDP_ENDPOINTS_MISSING_LOG_FILE" \ + "$IDP_ENDPOINTS_LOG_FILE" \ + "$IDP_ENDPOINTS_RESPONSIVE_FILE" \ + "$IDP_ENDPOINTS_NOT_RESPONSIVE_FILE" \ + "$ERROR_LOG_FILE" + + # an empty file is better than no file + $_TOUCH \ + "$IDP_ENDPOINTS_MISSING_LOG_FILE" \ + "$IDP_ENDPOINTS_LOG_FILE" \ + "$IDP_ENDPOINTS_RESPONSIVE_FILE" \ + "$IDP_ENDPOINTS_NOT_RESPONSIVE_FILE" # redirect stderr to a file if $quiet_mode; then @@ -531,37 +695,43 @@ init_out_files () { # output cross-script compatibility $verbose_mode && printf "$script_name writing compatibility file: %s\n" "$COMPATIBILITY_SCRIPT_FILE" /bin/cat <<- COMPATIBILITY_SCRIPT > $COMPATIBILITY_SCRIPT_FILE - # exactly one of the following two global vars will be nonempty - MD_PATH=$md_path - MDQ_BASE_URL=$mdq_base_url + # config environment + MDQ_BASE_URL=$MDQ_BASE_URL + CONNECT_TIMEOUT_DEFAULT=$CONNECT_TIMEOUT_DEFAULT + MAX_REDIRS_DEFAULT=$MAX_REDIRS_DEFAULT + SAML2_SP_ENTITY_ID=$SAML2_SP_ENTITY_ID + SAML2_SP_ACS_URL=$SAML2_SP_ACS_URL + SAML2_SP_ACS_BINDING=$SAML2_SP_ACS_BINDING + SAML1_SP_ENTITY_ID=$SAML1_SP_ENTITY_ID + SAML1_SP_ACS_URL=$SAML1_SP_ACS_URL + SAML1_SP_ACS_BINDING=$SAML1_SP_ACS_BINDING + # command-line parameters + md_path=$md_path + connect_timeout=$connect_timeout + max_time=$max_time + max_redirs=$max_redirs # output files - NO_SAML2_HTTP_ENDPOINT_FILE=${NO_SAML2_HTTP_ENDPOINT_FILE} - IDP_LOG_FILE=${IDP_LOG_FILE} - IDP_NAMES_FILE=$IDP_NAMES_FILE + IDP_ENDPOINTS_MISSING_LOG_FILE=$IDP_ENDPOINTS_MISSING_LOG_FILE + IDP_ENDPOINTS_LOG_FILE=$IDP_ENDPOINTS_LOG_FILE + IDP_ENDPOINTS_RESPONSIVE_FILE=$IDP_ENDPOINTS_RESPONSIVE_FILE + IDP_ENDPOINTS_NOT_RESPONSIVE_FILE=$IDP_ENDPOINTS_NOT_RESPONSIVE_FILE ERROR_LOG_FILE=$ERROR_LOG_FILE # temporary output directory tmp_dir="$tmp_dir" COMPATIBILITY_SCRIPT } -print_idp_names_logfile () { +print_idp_endpoint_missing_logfile () { $DO_NOT_PRINT_FILES && return - local names=$1 - - printf "%s\n" "$names" >> "$IDP_NAMES_FILE" -} - -print_no_saml2_http_endpoint_logfile () { - $DO_NOT_PRINT_FILES && return + local binding=$1 + local entityID=$2 + local registrarID=$3 - local entityID=$1 - local registrarID=$2 - - printf "%s %s\n" "$entityID" "$registrarID" >> "$NO_SAML2_HTTP_ENDPOINT_FILE" + printf "%s %s %s\n" "$binding" "$entityID" "$registrarID" >> "$IDP_ENDPOINTS_MISSING_LOG_FILE" } -print_logfile () { +print_idp_endpoint_logfile () { $DO_NOT_PRINT_FILES && return local curl_status_code=$1 @@ -571,120 +741,292 @@ print_logfile () { local entityID=$5 local registrarID=$6 - printf "%s %s %s " "$curl_status_code" "$curl_output" "$location" >> "$IDP_LOG_FILE" - printf "%s %s %s\n" "$binding" "$entityID" "$registrarID" >> "$IDP_LOG_FILE" + printf "%s %s %s " "$curl_status_code" "$curl_output" "$location" >> "$IDP_ENDPOINTS_LOG_FILE" + printf "%s %s %s\n" "$binding" "$entityID" "$registrarID" >> "$IDP_ENDPOINTS_LOG_FILE" +} + +print_idp_endpoint_list () { + $DO_NOT_PRINT_FILES && return + + # command-line arguments + local errorCode + local responseCode + local location + local binding + local entityID + local displayName + local orgName + local registrarID + + # command-line options + local out_file + + # process command-line options (if any) + local OPTARG + local OPTIND + local opt + while getopts ":f:" opt; do + case $opt in + f) + out_file="$OPTARG" + ;; + \?) + echo "ERROR: $FUNCNAME: Unrecognized option: -$OPTARG" >&2 + return 2 + ;; + :) + echo "ERROR: $FUNCNAME: Option -$OPTARG requires an argument" >&2 + return 2 + ;; + esac + done + + shift $(( OPTIND - 1 )) + + errorCode=$1 + responseCode=$2 + location=$3 + binding=$4 + entityID=$5 + displayName="$6" + orgName="$7" + registrarID=$8 + + printf "%s\t%s\t%s\t%s\t" "$errorCode" "$responseCode" "$location" $binding >> "$out_file" + printf "%s\t%s\t%s\t%s\n" $entityID "$displayName" "$orgName" $registrarID >> "$out_file" } ##################################################################### -# Main processing +# Helper functions ##################################################################### -init_out_files +# depends on: +# md_tools.sh +# http_tools.sh +# extract_entity.xsl +# +get_entity_descriptor () { -if $verbose_mode; then - num_entityIDs=$( /bin/cat $IN_FILE | /usr/bin/wc -l ) - printf "$script_name processing %d entityIDs\n" $num_entityIDs -fi + local status_code -# iterate over all entityIDs in the input file -/bin/cat $IN_FILE | while read entityID; do - - # get the entity descriptor for this entityID - entityDescriptor=$( getEntityFromServer -T "$tmp_dir" -u "$MDQ_BASE_URL" $entityID ) - return_code=$? - if [ "$return_code" -ne 0 ]; then - echo "ERROR: $script_name: unable to obtain metadata for entityID: $entityID" >&2 - [ "$return_code" -gt 1 ] && exit 1 - continue + # get entity metadata for this entityID + if $md_file_mode; then + entityDescriptor=$( getEntityFromFile -f "$md_path" $entityID ) + else + entityDescriptor=$( getEntityFromServer -T "$tmp_dir" -u "$MDQ_BASE_URL" $entityID ) fi - - # short-circuit the while-loop if this is not an IdP - if ! echo "$entityDescriptor" | $_GREP -Fq 'IDPSSODescriptor '; then - #print_no_idp_role_logfile "$entityID" "$registrarID" - echo "WARNING: $script_name: entity is not an IdP: $entityID" >&2 - continue + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain metadata for entity: $entityID" >&2 + [ "$status_code" -gt 1 ] && return 3 + return 1 fi - # extract the registrar ID from the entity descriptor - registrarID=$( echo "$entityDescriptor" \ - | $_GREP -F -m 1 ' registrationAuthority=' \ - | $_SED -e 's/^.* registrationAuthority="\([^"]*\)".*$/\1/' - ) + return 0 +} + +# depends on: +# md_tools.sh +# entity_endpoints_txt.xsl +# entity_idp_names_txt.xsl +# +parse_entity_descriptor () { + + local status_code + local names - # if there is no registrar ID, work around it and continue processing - if [ -z "$registrarID" ]; then - registrarID=NULL + # short-circuit if this entity is not an IdP + if ! echo "$entityDescriptor" | $_GREP -Eq '<(md:)?IDPSSODescriptor '; then + echo "WARNING: $FUNCNAME: entity is not an IdP: $entityID" >&2 + return 1 fi - # compute all SSO endpoints + # list all the IdP SSO endpoints in the entity descriptor endpoints=$( echo "$entityDescriptor" \ - | $_GREP -E '<(md:)?SingleSignOnService ' + | listEndpoints \ + | filterEndpoints -r IDPSSODescriptor -t SingleSignOnService ) + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain IdP SSO endpoints for entity: $entityID" >&2 + return 3 + fi - # iterate over the SAML2 browser-facing SSO endpoints - has_no_saml_http_endpoints=true - for binding_uri in $binding_uris; do - - # compute the SAML2 SSO endpoint - endpoint=$( echo "$endpoints" \ - | $_GREP -F -m 1 ' Binding="'$binding_uri'"' - ) - if [ -z "$endpoint" ]; then - $verbose_mode && printf "$script_name: no endpoint with Binding=\"%s\"\n" "$binding" - continue - fi - has_no_saml_http_endpoints=false - - # compute the endpoint location and binding - location=$( echo "$endpoint" \ - | $_SED -e 's/^.* Location="\([^"]*\)".*$/\1/' - ) - binding=$( echo "$endpoint" \ - | $_SED -e 's/^.* Binding="\([^"]*\)".*$/\1/' - ) - $verbose_mode && printf "$script_name probing endpoint with Location=\"%s\" and Binding=\"%s\"\n" "$location" "$binding" - - # probe the endpoint - output=$( probe_saml_idp_endpoint $curl_opts \ - -T "$tmp_dir" \ - $location $binding SingleSignOnService - ) - exit_status=$? - if [ "$exit_status" -ne 0 ]; then - echo "ERROR: $script_name: probe_saml_idp_endpoint failed ($exit_status)" >&2 - exit 3 - fi + # every IdP MUST have at least one SSO endpoint + if [ -z "$endpoints" ]; then + echo "ERROR: $FUNCNAME: entity has no IdP SSO endpoints: $entityID" >&2 + return 4 + fi + + # extract the IdP names (for logging purposes) + names=$( echo "$entityDescriptor" | extractIdPNames ) + status_code=$? + if [ "$status_code" -ne 0 ]; then + echo "ERROR: $FUNCNAME: unable to obtain IdP names for entity: $entityID" >&2 + return 5 + fi - # parse the results - curl_error_code=$( echo "$output" | $_CUT -f1 -d" " ) - http_response_code=$( echo "$output" | $_CUT -f2 -d" " | $_SED -e 's/^.*;response:\([^;]*\).*$/\1/' ) - location=$( echo "$output" | $_CUT -f3 -d" " ) + # IdP mdui:DisplayName + displayName=$( echo "$names" | $_CUT -f2 ) + [ -z "$displayName" ] && displayName=NULL - # interactive output - #printf "%d %d %s %s %s\n" "$curl_error_code" "$http_response_code" "$location" "$entityID" "$registrarID" - printf "%s %s %s %s\n" $( echo "$output" | $_CUT -f1-3 -d" " ) $entityID + # md:OrganizationName is best for metadata registered by InCommon + # (admittedly, should be using md:OrganizationDisplayName instead) + orgName=$( echo "$names" | $_CUT -f3 ) + [ -z "$orgName" ] && orgName=NULL - # file output - print_logfile $output $entityID $registrarID - - done + # mdrpi:RegistrationInfo/@registrationAuthority + registrarID=$( echo "$names" | $_CUT -f5 ) + [ -z "$registrarID" ] && registrarID=NULL - if $has_no_saml_http_endpoints; then - print_no_saml2_http_endpoint_logfile "$entityID" "$registrarID" - continue + return 0 +} + +# depends on: +# md_tools.sh +# +get_sso_endpoint () { + + # compute the endpoint location for this binding + location=$( echo "$endpoints" \ + | filterEndpoints -b $binding \ + | listEndpointLocations \ + | /usr/bin/head -n 1 + ) + + # if there is no endpoint location, skip this binding + if [ -z "$location" ]; then + echo "WARNING: $FUNCNAME: entity has no SAML ${binding##*:} endpoint: $entityID" >&2 + # log missing endpoint + #print_idp_endpoint_list -f "$IDP_ENDPOINTS_MISSING_LOG_FILE" \ + # "" "" MISSING $binding $entityID "$displayName" "$orgName" $registrarID + print_idp_endpoint_missing_logfile $binding $entityID $registrarID + return 1 fi + + return 0 +} - # extract the IdP names and print them to a file - names=$( echo "$entityDescriptor" \ - | /usr/bin/xsltproc $LIB_DIR/extract_IdP_names.xsl - +# depends on: +# saml_tools.sh +# http_tools.sh +# +probe_sso_endpoint () { + + local output + local status_code + local curl_error_code + local curl_output + local http_response_code + + # probe the endpoint + output=$( probe_saml_idp_endpoint $curl_opts \ + -T "$tmp_dir" \ + $location $binding SingleSignOnService ) status_code=$? if [ "$status_code" -ne 0 ]; then - echo "ERROR: $script_name: unable to extract IdP names for entityID: $entityID" >&2 + echo "ERROR: $FUNCNAME: endpoint probe failed ($status_code): $location" >&2 + return 3 + fi + + # sanity check: the location and binding in the output + # must match the location and binding in metadata + + # log output + print_idp_endpoint_logfile $output $entityID $registrarID + + curl_error_code=$( echo "$output" | $_CUT -f1 -d" " ) + curl_output=$( echo "$output" | $_CUT -f2 -d" " ) + http_response_code=$( echo "$curl_output" | $_SED -e 's/^.*;response:\([^;]*\).*$/\1/' ) + + # log endpoints for post-processing + if echo "$output" | $_GREP -Eq ';response:(200|401);'; then + + print_idp_endpoint_list -f "$IDP_ENDPOINTS_RESPONSIVE_FILE" \ + $curl_error_code $http_response_code $location $binding \ + $entityID "$displayName" "$orgName" $registrarID + else + + print_idp_endpoint_list -f "$IDP_ENDPOINTS_NOT_RESPONSIVE_FILE" \ + $curl_error_code $http_response_code $location $binding \ + $entityID "$displayName" "$orgName" $registrarID + fi + + # interactive output + #printf "%d %d %s %s %s\n" "$curl_error_code" "$http_response_code" "$location" "$entityID" "$registrarID" + printf "%s %s %s %s\n" $( echo "$output" | $_CUT -f1-3 -d" " ) $entityID + + return +} + +##################################################################### +# Main processing +##################################################################### + +init_out_files + +if $verbose_mode; then + num_entityIDs=$( /bin/cat $IN_FILE | /usr/bin/wc -l ) + printf "$script_name processing %d entityIDs\n" $num_entityIDs +fi + +# Given a list of entityIDs, and a variable list of browser-facing +# HTTP protocol bindings, iterate over each entityID as follows: +# +# 1. Get entity metadata +# 2. Parse entity metadata +# 3. For each binding: +# a. Get the corresponding SSO endpoint from metadata +# b. If the location exists, probe the SSO endpoint +# +# The inner loop iterates an indeterminate number of times +# depending on the list of bindings (which is influenced by +# various command-line options) and the actual endpoints in +# metadata. +# +/bin/cat $IN_FILE | while read entityID; do + + # if status_code > 1, a fatal error occurred + + # get entity metadata + get_entity_descriptor + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue + fi + + # parse entity metadata + parse_entity_descriptor + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" continue fi - print_idp_names_logfile "$names" + + # iterate over the SAML2 browser-facing SSO endpoints + for binding in $binding_uris; do + + # get the corresponding SSO endpoint from metadata + get_sso_endpoint + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue + fi + + # probe the SSO endpoint + probe_sso_endpoint + status_code=$? + if [ "$status_code" -ne 0 ]; then + [ "$status_code" -gt 1 ] && exit "$status_code" + continue + fi + + done done -exit 0 +# status code inherits from while loop child process +exit