Reloading SAML : IdP Discovery

This post is somewhat different from other posts of this series, majority of the concepts that we are discussing here are not only specific to SAML, those can be used with some other protocols as well, additionally, some of the real-world examples used here to describe some general concepts are not related to SAML as well but I used them because they are the best suite to illustrate those concepts.

As the tradition we used throughout this series, let’s try to understand why we need IdP discovery in the first place. Assume you have the following deployment in a place where only one IdP in the system.

In above deployment all the SPs are aware of the location and the configuration details of the IdP based on metadata(SAML IdP Metadata) or some other mean, now whenever a user trying to login to a SP, the SP simply redirect the user to the IdP along with a SAML request, in this case, all the users are redirected to a single IdP.

Now let’s look at following deployment with 3 IdPs.

In this example, all the customers belong to a business are managed by an IdP called public.idp.com, employees of the business are managed by an IdP called internal.idp.com and the partners of the business are managed by partner.idp.com.

In above case whenever a user tries to login, the SP should determine the corresponding IdP among three so that the user will be redirected to the correct IdP, otherwise the user can’t be logged in to the application. As an example when a customer tries to log in, the customer needs to be redirected to public.idp.com IdP while when a partner tries to login, the partner needs to be redirected to the partner.idp.com Idp and so on.

At this point, you have an idea about the concept of IdP discovery and also its practical necessity. However due to the same example, one can easily come to a conclusion that IdP discovery is a responsibility of the SP, but that is not always correct, other than SP there are some other options available for IdP discovery to take place. Since I’m not aware of any standard name, I called this as IdP discovery topologies, in the next section we will look at each of these options.

IdP Discovery Topologies

In this approach, the IdP discovery process takes place at the SP level, either SP by its own make the IdP selection or get user’s involvement for the IdP selection.

In this approach, SP simply delegates IdP discovery task to a 3rd party service where this 3rd party service makes the IdP selection by its own or with help from the user.

In this approach, SP simply forward SAML request to a proxy service where the proxy service has the knowledge to make the IdP selection and SP is not aware of the selected IdP. However, it is very hard to put this pattern into practice due to a number of practical limitations.

The above classification is based on the topology of the components and considering at which point it performs the IdP discovery. We can come up with another classification based on who is responsible or who has better knowledge to determine the correct IdP from the available list of IdPs.

IdP discovery option based on responsible actor

In this approach, an individual who is trying to login can make a better choice on correct IdP selection once the SP prompt him a list of IdPs.

In practical use cases, an individual can have multiple accounts in multiple IdPs or can have a single account in one Idp, this behaviour varies from one use case to another.

Example — 1

(source — https://app.franceconnect.gouv.fr , this example just illustrates the concept, it does not mean above application use SAML or SAML IdP Discovery)

In e-ID schema used in France called FranceConnect, a user need to pick his/her Identity provider first then the system redirects the user into that IdP, in this case, the user makes the IdP selection.

Example -2

(Above example is just to illustrate the concept only and does not mean above application use SAML or SAML IdP Discovery)

In this example SP facilitates users to login via social networks, in contrast to the previous example here it’s valid for someone to have accounts with a number of social networks, in such cases one can login to SP via Facebook today and try to login to the SP via Twitter tomorrow.

This use case is much complex than the previous one because one individual can be associated with multiple social networks but there is no option for social network themselves to make this association, in other words, there is no relationship between my Facebook identity and my Twitter identity then it’s a SP’s responsibility to associate these accounts into a single identity at the SP level. Without that support, one can get two identities when login to the SP via Facebook and Twitter.

In this case, SP can determine the correct IdP itself by evaluating the available context details of the user or asking some questions from the user, according to the behaviour of the user it is possible to break this approach into two.

Passive User — In this case, SP determines the correct IdP by evaluating the context details available to the SP at the time of user accessing the SP, additionally most of the cases IdP discovery process is transparent to users. Following are some of the examples for such context details helpful to make IdP selection.

  1. Request URL — Inputs for IdP selection can be encoded in the request URL such as tenant domain for the user etc. As an example “HelloWorld” section of this https://helloworld.my.salesforce.com/ URL is used by Salesforce to identify the tenant of the user and also to determine correct IdP for the user.
  2. Using a shared cookie — Deployments where a common domain exists between the IdPs and SP, it is possible to use a common cookie to determine the correct IdP by the SP.
  3. Using the previous session — SP can determine the existing user session from the previous log-in or based on cookie data created during the previous session.
  4. IP Address/Ranges — In deployments where a mapping among IP ranges and IdPs exists, it is possible for SP to determine the correct IdP by evaluating the IP address of the access request.

Active User — In this case, the user actively participates to the IdP discovery process by providing inputs to the SP. Identity first login used by Google is one of the best examples for this where Google 1st ask for username only then after evaluating the provided username it determines and redirects the user to the correct IdP. Another approach is to provide a list of options such as list of departments or list of countries so that based on the user’s selection SP can determine the correct IdP.

IdP Discovery based on common domain cookie (CDC)

In this approach, SP identifies the correct IdP using a browser cookie that is previously set by the IdP. According to HTTP specification cookies are bound with a specific domain, which means in order to have a common cookie between the SP and the IdP, there should be a common domain between the SP and the IdP otherwise this approach won’t work. In the above diagram, example.com is common for both hr.example.com and idp.example.com, hence any cookie set to example.com can be common for both domains.

Typical IdP discovery processing based on common domain cookie (CDC) consists of following steps.

  1. IdP consists of a component that can write a cookie in a common domain, we can call this component as CDC Writer. After a successful user login IdP write this CDC cookie with a unique IdP identifier that can be understood by the SP.
  2. SP consists of a component that can read the common cookie set by the IdP, we can call this component as CDC Reader. When a user tries to login back to the same SP or another SP which fulfil the common domain requirement, SP check for the existence of CDC cookie, if it exists it tries to identify the correct IdP based on the IdP identifier encoded within the cookie.

In practice, you can use any custom cookie with custom convention or you can use a profile called SAML2 Identity Provider Discovery Profile defined in section 4.3 of the SAML Profiles specification, at the later stage of this post we will discuss SAML2 IdP discovery profile in detail.

Due to the hard requirement to have a common domain between SP and IdP this approach is not quite popular in the industry.

IdP Discovery based on the user identifier

If you are a GMail user for some time probably you may have noticed the following change they have done in login experience a few years back.

Previously, you had to provide both username and password at once, but now GMail asks you your identity first, then depend on the identifier they will redirect you to correct IdP for authentication. As an example, if you use yourname@gmail.com Google identifies this identity is associated its own IdP hence proceed further authentication by asking the password, if you use yourname@yourcompany.com as the identifier, it identifies yourcompany.com as the domain and redirect to the IdP belong to yourcompany.com organization.

Above pattern is known as Identity-first login, SP also can use the same approach to determine the correct IdP for a user trying to login. SP can first ask for the identifier of the user and based on that can determine the correct IdP.

IdP Discovery based on user inputs

(Source — https://eunode.qa.sveidas.se, Above example is just to illustrate the concept only does not mean above application use SAML or SAML IdP Discovery )

Above diagram is from Swedish eIDAS demo portal where it facilitates cross-border digital authentications for Europe citizens, if we consider this portal as a SP then it prompts a list of countries so that the user can pick the correct country where his/her e-ID is established.

Typically a citizen can have e-ID only within single country hence once he selects a specific country ( a IdP) there is a high chance that he/she will pick the same country over again, in such cases SP can locally store the user selection so that the SP provide much better user experience whenever he/she returns next time.

In this approach when a user is trying to login, the SP simply redirect the user to a separately deployed discovery service with the required information, then this discovery service is responsible for correct IdP selection. Once the discovery service comes to a conclusion about IdP selection it returns back to the SP with discovery results so that SP can redirect the user to the correct IdP.

Following diagram illustrate one such implementation based on a cookie, unlike the previously discussed CDC cookie-based approach here we don’t need to have a common domain between the SP and the IdP.

This approach consists of the following procedure.

  1. After successful user authentication, the user is redirected to the separately deployed discovery service with the IdP identification details.
  2. The discovery service set a cookie with IdP identification details.
  3. When a user is trying to login, SP redirects the user to the discovery service.
  4. The discovery service looks for a previously created cookie if found it read IdP identification details from the cookie.
  5. The discovery service returns back to the SP with identified IdP details so that SP can redirect the user to the correct IdP for authentication.

When it comes to implementation, you can come up with your own custom cookie and custom discovery service or you can follow the specification called SAML2 Identity Provider Discovery protocol and Profile which provides standard semantics for this approach, we will discuss this specification at the end of this post.

In this approach, SP simply sends the authentication request (SAML request) to a proxy service where the proxy service is responsible to figure out the correct IdP and forward the request messages. Old WAYF ( Where Are You From) model used in Shibboleth is a good example of this approach but this is no longer a popular option and we are not going to discuss much this approach.

Section 4.3 of the Profiles for SAML 2.0 specification define a profile called “Identity Provider Discovery” which can be used to identify a correct IdP for a user by a SP and based on common cookie between the IdP and SP.

In a previous section of this post ( IdP Discovery based on common domain cookie ) we already discussed the generic architecture for this common cookie-based discovery approach. I have given the following diagram again to recall the generic architecture.

Identity Provider Discovery profile tries to standardize this approach as a SAML specification, I have discussed most of the important points below.

  • The objective of this profile is, facilitate SP to discover a correct IdP that the particular user is associated.
  • This discovery profile can be used with web browser-based SSO, in other words this discovery profile assume the presence of a web browser as the user agent.
  • In this profile, it’s mandatory to have a common domain between SP and IdP because without such common domain there is not any other approach to share a common cookie between the SP and the IdP.
  • The cookie used in this profile called Common Domain Cookie (CDC) and it should use the following semantics.
  • IdP should be able to write this cookie in the common domain after successful user authentication. If the cookie is already present IdP should append it’s a unique identifier to the IdP list within the cookie.
  • SP should able to read and process this cookie to identify the correct IdP associated with the user.
  • CDC name must be “_saml_idp
  • CDC can have one or more base-64 encoded URI values separated by a single space character, each URI should uniquely represent an IdP.
  • CDC’s path prefix should be “/
  • CDC’s domain should set to the common domain.
  • CDC must be marked as secure.
  • CDC can be season-only or persistent.
  • This profile does not define any standard approach to write the CDC by the IdP and read by the SP, usually, both can be achieved by redirecting within the same domain.

However, the requirement to have a common domain between the SP and the IdP has reduced the usability of this profile, SAML2 Identity Provider Discovery protocol and Profile that we will look at in next section is more flexible than this profile.

Although there are similarities in names and purposes this profile is entirely different from the SAML2 Identity Provider Discovery Profile defined in section 4.3 of the profiles of SAML2 specification, this profile is produced as a separate specification in 2008.

For the simplicity of our discussion, I will call this current profile as Discovery Protocol while previously discussed approach CDC Discovery profile. First, let’s look at some of the differences between these two approaches.

  • CDC Discovery profile assumes only two roles: IdP and SP but in contrast to that discovery, profile assumes three roles as IdP, SP and centralized Discovery Service.
  • In CDC Discovery profile it’s mandatory to have a common domain between SP and IdP but in discovery protocol, there is no such requirement, due to this discovery protocol is much flexible.
  • CDC discovery profile is based on the cookie but discovery protocol does not mandate the use of cookie instead discovery service can choose suitable approach ( BTW cookie can be used as well)
  • Wire-protocol used in discovery protocol is dependent on browser redirects.

Like is previous profile the main objective of this profile is also to facilitate SP to discover a correct IdP that a particular user is associated. This discovery profile can be used with web browser-based SSO, in other words this discovery profile also assumes the presence of a web browser as the user agent.

Following diagram describe the interactions between the players of this profile.

Step 1 : Request to discovery service from the SP

  1. SP redirects the user to the discovery service with HTTP GET request.
  2. Parameters required for the discovery service are attached as URL-encoded query strings to the above GET request. These are the possible parameters …
  • entityID (mandatory parameter ) — unique identifier of the SP.
  • return — A URL with a possible query string value, once the discovery process is completed discovery service will redirect the user to this location.
  • returnIDParam — A parameter name used to return the unique identifier of the selected IdP to the SP. If this parameter is omitted, it defaults value is “entityID
  • IsPassive — indicate whether the discovery service is allowed to interact with the user or not, the default value is false.

Step 2 : Discovery service determine the correct IdP

In this step discovery service determine the correct IdP associated with the SP, depending on the value of IsPassive parameter discovery service may directly interact with the user or not.

In case if the discovery service uses a single cookie to determine the correct IdP, the cookie name must be “_saml_idp”.

Step 3 : Discovery service send back user to the SP with discovery results

During this step, the discovery service sends back the user to the SP via the URL specified in the return parameter and the URI of the selected IdP is set as the value of the entityID query parameter, note that if returnIDParam is present the discovery service use that value instead of entityID for the query parameter name.

NOTE : to improve security measures especially against phishing attacks, it is encouraged to use SAML Metadata to specify the return URL instead of sending via the the browser (via return parameter). In case both the metadata and is return parameter is present, the discovery service should use the URL specified in the metadata.

It is possible to use <idpdisc:DiscoveryResponse> element to specify the return URL under the <md:SPSSODescriptor> element.

I hope this post helpful to understand the concept of IdP discovery and available technical approaches to achieve IdP discovery, this also helps to get a clear idea about IdP discovery profiles defined in SAML2 specifications.

Integration and Identity Architect & PMC Member @ The Apache Software Foundation, was a Director @ WSO2