Federation Mapping

Introduction

With federated authentication a trusted external IdP (Identity Provider) authenticates an entity and provides attributes associated with the authenticated entity such as full name, organization, location, group membership, roles, etc. to the local authorization system. The remote identity attributes usually do not directly correlate to the local identity attributes as such they must be mapped or transformed to be consistent with the local identity attributes.

Problems with existing contributed OpenStack federation mapping

A federated mapping implementation was contributed to OpenStack which is based on static rules. However that mapping system lacks basic features required in real world deployments. Examples of the deficiencies are:

Examples of real work tasks current mapping cannot handle

I've worked with RADIUS for years. RADIUS is often configured to operate in a federated mode where the attributes supplied to the RADIUS server have to be manipulated to match local conventions and policies. Below are some of the most common issues I've seen admins have to tackle.

Mapping requirements

The type of data transformations required by real world exchanges with foreign identity systems demands the ability to make comparisons, perform tests, assign values to variables, and call basic transformation and regular expression functions.

Simple lookups on source values which are then copied over to destination values are not sufficient.

Ideal Federation Mapping

Federation mapping is simply transforming one set of attributes into another set of attributes. JSON notation provides a rich yet simple way to express a wide range of attributes and values.

Federation mapping should be based on a simple input/output filter model. The mapper receives a JSON document containing the validated assertions from an external IdP. The mapper examines the contents of the JSON assertion and returns a JSON document of values mapped to the local authorization environment or an error indicator if the mapping cannot be performed.

Suggested solutions

The two most viable ways to address the current deficiencies are:

Embedded Scripting Language

Transforming one JSON document into another JSON document is trivial for modern scripting languages. Rather than defining the mapping via a set of rules a script is executed whose input is a JSON document of assertion values and whose output is a JSON document of mapped values. The script has full access to all the features of a modern programming language as such there are virtually no limitations as to what the script can do, the implementation can be clear and straight forward (easy to read and understand). Usually a script will be much simpler than enumerating a complex set of static rules. Scripts are easier to debug than static rules. Most administrators have the ability to program in a scripting language. Learning how to use a rule based mapping system is often just as difficult as learning script fundamentals but unlike script languages rule systems are not general purpose. There is more value in learning a script language than a one-off rule system.

Script based transformations could be implemented this way:

  1. Embed the script interpreter into the running process.
  2. Fork a script interpreter for each script evaluation.
  3. Run a separate service implemented in the scripting language and pass the JSON documents via inter-process communication.

Option 1 (embedded) offers performance and ease of implementation. Many modern script interpreters can easily be embedded into programs with varying degrees of data exchange mechanisms. ECMAScript (i.e. Javascript), Python and Perl are obvious candidates [1]

However the embedded option has one potential problem which deserves serious consideration. For security reasons (and general robustness reasons) you do not want the script to be able to access internal data nor be able to execute operations in the context of a privileged process. Running scripts provided by general users should always be prohibited lest a security breach occur. However in this context the mapping scripts are provided by trusted administrators with high levels of privilege. We should be able to assume these scripts are safe and not nefarious, the same administrators usually have enough permission to do real damage irregardless of the script loading issue so discounting the use of embedded scripts provided by administrators does not have merit. Note that options 2 (fork) and 3 (service) do not expose the primary process to the same potential security breaches.

Option 2 (forking) is too heavy weight severely hampering throughput.

Option 3 (service) raises deployment issues, for instance a mechanism that starts and stops the service, monitors it's health, locks down both the communication and resources such that nothing is leaked and nothing can be executed which shouldn't be. Some deployments are concerned with the number of processes and services which are run.

Option 1 (embedded) seems the most viable.

Enhanced Rule Mapping

We've established real world attribute mapping requires the ability to invoke basic transformations, perform tests then execute conditional logic and build compound values from discreet values. Such operations require the use of intermediate variables.

All of these requirements come for free in a scripting language. But what if we don't want to use a script to perform a transformation and prefer the rule based approach? Our goals in this case are:

  • Provide the minimal set of functions that cover the maximum number of real world use cases.
  • Keep it simple!. Simplicity aids users and makes the implementation easier to produce.
  • Do not design a language. Any design that wanders off towards being a language is better replaced by an actual embedded language which is already fully implemented, debugged and familiar to users.
  • Make everything self-consistent, no special case exceptions. Consistency aids both users and implementors.
  • Works as in the ideal case, accepts a JSON assertion and emits a JSON mapping.

Prototype Rule Mapping and Request For Comments

A prototype implementation which tries to address the above goals has been completed. It is currently implemented in Python and may be re-implemented in Java for another project. It is documented here Enhanced Federation Rule Mapping

The approach needs to evaluated and reviewed. Then a judgment call needs to be made. Is a static rule system (even if it has the advanced functionality in the prototype) still a better alternative than writing a script and have it executed by an interpreter?


[1]This federation mapping is being considered for other implementations besides OpenStack (Python based). For example it is under consideration for a Java based project. The best supported embedded interpreters in Java today are ECMAScript (JavaScript) and Python (via Jython). Python also has support for loading an ECMAScript interpreter or the script language could be Python in which case the script would simply be evaluated in the context of the running process (however there is no effective sand-boxing when you eval a Python script inside Python).