Module dogtag
source code
Overview of interacting with CMS:
CMS stands for "Certificate Management System". It has been released under a
variety of names, the open source version is called "dogtag".
CMS consists of a number of servlets which in rough terms can be thought of as
RPC commands. A servlet is invoked by making an HTTP request to a specific URL
and passing URL arguments. Normally CMS responds with an HTTP reponse consisting
of HTML to be rendered by a web browser. This HTTP HTML response has both
Javascript SCRIPT components and HTML rendering code. One of the Javascript
SCRIPT blocks holds the data for the result. The rest of the response is derived
from templates associated with the servlet which may be customized. The
templates pull the result data from Javascript variables.
One way to get the result data is to parse the HTML looking for the Javascript
varible initializations. Simple string searchs are not a robust method. First of
all one must be sure the string is only found in a Javascript SCRIPT block and
not somewhere else in the HTML document. Some of the Javascript variable
initializations are rather complex (e.g. lists of structures). It would be hard
to correctly parse such complex and diverse Javascript. Existing Javascript
parsers are not generally available. Finally, it's important to know the
character encoding for strings. There is a somewhat complex set of precident
rules for determining the current character encoding from the HTTP header,
meta-equiv tags, mime Content-Type and charset attributes on HTML elements. All
of this means trying to read the result data from a CMS HTML response is
difficult to do robustly.
However, CMS also supports returning the result data as a XML document
(distinct from an XHTML document which would be essentially the same as
described above). There are a wide variety of tools to robustly parse
XML. Because XML is so well defined things like escapes, character encodings,
etc. are automatically handled by the tools.
Thus we never try to parse Javascript, instead we always ask CMS to return us an
XML document by passing the URL argument xml="true". The body of the HTTP
response is an XML document rather than HTML with embedded Javascript.
To parse the XML documents we use the Python lxml package which is a Python
binding around the libxml2 implementation. libxml2 is a very fast, standard
compliant, feature full XML implementation. libxml2 is the XML library of choice
for many projects. One of the features in lxml and libxml2 that is particularly
valuable to us is the XPath implementation. We make heavy use of XPath to find
data in the XML documents we're parsing.
Parse Results vs. IPA command results:
CMS results can be parsed from either HTML or XML. CMS unfortunately is not
consistent with how it names items or how it utilizes data types. IPA has strict
rules about data types. Also IPA would like to see a more consistent view CMS
data. Therefore we split the task of parsing CMS results out from the IPA
command code. The parse functions normalize the result data by using a
consistent set of names and data types. The IPA command only deals with the
normalized parse results. This also allow us to use different parsers if need be
(i.e. if we had to parse Javascript for some reason). The parse functions
attempt to parse as must information from the CMS result as is possible. It puts
the parse result into a dict whose normalized key/value pairs are easy to
access. IPA commands do not need to return all the parsed results, it can pick
and choose what it wants to return in the IPA command result from the parse
result. It also rest assured the values in the parse result will be the correct
data type. Thus the general sequence of steps for an IPA command talking to CMS
are:
- Receive IPA arguments from IPA command
- Formulate URL with arguments for CMS
- Make request to CMS server
- Extract XML document from HTML body returned by CMS
- Parse XML document using matching parse routine which returns response dict
- Extract relevant items from parse result and insert into command result
- Return command result
Serial Numbers:
Serial numbers are integral values of any magnitude because they are based on
ASN.1 integers. CMS uses the Java BigInteger to represent these. Fortunately
Python also has support for big integers via the Python long() object. Any
BigIntegers we receive from CMS as a string can be parsed into a Python long
without loss of information.
However Python has a neat trick. It normally represents integers via the int
object which internally uses the native C long type. If you create an int
object by passing the int constructor a string it will check the magnitude of
the value. If it would fit in a C long then it returns you an int
object. However if the value is too big for a C long type then it returns you
a Python long object instead. This is a very nice property because it's much
more efficient to use C long types when possible (e.g. Python int), but when
necessary you'll get a Python long() object to handle large magnitude
values. Python also nicely handles type promotion transparently between int
and long objects. For example if you multiply two int objects you may get back
a long object if necessary. In general Python int and long objects may be
freely mixed without the programmer needing to be aware of which type of
intergral object is being operated on.
The leads to the following rule, always parse a string representing an
integral value using the int() constructor even if it might have large
magnitude because Python will return either an int or a long automatically. By
the same token don't test for type of an object being int exclusively because
it could either be an int or a long object.
Internally we should always being using int or long object to hold integral
values. This is because we should be able to compare them correctly, be free
from concerns about having the know the radix of the string, perform
arithmetic operations, and convert to string representation (with correct
radix) when necessary. In other words internally we should never handle
integral values as strings.
However, the XMLRPC transport cannot properly handle a Python long object. The
XMLRPC encoder upon seeing a Python long will test to see if the value fits
within the range of an 32-bit integer, if so it passes the integer parameter
otherwise it raises an Overflow exception. The XMLRPC specification does
permit 64-bit integers (e.g. i8) and the Python XMLRPC module could allow long
values within the 64-bit range to be passed if it were patched, however this
only moves the problem, it does not solve passing big integers through
XMLRPC. Thus we must always pass big integers as a strings through the XMLRPC
interface. But upon receiving that value from XMLRPC we should convert it back
into an int or long object. Recall also that Python will automatically perform
a conversion to string if you output the int or long object in a string context.
Radix Issues:
CMS uses the following conventions: Serial numbers are always returned as
hexadecimal strings without a radix prefix. When CMS takes a serial number as
input it accepts the value in either decimal or hexadecimal utilizing the radix
prefix (e.g. 0x) to determine how to parse the value.
IPA has adopted the convention that all integral values in the user interface
will use base 10 decimal radix.
Basic rules on handling these values
Reading a serial number from CMS requires conversion from hexadecimal
by converting it into a Python int or long object, use the int constructor:
>>> serial_number = int(serial_number, 16)
Big integers passed to XMLRPC must be decimal unicode strings
>>> unicode(serial_number)
Big integers received from XMLRPC must be converted back to int or long
objects from the decimal string representation.
>>> serial_number = int(serial_number)
Xpath pattern matching on node names:
There are many excellent tutorial on how to use xpath to find items in an XML
document, as such there is no need to repeat this information here. However,
most xpath tutorials make the assumption the node names you're searching for are
fixed. For example:
doc.xpath('//book/chapter[*]/section[2]')
Selects the second section of every chapter of the book. In this example the
node names 'book', 'chapter', 'section' are fixed. But what if the XML document
embedded the chapter number in the node name, for example 'chapter1',
'chapter2', etc.? (If you're thinking this would be incredibly lame, you're
right, but sadly people do things like this). Thus in this case you can't use
the node name 'chapter' in the xpath location step because it's not fixed and
hence won't match 'chapter1', 'chapter2', etc. The solution to this seems
obvious, use some type of pattern matching on the node name. Unfortunately this
advanced use of xpath is seldom discussed in tutorials and it's not obvious how
to do it. Here are some hints.
Use the built-in xpath string functions. Most of the examples illustrate the
string function being passed the text contents of the node via '.' or
string(.). However we don't want to pass the contents of the node, instead we
want to pass the node name. To do this use the name() function. One way we could
solve the chapter problem above is by using a predicate which says if the node
name begins with 'chapter' it's a match. Here is how you can do that.
>>> doc.xpath("//book/*[starts-with(name(), 'chapter')]/section[2]")
The built-in starts-with() returns true if it's first argument starts with it's
second argument. Thus the example above says if the node name of the second
location step begins with 'chapter' consider it a match and the search
proceeds to the next location step, which in this example is any node named
'section'.
But what if we would like to utilize the power of regular expressions to perform
the test against the node name? In this case we can use the EXSLT regular
expression extension. EXSLT extensions are accessed by using XML
namespaces. The regular expression name space identifier is 're:' In lxml we
need to pass a set of namespaces to XPath object constructor in order to allow
it to bind to those namespaces during it's evaluation. Then we just use the
EXSLT regular expression match() function on the node name. Here is how this is
done:
>>> regexpNS = "http://exslt.org/regular-expressions"
>>> find = etree.XPath("//book/*[re:match(name(), '^chapter(_\d+)$')]/section[2]",
... namespaces={'re':regexpNS}
>>> find(doc)
What is happening here is that etree.XPath() has returned us an evaluator
function which we bind to the name 'find'. We've passed it a set of namespaces
as a dict via the 'namespaces' keyword parameter of etree.XPath(). The predicate
for the second location step uses the 're:' namespace to find the function name
'match'. The re:match() takes a string to search as it's first argument and a
regular expression pattern as it's second argument. In this example the string
to seach is the node name of the location step because we called the built-in
node() function of XPath. The regular expression pattern we've passed says it's
a match if the string begins with 'chapter' is followed by any number of
digits and nothing else follows.
|
ra
Request Authority backend plugin.
|
|
|
|
|
|
|
|
|
|
|
|
parse_error_template_xml(doc)
CMS currently returns errors via XML as either a "template" document
(generated by CMSServlet.outputXML() or a "response" document (generated by
CMSServlet.outputError()). |
source code
|
|
|
parse_error_response_xml(doc)
CMS currently returns errors via XML as either a "template" document
(generated by CMSServlet.outputXML() or a "response" document (generated by
CMSServlet.outputError()). |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
CMS_SUCCESS = 0
|
|
CMS_FAILURE = 1
|
|
CMS_AUTH_FAILURE = 2
|
|
CMS_STATUS_UNAUTHORIZED = 1
|
|
CMS_STATUS_SUCCESS = 2
|
|
CMS_STATUS_PENDING = 3
|
|
CMS_STATUS_SVC_PENDING = 4
|
|
CMS_STATUS_REJECTED = 5
|
|
CMS_STATUS_ERROR = 6
|
|
CMS_STATUS_EXCEPTION = 7
|
cms_request_status_to_string(request_status)
| source code
|
- Parameters:
request_status - The integral request status value
- Returns:
- String name of request status
|
- Parameters:
error_code - The integral error code value
- Returns:
- String name of the error code
|
parse_and_set_boolean_xml(node,
response,
response_name)
| source code
|
Read the value out of a xml text node and interpret it as a boolean value.
The text values are stripped of whitespace and converted to lower case
prior to interpretation.
If the value is recognized the response dict is updated using the
request_name as the key and the value is set to the bool value of either
True or False depending on the interpretation of the text value. If the text
value is not recognized a ValueError exception is thrown.
Text values which result in True:
Text values which result in False:
- Parameters:
node - xml node object containing value to parse for boolean result
response - response dict to set boolean result in
response_name - name of the respone value to set
- Raises:
|
Returns the error code when the servlet replied with
CMSServlet.outputError()
The possible error code values are:
- CMS_SUCCESS = 0
- CMS_FAILURE = 1
- CMS_AUTH_FAILURE = 2
However, profileSubmit sometimes also returns these values:
- EXCEPTION = 1
- DEFERRED = 2
- REJECTED = 3
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- error code as an integer or None if not found
|
Returns the request status from a CMS operation. May be one of:
- CMS_STATUS_UNAUTHORIZED = 1
- CMS_STATUS_SUCCESS = 2
- CMS_STATUS_PENDING = 3
- CMS_STATUS_SVC_PENDING = 4
- CMS_STATUS_REJECTED = 5
- CMS_STATUS_ERROR = 6
- CMS_STATUS_EXCEPTION = 7
CMS will often fail to return requestStatus when the status is
SUCCESS. Therefore if we fail to find a requestStatus field we default the
result to CMS_STATUS_SUCCESS.
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- request status as an integer
|
CMS currently returns errors via XML as either a "template" document
(generated by CMSServlet.outputXML() or a "response" document (generated by
CMSServlet.outputError()).
This routine is used to parse a "template" style error or exception
document.
This routine should be use when the CMS requestStatus is ERROR or
EXCEPTION. It is capable of parsing both. A CMS ERROR occurs when a known
anticipated error condition occurs (e.g. asking for an item which does not
exist). A CMS EXCEPTION occurs when an exception is thrown in the CMS server
and it's not caught and converted into an ERROR. Think of EXCEPTIONS as the
"catch all" error situation.
ERROR's and EXCEPTIONS's both have error message strings associated with
them. For an ERROR it's errorDetails, for an EXCEPTION it's
unexpectedError. In addition an EXCEPTION may include an array of additional
error strings in it's errorDescription field.
After parsing the results are returned in a result dict. The following
table illustrates the mapping from the CMS data item to what may be found in
the result dict. If a CMS data item is absent it will also be absent in the
result dict.
cms name |
cms type |
result name |
result type |
requestStatus |
int |
request_status |
int |
errorDetails |
string |
error_string |
unicode |
unexpectedError |
string |
error_string |
unicode |
errorDescription |
[string] |
error_descriptions |
[unicode] |
authority |
string |
authority |
unicode |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
|
CMS currently returns errors via XML as either a "template" document
(generated by CMSServlet.outputXML() or a "response" document (generated by
CMSServlet.outputError()).
This routine is used to parse a "response" style error document.
cms name |
cms type |
result name |
result type |
Status |
int |
error_code |
int |
Error |
string |
error_string |
unicode |
RequestID |
string |
request_id |
string |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
|
CMS returns an error code and an array of request records.
This function returns a response dict with the following format:
{'error_code' : int, 'requests' : [{}]}
The mapping of fields and data types is illustrated in the following table.
If the error_code is not SUCCESS then the response dict will have the
contents described in parse_error_response_xml.
cms name |
cms type |
result name |
result type |
Status |
int |
error_code |
int |
Requests[].Id |
string |
requests[].request_id |
unicode |
Requests[].SubjectDN |
string |
requests[].subject |
unicode |
Requests[].serialno |
BigInteger |
requests[].serial_number |
int|long |
Requests[].b64 |
string |
requests[].certificate |
unicode |
Requests[].pkcs7 |
string |
|
|
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
- Raises:
|
After parsing the results are returned in a result dict. The following
table illustrates the mapping from the CMS data item to what may be found in
the result dict. If a CMS data item is absent it will also be absent in the
result dict.
If the requestStatus is not SUCCESS then the response dict will have the
contents described in parse_error_template_xml.
cms name |
cms type |
result name |
result type |
authority |
string |
authority |
unicode |
requestId |
string |
request_id |
string |
staus |
string |
cert_request_status |
unicode |
createdOn |
long, timestamp |
created_on |
datetime.datetime |
updatedOn |
long, timestamp |
updated_on |
datetime.datetime |
requestNotes |
string |
request_notes |
unicode |
pkcs7ChainBase64 |
string |
pkcs7_chain |
unicode |
cmcFullEnrollmentResponse |
string |
full_response |
unicode |
records[].serialNumber |
BigInteger |
serial_numbers |
[int|long] |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
- Raises:
|
After parsing the results are returned in a result dict. The following
table illustrates the mapping from the CMS data item to what may be found in
the result dict. If a CMS data item is absent it will also be absent in the
result dict.
If the requestStatus is not SUCCESS then the response dict will have the
contents described in parse_error_template_xml.
cms name |
cms type |
result name |
result type |
emailCert |
Boolean |
email_cert |
bool |
noCertImport |
Boolean |
no_cert_import |
bool |
revocationReason |
int |
revocation_reason |
int |
certPrettyPrint |
string |
cert_pretty |
unicode |
authorityid |
string |
authority |
unicode |
certFingerprint |
string |
fingerprint |
unicode |
certChainBase64 |
string |
certificate |
unicode |
serialNumber |
string |
serial_number |
int|long |
pkcs7ChainBase64 |
string |
pkcs7_chain |
unicode |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
- Raises:
|
After parsing the results are returned in a result dict. The following
table illustrates the mapping from the CMS data item to what may be found in
the result dict. If a CMS data item is absent it will also be absent in the
result dict.
If the requestStatus is not SUCCESS then the response dict will have the
contents described in parse_error_template_xml.
cms name |
cms type |
result name |
result type |
dirEnabled |
string |
dir_enabled |
bool |
certsUpdated |
int |
certs_updated |
int |
certsToUpdate |
int |
certs_to_update |
int |
error |
string |
error_string |
unicode |
revoked |
string |
revoked |
unicode |
totalRecordCount |
int |
total_record_count |
int |
updateCRL |
string |
update_crl |
bool |
updateCRLSuccess |
string |
update_crl_success |
bool |
updateCRLError |
string |
update_crl_error |
unicode |
publishCRLSuccess |
string [1]_[4]_ |
publish_crl_success |
bool |
publishCRLError |
string |
publish_crl_error |
unicode |
crlUpdateStatus |
string |
crl_update_status |
bool |
crlUpdateError |
string |
crl_update_error |
unicode |
crlPublishStatus |
string |
crl_publish_status |
bool |
crlPublishError |
string |
crl_publish_error |
unicode |
records[].serialNumber |
BigInteger |
records[].serial_number |
int|long |
records[].error |
string |
records[].error_string |
unicode |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
- Raises:
|
After parsing the results are returned in a result dict. The following
table illustrates the mapping from the CMS data item to what may be found in
the result dict. If a CMS data item is absent it will also be absent in the
result dict.
If the requestStatus is not SUCCESS then the response dict will have the
contents described in parse_error_template_xml.
cms name |
cms type |
result name |
result type |
dirEnabled |
string |
dir_enabled |
bool |
dirUpdated |
string |
dir_updated |
bool |
error |
string |
error_string |
unicode |
unrevoked |
string |
unrevoked |
unicode |
updateCRL |
string |
update_crl |
bool |
updateCRLSuccess |
string |
update_crl_success |
bool |
updateCRLError |
string |
update_crl_error |
unicode |
publishCRLSuccess |
string |
publish_crl_success |
bool |
publishCRLError |
string |
publish_crl_error |
unicode |
crlUpdateStatus |
string |
crl_update_status |
bool |
crlUpdateError |
string |
crl_update_error |
unicode |
crlPublishStatus |
string |
crl_publish_status |
bool |
crlPublishError |
string |
crl_publish_error |
unicode |
serialNumber |
BigInteger |
serial_number |
int|long |
- Parameters:
doc - The root node of the xml document to parse
- Returns:
- result dict
- Raises:
|