Robert Norris <email@example.com>
XMPP requires the use of the SASL DIGEST-MD5 mechanism in order to authenticate clients. The RFC itself is difficult to follow in places, however, the actual functionality the clients are required to implement in order to successfully authenticate to a DIGEST-MD5 aware server are minimal.
It is the goal of this document to show the basics of SASL and DIGEST-MD5. Hopefully this will be enough to get SASL newcomers up and running.
Note that this document is not intended to be any kind of authorative documentation on XMPP, SASL or DIGEST-MD5. If you're in doubt, read the relevant specs. This document assumes that you'll never want to use the channel integrity and encryption features that DIGEST-MD5 provides.
Clients need a few utility functions available:
After the XML stream has been set up, the server will send a list of stream features that it supports. In amongst this will be a list of SASL mechanisms that are supported:
<stream:features> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>PLAIN</mechanism> <mechanism>KERBEROS_V4</mechanism> </mechanisms> </stream:features>
If DIGEST-MD5 is not here, then the rest of this document will be fairly useless. However, its reasonable to assume that the majority of servers out there will have DIGEST-MD5 available, since XMPP requires servers to implement it, and it has no known security holes.
To initiate the authentication exchange, the client sends a SASL authentication request, selecting DIGEST-MD5 as the desired mechanism:
<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/>
The server will respond by sending a challenge, something like this:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi xxb3A9ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz </challenge>
The contents of this challenge is encoded using Base64, and might look like this when decoded:
The challenge can then be parsed out into a number of comma-seperated key-value pairs (though not all require values, as indicated below). Brief descriptions are as follows:
If present (one or more), a list of authentication realms that the client may attempt to authenticate into. If not present, the client should ask the user to specify a realm.
An opaque string that is generated by the server. The client must send this back to the server as part of its response, which helps to prevent replay attacks. If the server doesn't send a nonce, the client should abort the exchange.
A list of "quality of protection" values supported by the server. One of the returned values should be "auth", to indicate the server supports authentication. Other values ("auth-int", "auth-conf") may be returned, but are outside the scope of this document. If the value "auth" is not present, the client should abort the exchange.
If present, specifies that the server supports UTF-8 encoding for the username and password. If not present, the username and password must be encoded with ISO 8859-1. This is only needed for backward compatibility with HTTP Digest, and its unlikely that a XMPP server will ever not send it. If there's more than one, the client should abort the exchange.
Required for backward compatibility with HTTP Digest (which can use other algorithms besides MD5). It can be ignored, but if the server doesn't send it, or sends it more than once, the client should abort the exchange.
The client is required to respond with the appropriate credentials (Base64 encoded, of course). An example response might be:
<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9InJvYiIscmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik 9BNk1HOXRFUUdtMmhoIixjbm9uY2U9Ik9BNk1IWGg2VnFUclJrIixuYz0w MDAwMDAwMSxxb3A9YXV0aCxkaWdlc3QtdXJpPSJ4bXBwL2NhdGFjbHlzbS 5jeCIscmVzcG9uc2U9ZDM4OGRhZDkwZDRiYmQ3NjBhMTUyMzIxZjIxNDNh ZjcsY2hhcnNldD11dGYtOCxhdXRoemlkPSJyb2JAY2F0YWNseXNtLmN4L2 15UmVzb3VyY2Ui </response>
The decoded form of this is:
The meaning of the values described here are as follows:
The user's name in the specified realm, encoded according to the value of the "charset" directive sent by the server. Must be present once, or the authentication fails.
The authentication realm that this user's account is in. This is required if the server specified realms in the first challenge, and should be set to one of those realms. If this is missing, it will be set to the empty string.
The string specified by the server in the nonce option during the first challenge. If missing or specified more than once, authentication fails.
An opaque string that is generated by the client. The server will send this back to the client as part of future challenges, which helps to prevent replay attacks. If missing or specified more than once, authentication fails.
This is the hexadecimal count of the number of responses (including this one) that the client has sent with the nonce value in this request. This is used to help the server detect replays during subsequent authentication, and is not used here. Set it to "00000001".
Indicates the type of service. This should be set to "xmpp".
The DNS hostname or IP address for the service requested (though the DNS hostname is preferred). For XMPP, this should be set to the hostname of the remote server.
The full name of the service that the client is trying to connect to, formed from the serv-type and host options. For example, the XMPP service on "jabber.org" would have a digest-uri value of "xmpp/jabber.org".
A string of 32 hex digits (with the alphabetic characters lower-cased) that proves the user knows the password (see below for details of how to compute this). If missing or specified more than once, authentication fails.
If present, specifies that the client has used UTF-8 encoding for the username and password. If not present, the username and password must be encoded in ISO 8859-1.
The "authorization ID", encoded in UTF-8. This should be the full JID (including resource) of the session that the client wishes to start once authentication is complete.
This is where the magic happens. The value of the response directive is computed as follows:
The resultant string Z should be sent to the server as the value of the "response" directive.
The server will now authenticate the client based on the presented credentials. If the authentication failed, the server will return:
If it was successful, the server will send a final challenge:
<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA== </challenge>
This challenge, when decoded, will contain a single directive "rspauth". This directive is not useful for the purposes of this document, and may be safely ignored.
The client should respond with an empty response:
At this point, the server should inform the client of successful authentication, like so:
The connection is now authenticated. The client must now request an IM session by sending the following:
<iq type='set' id='sess_1'> <session xmlns='urn:ietf:params:xml:ns:xmpp-session'/> </iq>
Assuming that your username/realm has allowed to start a session as the authzid that was specified during the SASL handshake, the session start request will succedd:
<iq type='result id='sess_1'/>
The session then continues as normal - get roster, send presence, etc.