Wednesday, August 27, 2008

(Don't) Follow that SchemaLocation!

A common assumption is that if an XML message contains a SchemaLocation directive, then a Web Service (or an XML Gateway) should follow the URL identified in the SchemaLocation element, pull down the Schema from that URL, and use it to validate the XML message.

Here is an example of a SchemaLocation directive in an XML message in SOAPbox :



Contrary to the common assumption, dereferencing the SchemaLocation on an untrusted message is a bad, bad idea. Think of the situation where an attacker can point the XML parser to a bogus schema, a schema designed to blog a parser, or to a script which serves up an endless stream of bytes.

For this reason, Vordel's XML Gateway contains a Schema Cache (highlighted in the screenshot below). This is a trusted store of Schemas. The Schemas can come from a repository, or from WSDLs which have been imported (or both). But, the key point is that SchemaLocation directives are not being naively trusted.



Sending bogus SchemaLocation directives is just one technique employed when doing a vulnerability assessment of a Web Service. I have described others in my presentation at RSA 2008 back in April on Web Services vulnerability assessment.