Wednesday, July 8, 2009

Downsides of Schemas

Prompted by Dave Wichers, Gunnar Peterson has a good piece today about the use of hardened schemas.

Schemas are indeed useful, but here are some downsides:

- The availability of a schema helps in a plaintext-guessing attack against encrypted data, since an attacker knows what the structure of the unencrypted data is, names of element and attributes, and maybe default values also.

- Applications which are coded to validate all incoming XML can be diverted to a malicious Schema using the SchemaLocation attribute. The malicious Schema could include very compex checks which would choke a parser. This behavior can be turned off in some platforms, for example here is how it's done in .NET: http://msdn.microsoft.com/en-us/library/ms763691(VS.85).aspx . I should note that Gunnar's code is not vulnerable to this attack since he specifies the Schema in the "schemaFile" variable. But many applications are. It is a neat way to turn a security measure against itself.

- As Gunnar says, Schema validation is only as good as the Schema itself. Most Schemas are just about the data-types ("this is a string, this is a string, this is also a string", etc). That is not useful for security purposes.

- Schemas define what should be in an XML document, but are not useful for defining what should not be in an XML document. That is where threat scanning for attack signatures (e.g. SQL Injection) comes in.

- Schema validation does not apply to RPC-encoded SOAP messages (partly because type information is included in each element that appears within the SOAP message). Unfortunately, RPC-encoded SOAP still exists in the wild. However, here at Vordel we do the seemingly impossible in our XML Gateway: allowing RPC-encoded SOAP messages to be validated against Schemas. If you need to validate RPC-encoded SOAP messages, check out the Vordel XML Gateway.

- And the biggie: Performance. Without an XML Acceleration system such as VXA in place, Schema validation can add significant latency to message throughput. In fact, that is one of the big reasons why Schema Validation is often skipped in Java and .NET apps (a bad idea!).

I'm not discouraging Schema usage, just saying that some caveats have to be kept in mind.