Normalising XML in SQLServer for hashing

FUNDVB

New Member
I asked this question which led me to do the following.
  • Create an XML representation of a C# object structure in order to pass it to SQLServer.
  • Create a stored procedure which hashes the XML and then shreds the XML into related tables and also stores the hash in the root table for quick lookup.
This means that I can pass complex object data to SQLServer and do a lookup on the hash rather than trying to shred and match the XML against the tables (which I can also do but is slow(er))...However - One of the nice things about XML is that you can format it eg indent it etc - and also - that attribute order is not important. BUT when you hash something formatting and indentation IS important. So what I do in C# is...
  • Normalise the XML by putting all attributes in alphabetical order
  • use .ToString(DisableFormatting) to remove extra formatting spaces
This works fine, but when I'm testing it's easier to have formatted XML so I can see more easily what I'm passing to the stored proc.It would be nice if SQLServer could be trusted to preserve attribute order but it can't...\[quote\] The order of attributes in an XML instance is not preserved. When you query the XML instance stored in the xml type column, the order of attributes in the resulting XML may be different from the original XML instance.\[/quote\]This means that I can't use SQLServer's XML datatype to normalise the data.What bothers me is that at some point someone will be using my procs and think "oh great, XML, attribute order doesn't matter, formatting doesn't matter, the data represented is the same" However, when I hash this isn't going to be the case.Anyone got any solutions to this problem? I don't really want to write an XML parser in T-SQL!! Or use an XML parser someone else has written in order to normalise it. Why can't the SQLServer XML datatype just preserve attribute order? I suppose I can "trust" my app to always pass XML in the same format/order thus resulting in the same hash for the same object. But I'm uncomfortable with the idea that the stored proc has to also "trust" the app to do this. I would like to somehow be able to check the normalisation of the XML, it will obviously be more robust like that.
 
Back
Top