Wednesday, June 6, 2018

OpenXML Queries with Namespaces.

OpenXML is a convenient way to extract data in XML inside SQL Server however there are a number of gotchas, particularly if your file contains namespaces.

Namespaces attempt to envelope certain field names so duplicate element names are distinct within their own namespace. Why anyone would use the same element name for two different elements in an XML document is beyond me but they do exists and you must handle them in OpenXML otherwise the function won’t work.

This is what my XML document looked like:


The first thing that has to happen is that all the namespaces need to be defined to the sp_xml_preparedocument procedure call. That includes the root and child namespace declarations.

The namespace definitions are passed to the sp_xml_preparedocument procedure as the third parameter. It’s not important to use the same element name (Original:message, Definition:root) because you are just defining the namespaces to OpenXML.

It IS important to define a prefix for the namespace for ALL the namespaces in your document. In my case I had a namespace in the [message] element that did not have an alias so I gave that an alias of :hm. I also had a namespace defined in the [fxRates] element. I defined that also and gave it an alias of :mr. You don’t need to alias the schemaLocation. (@XML is a variable of type XML containing the xml document).


Once the aliases are defined you can then reference the elements in the file USING the namespace alias. That is you MUST use the namespace to define the elements. You don’t use the namespace alias for attributes.

For the header, that looks like:

SELECT DomainName,
       msgVersion,
       SubdomainName,
       Identifier,
       IdentifierVersion,
       SourceName,
       MessageId,
  FROM OPENXML(@hDoc, '//hm:message/hm:header/hm:meta', 1)
  WITH (
          DomainName varchar(100) 'hm:domain/hm:name',
          msgVersion varchar(100) 'hm:domain/hm:msgVersion',
          SubdomainName varchar(100) 'hm:domain/hm:subdomain/hm:name',
          Identifier varchar(200) 'hm:identifier/hm:id',
          IdentifierVersion varchar(20) 'hm:identifier/hm:version',
          SourceName varchar(100) 'hm:source',
          MessageId varchar(100) 'hm:messageId',
       )

To get the market rates out, there is an additional gotcha. You must OpenXML using the appropriate alias for the name space AND go down to the lowest level repeating element, to retrieve the rows at that level.

I initially tried:

-- CLOSE BUT NO CIGAR
SELECT PrimaryCurrency,
       SecondaryCurrency,
       RateDate,
       ExchangeRate,
       DaysAhead,
       TenorPeriod,
       Period
  -- notice the use of hm: and mr: aliases in this statement
FROM OPENXML(@hDoc, '//hm:message/hm:payload/mr:fxRates', 2)  WITH ( PrimaryCurrency varchar(6) ' mr:fxRate/mr:currencies/mr:primaryCurrency', SecondaryCurrency varchar(6) 'mr:fxRate/mr:currencies/mr:secondaryCurrency', RateDate date 'mr:fxRate/mr:date', ExchangeRate float '/mr:fxRate/mr:exchangeRate/mr:rate', DaysAhead int '/mr:fxRate/mr:exchangeRate/mr:daysAhead', TenorPeriod varchar(3) '/mr:fxRate/mr:exchangeRate/mr:tenorPeriod', Period float '/mr:fxRate/mr:exchangeRate/mr:period' )


But while fxRates is the root of the repeating element, the actual repeating element in the section is [exchangeRate]. The correct statement is below. Because you need to drill down to the lowest level you must go back up to get common elements.

-- CORRECT STATEMENTSELECT PrimaryCurrency,
       SecondaryCurrency,
       RateDate,
       ExchangeRate,
       DaysAhead,
       TenorPeriod,
       Period
  -- notice the use of hm: and mr: aliases in this statement AND the level
  FROM OPENXML(@hDoc, '//hm:message/hm:payload/mr:fxRates/mr:fxRate/mr:exchangeRate', 2)  
  WITH (
          -- go back up to get the currencies
          PrimaryCurrency varchar(6) '../mr:currencies/mr:primaryCurrency',   
          SecondaryCurrency varchar(6) '../mr:currencies/mr:secondaryCurrency',
          -- go back up to get the rate date
          RateDate date '../mr:date',             
          ExchangeRate float 'mr:rate',
          DaysAhead int 'mr:daysAhead',
          TenorPeriod varchar(3) 'mr:tenorPeriod',
          Period float 'mr:period'
       )

In brief:

  1. You must define all namespaces with aliases in the sp_xml_preparedocument statement
  2. You must use the correct alias on all elements in the open xml reference.


And don't forget to close the handle!
exec sp_xml_removedocument @hDoc;

No comments:

Post a Comment