Mapping Between XML Nodes and Data Packet Fields

From RAD Studio
Jump to: navigation, search

Go Up to Defining Transformations


XML provides a text-based way to store or describe structured data. Datasets provide another way to store and describe structured data. Therefore, to convert an XML document into a dataset, you must identify the correspondences between the nodes in an XML document and the fields in a dataset.

Consider, for example, an XML document that represents a set of email messages. It might look like the following (containing a single message):

<?xml version="1.0" standalone="yes" ?>
<email>
   <head>
      <from>
         <name>Dave Boss</name>
         <address>[email protected]</address>
      </from>
      <to>
         <name>Joe Engineer</name>
         <address>[email protected]</address>
      </to>
      <cc>
         <name>Robin Smith/name>
         <address>[email protected]</address>
      </cc>
      <cc>
         <name>Leonard Devon</name>
         <address>[email protected]</address>
      </cc>
   </head>
   <body>
      <subject>XML components</subject>
      <content>
        Joe,
        Attached is the specification for the XML component support in Delphi.
        This looks like a good solution to our buisness-to-buisness application!
        Also attached, please find the project schedule. Do you think its reasonable?
           Dave.
      </content>
      <attachment attachfile="XMLSpec.txt"/>
      <attachment attachfile="Schedule.txt"/>
   </body>
</email>

One natural mapping between this document and a dataset would map each e-mail message to a single record. The record would have fields for the sender's name and address. Because an e-mail message can have multiple recipients, the recipient (<to> would map to a nested dataset. Similarly, the cc list maps to a nested dataset. The subject line would map to a string field while the message itself (<content>) would probably be a memo field. The names of attachment files would map to a nested dataset because one message can have several attachments. Thus, the e-mail above would map to a dataset something like the following:

SenderName SenderAddress To CC Subject Content Attach

Dave Boss

[email protected]

(DataSet)

(DataSet)

XML components

(MEMO)

(DataSet)



where the nested dataset in the "To" field is

Name Address

Joe Engineer

[email protected]



the nested dataset in the "CC" field is

Name Address

Robin Smith

[email protected]

Leonard Devon

[email protected]



and the nested dataset in the "Attach" field is

Attachfile

XMLSpec.txt

Schedule.txt



Defining such a mapping involves identifying those nodes of the XML document that can be repeated and mapping them to nested datasets. Tagged elements that have values and appear only once (such as <content>...</content>) map to fields whose datatype reflects the type of data that can appear as the value. Attributes of a tag (such as the AttachFile attribute of the attachment tag) also map to fields.

Note that not all tags in the XML document appear in the corresponding dataset. For example, the <head>...<head/> element has no corresponding element in the resulting dataset. Typically, only elements that have values, elements that can be repeated, or the attributes of a tag, map to the fields (including nested dataset fields) of a dataset. The exception to this rule is when a parent node in the XML document maps to a field whose value is built up from the values of the child nodes. For example, an XML document might contain a set of tags such as

<FullName>
   <Title> Mr. </Title>
   <FirstName> John </FirstName>
   <LastName> Smith </LastName>
</FullName>

which could map to a single dataset field with the value

Mr. John Smith

See Also