Thursday, November 09, 2006
Delta Specification
Introduction
Delta is an XML-based language for describing changes to XML documents. A delta document (doc) contains a reference to a target XML doc, and a sequence of elements that describe “add” and “remove” operations to be applied to the target doc.
Delta makes it possible to describe changes without modifying the underlying doc. This allows a group of people to exchange changes efficiently, without exchanging the doc itself. Additionally, delta makes it possible to compare sets of independent changes, and merge delta operations as a way of combining multiple peoples’ work.
Increasingly, people are using XML as a means of formatting data to be exchanged between programs. Historically though, changes have been transmitted as updated versions of XML docs, and this places the burden on the receiving program to figure out what the changes are, by comparing versions. This approach results in lost information (i.e. the intermediate change steps), and it is inefficient when the docs become large.
Delta keeps change information out of the original doc, and organizes the changes in a sequence that corresponds to the order in which they are made. As such, a delta doc represents the recipe for how a set of changes is to be made, and thus, is distinct from the doc itself.
Figure 1 illustrates the relationships among the delta docs, the docs undergoing changes, and the software components required to process these docs. The delta doc has a dependency upon a separate XML doc that is being changed. Furthermore, the delta processor operates upon the delta doc in order to apply changes to the start doc, and produce an output (end) doc.
This specification describes the rules for creating valid delta docs, as well as the rules that delta processors must follow in applying change operations to a target doc.
Language Elements
<Delta> Element
The element is the root of a delta doc. It contains the information needed for a delta processor to make a set of changes to a referenced “target” XML doc. In most cases, the target doc represents the “start” state, before a set of operations has been applied, but it is also possible to reference a document in its “end” state, after a set of change operations has been applied. In the latter case, a delta processor would process the operations in reverse order in order to re-derive the “start” state of the doc.
Delta docs can optionally provide a date value by way of an <updated> element; using RFC 3339 format, to indicate when the delta doc was modified. This information is useful in determining the chronological order of multiple delta docs that reference the same target doc.
A delta doc must have a <start> or <end> element that specifies the URI to the target doc. This is followed by an <operations> element, which specifies the set of individual operations.
Following is a simple example of a delta doc that modifies an Atom feed, such that a new <entry> element is added after the first existing entry, and the <updated> element is removed from the <feed> element and replaced with a new one. Following the delta doc is the original feed doc.
<Operations> Element
The <operations> element contains an ordered list of “add” and “remove” operations. Each operation is represented by an <add> or <remove> element. Each of these elements must have an “id” attribute whose value is unique among its set of siblings. The id is used to determine the relative order in which the operations are to be applied. Optionally, one can include a child <date> element on each operation to specify an absolute point in time for the operation. This can be useful when a delta processor compares two or more separate delta documents, where the id alone does not provide enough information to determine the relative order in which the operations from separate delta docs occur.
“Add” and “remove” operations also contain a <path> element with an XPath value to determine where an operation should to be performed. Delta processors should evaluate the XPath, and use the resulting “found” element(s) and/or attribute(s) as the context in which to perform the operation.
It is very important to note that before applying an operation, a delta processor must apply all of the operations preceding the operation in the delta doc. This requirement assures that the state of the doc is valid for the XPath in the operation. In the case where a delta doc references an “end” doc instead, then the operations, starting from the end, back to the specific operation, must be applied in reverse order to undo the doc to the valid state for the specific operation.
<Add> Element
The XPath on an “add” operation must reference one or more “target” elements or attributes in the target doc. For “add” operations, the <path> element may also have a “directive” attribute that indicates where, relative to the target element/attribute, the value should be added. The values for the “directive” attribute are: “before”, “after”, and “child”, with “child” being the assumed value in the absence of a directive attribute.
A path can also reference multiple elements, in which case the “add” operation is applied to each referenced element.
An “add” operation can also have a <value> that is comprised of a sequence of child elements. In this case the sequence is added in the same manner as adding a single element. This approach makes it possible to combine a set of “add” operations where multiple children are being added to a single parent element.
Adding attributes poses a slightly different challenge. The XPath for an attribute-based “add” operation should reference the element(s) to which the attribute(s) should be added. Since an attribute is represented by a name and a value, delta uses an <attribute> element to encapsulate this information as a child element of the <value> element. A delta processor needs to unpack the attribute information from the <value> and create/add a new attribute to the XPath-referenced elements:
To add an attribute, one can also specify an XPath to an existing attribute on an element, and then use the “directive” attribute on the <path> to indicate whether the attribute should be added “before” or “after” the referenced attribute.
<Remove> Element
“Remove” operations are easier to specify than “add” operations because the XPath simply references the element(s) and/or attribute(s) to remove. No “directive” is needed on the <path> element. As with “add” operations, one or more values can be removed in a single "remove" operation. Optionally, the removed values can be placed in the <value> element. However, in cases where a delta doc references an “end” state doc, the value on the “remove” operation is required to be saved, in order for a delta processor to be able to undo the “remove” operation and put the removed value back into the target doc.
In the case of removing an attribute, the entire attribute is removed, not just the value.
Delta Schema
This section contains the xsd representation of the delta schema:
Delta is an XML-based language for describing changes to XML documents. A delta document (doc) contains a reference to a target XML doc, and a sequence of elements that describe “add” and “remove” operations to be applied to the target doc.
Delta makes it possible to describe changes without modifying the underlying doc. This allows a group of people to exchange changes efficiently, without exchanging the doc itself. Additionally, delta makes it possible to compare sets of independent changes, and merge delta operations as a way of combining multiple peoples’ work.
Increasingly, people are using XML as a means of formatting data to be exchanged between programs. Historically though, changes have been transmitted as updated versions of XML docs, and this places the burden on the receiving program to figure out what the changes are, by comparing versions. This approach results in lost information (i.e. the intermediate change steps), and it is inefficient when the docs become large.
Delta keeps change information out of the original doc, and organizes the changes in a sequence that corresponds to the order in which they are made. As such, a delta doc represents the recipe for how a set of changes is to be made, and thus, is distinct from the doc itself.
Figure 1 illustrates the relationships among the delta docs, the docs undergoing changes, and the software components required to process these docs. The delta doc has a dependency upon a separate XML doc that is being changed. Furthermore, the delta processor operates upon the delta doc in order to apply changes to the start doc, and produce an output (end) doc.
This specification describes the rules for creating valid delta docs, as well as the rules that delta processors must follow in applying change operations to a target doc.
Language Elements
<Delta> Element
The
Delta docs can optionally provide a date value by way of an <updated> element; using RFC 3339 format, to indicate when the delta doc was modified. This information is useful in determining the chronological order of multiple delta docs that reference the same target doc.
A delta doc must have a <start> or <end> element that specifies the URI to the target doc. This is followed by an <operations> element, which specifies the set of individual operations.
Following is a simple example of a delta doc that modifies an Atom feed, such that a new <entry> element is added after the first existing entry, and the <updated> element is removed from the <feed> element and replaced with a new one. Following the delta doc is the original feed doc.
<?xml version="1.0" encoding="utf-8"?>
<delta xmlns="http://www.delta.org/2006/Delta" version="0.1"
xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:atom="http://www.w3.org/2005/Atom">
<updated>2006-03-31T11:42:55-05:00</updated>
<start>http://www.somewhere.com/atom1.xml</start>
<operations>
<add id="1">
<date>2006-03-31T11:42:51-05:00</date>
<path directive="after">//atom:feed/atom:entry[1]</path>
<value>
<atom:entry>
<atom:id>tag:intertwingly.net,2004:2180</atom:id>
<atom:link rel="alternate" />
<atom:title>Bridge Crossing Puzzle</atom:title>
<atom:summary>My daughter was given a puzzle.</atom:summary>
<atom:content type="xhtml">
<xhtml:div>
<xhtml:p>My daughter was given a puzzle.</xhtml:p>
</xhtml:div>
</atom:content>
<atom:updated>2006-03-31T11:42:54-05:00</atom:updated>
</atom:entry>
</value>
</add>
<remove id="2">
<date>2006-03-31T11:42:53-05:00</date>
<path>//atom:feed/atom:updated</path>
</remove>
<add id="3">
<date>2006-03-31T11:42:54-05:00</date>
<path directive="after">//atom:feed/*[3]</path>
<value>
<atom:updated>2006-03-31T11:42:54-05:00</atom:updated>
</value>
</add>
</operations>
</delta>
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en-US'>
<title>Example Feed</title>
<subtitle>Insert witty or insightful remark here</subtitle>
<link href='http://example.org/' />
<updated>2003-12-13T18:30:02Z</updated>
<author>
<name>John Doe</name>
</author>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<entry>
<title>Atom-Powered Robots Run Amok</title>
<link href='http://example.org/2003/12/13/atom03' />
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated>
<summary>Some text.</summary>
</entry>
</feed>
<Operations> Element
The <operations> element contains an ordered list of “add” and “remove” operations. Each operation is represented by an <add> or <remove> element. Each of these elements must have an “id” attribute whose value is unique among its set of siblings. The id is used to determine the relative order in which the operations are to be applied. Optionally, one can include a child <date> element on each operation to specify an absolute point in time for the operation. This can be useful when a delta processor compares two or more separate delta documents, where the id alone does not provide enough information to determine the relative order in which the operations from separate delta docs occur.
“Add” and “remove” operations also contain a <path> element with an XPath value to determine where an operation should to be performed. Delta processors should evaluate the XPath, and use the resulting “found” element(s) and/or attribute(s) as the context in which to perform the operation.
It is very important to note that before applying an operation, a delta processor must apply all of the operations preceding the operation in the delta doc. This requirement assures that the state of the doc is valid for the XPath in the operation. In the case where a delta doc references an “end” doc instead, then the operations, starting from the end, back to the specific operation, must be applied in reverse order to undo the doc to the valid state for the specific operation.
<Add> Element
The XPath on an “add” operation must reference one or more “target” elements or attributes in the target doc. For “add” operations, the <path> element may also have a “directive” attribute that indicates where, relative to the target element/attribute, the value should be added. The values for the “directive” attribute are: “before”, “after”, and “child”, with “child” being the assumed value in the absence of a directive attribute.
<add id='1'>
<path directive="child">//xcal:iCalendar/xcal:vcalendar/xcal:vevent</path>
<value>
<ibmcal:summary>new summary</ibmcal:summary>
</value>
</add>
Example 1: Add a <summary> element as a child of an <xcal:vevent> element.
A path can also reference multiple elements, in which case the “add” operation is applied to each referenced element.
<path>//xcal:iCalendar/xcal:vcalendar/xcal:vevent/xcal:attendee[1 | 3 | 5]</path>
Example 2: An XPath that resolves to multiple elements.
An “add” operation can also have a <value> that is comprised of a sequence of child elements. In this case the sequence is added in the same manner as adding a single element. This approach makes it possible to combine a set of “add” operations where multiple children are being added to a single parent element.
Adding attributes poses a slightly different challenge. The XPath for an attribute-based “add” operation should reference the element(s) to which the attribute(s) should be added. Since an attribute is represented by a name and a value, delta uses an <attribute> element to encapsulate this information as a child element of the <value> element. A delta processor needs to unpack the attribute information from the <value> and create/add a new attribute to the XPath-referenced elements:
<add id="2">
<path>//xcal:iCalendar/xcal:vcalendar/xcal:vevent</path>
<value>
<attribute name="ibmcal:draft" value="true" />
</value>
</add>
Example 3: Add an @draft attribute to the <xcal:vevent> element.
To add an attribute, one can also specify an XPath to an existing attribute on an element, and then use the “directive” attribute on the <path> to indicate whether the attribute should be added “before” or “after” the referenced attribute.
<add id="3">
<path directive="after">//xcal:iCalendar/xcal:vcalendar/@version</path>
<value>
<attribute name="xcal:prodid" value="-//handcal//NONSGML 1.0//EN" />
</value>
</add>
Example 4: Add a @prodid attribute after the @version attribute on the <xcal:vcalendar> element.
<Remove> Element
“Remove” operations are easier to specify than “add” operations because the XPath simply references the element(s) and/or attribute(s) to remove. No “directive” is needed on the <path> element. As with “add” operations, one or more values can be removed in a single "remove" operation. Optionally, the removed values can be placed in the <value> element. However, in cases where a delta doc references an “end” state doc, the value on the “remove” operation is required to be saved, in order for a delta processor to be able to undo the “remove” operation and put the removed value back into the target doc.
<remove id="2">
<path>//xcal:iCalendar/xcal:vcalendar/xcal:vevent/xcal:attendee[
[@role="REQ-PARTICIPANT"] = "tuser1@dominoportal.com" | [@role="REQ-PARTICIPANT"] =
"tuser2@dominoportal.com" ]</path>
</remove>
Example 5: Remove two <xcal:attendee> elements where there’s a match on both the @role attribute, and the text value of the element.
In the case of removing an attribute, the entire attribute is removed, not just the value.
Delta Schema
This section contains the xsd representation of the delta schema:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore"
xmlns:delta="http://www.delta.org/2006/Delta" attributeFormDefault="unqualified"
ecore:package="com.ibm.delta" elementFormDefault="qualified"
targetNamespace="http://www.delta.org/2006/Delta">
<xs:annotation>
<xs:documentation>This version of the Delta schema is based on version 0.1 of the
format specifications, found here
http://www.deltaweb.org/developers/delta-format-spec.html.</xs:documentation>
</xs:annotation>
<xs:import namespace="http://www.w3.org/XML/1998/namespace"
schemaLocation="http://www.w3.org/2001/03/xml.xsd" />
<xs:annotation>
<xs:documentation>A Delta document may have one root element:
delta</xs:documentation>
</xs:annotation>
<xs:element name="delta" type="delta:deltaType" />
<xs:complexType ecore:name="Delta" name="deltaType">
<xs:choice maxOccurs="unbounded" minOccurs="2">
<xs:element maxOccurs="1" minOccurs="0" name="updated" type="delta:dateTimeType" />
<xs:group maxOccurs="1" minOccurs="1" ref="delta:baseGroup" />
<xs:element maxOccurs="1" minOccurs="1" name="operations"
type="delta:operationsType">
<xs:unique name="operationId">
<xs:selector xpath="delta:operationType" />
<xs:field xpath="@id" />
</xs:unique>
</xs:element>
<xs:any maxOccurs="unbounded" minOccurs="0" namespace="##other"
processContents="lax" />
</xs:choice>
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:complexType>
<xs:group name="baseGroup">
<xs:choice>
<xs:element maxOccurs="1" minOccurs="0" name="start" type="delta:uriType" />
<xs:element maxOccurs="1" minOccurs="0" name="end" type="delta:uriType" />
</xs:choice>
</xs:group>
<xs:complexType ecore:name="URI" name="uriType">
<xs:simpleContent>
<xs:extension base="xs:anyURI">
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType ecore:name="Operations" name="operationsType">
<xs:choice maxOccurs="unbounded" minOccurs="1">
<xs:element maxOccurs="unbounded" minOccurs="0" name="add"
type="delta:operationType" />
<xs:element maxOccurs="unbounded" minOccurs="0" name="remove"
type="delta:operationType" />
</xs:choice>
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:complexType>
<xs:complexType ecore:name="Operation" name="operationType">
<xs:choice maxOccurs="unbounded" minOccurs="3">
<xs:element maxOccurs="1" minOccurs="1" name="date" type="delta:dateTimeType" />
<xs:element maxOccurs="1" minOccurs="1" name="path" type="delta:xPathType" />
<xs:element maxOccurs="1" minOccurs="1" name="value" type="delta:contentType" />
<xs:any namespace="##other" processContents="lax" />
</xs:choice>
<xs:attribute name="id" type="xs:positiveInteger" />
<xs:attribute name="family" type="xs:string" />
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:complexType>
<xs:complexType ecore:name="DateTime" name="dateTimeType">
<xs:simpleContent>
<xs:extension base="delta:iso8601dateTime">
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:simpleType name="iso8601dateTime">
<xs:union memberTypes="xs:dateTime xs:date xs:gYearMonth xs:gYear" />
</xs:simpleType>
<xs:complexType name="xPathType">
<xs:annotation>
<xs:documentation>A subset of XPath expressions for use in
selectors</xs:documentation>
<xs:documentation>A utility type, not for public use</xs:documentation>
</xs:annotation>
<xs:simpleContent>
<xs:extension base="delta:xPathTypeSimple">
<xs:attribute name="directive">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="before" />
<xs:enumeration value="after" />
<xs:enumeration value="child" />
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:simpleType name="xPathTypeSimple">
<xs:annotation>
<xs:documentation>A subset of XPath expressions for use in
selectors</xs:documentation>
<xs:documentation>A utility type, not for public use</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:token">
<xs:annotation>
<xs:documentation>The following pattern is intended to allow XPath expressions
per the following EBNF: Selector ::= Path ( '|' Path )* Path ::= ('.//')? Step (
'/' Step )* Step ::= '.' | NameTest NameTest ::= QName | '*' | NCName ':' '*'
child:: is also allowed</xs:documentation>
</xs:annotation>
<xs:pattern
value="(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*(\|(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*)*" />
</xs:restriction>
</xs:simpleType>
<xs:complexType ecore:name="Content" mixed="true" name="contentType">
<xs:sequence>
<xs:any maxOccurs="unbounded" minOccurs="0" namespace="##other"
processContents="lax" />
</xs:sequence>
<xs:attribute name="src" type="xs:anyURI" use="optional" />
<xs:attributeGroup ref="delta:commonAttributes" />
</xs:complexType>
<xs:attributeGroup name="commonAttributes">
<xs:attribute ref="xml:base" />
<xs:attribute ref="xml:lang" />
<xs:anyAttribute namespace="##other" processContents="lax" />
</xs:attributeGroup>
</xs:schema>