Independent Submission E. Wilde Request for Comments: 7351 UC Berkeley Category: Informational August 2014 ISSN: 2070-1721
A Media Type for XML Patch Operations
Abstract
The XML patch document format defines an XML document structure for expressing a sequence of patch operations to be applied to an XML document. The XML patch document format builds on the foundations defined in RFC 5261. This specification also provides the media type registration "application/xml-patch+xml", to allow the use of XML patch documents in, for example, HTTP conversations.
Status of This Memo
This document is not an Internet Standards Track specification; it is published for informational purposes.
This is a contribution to the RFC Series, independently of any other RFC stream. The RFC Editor has chosen to publish this document at its discretion and makes no statement about its value for implementation or deployment. Documents approved for publication by the RFC Editor are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7351.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
The Extensible Markup Language (XML) [RFC7303] is a common format for the exchange and storage of structured data. HTTP PATCH [RFC5789] extends HTTP [RFC7231] with a method to perform partial modifications to resources. HTTP PATCH requires that patch documents be sent along with the request, and it is therefore useful for there to be standardized patch document formats (identified by media types) for popular media types.
The XML patch media type "application/xml-patch+xml" is an XML document structure for expressing a sequence of operations to apply to a target XML document, suitable for use with the HTTP PATCH method. Servers can freely choose which patch formats they want to accept, and "application/xml-patch+xml" could be a simple default format that can be used unless a server decides to use a different (maybe more sophisticated) patch format for XML.
The format for patch documents is based on the XML patch framework defined in RFC 5261 [RFC5261]. While RFC 5261 does define a concrete syntax as well as the media type "application/patch-ops-error+xml" for error documents, it only defines XML Schema (XSD) [W3C.REC-xmlschema-1-20041028] types for patch operations. The concrete document format and the media type for patch operations are defined in an XSD defined in this specification.
This specification relies on RFC 5261 but also requires that errata reported to date are taken into account. The main reason for the errata is the problematic ways in which RFC 5261 relies on XML Path Language (XPath) as the expression language for selecting the location of a patch, while at the same time XPath's data model does
Wilde Informational [Page 2]
RFC 7351 XML Patch August 2014
not contain sufficient information to determine whether such a selector indeed can be used for a patch operation or should result in an error. Specifically, the problem occurs with namespaces, where XPath does not expose namespace declaration attributes, while the patch model needs them to determine whether or not a namespace patch is allowed. Appendix A contains more information about the general problem and errata reports.
The XML patch document format is based on a simple schema that uses a "patch" element as the document element and allows an arbitrary sequence of "add", "remove", and "replace" elements as the children of the document element. These children follow the semantics defined in RFC 5261, which means that each element is treated as an individual patch operation, and the result of each patch operation is a patched XML document that is the target XML document for the next patch operation.
The following simple example patch document contains a single patch operation. This operation adds a new attribute called "new-attribute" to the document element of the target XML document. An XML patch document always uses a "patch" element in the "urn:ietf:rfc:7351" namespace as the document element that contains zero or more patch operation elements, which are also in the "urn:ietf:rfc:7351" namespace.
The following more complex example patch document uses the example from RFC 5261, Section A.18 (but changing the example namespaces to example.com URIs); it uses the same "patch" element and XML namespace as shown in the simpler example. It shows the general structure of an XML patch document with multiple operations, as well as an example of each operation.
Wilde Informational [Page 3]
RFC 7351 XML Patch August 2014
<p:patch xmlns="http://example.com/ns1" xmlns:y="http://example.com/ns2" xmlns:p="urn:ietf:rfc:7351"> <p:add sel="doc/elem[@a='foo']"> <!-- This is a new child --> <child id="ert4773"> <y:node/> </child> </p:add> <p:replace sel="doc/note/text()">Patched doc</p:replace> <p:remove sel="*/elem[@a='bar']/y:child" ws="both"/> <p:add sel="*/elem[@a='bar']" type="@b">new attr</p:add> </p:patch>
As this example demonstrates, both the document element "patch" and the patch operation elements are in the same XML namespace. This is the result of RFC 5261 only defining types for the patch operation elements, which then can be reused in schemas to define concrete patch elements.
RFC 5261 defines XSD [W3C.REC-xmlschema-1-20041028] for the patch operation types. The following schema for the XML patch media type is based on the types defined in RFC 5261, which are imported as "rfc5261.xsd" in the following schema. The schema defines a "patch" document element, and then allows an unlimited (and possibly empty) sequence of the "add", "remove", and "replace" operation elements, which are directly based on the respective types from the schema defined in RFC 5261.
Since the semantics of the XML patch operations are defined by RFC 5261, please refer to the numerous examples in that specification for more XML patch document examples. All the examples in RFC 5261 can be taken as examples for the XML patch media type, when looking at them with two minor changes in mind.
The two differences are that XML patch documents always use the "patch" element as the document element and that both the "patch" element and the individual operation elements in XML patch documents have to be in the XML namespace with the URI "urn:ietf:rfc:7351".
For example, consider the patch example in RFC 5261, Appendix A.1, "Adding an Element". In this example, the patch is applied to the following XML document:
<?xml version="1.0" encoding="UTF-8"?> <doc> <note>This is a sample document</note> </doc>
The patch example is based on the following patch document (with the element and namespace changes described above):
<?xml version="1.0" encoding="UTF-8"?> <p:patch xmlns:p="urn:ietf:rfc:7351"> <p:add sel="doc"><foo id="ert4773">This is a new child</foo></p:add> </p:patch>
Applying the patch results in the following XML document:
<?xml version="1.0" encoding="UTF-8"?> <doc> <note>This is a sample document</note> <foo id="ert4773">This is a new child</foo></doc>
Applications that use this media type: Applications that manipulate XML documents.
Additional information:
Magic number(s): N/A
File extension(s): XML documents often use ".xml" as the file extension, and this media type does not propose a specific extension other than this generic one.
Macintosh file type code(s): TEXT
Person & email address to contact for further information: Erik Wilde <dret@berkeley.edu>
The security considerations from RFC 5261 [RFC5261] apply to the application/xml-patch+xml media type.
In addition, parsing XML may entail including information from external sources through XML's mechanism of external entities. Implementations, therefore, should be aware of the fact that standard parsers may resolve external entities and thus include external information as a result of applying patch operations to an XML document.
[RFC3470] Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols", BCP 70, RFC 3470, January 2003.
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008.
[RFC5261] Urpalainen, J., "An Extensible Markup Language (XML) Patch Operations Framework Utilizing XML Path Language (XPath) Selectors", RFC 5261, September 2008.
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, January 2013.
[RFC7303] Thompson, H. and C. Lilley, "XML Media Types", RFC 7303, July 2014.
[RFC5789] Dusseault, L. and J. Snell, "PATCH Method for HTTP", RFC 5789, March 2010.
Wilde Informational [Page 7]
RFC 7351 XML Patch August 2014
[RFC7231] Fielding, R. and J. Reschke, "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content", RFC 7231, June 2014.
[W3C.REC-DOM-Level-3-Core-20040407] Robie, J., Wood, L., Champion, M., Hegaret, P., Nicol, G., Le Hors, A., and S. Byrne, "Document Object Model (DOM) Level 3 Core Specification", World Wide Web Consortium Recommendation REC-DOM-Level-3-Core-20040407, April 2004, <http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407>.
[W3C.REC-xml-20081126] Sperberg-McQueen, C., Yergeau, F., Paoli, J., Maler, E., and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth Edition)", World Wide Web Consortium Recommendation REC- xml-20081126, November 2008, <http://www.w3.org/TR/2008/REC-xml-20081126>.
[W3C.REC-xml-names-20091208] Hollander, D., Layman, A., Bray, T., Tobin, R., and H. Thompson, "Namespaces in XML 1.0 (Third Edition)", World Wide Web Consortium Recommendation REC-xml-names-20091208, December 2009, <http://www.w3.org/TR/2009/REC-xml-names-20091208>.
[W3C.REC-xmlschema-1-20041028] Thompson, H., Beech, D., Maloney, M., and N. Mendelsohn, "XML Schema Part 1: Structures Second Edition", World Wide Web Consortium Recommendation REC-xmlschema-1-20041028, October 2004, <http://www.w3.org/TR/2004/REC-xmlschema-1-20041028>.
[W3C.REC-xpath-19991116] DeRose, S. and J. Clark, "XML Path Language (XPath) Version 1.0", World Wide Web Consortium Recommendation REC-xpath-19991116, November 1999, <http://www.w3.org/TR/1999/REC-xpath-19991116>.
[W3C.REC-xpath20-20101214] Boag, S., Berglund, A., Kay, M., Simeon, J., Robie, J., Chamberlin, D., and M. Fernandez, "XML Path Language (XPath) 2.0 (Second Edition)", World Wide Web Consortium Recommendation REC-xpath20-20101214, December 2010, <http://www.w3.org/TR/2010/REC-xpath20-20101214>.
This section is informative. It describes some issues that might be interesting for implementers, but it might also be interesting for users of XML patch that want to understand some of the differences between standard XPath 1.0 processing and the processing model of selectors in RFC 5261.
Specifically, the issues described in the following two sections have been identified as technical issues with RFC 5261 and have been filed as errata. Implementers interested in using XML patch are encouraged to take those errata into account when implementing XML patch documents. The issue about "Matching Namespaces" described in Appendix A.1 has been filed as RFC Errata ID 3477 [Err3477]. The issue about "Patching Namespaces" described in Appendix A.2 has been filed as RFC Errata ID 3478 [Err3478].
RFC 5261 defines standard rules for matching prefixed names in expressions: any prefixes are interpreted according to the namespace bindings of the diff document (the document that the expression is applied against). This means that each prefixed name can be interpreted in the context of the diff document.
For unprefixed names in expressions, the rules depart from XPath 1.0 [W3C.REC-xpath-19991116]. XPath 1.0 defines that unprefixed names in expressions match namespace-less names (i.e., there is no "default namespace" for names used in XPath 1.0 expressions). RFC 5261 requires, however, that unprefixed names in expressions must use the default namespace of the diff document (if there is one). This means that it is not possible to simply take a selector from a patch document and evaluate it in the context of the diff document according to the rules of XPath 1.0 because this would interpret unprefixed names incorrectly. As a consequence, it is not possible to simply take an XPath 1.0 processor and evaluate XML patch selectors in the context of the diff document.
As an extension of XPath 1.0's simple model, XPath 2.0 [W3C.REC-xpath20-20101214] specifies different processing rules for unprefixed names: they are matched against the URI of the "default element/type namespace", which is defined as part of an expression's static context. In some XPath 2.0 applications, this can be set; XSL Transformations (XSLT) 2.0, for example, has the ability to define an "xpath-default-namespace", which then will be used to match unprefixed names in expressions. Thus, by using an XPath 2.0 implementation that allows one to set this URI, and setting it to the default namespace of the diff document (or leaving it undefined if
Wilde Informational [Page 9]
RFC 7351 XML Patch August 2014
there is no such default namespace), it is possible to use an out-of- the-box XPath 2.0 implementation for evaluating XML patch selectors.
Please keep in mind, however, that evaluating selectors is only one part of applying patches. When it comes to applying the actual patch operation, neither XPath 1.0 nor XPath 2.0 are sufficient because they do not preserve some of the information from the XML syntax (specifically namespace declarations) that is required to correctly apply patch operations. The following section describes this issue in more detail.
Please note that [RFC5261], Section 4.2.2 on namespace matching explains XPath 2.0's rules incorrectly. For this reason, RFC Errata ID 3477 is available for Section 4.2.2 of RFC 5261.
One of the issues when patching namespaces based on XPath is that XPath exposes namespaces differently than the XML 1.0 [W3C.REC-xml-20081126] syntax for XML namespaces [W3C.REC-xml-names-20091208]. In the XML syntax, a namespace is declared with an attribute using the reserved name or prefix "xmlns", and this results in this namespace being available recursively through the document tree. In XPath, the namespace declaration is not exposed as an attribute (i.e., the attribute, although syntactically an XML attribute, is not accessible in XPath), but the resulting namespace nodes are exposed recursively through the tree.
RFC 5261 uses the terms "namespace declaration" and "namespace" almost interchangeably, but it is important to keep in mind that the namespace declaration is an XML syntax construct that is unavailable in XPath, while the namespace itself is a logical construct that is not visible in the XML syntax, but a result of a namespace declaration. The intent of RFC 5261 is to patch namespaces as if namespace declarations were patched; thus, it only allows patching namespace nodes on the element nodes where the namespace has been declared.
Patching namespaces in XML patch is supposed to "emulate" the effect of actually changing the namespace declaration (which is why a namespace can only be patched at the element where it has been declared). Therefore, when patching a namespace, even though XPath's "namespace" axis is used, implementations have to make sure that not only the single selected namespace node is being patched but that all namespaces nodes resulting from the namespace declaration of this namespace are also patched accordingly.
Wilde Informational [Page 10]
RFC 7351 XML Patch August 2014
This means that an implementation might have to descend into the tree, matching all namespace nodes with the selected prefix/URI pair recursively, until it encounters leaf elements or namespace declarations with the same prefix it is patching. Determining this requires access to the diff document beyond XPath, because, in XPath itself, namespace declarations are not represented; thus, such a recursive algorithm wouldn't know when to stop. Consider the following document:
<x xmlns:a="tag:42"> <y xmlns:a="tag:42"/> </x>
If this document is patched with a selector of /x/namespace::a, then only the namespace node on element x should be patched, even though the namespace node on element y has the same prefix/URI combination as the one on element x. However, determining that the repeated namespace declaration was present at all on element y is impossible when using XPath alone, which means that implementations must have an alternative way to determine the difference between the document above, and this one:
<x xmlns:a="tag:42"> <y/> </x>
In this second example, patching with a selector of /x/namespace::a should indeed change the namespace nodes on elements x and y, because they both have been derived from the same namespace declaration.
The conclusion of these considerations is that for implementing XML patch, access closer to the XML syntax (specifically access to namespace declarations) is necessary. As a result, implementations attempting to exclusively use the XPath model for implementing XML patch will fail to correctly address certain edge cases (such as the one shown above).
Note that XPath's specific limitations do not mean that it is impossible to use XML technologies other than XPath. The Document Object Model (DOM) [W3C.REC-DOM-Level-3-Core-20040407], for example, does expose namespace declaration attributes as regular attributes in the document tree; thus, they could be used to differentiate between the two variants shown above.
Please note that RFC 5261, Section 4.4.3 (on replacing namespaces) mixes the terms "namespace declaration" and "namespace". For this reason, RFC Errata ID 3478 is available for Section 4.4.3 of RFC 5261.
RFC 5261 [RFC5261] does not contain an ABNF grammar for the allowed subset of XPath expressions but includes an XSD-based grammar in its type definition for operation types. In order to make implementation easier, this appendix contains an ABNF grammar that has been derived from the XSD expressions in RFC 5261. In the following grammar, "xpath" is the definition for the allowed XPath expressions for remove and replace operations, and "xpath-add" is the definition for the allowed XPath expressions for add operations. The names of all grammar productions are the ones used in the XSD-based grammar of RFC 5261.
Please note that the "ncname" production listed above does not fully capture the constraints of the original XSD-based definition, where it is defined as "\i\c*". DIGIT and DQUOTE are defined by the ABNF specification [RFC5234].