Wordpress/tests/phpunit/includes/normalize-xml.xsl
Pascal Birchler b58973554d Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.

See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.

This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.

This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.

Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.


git-svn-id: https://develop.svn.wordpress.org/trunk@48072 602fd350-edb4-49c9-b593-d223f7449a82
2020-06-17 15:22:49 +00:00

77 lines
2.1 KiB
XML

<?xml version='1.0' encoding='UTF-8' ?>
<!--
Normalize an XML document to make it easier to compare whether 2 documents will
be seen as "equal" to an XML processor.
The normalization is similiar, in spirit, to {@link https://www.w3.org/TR/xml-c14n11/ Canonical XML},
but without some aspects of C14N that make the kinds of assertions we need difficult.
For example, the following XML documents will be interpreted the same by an XML processor,
even though a string comparison of them would show differences:
<root xmlns='urn:example'>
<ns0:child xmlns:ns0='urn:another-example'>this is a test</ns0:child>
</root>
<ns0:root xmlns:ns0='urn:example'>
<child xmlns='urn:another-example'>this is a test</child>
</ns0:root>
-->
<xsl:transform
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
version='1.0'
>
<!--
Output UTF-8 XML, no indendation and all CDATA sections replaced with their character content.
-->
<xsl:output
method='xml'
indent='no'
cdata-section-elements=''
encoding='UTF-8' />
<!--
Strip insignificant white space.
-->
<xsl:strip-space elements='*' />
<!--
Noramlize elements by not relying on the prefix used in the input document
and ordering attributes first by namespace-uri and then by local-name.
-->
<xsl:template match='*' priority='10'>
<xsl:element name='{local-name()}' namespace='{namespace-uri()}'>
<xsl:apply-templates select='@*'>
<xsl:sort select='namespace-uri()' />
<xsl:sort select='local-name()' />
</xsl:apply-templates>
<xsl:apply-templates select='node()' />
</xsl:element>
</xsl:template>
<!--
Noramlize attributes by not relying on the prefix used in the input document.
-->
<xsl:template match='@*'>
<xsl:attribute name='{local-name()}' namespace='{namespace-uri()}'>
<xsl:value-of select='.' />
</xsl:attribute>
</xsl:template>
<!--
Strip comments.
-->
<xsl:template match='comment()' priority='10' />
<!--
Pass all other nodes through unchanged.
-->
<xsl:template match='node()'>
<xsl:copy>
<xsl:apply-templates select='node()' />
</xsl:copy>
</xsl:template>
</xsl:transform>