Class XmlCanonicalizer

java.lang.Object
cloud.opencode.base.xml.canonical.XmlCanonicalizer

public final class XmlCanonicalizer extends Object
XML Canonicalizer - Produces canonical XML output (C14N) XML 规范化器 - 生成规范化 XML 输出(C14N)

This utility class canonicalizes XML documents to ensure consistent serialization regardless of input formatting. The canonicalization process includes attribute sorting, whitespace normalization, and optional comment removal.

此工具类对 XML 文档进行规范化,确保无论输入格式如何都能产生一致的序列化结果。 规范化过程包括属性排序、空白规范化和可选的注释移除。

Features | 主要功能:

  • Attribute alphabetical reordering - 属性字母排序
  • Namespace declaration normalization - 命名空间声明规范化
  • UTF-8 encoding enforcement - 强制 UTF-8 编码
  • XML declaration removal - 移除 XML 声明
  • Whitespace normalization between elements - 元素间空白规范化
  • Optional comment removal - 可选注释移除
  • Consistent output across multiple calls - 多次调用产生一致输出

Usage Examples | 使用示例:

// Canonicalize XML string
String canonical = XmlCanonicalizer.canonicalize("<root b='2' a='1'/>");
// Result: <root a="1" b="2"/>

// Canonicalize with comment removal
String canonical = XmlCanonicalizer.canonicalize(xml, true);

// Canonicalize XmlDocument
XmlDocument doc = XmlDocument.parse(xml);
String canonical = XmlCanonicalizer.canonicalize(doc);

Performance | 性能特性:

  • Time complexity: O(n log n) due to attribute sorting, where n = total nodes/attributes - 时间复杂度: O(n log n),由于属性排序,n 为总节点/属性数
  • Space complexity: O(n) for the DOM tree - 空间复杂度: O(n),用于 DOM 树

Security | 安全性:

  • Thread-safe: Yes (stateless utility, uses secure parser) - 线程安全: 是(无状态工具,使用安全解析器)
  • Null-safe: No (null inputs throw NullPointerException) - 空值安全: 否(null 输入抛出 NullPointerException)
  • Secure parsing with XXE protection - 安全解析,带 XXE 防护
Since:
JDK 25, opencode-base-xml V1.0.3
Author:
Leon Soo www.LeonSoo.com
See Also:
  • Method Details

    • canonicalize

      public static String canonicalize(String xml)
      Canonicalizes the given XML string. 规范化给定的 XML 字符串。

      Comments are preserved by default.

      默认保留注释。

      Parameters:
      xml - the XML string | XML 字符串
      Returns:
      the canonical XML string | 规范化的 XML 字符串
      Throws:
      OpenXmlException - if parsing or transformation fails | 如果解析或转换失败则抛出异常
    • canonicalize

      public static String canonicalize(XmlDocument doc)
      Canonicalizes the given XmlDocument. 规范化给定的 XmlDocument。

      Comments are preserved by default.

      默认保留注释。

      Parameters:
      doc - the XML document | XML 文档
      Returns:
      the canonical XML string | 规范化的 XML 字符串
      Throws:
      OpenXmlException - if transformation fails | 如果转换失败则抛出异常
    • canonicalize

      public static String canonicalize(String xml, boolean removeComments)
      Canonicalizes the given XML string with optional comment removal. 规范化给定的 XML 字符串,可选择移除注释。
      Parameters:
      xml - the XML string | XML 字符串
      removeComments - whether to remove comments | 是否移除注释
      Returns:
      the canonical XML string | 规范化的 XML 字符串
      Throws:
      OpenXmlException - if parsing or transformation fails | 如果解析或转换失败则抛出异常
    • canonicalize

      public static String canonicalize(XmlDocument doc, boolean removeComments)
      Canonicalizes the given XmlDocument with optional comment removal. 规范化给定的 XmlDocument,可选择移除注释。
      Parameters:
      doc - the XML document | XML 文档
      removeComments - whether to remove comments | 是否移除注释
      Returns:
      the canonical XML string | 规范化的 XML 字符串
      Throws:
      OpenXmlException - if transformation fails | 如果转换失败则抛出异常