Class CsvDiff

java.lang.Object
cloud.opencode.base.csv.diff.CsvDiff

public final class CsvDiff extends Object
CSV Diff - Computes differences between two CSV documents CSV差异 - 计算两个CSV文档之间的差异

Provides two comparison strategies: row-by-row positional comparison and key-based matching for more meaningful diffing of tabular data.

提供两种比较策略:逐行位置比较和基于键的匹配,用于更有意义的表格数据差异比较。

Features | 主要功能:

  • Positional row-by-row diff - 位置逐行差异
  • Key-based diff using a key column - 基于键列的差异
  • Detects ADDED, REMOVED, and MODIFIED rows - 检测新增、删除和修改的行
  • Preserves row order in results - 结果保持行顺序

Usage Examples | 使用示例:

List<CsvChange> changes = CsvDiff.diff(original, modified);
List<CsvChange> changes = CsvDiff.diffByKey(original, modified, "id");

for (CsvChange change : changes) {
    switch (change.type()) {
        case ADDED    -> System.out.println("Added: " + change.newRow());
        case REMOVED  -> System.out.println("Removed: " + change.oldRow());
        case MODIFIED -> System.out.println("Modified row " + change.rowIndex());
    }
}

Security | 安全性:

  • Thread-safe: Yes (stateless utility) - 线程安全: 是(无状态工具类)
  • Null-safe: Yes (validates inputs) - 空值安全: 是(验证输入)
Since:
JDK 25, opencode-base-csv V1.0.3
Author:
Leon Soo www.LeonSoo.com
See Also:
  • Method Details

    • diff

      public static List<CsvChange> diff(CsvDocument original, CsvDocument modified)
      Computes differences between two CSV documents using positional row comparison 使用位置行比较计算两个CSV文档之间的差异

      Compares rows at the same index position. Rows present only in the modified document are reported as ADDED, rows missing from the modified document are reported as REMOVED, and rows that differ at the same position are reported as MODIFIED.

      比较相同索引位置的行。仅在修改文档中存在的行报告为ADDED, 修改文档中缺失的行报告为REMOVED,相同位置不同的行报告为MODIFIED。

      Parameters:
      original - the original document | 原始文档
      modified - the modified document | 修改后的文档
      Returns:
      list of changes (empty if identical) | 变更列表(相同则为空)
      Throws:
      NullPointerException - if either argument is null | 如果任一参数为null
    • diffByKey

      public static List<CsvChange> diffByKey(CsvDocument original, CsvDocument modified, String keyColumn)
      Computes differences using a key column for row matching 使用键列进行行匹配来计算差异

      Builds an index from the specified key column and matches rows by their key values. This is useful for comparing datasets where rows may be reordered but have a unique identifier.

      从指定键列构建索引,并按键值匹配行。适用于行可能重新排序 但具有唯一标识符的数据集比较。

      Parameters:
      original - the original document | 原始文档
      modified - the modified document | 修改后的文档
      keyColumn - the header name of the key column | 键列的标题名称
      Returns:
      list of changes (empty if identical) | 变更列表(相同则为空)
      Throws:
      NullPointerException - if any argument is null | 如果任一参数为null
      OpenCsvException - if key column not found | 如果键列未找到