Class CsvQuery
Provides a SQL-like query API for filtering, sorting, projecting, and grouping CSV data. Each intermediate method returns a new CsvQuery instance without mutating the original, following the immutable builder pattern. Terminal operations execute the query pipeline lazily and return results.
提供类似SQL的查询API,用于过滤、排序、投影和分组CSV数据。每个中间方法返回一个新的 CsvQuery实例而不改变原始对象,遵循不可变构建器模式。终端操作延迟执行查询管道并返回结果。
Features | 主要功能:
- Immutable query chain - 不可变查询链
- Lazy execution on terminal operations - 终端操作时延迟执行
- Column projection (select) - 列投影(select)
- Row filtering (where) - 行过滤(where)
- Sorting with custom comparators - 自定义比较器排序
- Pagination (limit/offset) - 分页(limit/offset)
- Deduplication (distinct) - 去重(distinct)
- Grouping and counting - 分组和计数
Usage Examples | 使用示例:
CsvDocument result = CsvQuery.from(doc)
.where(row -> !"".equals(row.get(0)))
.select("name", "age")
.orderBy("age", true)
.limit(10)
.execute();
long count = CsvQuery.from(doc)
.where(row -> Integer.parseInt(row.get(1)) > 25)
.count();
Map<String, CsvDocument> groups = CsvQuery.from(doc)
.groupBy("role");
Security | 安全性:
- Thread-safe: Yes (immutable) - 线程安全: 是(不可变)
- Null-safe: Null parameters throw NullPointerException - 空值安全: null参数抛出NullPointerException
- Since:
- JDK 25, opencode-base-csv V1.0.3
- Author:
- Leon Soo www.LeonSoo.com
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionExtracts the values of a single column from matching rows 从匹配行中提取单列的值longcount()Counts the number of matching rows 计算匹配行数Counts matching rows grouped by a column value 按列值分组计算匹配行数distinct()Removes duplicate rows (comparing all field values) 移除重复行(比较所有字段值)Removes duplicate rows based on specific columns 基于特定列移除重复行execute()Executes the query and returns the result as a CsvDocument 执行查询并将结果返回为CsvDocumentstatic CsvQueryfrom(CsvDocument doc) Creates a query from a CsvDocument 从CsvDocument创建查询Groups matching rows by a column value 按列值对匹配行分组limit(int maxRows) Limits the number of result rows 限制结果行数offset(int skipRows) Skips the first N rows of the result 跳过结果的前N行Sorts rows by a column using natural string ordering 按列使用自然字符串排序对行排序orderBy(String column, Comparator<String> comparator) Sorts rows by a column using a custom comparator 按列使用自定义比较器对行排序Selects specific columns by header name 按标题名选择特定列Filters rows by a predicate 按谓词过滤行
-
Method Details
-
from
Creates a query from a CsvDocument 从CsvDocument创建查询- Parameters:
doc- the source document | 源文档- Returns:
- a new CsvQuery instance | 新的CsvQuery实例
- Throws:
NullPointerException- if doc is null | 如果doc为null
-
select
Selects specific columns by header name 按标题名选择特定列Only the specified columns will be included in the result document. Column names are validated when the query executes.
结果文档中只包含指定的列。列名在查询执行时验证。
- Parameters:
columns- the column names to select | 要选择的列名- Returns:
- a new CsvQuery with the select applied | 应用了select的新CsvQuery
- Throws:
NullPointerException- if columns is null | 如果columns为null
-
where
Filters rows by a predicate 按谓词过滤行Multiple where calls are combined with logical AND.
多个where调用以逻辑AND组合。
- Parameters:
predicate- the filter predicate | 过滤谓词- Returns:
- a new CsvQuery with the filter added | 添加了过滤器的新CsvQuery
- Throws:
NullPointerException- if predicate is null | 如果predicate为null
-
orderBy
Sorts rows by a column using natural string ordering 按列使用自然字符串排序对行排序Null or missing column values are sorted last when ascending, first when descending.
升序时null或缺失的列值排在最后,降序时排在最前。
- Parameters:
column- the column name to sort by | 排序列名ascending- true for ascending, false for descending | true为升序,false为降序- Returns:
- a new CsvQuery with sorting applied | 应用了排序的新CsvQuery
- Throws:
NullPointerException- if column is null | 如果column为null
-
orderBy
Sorts rows by a column using a custom comparator 按列使用自定义比较器对行排序- Parameters:
column- the column name to sort by | 排序列名comparator- the comparator for column values | 列值比较器- Returns:
- a new CsvQuery with sorting applied | 应用了排序的新CsvQuery
- Throws:
NullPointerException- if column or comparator is null | 如果column或comparator为null
-
limit
Limits the number of result rows 限制结果行数- Parameters:
maxRows- the maximum number of rows | 最大行数- Returns:
- a new CsvQuery with the limit applied | 应用了限制的新CsvQuery
- Throws:
IllegalArgumentException- if maxRows is negative | 如果maxRows为负数
-
offset
Skips the first N rows of the result 跳过结果的前N行- Parameters:
skipRows- the number of rows to skip | 要跳过的行数- Returns:
- a new CsvQuery with the offset applied | 应用了偏移的新CsvQuery
- Throws:
IllegalArgumentException- if skipRows is negative | 如果skipRows为负数
-
distinct
Removes duplicate rows (comparing all field values) 移除重复行(比较所有字段值)Rows are considered duplicates if all their field values are equal, regardless of row number.
如果所有字段值相等则认为行重复,不考虑行号。
- Returns:
- a new CsvQuery with distinct applied | 应用了去重的新CsvQuery
-
distinct
Removes duplicate rows based on specific columns 基于特定列移除重复行Rows are considered duplicates if the values of the specified columns are all equal. The first occurrence of each group is kept.
如果指定列的值都相等则认为行重复。保留每组的第一个出现。
- Parameters:
columns- the column names to check for duplicates | 检查重复的列名- Returns:
- a new CsvQuery with distinct applied | 应用了去重的新CsvQuery
- Throws:
NullPointerException- if columns is null | 如果columns为null
-
execute
Executes the query and returns the result as a CsvDocument 执行查询并将结果返回为CsvDocument- Returns:
- the result document | 结果文档
- Throws:
OpenCsvException- if a referenced column does not exist | 如果引用的列不存在
-
count
public long count()Counts the number of matching rows 计算匹配行数Equivalent to
execute().rowCount().等效于
execute().rowCount()。- Returns:
- the count of matching rows | 匹配行数
- Throws:
OpenCsvException- if a referenced column does not exist | 如果引用的列不存在
-
column
Extracts the values of a single column from matching rows 从匹配行中提取单列的值- Parameters:
name- the column name | 列名- Returns:
- list of column values | 列值列表
- Throws:
NullPointerException- if name is null | 如果name为nullOpenCsvException- if the column does not exist | 如果列不存在
-
groupBy
Groups matching rows by a column value 按列值对匹配行分组Returns a LinkedHashMap preserving insertion order (order of first occurrence). Each group is a CsvDocument with the same headers as the source.
返回保持插入顺序的LinkedHashMap(首次出现的顺序)。 每个组是与源具有相同标题的CsvDocument。
- Parameters:
column- the column name to group by | 分组列名- Returns:
- map of column value to grouped document | 列值到分组文档的映射
- Throws:
NullPointerException- if column is null | 如果column为nullOpenCsvException- if the column does not exist | 如果列不存在
-
countBy
Counts matching rows grouped by a column value 按列值分组计算匹配行数- Parameters:
column- the column name to count by | 计数列名- Returns:
- map of column value to count | 列值到计数的映射
- Throws:
NullPointerException- if column is null | 如果column为nullOpenCsvException- if the column does not exist | 如果列不存在
-