|
Class Summary |
| AggregatedCountsCreator |
Class that sequentially scans all Web1T-format files in order to compute the aggregated counts. |
| FileMap |
Represents an index showing which ngrams (indexed by the first two characters) are to be found in which files on the disk. |
| FileSearch |
Implements the actual (binary) search for an ngram in a Web1T-format file. |
| FrequencyDistribution<T> |
Represents a frequency distribution as a efficient map (inspired by nltk.probability.FreqDist). |
| IndexCreator |
Creates an index showing which ngrams (indexed by the first two characters) are to be found in which files on the disk. |
| JWeb1TAggregator |
Creates some aggregated counts that are not directly available in the data. |
| JWeb1TIndexer |
Provides a method to create the indexes to access the web1t corpus. |
| JWeb1TSearcher |
Search-on-disk based implementation of the Searcher interface for accessing the data in Web1T-format. |
| JWeb1TSearcherInMemory |
Memory-based implementation of the Searcher interface for accessing the data in Web1T-format. |