Class ParallelMapper<I,O>
- Type Parameters:
I- the class of the input objects.O- the class of the output objects.
ParallelMapper is a mapper for processing multiple items where the processing of each input item uses a lot
of time but few resources (e.g. map a collections of URLs to the content that they are referencing). For other cases
the parallel streams of Java are a better solution.
ParallelMapper takes as input a Collection of inputs of class I, and a mapping function
Function that maps an object I to an object O. When map() is called, it applies the
mapping function to all the Collection items. The successful mappings are available thought the method getSuccesses() and the failures through the method getExceptions().
The class provides also a getCompleted() method for the mappings that completed either
successfully or unsuccessfully. For more details about this check Future.isDone().
Example:
List<URI> inputs = someUris();
ParallelMapper <URI,String> mapper = new ParallelMapper <>(inputs,uri->dereference(URI));
mapper.map();
List<Strings> results = mapper.getSuccesses();
ParallelMapper accepts also another parameter: batchSize. This indicates the size of each batch that is
going to be processed in parallel. The batches are executed sequentially one after the other and the
items inside the batch are processed in parallel.
For example, if the input has 10.000 elements and the batch size is 1.000, all 1.000 elements of the first batch
are going to be processed in parallel, and then the second batch is going to be executed, etc. ParallelMapper
creates one execution thread per item in a batch and therefore it not recommended when the
mapping function is trivial.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionParallelMapper(Collection<I> inputs, Function<I, O> function) ParallelMapper(Collection<I> inputs, Function<I, O> mappingFunction, int batchSize) -
Method Summary
-
Field Details
-
DEFAULT_BATCH_SIZE
public static final int DEFAULT_BATCH_SIZE- See Also:
-
-
Constructor Details
-
ParallelMapper
-
ParallelMapper
-
ParallelMapper
-
ParallelMapper
-
-
Method Details
-
map
- Throws:
InterruptedException
-
getSuccesses
-
getExceptions
-