Class ParallelMapper<I,O>
- Type Parameters:
I- the class of the input objects.O- the class of the output objects.
ParallelMapper is a mapper for processing multiple items where the processing of each
input item uses a lot of time but few resources (e.g. map a collections of URLs to the content
that they are referencing). For other cases the parallel streams of Java are a better solution.
ParallelMapper takes as input a Collection of inputs of class I, and a mapping
function Function that maps an object I to an object O. When map()
is called, it applies the mapping function to all the Collection items. The successful mappings
are available thought the method getSuccesses() and the failures through
the method getExceptions().
The class provides also a getCompleted() method for the mappings that
completed either successfully or unsuccessfully. For more details about this check Future.isDone().
Example:
List<URI> inputs = someUris();
ParallelMapper <URI,String> mapper = new ParallelMapper <>(inputs,uri->dereference(URI));
mapper.map();
List<Strings> results = mapper.getSuccesses();
ParallelMapper accepts also another parameter: batchSize. This indicates the size of
each batch that is going to be processed in parallel. The batches are executed sequentially one
after the other and the items inside the batch are processed in parallel.
For example, if the input has 10.000 elements and the batch size is 1.000, all 1.000 elements
of the first batch are going to be processed in parallel, and then the second batch is going to
be executed, etc. ParallelMapper creates one execution thread per item in a batch and
therefore it not recommended when the mapping function is trivial.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionParallelMapper(Collection<I> inputs, Function<I, O> function) ParallelMapper(Collection<I> inputs, Function<I, O> mappingFunction, int batchSize) -
Method Summary
-
Field Details
-
DEFAULT_BATCH_SIZE
public static final int DEFAULT_BATCH_SIZE- See Also:
-
-
Constructor Details
-
ParallelMapper
-
ParallelMapper
-
ParallelMapper
-
ParallelMapper
-
-
Method Details
-
map
- Throws:
InterruptedException
-
getSuccesses
-
getExceptions
-