Class ParallelMapper<I,O>

java.lang.Object
nva.commons.core.parallel.ParallelMapper<I,O>
Type Parameters:
I - the class of the input objects.
O - the class of the output objects.

public class ParallelMapper<I,O> extends Object
ParallelMapper is a mapper for processing multiple items where the processing of each input item uses a lot of time but few resources (e.g. map a collections of URLs to the content that they are referencing). For other cases the parallel streams of Java are a better solution.

ParallelMapper takes as input a Collection of inputs of class I, and a mapping function Function that maps an object I to an object O. When map() is called, it applies the mapping function to all the Collection items. The successful mappings are available thought the method getSuccesses() and the failures through the method getExceptions().

The class provides also a getCompleted() method for the mappings that completed either successfully or unsuccessfully. For more details about this check Future.isDone().

Example:

      List<URI> inputs = someUris();
      ParallelMapper <URI,String>  mapper = new ParallelMapper <>(inputs,uri->dereference(URI));
      mapper.map();
      List<Strings> results = mapper.getSuccesses();
 

ParallelMapper accepts also another parameter: batchSize. This indicates the size of each batch that is going to be processed in parallel. The batches are executed sequentially one after the other and the items inside the batch are processed in parallel.

For example, if the input has 10.000 elements and the batch size is 1.000, all 1.000 elements of the first batch are going to be processed in parallel, and then the second batch is going to be executed, etc. ParallelMapper creates one execution thread per item in a batch and therefore it not recommended when the mapping function is trivial.