Interface StateStore

All Known Implementing Classes:
AbstractStateStore, LocalStateStore, MemoryStateStore, RawStateStore

public interface StateStore
The StateStore helps keep track of the extraction/processing state of a data application (extractor, data pipeline, contextualization pipeline, etc.). It is designed to keep track of watermarks to enable incremental load patterns. At the beginning of a run the data application typically calls the load() method, which loads the states from the remote store (which can either be a local JSON file or a table in CDF RAW), and during and/or at the end of a run, the commit method is called, which saves the current states to the persistent store.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Commit the current states to the persistent store.
    void
    Delete the state(s) for a given id.
    void
    expandHigh(String key, long value)
    Like setHigh(String, long), but only sets the state if the proposed state is higher than the current state.
    void
    expandLow(String key, long value)
    Like setLow(String, long), but only sets the state if the proposed state is lower than the current state.
    Get the high watermark state for a single id.
    Get the low watermark state for a single id.
    Optional<com.google.protobuf.Struct>
    Get the set of states for a single id.
    boolean
    isOutsideState(String key, long value)
    Check if a state is outside the stored state interval.
    Returns a read-only set of all state keys.
    void
    Load the states from the persistent store.
    void
    setHigh(String key, long value)
    Set/update the high watermark for a single id.
    void
    setLow(String key, long value)
    Set/update the low watermark for a single id.
    boolean
    Start a background thread to perform a commit every maxUploadInterval.
    boolean
    Stops the background thread if it is running and ensures the upload queue is empty by calling upload() one last time after shutting down the thread.
  • Method Details

    • load

      void load() throws Exception
      Load the states from the persistent store.
      Throws:
      Exception
    • commit

      void commit() throws Exception
      Commit the current states to the persistent store. Will overwrite/replace previously persisted states.
      Throws:
      Exception
    • setHigh

      void setHigh(String key, long value)
      Set/update the high watermark for a single id.
      Parameters:
      key - The id to store the state of.
      value - The value of the high watermark.
    • expandHigh

      void expandHigh(String key, long value)
      Like setHigh(String, long), but only sets the state if the proposed state is higher than the current state.
      Parameters:
      key - key The id to store the state of.
      value - The value of the high watermark.
    • getHigh

      OptionalLong getHigh(String key)
      Get the high watermark state for a single id.
      Parameters:
      key - The id to get the state of.
      Returns:
      The value of the high watermark. An empty optional in case the state does not exist.
    • setLow

      void setLow(String key, long value)
      Set/update the low watermark for a single id.
      Parameters:
      key - The id to store the state of.
      value - The value of the low watermark.
    • expandLow

      void expandLow(String key, long value)
      Like setLow(String, long), but only sets the state if the proposed state is lower than the current state.
      Parameters:
      key - key The id to store the state of.
      value - The value of the low watermark.
    • getLow

      OptionalLong getLow(String key)
      Get the low watermark state for a single id.
      Parameters:
      key - The id to get the state of.
      Returns:
      The value of the low watermark. An empty optional in case the state does not exist.
    • isOutsideState

      boolean isOutsideState(String key, long value)
      Check if a state is outside the stored state interval. I.e. if a state is higher than the current high watermark or lower than the current low watermark. The method can be used for determining if a data object should be processed or not.
      Parameters:
      key - the id to test.
      value - the state to test.
      Returns:
      True if the state is outside of stored state or if the key is previously unseen.
    • getState

      Optional<com.google.protobuf.Struct> getState(String key)
      Get the set of states for a single id. This will give you both the low and high watermark (if set) as a Struct. The returned Struct is a read-only view of the state values.
      Parameters:
      key - The id to get the state of.
      Returns:
      All the states in a Struct.
    • deleteState

      void deleteState(String key)
      Delete the state(s) for a given id.
      Parameters:
      key - The id to delete the state(s) of.
    • keySet

      Set<String> keySet()
      Returns a read-only set of all state keys.
      Returns:
      An immutable Set of all state keys.
    • start

      boolean start()
      Start a background thread to perform a commit every maxUploadInterval. The default upload interval is every 20 seconds. If the background thread has already been started (for example by an earlier call to start() then this method does nothing and returns false.
      Returns:
      true if the upload thread started successfully, false if the background thread has already been started.
    • stop

      boolean stop()
      Stops the background thread if it is running and ensures the upload queue is empty by calling upload() one last time after shutting down the thread.
      Returns:
      true if the upload thread stopped successfully, false if the upload thread was not started in the first place.