Record Class OcrRegion

java.lang.Object
java.lang.Record
ai.doctruth.spi.OcrRegion
Record Components:
text - the recovered text in this region.
box - pixel bounding box on the rendered page image.
confidence - per-region confidence in [0.0, 1.0].

public record OcrRegion(String text, OcrBox box, double confidence) extends Record
One OCR-recovered text region with its pixel bounding box on the rendered page image. The (x, y, width, height) are pixels at the rendering DPI; downstream code scales to PDF user-space if needed via the same DPI.

Why pixel coordinates and not PDF user-space points? OCR engines work on raster images; reporting the pixel box back is the only honest representation of where the engine "saw" the text. Callers that need PDF user-space coordinates convert once, with the DPI they used.

Invariants (constructors):

  • text non-null (empty allowed — represents a region the engine couldn't transcribe).
  • box non-null and geometrically valid.
  • confidence in [0.0, 1.0].
Since:
0.1.0
  • Constructor Summary

    Constructors
    Constructor
    Description
    OcrRegion(String text, int x, int y, int width, int height, double confidence)
     
    OcrRegion(String text, OcrBox box, double confidence)
    Creates an instance of a OcrRegion record class.
  • Method Summary

    Modifier and Type
    Method
    Description
    box()
    Returns the value of the box record component.
    double
    Returns the value of the confidence record component.
    final boolean
    Indicates whether some other object is "equal to" this one.
    final int
    Returns a hash code value for this object.
    int
     
    Returns the value of the text record component.
    final String
    Returns a string representation of this record class.
    int
     
    int
    x()
     
    int
    y()
     

    Methods inherited from class Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • OcrRegion

      public OcrRegion(String text, int x, int y, int width, int height, double confidence)
    • OcrRegion

      public OcrRegion(String text, OcrBox box, double confidence)
      Creates an instance of a OcrRegion record class.
      Parameters:
      text - the value for the text record component
      box - the value for the box record component
      confidence - the value for the confidence record component
  • Method Details

    • x

      public int x()
    • y

      public int y()
    • width

      public int width()
    • height

      public int height()
    • toString

      public final String toString()
      Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components.
      Specified by:
      toString in class Record
      Returns:
      a string representation of this object
    • hashCode

      public final int hashCode()
      Returns a hash code value for this object. The value is derived from the hash code of each of the record components.
      Specified by:
      hashCode in class Record
      Returns:
      a hash code value for this object
    • equals

      public final boolean equals(Object o)
      Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared with Objects::equals(Object,Object); primitive components are compared with the compare method from their corresponding wrapper classes.
      Specified by:
      equals in class Record
      Parameters:
      o - the object with which to compare
      Returns:
      true if this object is the same as the o argument; false otherwise.
    • text

      public String text()
      Returns the value of the text record component.
      Returns:
      the value of the text record component
    • box

      public OcrBox box()
      Returns the value of the box record component.
      Returns:
      the value of the box record component
    • confidence

      public double confidence()
      Returns the value of the confidence record component.
      Returns:
      the value of the confidence record component