Class InsertExperimentEvent

    • Constructor Detail

    • Method Detail

      • id

         final Optional<String> id()

        A unique identifier for the experiment event. If you don't provide one, BrainTrust will generate one for you

      • _isMerge

         final Optional<Boolean> _isMerge()

        The _is_merge field controls how the row is merged with any existing row with the same id in the DB. By default (or when set to false), the existing row is completely replaced by the new row. When set to true, the new row is deep-merged into the existing row, if one is found. If no existing row is found, the new row is inserted as is.

        For example, say there is an existing row in the DB {"id": "foo", "input": {"a": 5, "b": 10}}. If we merge a new row as {"_is_merge": true, "id": "foo", "input": {"b": 11, "c": 20}}, the new row will be {"id": "foo", "input": {"a": 5, "b": 11, "c": 20}}. If we replace the new row as {"id": "foo", "input": {"b": 11, "c": 20}}, the new row will be {"id": "foo", "input": {"b": 11, "c": 20}}

      • _mergePaths

         final Optional<List<List<String>>> _mergePaths()

        The _merge_paths field allows controlling the depth of the merge, when _is_merge=true. _merge_paths is a list of paths, where each path is a list of field names. The deep merge will not descend below any of the specified merge paths.

        For example, say there is an existing row in the DB {"id": "foo", "input": {"a": {"b": 10}, "c": {"d": 20}}, "output": {"a": 20}}. If we merge a new row as {"_is_merge": true, "_merge_paths": [["input", "a"], ["output"]], "input": {"a": {"q": 30}, "c": {"e": 30}, "bar": "baz"}, "output": {"d": 40}}, the new row will be {"id": "foo": "input": {"a": {"q": 30}, "c": {"d": 20, "e": 30}, "bar": "baz"}, "output": {"d": 40}}. In this case, due to the merge paths, we have replaced input.a and output, but have still deep-merged input and input.c.

      • _objectDelete

         final Optional<Boolean> _objectDelete()

        Pass _object_delete=true to mark the experiment event deleted. Deleted events will not show up in subsequent fetches for this experiment

      • _parentId

         final Optional<String> _parentId()

        DEPRECATED: The _parent_id field is deprecated and should not be used. Support for _parent_id will be dropped in a future version of Braintrust. Log span_id, root_span_id, and span_parents explicitly instead.

        Use the _parent_id field to create this row as a subspan of an existing row. Tracking hierarchical relationships are important for tracing (see the guide for full details).

        For example, say we have logged a row {"id": "abc", "input": "foo", "output": "bar", "expected": "boo", "scores": {"correctness": 0.33}}. We can create a sub-span of the parent row by logging {"_parent_id": "abc", "id": "llm_call", "input": {"prompt": "What comes after foo?"}, "output": "bar", "metrics": {"tokens": 1}}. In the webapp, only the root span row "abc" will show up in the summary view. You can view the full trace hierarchy (in this case, the "llm_call" row) by clicking on the "abc" row.

        If the row is being merged into an existing row, this field will be ignored.

      • context

         final Optional<InsertExperimentEvent.Context> context()

        Context is additional information about the code that produced the experiment event. It is essentially the textual counterpart to metrics. Use the caller_* attributes to track the location in code which produced the experiment event

      • _expected

         final JsonValue _expected()

        The ground truth value (an arbitrary, JSON serializable object) that you'd compare to output to determine if your output value is correct or not. Braintrust currently does not compare output to expected for you, since there are so many different ways to do that correctly. Instead, these values are just used to help you navigate your experiments while digging into analyses. However, we may later use these values to re-score outputs or fine-tune your models

      • _input

         final JsonValue _input()

        The arguments that uniquely define a test case (an arbitrary, JSON serializable object). Later on, Braintrust will use the input to know whether two test cases are the same between experiments, so they should not contain experiment-specific state. A simple rule of thumb is that if you run the same experiment twice, the input should be identical

      • metadata

         final Optional<InsertExperimentEvent.Metadata> metadata()

        A dictionary with additional data about the test example, model outputs, or just about anything else that's relevant, that you can use to help find and analyze examples later. For example, you could log the prompt, example's id, or anything else that would be useful to slice/dice later. The values in metadata can be any JSON-serializable type, but its keys must be strings

      • metrics

         final Optional<InsertExperimentEvent.Metrics> metrics()

        Metrics are numerical measurements tracking the execution of the code that produced the experiment event. Use "start" and "end" to track the time span over which the experiment event was produced

      • _output

         final JsonValue _output()

        The output of your application, including post-processing (an arbitrary, JSON serializable object), that allows you to determine whether the result is correct or not. For example, in an app that generates SQL queries, the output should be the result of the SQL query generated by the model, not the query itself, because there may be multiple valid queries that answer a single question

      • rootSpanId

         final Optional<String> rootSpanId()

        Use span_id, root_span_id, and span_parents instead of _parent_id, which is now deprecated. The span_id is a unique identifier describing the row's place in the a trace, and the root_span_id is a unique identifier for the whole trace. See the guide for full details.

        For example, say we have logged a row {"id": "abc", "span_id": "span0", "root_span_id": "root_span0", "input": "foo", "output": "bar", "expected": "boo", "scores": {"correctness": 0.33}}. We can create a sub-span of the parent row by logging {"id": "llm_call", "span_id": "span1", "root_span_id": "root_span0", "span_parents": ["span0"], "input": {"prompt": "What comes after foo?"}, "output": "bar", "metrics": {"tokens": 1}}. In the webapp, only the root span row "abc" will show up in the summary view. You can view the full trace hierarchy (in this case, the "llm_call" row) by clicking on the "abc" row.

        If the row is being merged into an existing row, this field will be ignored.

      • scores

         final Optional<InsertExperimentEvent.Scores> scores()

        A dictionary of numeric values (between 0 and 1) to log. The scores should give you a variety of signals that help you determine how accurate the outputs are compared to what you expect and diagnose failures. For example, a summarization app might have one score that tells you how accurate the summary is, and another that measures the word similarity between the generated and grouth truth summary. The word similarity score could help you determine whether the summarization was covering similar concepts or not. You can use these scores to help you sort, filter, and compare experiments

      • spanId

         final Optional<String> spanId()

        Use span_id, root_span_id, and span_parents instead of _parent_id, which is now deprecated. The span_id is a unique identifier describing the row's place in the a trace, and the root_span_id is a unique identifier for the whole trace. See the guide for full details.

        For example, say we have logged a row {"id": "abc", "span_id": "span0", "root_span_id": "root_span0", "input": "foo", "output": "bar", "expected": "boo", "scores": {"correctness": 0.33}}. We can create a sub-span of the parent row by logging {"id": "llm_call", "span_id": "span1", "root_span_id": "root_span0", "span_parents": ["span0"], "input": {"prompt": "What comes after foo?"}, "output": "bar", "metrics": {"tokens": 1}}. In the webapp, only the root span row "abc" will show up in the summary view. You can view the full trace hierarchy (in this case, the "llm_call" row) by clicking on the "abc" row.

        If the row is being merged into an existing row, this field will be ignored.

      • spanParents

         final Optional<List<String>> spanParents()

        Use span_id, root_span_id, and span_parents instead of _parent_id, which is now deprecated. The span_id is a unique identifier describing the row's place in the a trace, and the root_span_id is a unique identifier for the whole trace. See the guide for full details.

        For example, say we have logged a row {"id": "abc", "span_id": "span0", "root_span_id": "root_span0", "input": "foo", "output": "bar", "expected": "boo", "scores": {"correctness": 0.33}}. We can create a sub-span of the parent row by logging {"id": "llm_call", "span_id": "span1", "root_span_id": "root_span0", "span_parents": ["span0"], "input": {"prompt": "What comes after foo?"}, "output": "bar", "metrics": {"tokens": 1}}. In the webapp, only the root span row "abc" will show up in the summary view. You can view the full trace hierarchy (in this case, the "llm_call" row) by clicking on the "abc" row.

        If the row is being merged into an existing row, this field will be ignored.

      • _id

         final JsonField<String> _id()

        Returns the raw JSON value of id.

        Unlike id, this method doesn't throw if the JSON field has an unexpected type.

      • _tags

         final JsonField<List<String>> _tags()

        Returns the raw JSON value of tags.

        Unlike tags, this method doesn't throw if the JSON field has an unexpected type.