whylogs_container.whylabs.container.otel¶
Functions
|
This function will take the output of a score workflow and return the metrics that caused the high scores for each ruleset that was run. |
|
|
|
|
|
Classes
|
An enumeration. |
- class whylogs_container.whylabs.container.otel.AttributePrefix(value)¶
Bases:
Enum
An enumeration.
- AdditionalData = 'whylabs.secure.additional_data'¶
- ApiKeyId = 'whylabs.api_key-id'¶
- Container = 'whylabs.secure.container'¶
- DatasetId = 'whylabs.dataset_id'¶
- Metadata = 'whylabs.secure.metadata'¶
- Metric = 'whylabs.secure.metrics'¶
- MetricLatency = 'whylabs.secure.latency'¶
- Policy = 'whylabs.secure.policy'¶
- ResourceId = 'whylabs.resource_id'¶
- Score = 'whylabs.secure.score'¶
- Tags = 'whylabs.secure.tags'¶
- ValidationEvent = 'whylabs.secure.validation'¶
- Version = 'guardrails_container.version'¶
- WorkflowAction = 'whylabs.secure.action'¶
- whylogs_container.whylabs.container.otel.extract_score_causes(score_result: WorkflowResult) List[str] ¶
This function will take the output of a score workflow and return the metrics that caused the high scores for each ruleset that was run. The output of the score workflow will be scores for metrics like this dataframe:
prompt.score.misuse 100 prompt.score.misuse.prompt.topics.financial 100 prompt.score.misuse.prompt.topics.medical 12 response.score.customer_experience 70 response.score.customer_experience.response.sentiment.sentiment_score None response.score.customer_experience.response.toxicity.toxicity_score 21 response.score.customer_experience.response.regex.refusal 70
Each score will be accompanied by the raw metric values as well. For example:
prompt.score.misuse 100 prompt.score.misuse.prompt.topics.financial 100 prompt.score.misuse.prompt.topics.medical 12
Here, the prompt.score.misuse score was high because the prompt.topics.financial metric was high. The prompt.topics.medical metric was low, so it did not determine the high score. They’re just maxed at the moment.
For the sample input above, if these rulesets were configured to fail then this function will return
[‘prompt.score.misuse.prompt.topics.financial’, ‘response.score.customer_experience.response.regex.refusal’]
because those were the two causes of the high scores in the rulesets that were run. If there were no validation failures in the workflow then the output will be an empty list. It doesn’t return the highest scoring metric, it returns the highest scoring metric that caused the score to be high enough to fail validation.
- whylogs_container.whylabs.container.otel.get_current_carrier() Dict[str, Any] ¶
- whylogs_container.whylabs.container.otel.get_tracer(child_org: str | None) Tracer ¶
- whylogs_container.whylabs.container.otel.span_value(it: Any) Any ¶
- whylogs_container.whylabs.container.otel.trace_langkit_result(result: WorkflowResult, score_result: WorkflowResult | None, config: ConfigInstance, dataset_id: str, start_time_ns: int, end_time_ns: int, carrier: Dict[str, Any], api_key: WhyLabsApiKey, metric_metadata: Dict[str, Any] | None = None, additional_data: Dict[str, Any] | None = None, request_metadata: Dict[str, Any] | None = None) None ¶