monitor_schema.models.analyzer.algorithms#
Collections of support algorithms.
Attributes#
Classes#
A baseline based on a static reference profile. |
|
Using current batch. |
|
A static time range. |
|
A dynamic trailing window. |
|
No extras base model. |
|
Support for a specific time range. |
|
Specify the algorithm type. |
|
Metrics that are applicable at the dataset level. |
|
Simple column metrics that are basically just a single number. |
|
Sketch-based metrics that can only be processed by certain algorithms. |
|
Base algorithm config. |
|
Expected value: one of these fields must be set. |
|
Operators for performing a comparison. |
|
Operators for performing a comparison. |
|
Operators for performing a comparison. |
|
Compare whether the target against either an expect value or against the baseline. |
|
Compare a target list of values against a baseline list of values. |
|
Compare whether target against a list of values. |
|
Compare whether the target is equal to a value or not. |
|
Fixed threshold analysis. |
|
Threshold Type declaring the upper and lower bound. |
|
Base algorithm config. |
|
Calculates upper bounds and lower bounds based on stddev from a series of numbers. |
|
An analyzer using stddev for a window of time range. |
|
An analyzer using stddev for a window of time range. |
|
Experimental algorithm that is not standardized by the above ones yet. |
|
Whether to use the absolute difference or the percentage to calculate the difference. |
|
Detecting the differences between two numerical metrics. |
|
Conjunction (ANDs) composite analyzer joining multiple analyzers. |
|
Disjunction (ORs) composite analyzer joining multiple analyzers. |
Functions#
|
Turn anyOf in JSON schema to oneOf. |
Module Contents#
- class monitor_schema.models.analyzer.algorithms.ReferenceProfileId[source]#
Bases:
_Baseline
A baseline based on a static reference profile.
A typical use case is to use a “gold” dataset and upload its profile to WhyLabs. This can be a training dataset as well for an ML model.
- type: Literal[BaselineType]#
- class monitor_schema.models.analyzer.algorithms.SingleBatchBaseline[source]#
Bases:
_SegmentBaseline
Using current batch.
This is used when you want to use one batch to monitor another batch in a different metric entity.
- type: Literal[BaselineType]#
- class monitor_schema.models.analyzer.algorithms.TimeRangeBaseline[source]#
Bases:
_SegmentBaseline
A static time range.
Instead of using a single profile or a trailing window, user can lock in a “good” period.
- type: Literal[BaselineType]#
- class monitor_schema.models.analyzer.algorithms.TrailingWindowBaseline[source]#
Bases:
_SegmentBaseline
A dynamic trailing window.
This is useful if you don’t have a static baseline to monitor against. This is the default mode for most monitors.
- type: Literal[BaselineType]#
- exclusionRanges: List[monitor_schema.models.commons.TimeRange] | None#
- class monitor_schema.models.analyzer.algorithms.NoExtrasBaseModel[source]#
Bases:
pydantic.BaseModel
No extras base model.
Inherit to prevent accidental extra fields.
- class monitor_schema.models.analyzer.algorithms.TimeRange[source]#
Bases:
NoExtrasBaseModel
Support for a specific time range.
- start: datetime.datetime#
- end: datetime.datetime#
- monitor_schema.models.analyzer.algorithms.COLUMN_NAME_TYPE#
- monitor_schema.models.analyzer.algorithms.anyOf_to_oneOf(schema: Dict[str, Any], field_name: str) None [source]#
Turn anyOf in JSON schema to oneOf.
onfOf is much stricter and pyDantic doesn’t produce this tag. We hijack the JSON schema object to set this correctly.
- class monitor_schema.models.analyzer.algorithms.AlgorithmType[source]#
-
Specify the algorithm type.
- expected = 'expected'#
- column_list = 'column_list'#
- comparison = 'comparison'#
- conjunction = 'conjunction'#
- disjunction = 'disjunction'#
- list_comparison = 'list_comparison'#
- frequent_string_comparison = 'frequent_string_comparison'#
- diff = 'diff'#
- drift = 'drift'#
- stddev = 'stddev'#
- seasonal = 'seasonal'#
- fixed = 'fixed'#
- experimental = 'experimental'#
- class monitor_schema.models.analyzer.algorithms.DatasetMetric[source]#
-
Metrics that are applicable at the dataset level.
- profile_count = 'profile.count'#
- profile_last_ingestion_time = 'profile.last_ingestion_time'#
- profile_first_ingestion_time = 'profile.first_ingestion_time'#
- column_row_count_sum = 'column_row_count_sum'#
- shape_column_count = 'shape_column_count'#
- shape_row_count = 'shape_row_count'#
- input_count = 'input.count'#
- output_count = 'output.count'#
- classification_f1 = 'classification.f1'#
- classification_precision = 'classification.precision'#
- classification_recall = 'classification.recall'#
- classification_accuracy = 'classification.accuracy'#
- classification_fpr = 'classification.fpr'#
- classification_auroc = 'classification.auroc'#
- regression_mse = 'regression.mse'#
- regression_mae = 'regression.mae'#
- regression_rmse = 'regression.rmse'#
- class monitor_schema.models.analyzer.algorithms.SimpleColumnMetric[source]#
-
Simple column metrics that are basically just a single number.
- count = 'count'#
- median = 'median'#
- max = 'max'#
- min = 'min'#
- mean = 'mean'#
- stddev = 'stddev'#
- variance = 'variance'#
- unique_upper = 'unique_upper'#
- unique_upper_ratio = 'unique_upper_ratio'#
- unique_est = 'unique_est'#
- unique_est_ratio = 'unique_est_ratio'#
- unique_lower = 'unique_lower'#
- unique_lower_ratio = 'unique_lower_ratio'#
- count_bool = 'count_bool'#
- count_bool_ratio = 'count_bool_ratio'#
- count_integral = 'count_integral'#
- count_integral_ratio = 'count_integral_ratio'#
- count_fractional = 'count_fractional'#
- count_fractional_ratio = 'count_fractional_ratio'#
- count_string = 'count_string'#
- count_string_ratio = 'count_string_ratio'#
- count_null = 'count_null'#
- count_null_ratio = 'count_null_ratio'#
- inferred_data_type = 'inferred_data_type'#
- quantile_p5 = 'quantile_5'#
- quantile_p75 = 'quantile_75'#
- quantile_p25 = 'quantile_25'#
- quantile_p90 = 'quantile_90'#
- quantile_p95 = 'quantile_95'#
- quantile_p99 = 'quantile_99'#
- class monitor_schema.models.analyzer.algorithms.ComplexMetrics[source]#
-
Sketch-based metrics that can only be processed by certain algorithms.
- histogram = 'histogram'#
- frequent_items = 'frequent_items'#
- unique_sketch = 'unique_sketch'#
- column_list = 'column_list'#
- class monitor_schema.models.analyzer.algorithms.AlgorithmConfig[source]#
Bases:
monitor_schema.models.commons.NoExtrasBaseModel
Base algorithm config.
- params: Optional[Dict[constr(max_length=100), constr(max_length=1000)]]#
- metric: Union[DatasetMetric, SimpleColumnMetric, constr(max_length=100)]#
- class monitor_schema.models.analyzer.algorithms.ExpectedValue[source]#
Bases:
monitor_schema.models.commons.NoExtrasBaseModel
Expected value: one of these fields must be set.
- str: Optional[constr(max_length=100)]#
- class monitor_schema.models.analyzer.algorithms.ComparisonOperator[source]#
-
Operators for performing a comparison.
- eq = 'eq'#
- gt = 'gt'#
- lt = 'lt'#
- ge = 'ge'#
- le = 'le'#
- class monitor_schema.models.analyzer.algorithms.ListComparisonOperator[source]#
-
Operators for performing a comparison.
- in_list = 'in'#
- not_in = 'not_in'#
- class monitor_schema.models.analyzer.algorithms.FrequentStringComparisonOperator[source]#
-
Operators for performing a comparison.
- eq = 'eq'#
- target_includes_all_baseline = 'target_includes_all_baseline'#
- baseline_includes_all_target = 'baseline_includes_all_target'#
- class monitor_schema.models.analyzer.algorithms.ComparisonConfig[source]#
Bases:
AlgorithmConfig
Compare whether the target against either an expect value or against the baseline.
This is useful to detect data type change, for instance.
- type: Literal[AlgorithmType]#
- operator: ComparisonOperator#
- expected: ExpectedValue | None#
- class monitor_schema.models.analyzer.algorithms.ListComparisonConfig[source]#
Bases:
AlgorithmConfig
Compare a target list of values against a baseline list of values.
- type: Literal[AlgorithmType]#
- operator: ListComparisonOperator#
- expected: List[ExpectedValue] | None#
- class monitor_schema.models.analyzer.algorithms.FrequentStringComparisonConfig[source]#
Bases:
AlgorithmConfig
Compare whether target against a list of values.
- type: Literal[AlgorithmType]#
- metric: Literal[ComplexMetrics]#
- operator: FrequentStringComparisonOperator#
- class monitor_schema.models.analyzer.algorithms.ColumnListChangeConfig[source]#
Bases:
AlgorithmConfig
Compare whether the target is equal to a value or not.
This is useful to detect data type change, for instance.
- type: Literal[AlgorithmType]#
- mode: Literal['ON_ADD_AND_REMOVE', 'ON_ADD', 'ON_REMOVE'] = 'ON_ADD_AND_REMOVE'#
- metric: Literal[ComplexMetrics]#
- class monitor_schema.models.analyzer.algorithms.FixedThresholdsConfig[source]#
Bases:
AlgorithmConfig
Fixed threshold analysis.
If user fails to set both upper bound and lower bound, this algorithm becomes a no-op. WhyLabs might enforce the present of either fields in the future.
- type: Literal[AlgorithmType]#
- class monitor_schema.models.analyzer.algorithms.ThresholdType[source]#
Bases:
enum.Enum
Threshold Type declaring the upper and lower bound.
By default an anomaly will be generated when the target is above or below the baseline by the specified threshold.
If its only desirable to alert when the target is above the baseline and not the other way around, specify upper for your ThresholdType.
- lower = 'lower'#
- upper = 'upper'#
- class monitor_schema.models.analyzer.algorithms._ThresholdBaseConfig[source]#
Bases:
AlgorithmConfig
Base algorithm config.
- thresholdType: ThresholdType | None#
- class monitor_schema.models.analyzer.algorithms.StddevConfig[source]#
Bases:
_ThresholdBaseConfig
Calculates upper bounds and lower bounds based on stddev from a series of numbers.
An analyzer using stddev for a window of time range.
This calculation will fall back to Poisson distribution if there is only 1 value in the baseline. For 2 values, we use the formula sqrt((x_i - avg(x))^2 / n - 1)
- type: Literal[AlgorithmType]#
- class monitor_schema.models.analyzer.algorithms.SeasonalConfig[source]#
Bases:
_ThresholdBaseConfig
An analyzer using stddev for a window of time range.
This will fall back to Poisson distribution if there is only 1 value in the baseline.
This only works with TrailingWindow baseline (TODO: add backend validation)
- type: Literal[AlgorithmType]#
- algorithm: Literal['arima']#
- stddevTimeRanges: List[monitor_schema.models.commons.TimeRange] | None#
- class monitor_schema.models.analyzer.algorithms.DriftConfig[source]#
Bases:
AlgorithmConfig
An analyzer using stddev for a window of time range.
This analysis will detect whether the data drifts or not. By default, we use hellinger distance with a threshold of 0.7.
- type: Literal[AlgorithmType]#
- algorithm: Literal['hellinger', 'jensenshannon', 'kl_divergence', 'psi']#
- metric: Literal[ComplexMetrics, ComplexMetrics]#
- class monitor_schema.models.analyzer.algorithms.ExperimentalConfig[source]#
Bases:
AlgorithmConfig
Experimental algorithm that is not standardized by the above ones yet.
- type: Literal[AlgorithmType]#
- baseline: monitor_schema.models.analyzer.baseline.TrailingWindowBaseline | monitor_schema.models.analyzer.baseline.ReferenceProfileId | monitor_schema.models.analyzer.baseline.TimeRangeBaseline | monitor_schema.models.analyzer.baseline.SingleBatchBaseline#
- stub: AlgorithmType | None#
- class monitor_schema.models.analyzer.algorithms.DiffMode[source]#
-
Whether to use the absolute difference or the percentage to calculate the difference.
- abs = 'abs'#
- pct = 'pct'#
- class monitor_schema.models.analyzer.algorithms.DiffConfig[source]#
Bases:
AlgorithmConfig
Detecting the differences between two numerical metrics.
- type: Literal[AlgorithmType]#
- thresholdType: ThresholdType | None#
- class monitor_schema.models.analyzer.algorithms.ConjunctionConfig[source]#
Bases:
monitor_schema.models.commons.NoExtrasBaseModel
Conjunction (ANDs) composite analyzer joining multiple analyzers.
- type: Literal[AlgorithmType]#
- analyzerIds: typing_extensions.Annotated[str, Field(title='AnalyzerIds', description='The corresponding analyzer IDs for the conjunction.', pattern='^[A-Za-z0-9_\\-]+$')]#
- class monitor_schema.models.analyzer.algorithms.DisjunctionConfig[source]#
Bases:
monitor_schema.models.commons.NoExtrasBaseModel
Disjunction (ORs) composite analyzer joining multiple analyzers.
- type: Literal[AlgorithmType]#
- analyzerIds: typing_extensions.Annotated[str, Field(title='AnalyzerIds', description='The corresponding analyzer IDs for the conjunction.', pattern='^[A-Za-z0-9_\\-]+$')]#