monitor_schema.models.analyzer.algorithms#

Collections of support algorithms.

Module Contents#

Classes#

AlgorithmType

Specify the algorithm type.

DatasetMetric

Metrics that are applicable at the dataset level.

SimpleColumnMetric

Simple column metrics that are basically just a single number.

ComplexMetrics

Sketch-based metrics that can only be processed by certain algorithms.

AlgorithmConfig

Base algorithm config.

ExpectedValue

Expected value: one of these fields must be set.

ComparisonOperator

Operators for performing a comparison.

ListComparisonOperator

Operators for performing a comparison.

ComparisonConfig

Compare whether the target against a value or against a baseline's metric.

ListComparisonConfig

Compare whether target against a list of values.

ColumnListChangeConfig

Compare whether the target is equal to a value or not.

FixedThresholdsConfig

Fixed threshold analysis.

_ThresholdBaseConfig

Base algorithm config.

StddevConfig

Calculates upper bounds and lower bounds based on stddev from a series of numbers.

SeasonalConfig

An analyzer using stddev for a window of time range.

DriftConfig

An analyzer using stddev for a window of time range.

ExperimentalConfig

Experimental algorithm that is not standardized by the above ones yet.

DiffMode

Whether to use the absolute difference or the percentage to calculate the difference.

ThresholdType

Threshold Type declaring the upper and lower bound.

DiffConfig

Detecting the differences between two numerical metrics.

class monitor_schema.models.analyzer.algorithms.AlgorithmType[source]#

Bases: str, enum.Enum

Specify the algorithm type.

expected = expected#
column_list = column_list#
comparison = comparison#
list_comparison = list_comparison#
diff = diff#
drift = drift#
stddev = stddev#
seasonal = seasonal#
fixed = fixed#
experimental = experimental#
class monitor_schema.models.analyzer.algorithms.DatasetMetric[source]#

Bases: str, enum.Enum

Metrics that are applicable at the dataset level.

profile_count = profile.count#
profile_last_ingestion_time = profile.last_ingestion_time#
profile_first_ingestion_time = profile.first_ingestion_time#
column_row_count_sum = column_row_count_sum#
shape_column_count = shape_column_count#
shape_row_count = shape_row_count#
input_count = input.count#
output_count = output.count#
classification_f1 = classification.f1#
classification_precision = classification.precision#
classification_recall = classification.recall#
classification_accuracy = classification.accuracy#
classification_auc = classification.auc#
regression_mse = regression.mse#
regression_mae = regression.mae#
regression_rmse = regression.rmse#
class monitor_schema.models.analyzer.algorithms.SimpleColumnMetric[source]#

Bases: str, enum.Enum

Simple column metrics that are basically just a single number.

count = count#
median = median#
max = max#
min = min#
mean = mean#
stddev = stddev#
variance = variance#
unique_upper = unique_upper#
unique_upper_ratio = unique_upper_ratio#
unique_est = unique_est#
unique_est_ratio = unique_est_ratio#
unique_lower = unique_lower#
unique_lower_ratio = unique_lower_ratio#
count_bool = count_bool#
count_bool_ratio = count_bool_ratio#
count_integral = count_integral#
count_integral_ratio = count_integral_ratio#
count_fractional = count_fractional#
count_fractional_ratio = count_fractional_ratio#
count_string = count_string#
count_string_ratio = count_string_ratio#
count_null = count_null#
count_null_ratio = count_null_ratio#
inferred_data_type = inferred_data_type#
quantile_p5 = quantile_5#
quantile_p75 = quantile_75#
quantile_p25 = quantile_25#
quantile_p90 = quantile_90#
quantile_p95 = quantile_95#
quantile_p99 = quantile_99#
class monitor_schema.models.analyzer.algorithms.ComplexMetrics[source]#

Bases: str, enum.Enum

Sketch-based metrics that can only be processed by certain algorithms.

histogram = histogram#
frequent_items = frequent_items#
unique_sketch = unique_sketch#
column_list = column_list#
class monitor_schema.models.analyzer.algorithms.AlgorithmConfig(**data: Any)[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Base algorithm config.

class Config[source]#

Updates JSON schema anyOf to oneOf for baseline.

static schema_extra(schema: Dict[str, Any], model: pydantic.BaseModel) None[source]#

Update specific fields here (for Union type, specifically).

schemaVersion :Optional[int]#
params :Optional[Dict[constr(max_length=100), constr(max_length=1000)]]#
metric :Union[DatasetMetric, SimpleColumnMetric, constr(max_length=100)]#
class monitor_schema.models.analyzer.algorithms.ExpectedValue(**data: Any)[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Expected value: one of these fields must be set.

str :Optional[constr(max_length=100)]#
int :Optional[int]#
float :Optional[float]#
class monitor_schema.models.analyzer.algorithms.ComparisonOperator[source]#

Bases: str, enum.Enum

Operators for performing a comparison.

eq = eq#
gt = gt#
lt = lt#
ge = ge#
le = le#
class monitor_schema.models.analyzer.algorithms.ListComparisonOperator[source]#

Bases: str, enum.Enum

Operators for performing a comparison.

eq = in#
gt = not_in#
class monitor_schema.models.analyzer.algorithms.ComparisonConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Compare whether the target against a value or against a baseline’s metric.

This is useful to detect data type change, for instance.

type :Literal[AlgorithmType]#
operator :ComparisonOperator#
expected :Optional[ExpectedValue]#
baseline :Optional[Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]]#
class monitor_schema.models.analyzer.algorithms.ListComparisonConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Compare whether target against a list of values.

type :Literal[AlgorithmType]#
operator :ListComparisonOperator#
expected :Optional[List[ExpectedValue]]#
baseline :Optional[Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]]#
class monitor_schema.models.analyzer.algorithms.ColumnListChangeConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Compare whether the target is equal to a value or not.

This is useful to detect data type change, for instance.

type :Literal[AlgorithmType]#
mode :Literal[ON_ADD_AND_REMOVE, ON_ADD, ON_REMOVE] = ON_ADD_AND_REMOVE#
metric :Literal[ComplexMetrics]#
exclude :Optional[List[monitor_schema.models.utils.COLUMN_NAME_TYPE]]#
baseline :Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]#
class monitor_schema.models.analyzer.algorithms.FixedThresholdsConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Fixed threshold analysis.

If user fails to set both upper bound and lower bound, this algorithm becomes a no-op. WhyLabs might enforce the present of either fields in the future.

type :Literal[AlgorithmType]#
upper :Optional[float]#
lower :Optional[float]#
class monitor_schema.models.analyzer.algorithms._ThresholdBaseConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Base algorithm config.

maxUpperThreshold :Optional[float]#
minLowerThreshold :Optional[float]#
class monitor_schema.models.analyzer.algorithms.StddevConfig(**data: Any)[source]#

Bases: _ThresholdBaseConfig

Calculates upper bounds and lower bounds based on stddev from a series of numbers.

An analyzer using stddev for a window of time range.

This calculation will fall back to Poisson distribution if there is only 1 value in the baseline. For 2 values, we use the formula sqrt((x_i - avg(x))^2 / n - 1)

type :Literal[AlgorithmType]#
factor :Optional[float]#
minBatchSize :Optional[int]#
baseline :Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId]#
class monitor_schema.models.analyzer.algorithms.SeasonalConfig(**data: Any)[source]#

Bases: _ThresholdBaseConfig

An analyzer using stddev for a window of time range.

This will fall back to Poisson distribution if there is only 1 value in the baseline.

This only works with TrailingWindow baseline (TODO: add backend validation)

type :Literal[AlgorithmType]#
algorithm :Literal[arima, rego, stastforecast]#
minBatchSize :Optional[int]#
alpha :Optional[float]#
baseline :monitor_schema.models.analyzer.baseline.TrailingWindowBaseline#
stddevTimeRanges :Optional[List[monitor_schema.models.commons.TimeRange]]#
stddevMaxBatchSize :Optional[int]#
stddevFactor :Optional[float]#
class monitor_schema.models.analyzer.algorithms.DriftConfig(**data: Any)[source]#

Bases: AlgorithmConfig

An analyzer using stddev for a window of time range.

This analysis will detect whether the data drifts or not. By default, we use hellinger distance with a threshold of 0.7.

type :Literal[AlgorithmType]#
algorithm :Literal[hellinger, ks_test, kl_divergence, variation_distance]#
metric :Literal[ComplexMetrics, ComplexMetrics]#
threshold :float#
minBatchSize :Optional[int]#
baseline :Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]#
class monitor_schema.models.analyzer.algorithms.ExperimentalConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Experimental algorithm that is not standardized by the above ones yet.

type :Literal[AlgorithmType]#
implementation :str#
baseline :Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]#
stub :Optional[AlgorithmType]#
class monitor_schema.models.analyzer.algorithms.DiffMode[source]#

Bases: str, enum.Enum

Whether to use the absolute difference or the percentage to calculate the difference.

abs = abs#
pct = pct#
class monitor_schema.models.analyzer.algorithms.ThresholdType[source]#

Bases: str, enum.Enum

Threshold Type declaring the upper and lower bound.

By default an anomaly will be generated when the target is above or below the baseline by the specified threshold.

If its only desirable to alert when the target is above the baseline and not the other way around, specify upper for your ThresholdType.

lower = lower#
upper = upper#
class monitor_schema.models.analyzer.algorithms.DiffConfig(**data: Any)[source]#

Bases: AlgorithmConfig

Detecting the differences between two numerical metrics.

type :Literal[AlgorithmType]#
mode :DiffMode#
thresholdType :Optional[ThresholdType]#
threshold :float#
baseline :Union[monitor_schema.models.analyzer.baseline.TrailingWindowBaseline, monitor_schema.models.analyzer.baseline.ReferenceProfileId, monitor_schema.models.analyzer.baseline.TimeRangeBaseline, monitor_schema.models.analyzer.baseline.SingleBatchBaseline]#