monitor_schema.cli#

Console script for monitor_schema.

Classes#

`Analyzer`	Configuration for running an analysis.
`AnomalyFilter`	Filter the anomalies based on certain criteria. If the alerts are filtered down to 0, the monitor won't fire.
`BaselineType`	Supported baseline types.
`Cadence`	Cadence for an analyzer or monitor run.
`ColumnDataType`	Options for configuring data type for a column.
`ColumnDiscreteness`	Classifying the type.
`ColumnMatrix`	Define the matrix of columns and segments to fan out for monitoring.
`ColumnSchema`	Schema configuration for a column.
`DatasetMatrix`	Define the matrix of fields and segments to fan out for monitoring.
`DigestMode`	Config mode that indicates the monitor will send out a digest message.
`Document`	The main document that dictates how the monitor should be run. This document is managed by WhyLabs internally.
`DriftConfig`	An analyzer using stddev for a window of time range.
`EntitySchema`	Schema definition of an entity.
`EveryAnomalyMode`	Config mode that indicates the monitor will send out individual messages per anomaly.
`FixedCadenceSchedule`	Support for scheduling based on a predefined cadence.
`GlobalAction`	Actions that are configured at the team/organization level.
`Granularity`	Supported granularity.
`Monitor`	Customer specified monitor configs.
`Segment`	A segment is a list of tags.
`SendEmail`	Action to send an email.
`SlackWebhook`	Action to send a Slack webhook.
`StddevConfig`	Calculates upper bounds and lower bounds based on stddev from a series of numbers.
`TargetLevel`	Which nested level we are targeting.
`TrailingWindowBaseline`	A dynamic trailing window.

Functions#

`main`(→ None)	Generates schema and example document JSON.
`_dump_json_yaml`(→ None)

Module Contents#

class monitor_schema.cli.Analyzer[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Configuration for running an analysis.

An analysis targets a metric (note that a metric could be a complex object) for one or multiple fields in one or multiple segments. The output is a list of ‘anomalies’ that might show issues with data.

metadata: monitor_schema.models.commons.Metadata | None#

id: str#

displayName: str | None#

tags: Optional[List[constr(min_length=3, max_length=256, regex='[0-9a-zA-Z\\-_]')]]#

targetSize: int | None#

schedule: monitor_schema.models.commons.CronSchedule | monitor_schema.models.commons.FixedCadenceSchedule | None#

disabled: bool | None#

disableTargetRollup: bool | None#

targetMatrix: monitor_schema.models.analyzer.targets.ColumnMatrix | monitor_schema.models.analyzer.targets.DatasetMatrix | None#

dataReadinessDuration: str | None#

batchCoolDownPeriod: str | None#

backfillGracePeriodDuration: str | None#

config: monitor_schema.models.analyzer.algorithms.ConjunctionConfig | monitor_schema.models.analyzer.algorithms.DisjunctionConfig | monitor_schema.models.analyzer.algorithms.DiffConfig | monitor_schema.models.analyzer.algorithms.ComparisonConfig | monitor_schema.models.analyzer.algorithms.ListComparisonConfig | monitor_schema.models.analyzer.algorithms.FrequentStringComparisonConfig | monitor_schema.models.analyzer.algorithms.ColumnListChangeConfig | monitor_schema.models.analyzer.algorithms.FixedThresholdsConfig | monitor_schema.models.analyzer.algorithms.StddevConfig | monitor_schema.models.analyzer.algorithms.DriftConfig | monitor_schema.models.analyzer.algorithms.ExperimentalConfig | monitor_schema.models.analyzer.algorithms.SeasonalConfig#

class Config[source]#

Updates JSON schema anyOf to oneOf.

static schema_extra(schema: Dict[str, Any], model: pydantic.BaseModel) → None[source]#: Update specific fields here (for Union type, specifically).

class monitor_schema.cli.AnomalyFilter[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Filter the anomalies based on certain criteria. If the alerts are filtered down to 0, the monitor won’t fire.

includeColumns: List[monitor_schema.models.utils.COLUMN_NAME_TYPE] | None#

excludeColumns: List[monitor_schema.models.utils.COLUMN_NAME_TYPE] | None#

minWeight: float | None#

maxWeight: float | None#

minRankByWeight: int | None#

maxRankByWeight: int | None#

minTotalWeight: float | None#

maxTotalWeight: float | None#

minAlertCount: int | None#

maxAlertCount: int | None#

includeMetrics: List[monitor_schema.models.utils.METRIC_NAME_STR] | None#

class monitor_schema.cli.BaselineType[source]#

Bases: str, enum.Enum

Supported baseline types.

BatchTimestamp = 'BatchTimestamp'#

Reference = 'Reference'#

TrailingWindow = 'TrailingWindow'#

TimeRange = 'TimeRange'#

CurrentBatch = 'CurrentBatch'#

class monitor_schema.cli.Cadence[source]#

Bases: str, enum.Enum

Cadence for an analyzer or monitor run.

hourly = 'hourly'#

daily = 'daily'#

weekly = 'weekly'#

monthly = 'monthly'#

class monitor_schema.cli.ColumnDataType[source]#

Bases: str, enum.Enum

Options for configuring data type for a column.

integral = 'integral'#

fractional = 'fractional'#

boolean = 'bool'#

string = 'string'#

unknown = 'unknown'#

null = 'null'#

class monitor_schema.cli.ColumnDiscreteness[source]#

Bases: str, enum.Enum

Classifying the type.

discrete = 'discrete'#

continuous = 'continuous'#

class monitor_schema.cli.ColumnMatrix[source]#

Bases: _BaseMatrix

Define the matrix of columns and segments to fan out for monitoring.

type: Literal[TargetLevel]#

include: List[ColumnGroups | monitor_schema.models.utils.COLUMN_NAME_TYPE] | None#

exclude: List[ColumnGroups | monitor_schema.models.utils.COLUMN_NAME_TYPE] | None#

profileId: str | None#

class monitor_schema.cli.ColumnSchema[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Schema configuration for a column.

Should be generated by WhyLabs originally but can be overridden by users.

discreteness: ColumnDiscreteness#

dataType: ColumnDataType#

classifier: str | None#

class monitor_schema.cli.DatasetMatrix[source]#

Bases: _BaseMatrix

Define the matrix of fields and segments to fan out for monitoring.

type: Literal[TargetLevel]#

class monitor_schema.cli.DigestMode[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Config mode that indicates the monitor will send out a digest message.

type: Literal['DIGEST']#

filter: AnomalyFilter | None#

creationTimeOffset: str | None#

datasetTimestampOffset: str | None#

groupBy: List[DigestModeGrouping] | None#

class monitor_schema.cli.Document[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

The main document that dictates how the monitor should be run. This document is managed by WhyLabs internally.

id: uuid.UUID | None#

schemaVersion: Literal[1]#

metadata: monitor_schema.models.commons.Metadata | None#

orgId: str#

datasetId: str#

granularity: Granularity#

allowPartialTargetBatches: bool | None#

entitySchema: monitor_schema.models.column_schema.EntitySchema | None#

weightConfig: monitor_schema.models.column_schema.EntityWeights | None#

analyzers: List[monitor_schema.models.analyzer.Analyzer]#

monitors: List[monitor_schema.models.monitor.Monitor]#

class monitor_schema.cli.DriftConfig[source]#

Bases: AlgorithmConfig

An analyzer using stddev for a window of time range.

This analysis will detect whether the data drifts or not. By default, we use hellinger distance with a threshold of 0.7.

type: Literal[AlgorithmType]#

algorithm: Literal['hellinger', 'jensenshannon', 'kl_divergence', 'psi']#

metric: Literal[ComplexMetrics, ComplexMetrics]#

threshold: float#

minBatchSize: int | None#

baseline: monitor_schema.models.analyzer.baseline.TrailingWindowBaseline | monitor_schema.models.analyzer.baseline.ReferenceProfileId | monitor_schema.models.analyzer.baseline.TimeRangeBaseline | monitor_schema.models.analyzer.baseline.SingleBatchBaseline#

class monitor_schema.cli.EntitySchema[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Schema definition of an entity.

metadata: monitor_schema.models.commons.Metadata | None#

columns: Dict[monitor_schema.models.utils.COLUMN_NAME_TYPE, ColumnSchema]#

class monitor_schema.cli.EveryAnomalyMode[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Config mode that indicates the monitor will send out individual messages per anomaly.

type: Literal['EVERY_ANOMALY']#

filter: AnomalyFilter | None#

class monitor_schema.cli.FixedCadenceSchedule[source]#

Bases: NoExtrasBaseModel

Support for scheduling based on a predefined cadence.

type: Literal['fixed']#

cadence: Literal[Cadence, Cadence, Cadence, Cadence]#

exclusionRanges: List[TimeRange] | None#

class monitor_schema.cli.GlobalAction[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Actions that are configured at the team/organization level.

type: Literal['global']#

target: str#

class monitor_schema.cli.Granularity[source]#

Bases: str, enum.Enum

Supported granularity.

hourly = 'hourly'#

daily = 'daily'#

weekly = 'weekly'#

monthly = 'monthly'#

class monitor_schema.cli.Monitor[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Customer specified monitor configs.

metadata: monitor_schema.models.commons.Metadata | None#

id: str#

displayName: str | None#

tags: Optional[List[constr(min_length=3, max_length=256, regex='[0-9a-zA-Z\\-_]')]]#

analyzerIds: List[constr(regex='^[A-Za-z0-9_\\-]+$')]#

schedule: monitor_schema.models.commons.FixedCadenceSchedule | monitor_schema.models.commons.CronSchedule | monitor_schema.models.commons.ImmediateSchedule#

disabled: bool | None#

severity: int | None#

mode: EveryAnomalyMode | DigestMode#

actions: List[GlobalAction | SendEmail | SlackWebhook | RawWebhook]#

class Config[source]#

Updates JSON schema anyOf to oneOf.

static schema_extra(schema: Dict[str, Any], model: pydantic.BaseModel) → None[source]#: Update specific fields here (for Union type, specifically).

class monitor_schema.cli.Segment[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

A segment is a list of tags.

We normalize these in the backend.

tags: List[SegmentTag]#

class monitor_schema.cli.SendEmail[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Action to send an email.

type: Literal['email']#

target: str#

class monitor_schema.cli.SlackWebhook[source]#

Bases: monitor_schema.models.commons.NoExtrasBaseModel

Action to send a Slack webhook.

type: Literal['slack']#

target: pydantic.HttpUrl#

class monitor_schema.cli.StddevConfig[source]#

Bases: _ThresholdBaseConfig

Calculates upper bounds and lower bounds based on stddev from a series of numbers.

An analyzer using stddev for a window of time range.

This calculation will fall back to Poisson distribution if there is only 1 value in the baseline. For 2 values, we use the formula sqrt((x_i - avg(x))^2 / n - 1)

type: Literal[AlgorithmType]#

factor: float | None#

minBatchSize: int | None#

baseline: monitor_schema.models.analyzer.baseline.TrailingWindowBaseline | monitor_schema.models.analyzer.baseline.TimeRangeBaseline | monitor_schema.models.analyzer.baseline.ReferenceProfileId#

class monitor_schema.cli.TargetLevel[source]#

Bases: str, enum.Enum

Which nested level we are targeting.

dataset = 'dataset'#

column = 'column'#

class monitor_schema.cli.TrailingWindowBaseline[source]#

Bases: _SegmentBaseline

A dynamic trailing window.

This is useful if you don’t have a static baseline to monitor against. This is the default mode for most monitors.

type: Literal[BaselineType]#

size: int#

offset: int | None#

exclusionRanges: List[monitor_schema.models.commons.TimeRange] | None#

monitor_schema.cli.main() → None#: Generates schema and example document JSON.

monitor_schema.cli._dump_json_yaml(file_name: str, json_content: str) → None[source]#