Env Var Names
Entries
A JSON formatted list of string model ids. All the model ids in this list will have profiles delivered for them regardless of whether any data was sent to the container for them. This is useful to differentiate container issues from legitimately not receiving data.
Disables the password auth header that the container uses to validate requests. Some integration paths make it difficult to send custom headers values and VPC privacy may be adequate for use cases as a substitute.
A string password that the container requires for each request in the X-API-Key
header. This is a safeguard if you need to have the container exposed to the internet or other untrusted sources.
Controls the data structure that stores the profiles before they get uploaded. By default, WriteLayer.IN_MEMORY will use an in memory map.
Only applies when UPLOAD_DESTINATION is set to WriterTypes.DEBUG_FILE_SYSTEM. Controls the dir that whylogs profiles are written to, relative to the local dir.
A JSON list of keys that should be ignored when logging. If any of the columns of data messages match these keys then they won't be logged. Useful to avoid having to strip out data as a preprocessing step. If the container is running in Kafka mode then you can use this to avoid having to strip keys out of messages. If you're using the REST interface then any of the columns in the single or multiple will be dropped if they match any of the ones in here.
How frequent the container should upload profiles. This defaults to the same cadence as the model definition. For an hourly model, you'll upload profiles on an hourly basis. If this is set to ProfileWritePeriod.HOURS then you'll upload profiles every hour.
Used to determine where profiles are uploaded to. Must be a string value of WriterTypes. Other config values become required depending on the value of this.
A URL to use to upload profiles to. Useful for debugging or potentially standing up a WhyLabs compatible endpoint that you can customize. We need to document how to do that. Only applies if configured for uploads to WhyLabs.
A WhyLabs api key for your account. Only applies if configured for uploads to WhyLabs.
The period to use for log rotation. This can be HOURLY, DAILY. This determines how data is grouped into profiles. If you're using WhyLabs then this should match the model's type.
A JSON formatted list of host:port servers. See Kafka Bootstrap Servers Example: ["http://localhost:9092"]
The container will have a single group id for all the consumers. See Kafka Group Ids
Set this to true if you want the container to use its kafka config. By default, none of the Kafka options do anything.
A JSON formatted list of topic names to consume. The full list is subscribed to by each of the consumers. EnvVarNames.KAFKA_TOPICS.default huh
A JSON map that maps kafka topics to a whylabs dataset id. Applies when UPLOAD_DESTINATION is set to WriterTypes.WHYLABS.
How to treat nested values in Kafka JSON data messages. Will either include nested values by concatenating keys with "." or it will ignore the entire value.
Number of consumer threads to start up. If you dedicate 3 threads to consumers then there will be three separate threads dedicated to three consumers that independently poll Kafka. If you want to dedicate an entire container to a single consumer then you would put this value to 1. The ideal value depends on use case, Kafka cluster configuration, and the hardware used to host the container. Having a thread count == your topic partition count is reasonable. If you have more threads than your partitions then the extra ones will just be idle.
Only applies if REQUEST_QUEUEING_ENABLED is true
. That queue can be backed by in memory data structures or sqlite. See REQUEST_QUEUEING_ENABLED for more info.
An optimization that decouples the request from the request handling. This will make each request finish faster from the caller's perspective by queueing the requests to be handled asap, rather than handling them while the caller waits. This isn't that useful when using the default PROFILE_STORAGE_MODE=IN_MEMORY
since the request handling is pretty fast already. It was added to make PROFILE_STORAGE_MODE=SQLITE
faster since there is af air bit of IO for each request. You probably don't need to change this.
A fairly obscure/advanced tuning knob that only applies to the REST calls when REQUEST_QUEUEING_ENABLED is true
. You probably don't want to set this.