Observability configuration
The Instabase observability toolset is enabled by default when Instabase is installed. After installation, you can configure your observability toolset to enable optional features or manage preferences.
The files required to enable the observability toolset are applied during the course of a standard installation or upgrade. Any additional configuration assumes the following:
-
All requirements for statistics collection, federated cluster statistics collection, and log aggregation have been met.
-
All ConfigMap, Deployment, and Service files for the release have been applied and ConfigMap templates are up-to-date.
-
You’ve verified that your observability toolset is accessible.
Configure federated cluster statistics collection
If you have Prometheus cluster monitoring enabled, you can optionally federate with the cluster statistics infrastructure to collect and display CPU and memory utilization data in Deployment Manager.
When enabling federated cluster statistics collection, the following parameters are required.
Parameter name | Parameter key | Description | Default value |
---|---|---|---|
Enable Federate Job | obsPromEnableFederation | Enables federation scraping job to collect CPU and memory stats. | Turned off ("false" ) |
Get Node Exporter Stats | obsPromFederationGetNodeExporterStats | Enables the collection of node exporter stats as part of the federation job. | Turned off ("false" ) |
Scrape Target for Federate Job | obsPromFederationTarget | Defines the Prometheus server target for federated metrics scraping CPU/Memory stats data. For example, source-prometheus:9090 . |
To configure federated cluster statistics collection:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select the
config-prometheus-targets
object.Infoconfig-prometheus-targets
shares a ConfigMap template with theconfig-prometheus-recording-rules
,config-alertmanager-routes
, andconfig-prometheus-alerting-rules
objects. Editing any of these configs updates the others. -
Click Edit Config.
-
Turn on the Enable Federate Job toggle.
-
Turn on the Get Node Exporter toggle.
-
In the Scrape Target for Federate Job field, enter the Prometheus server target.
-
Click Save.
-
Restart
statefulset-vmagent
to reflect your configuration changes.-
Open the Deployment Manager Stateful Sets tab (Deployment Manager > Infra Dashboard > Stateful Sets).
-
From the StatefulSets list, select
statefulset-vmagent
. -
Click Restart.
-
Configure alerting
You can configure alerting to forward observability alerts to a Slack channel, a specific email address, or OpsGenie. By default, all warnings and critical severity notifications are forwarded.
Slack alerting
When configuring Slack alerting, the following parameters are required.
Parameter name | Parameter key | Description | |
---|---|---|---|
Enable Slack Alerts | obsEnableSlackAlerting | Enables Slack alerting. | |
Slack Alert URL | obsAlertSlackUrl | The URL used to connect to the Slack platform for alerting. | |
Slack Alert Channel | obsEnvoyAlertSlackChannel | Defines the name of the Slack channel where alerts are posted, such as #alert-observability. Include the # symbol when defining the channel name. |
To configure Slack alerting:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select the
config-alertmanager-routes
object.Infoconfig-alertmanager-routes
shares a ConfigMap template with theconfig-prometheus-recording-rules
,config-prometheus-targets
, andconfig-prometheus-alerting-rules
objects. Editing any of these configs updates the others. -
Click Edit Config.
-
Turn on the Enable Slack Alerts toggle.
-
InSlack Alert URL, enter the URL to connect to your Slack platform.
-
In Slack Alert Channel, enter the name of the Slack channel.
-
Click Save.
-
Restart
deployment-alertmanager
to reflect your configuration changes.-
Open the Deployment Manager Deployments tab (Deployment Manager > Infra Dashboard > Deployments).
-
From the deployments list, select
deployment-alertmanager
. -
Click Restart.
-
Email alerting
When configuring email alerting, the following parameters are required.
The obsAlertEmailAppPassword
parameter displays only as a key/value pair in the All Config Parameters tab.
Parameter name | Parameter key | Description | |
---|---|---|---|
Enable Email Alerts | obsEnableEmailAlerting | Enables email alerting. | |
Sender Email | obsAlertSenderEmail | Defines the sender email address for any email alerts. | |
Receiver Email | obsAlertReceiverEmail | Defines the receiver email address for any email alerts. | |
SMTP Server | obsAlertSmtpServer | Defines the URL of the Simple Mail Transfer Protocol (SMTP) email server used for sending alerts. | |
obsAlertEmailAppPassword | Defines the password associated with the SMTP email server. This password is required for authentication purposes when using the SMTP email server. |
To configure email alerting:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select the
config-alertmanager-routes
object.Infoconfig-alertmanager-routes
shares a ConfigMap template with theconfig-prometheus-recording-rules
,config-prometheus-targets
, andconfig-prometheus-alerting-rules
objects. Editing any of these configs updates the others. -
Click Edit Config.
-
In Sender Email, enter the email address from which alerts are sent.
-
In Receiver Email, enter the email address to which alerts are sent.
-
In SMTP Server, enter the URL of the SMTP server.
-
Select the All Config Parameters tab.
-
Locate the
"obsAlertEmailAppPassword"
key and enter the SMTP email server password. -
Click Save.
-
Restart
deployment-alertmanager
to reflect your configuration changes.-
Open the Deployment Manager Deployments tab (Deployment Manager > Infra Dashboard > Deployments).
-
From the deployments list, select
deployment-alertmanager
. -
Click Restart.
-
OpsGenie alerting
When configuring OpsGenie alerting, the following parameters are required.
OpsGenie parameters display only as key/value pairs in the All Config Parameters tab of the ConfigMap template editor.
Parameter key | Description | |
---|---|---|
obsEnableOpsGenieAlerting | Enables OpsGenie alerting. Set to "true" to enable. |
|
obsOpsGenieApiKey | The API key of the OpsGenie server used for OpsGenie alerting. |
To configure OpsGenie alerting:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select the
config-alertmanager-routes
object.Infoconfig-alertmanager-routes
shares a ConfigMap template with theconfig-prometheus-recording-rules
,config-prometheus-targets
, andconfig-prometheus-alerting-rules
objects. Editing any of these configs updates the others. -
Click Edit Config.
-
Select the All Config Parameters tab.
-
Locate the
"obsEnableOpsGenieAlerting"
key and set the value to"true"
. -
Locate the
"obsOpsGenieApiKey"
key and enter the OpsGenie server API key. -
Click Save.
-
Restart
deployment-alertmanager
to reflect your configuration changes.-
Open the Deployment Manager Deployments tab (Deployment Manager > Infra Dashboard > Deployments).
-
From the deployments list, select
deployment-alertmanager
. -
Click Restart.
-
Configure log storage
Grafana Loki is responsible for aggregating, indexing, and storing log information. The storage system to which Loki persists logs can be configured.
By default, Loki uses the same file storage that was selected when installing Instabase. If you want to change where logs are stored, you can configure the log storage location.
The following log storage options are supported:
-
Amazon S3 bucket
-
Azure Blob storage container
-
Network file system (NFS) volume
Amazon S3
By default, the Loki configuration ships with an Amazon S3 template for long term log storage. If you don’t have Amazon S3 storage enabled but would like to, revert the config-loki
object to its base configuration then follow the steps in this section. You might need to delete any patches applied to config-loki
that changed the default log storage setting.
While Amazon S3 is the default Loki storage option, some additional configuration might be required if your bucket has a non-standard configuration. If you selected Amazon S3 as your storage option during installation but still encounter log storage errors, the configuration can be completed by patching deployment-loki-write
to add the following environment variables:
- LOKI_S3_ACCESS_KEY
- LOKI_S3_SECRET_ACCESS_KEY
- LOKI_S3_REGION
- LOKI_S3_BUCKET_NAME
The patch you apply to deployment-loki-write
references a Kubernetes secret called aws-access-key
. You must create this secret in Kubernetes, in the same namespace as your Instabase installation. The secret must include two values, one for access-key
(your AWS IAM key) and one for secret-access-key
(your AWS IAM secret key).
You also need the name of the S3 bucket to use for storage and the region code for your AWS account.
To configure Amazon S3 for log storage:
-
Using the following template, create a patch with your region code (
<YOUR_S3_REGION_CODE>
) and bucket name (<YOUR_S3_BUCKET_NAME>
).apiVersion: apps/v1 kind: Deployment spec: template: spec: containers: - name: CONTAINER_NAME env: - name: LOKI_S3_ACCESS_KEY $patch: replace valueFrom: secretKeyRef: name: aws-access-key key: access-key - name: LOKI_S3_SECRET_ACCESS_KEY $patch: replace valueFrom: secretKeyRef: name: aws-access-key key: secret-access-key - name: LOKI_S3_REGION value: "<YOUR_S3_REGION_CODE>" - name: LOKI_S3_BUCKET_NAME value: "<YOUR_S3_BUCKET_NAME>"
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select
deployment-loki-write
. -
Click Enter Patch
-
Enter the above patch in the config editor.
-
Click Preview Changes to validate the patch.
-
Click Confirm Changes.
Azure Blob storage
You must create a storage account access key for the Azure Blob storage container.
By default, the config-loki
ConfigMap is set up to support Amazon S3 storage. To support a different storage provider, you must edit and replace the entire data field of the config-loki
ConfigMap. Because the data field inside of a ConfigMap is a simple string, you can’t use patches that target specific lines.
When editing the config-loki
ConfigMap, there are three changes required to support Azure Blob storage.
-
Under
common.storage
, you must replace thes3
configuration with anazure
configuration that includes your Azure Blob storage account name, account key, and container name.For example:
data: loki.yaml: | auth_enabled: false analytics: reporting_enabled: false server: http_listen_port: 3100 http_server_read_timeout: 60s common: replication_factor: 1 ring: kvstore: store: memberlist storage: azure: account_name: <YOUR_STORAGE_ACCOUNT_NAME> account_key: <YOUR_STORAGE_ACCOUNT_KEY> container_name: <YOUR_CONTAINER_NAME>
-
Under
schema_config.configs.object_store
, define theobject_store
asazure
. -
Under
storage_config.configs.shared_store
, define theshared_store
asazure
.
The following config-loki
excerpt shows these three changes together.
This YAML file is incomplete and can’t be used as a patch.
apiVersion: v1
kind: ConfigMap
metadata:
name: config-loki
namespace: ${ib.namespace}
labels:
app: loki
data:
loki.yaml: |-
auth_enabled: false
analytics:
reporting_enabled: false
server:
http_listen_port: 3100
http_server_read_timeout: 60s
common:
replication_factor: 1
ring:
kvstore:
store: memberlist
storage:
azure:
account_name: <YOUR_STORAGE_ACCOUNT_NAME>
account_key: <YOUR_STORAGE_ACCOUNT_KEY>
container_name: <YOUR_CONTAINER_NAME>
...
schema_config:
configs:
- from: "2020-12-11"
index:
period: 24h
prefix: index_
object_store: azure
schema: v11
store: boltdb-shipper
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: azure
...
To use Azure Blob storage for log storage:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select
config-loki
. -
Copy the entire contents of the
config-loki
ConfigMap to your clipboard. -
Click Enter Patch
-
Paste the ConfigMap in the config editor and make the indicated changes.
-
Click Preview Changes to validate the edits.
-
Click Confirm Changes.
Connect NFS volume
Approximately 64 GB of space is required for log storage in your NFS volume.
You can use an NFS volume for log storage by patching the config-loki
object.
This patch is also included in your release bundle, in the optional_patches
folder.
apiVersion: v1
kind: ConfigMap
metadata:
name: config-loki
labels:
app: loki
data:
loki.yaml: |
auth_enabled: false
analytics:
reporting_enabled: false
server:
http_listen_port: 3100
http_server_read_timeout: 60s
common:
replication_factor: 1
ring:
kvstore:
store: memberlist
storage:
filesystem:
chunks_directory: /data/loki/chunks
rules_directory: /data/loki/rules
chunk_store_config:
max_look_back_period: 0s
ingester:
chunk_idle_period: 2h
chunk_retain_period: 30s
wal:
enabled: false
dir: /data/loki/wal
flush_on_shutdown: true
autoforget_unhealthy: true
memberlist:
abort_if_cluster_join_fails: false
join_members:
- loki-memberlist
limits_config:
enforce_metric_name: false
retention_period: 336h
schema_config:
configs:
- from: "2020-10-24"
index:
period: 24h
prefix: index_
object_store: filesystem
schema: v11
store: boltdb-shipper
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: filesystem
compactor:
compactor_ring:
kvstore:
store: memberlist
working_directory: /tmp/loki/boltdb-shipper-compactor
shared_store: filesystem
retention_enabled: true
ruler:
storage:
type: local
local:
directory: /tmp/rules
rule_path: /tmp/scratch
alertmanager_url: http://service-alertmanager:9093
ring:
kvstore:
store: memberlist
enable_api: true
remote_write:
enabled: true
client:
url: http://service-prometheus-server:9090/api/v1/write
wal:
dir: /tmp
To use an NFS volume for log storage:
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select
config-loki
. -
Click Enter Patch
-
Enter the above patch in the config editor.
-
Click Preview Changes to validate the patch.
-
Click Confirm Changes.
Configure log retention period
By default, logs are stored for 336 hours, or 14 days. You can configure this retention period by modifying the limits_config.retention_period
value in the config-loki
object.
-
Open the Deployment Manager Configs tab (All apps > Deployment Manager > Configs).
-
From the Configs dropdown, select
config-loki
. -
Copy the entire contents of the
config-loki
ConfigMap to your clipboard. -
Click Enter Patch
-
Paste the ConfigMap in the config editor.
-
Define the
retention_period
value, in hours. -
Click Preview Changes to validate the edits.
-
Click Confirm Changes.