Alarms¶
Overview¶
Alarms configuration is global setting that allow organization to subscribe to various Alerts and/or Notifications and forward those to supported notification channels. Notification settings can be modified on UI at "Manage Account" -> "Alarms Settings". Each channel offers 4 types of filters:
- Environment type - allows filtering based on environment type. Example: PROD / QA / DEV
- Alert Event Type - allows filtering based on type of alert. Example: Alert / Notification
- Alert Event Priority - allows filtering based on priority of alert. Example: Low / Medium / High / Critical
- Event Category - allows filtering based on category. Example: InstanceEvent / UserEvent / ServiceEvent, etc.
- Event type - allows filtering based on specific event type. Example: UnhealthyInstance / LowDiskSpace / FailedUpdate, etc.
- Payload - a json structure with properties of the event. Example: instance_id, subscription_id
Each organization can configure multiple channels with different event filters.
Supported notification channels¶
Email¶
Email channel sends notification as email message to email address provided in email channel configuration.
PagerDuty¶
PagerDuty channels sends notification as PagerDuty alert. Recipient is identified by PagerDuty integration key that needs to be provided during configuration.
For more details on how to generate integration key, refer to this PagerDuty documentation.
Webhook¶
Webhook calls HTTP-based callback function with configurable payload. By default, Omnistrate will include following details as request body:
{
"eventID": "{{ $var.id }}",
"serviceID": "{{ $var.ServiceID }}",
"eventName": "{{ $var.Name }}",
"eventDescription": "{{ $var.Description }}",
"eventType": "{{ $var.Type }}",
"payload": "{{ $var.Payload }}"
}
Webhook channel allows method, endpoint and payload to be provided. If webhook action fails, we will retry few times before failing to notify this channel of the event.
Channels configuration¶
Omnistrate generates variety of Notifications and Alerts whenever a corresponding event occurs. Each notification channel can be configured to only receive specific types (based on event type, environment, priority, etc.).
To add new channel, open "Manage account" -> "Notifications" page on UI where you will be able to add new channel. There are 2 configuration options available when adding a channel:
-
Basic - subscription is created based on environment type, alert event type and event priority. This option offers less control, but makes it easier to create a channel based on fewer inputs. All of basic dimensions (environment, type and priority) have limited set of options that are unlikely to change as we add new types of alerts. An example of a basic notification rule: "High and critical priority Alerts from prod environments".
-
Advanced - subscription is created based on environment type and specific category and/or event type. This option offers more control, but is based on event categories.
Event Categories and Type¶
The following table lists all available event categories and their specific events:
Category | Type | Description |
---|---|---|
InstanceEvents | CertificateExpiry | Certificate is about to expire |
CertificateIssuance | New certificate has been issued | |
FailedBackup | Instance backup operation failed | |
FailedDelete | Instance deletion failed | |
FailedDeployment | Instance deployment failed | |
FailedRecovery | Instance recovery operation failed | |
FailedRestart | Instance restart failed | |
FailedRestore | Instance restore operation failed | |
FailedStart | Instance start operation failed | |
FailedStop | Instance stop operation failed | |
FailedUpdate | Instance update operation failed | |
HighCPUUsage | Instance CPU usage exceeds threshold | |
HighMemoryUsage | Instance memory usage exceeds threshold | |
LowDiskSpace | Instance disk space is running low | |
RecoveryStarted | Instance recovery process initiated | |
SuccessfulBackup | Instance backup completed successfully | |
SuccessfulDelete | Instance deleted successfully | |
SuccessfulDeployment | Instance deployed successfully | |
SuccessfulRecovery | Instance recovery completed successfully | |
SuccessfulRestart | Instance restarted successfully | |
SuccessfulRestore | Instance restore completed successfully | |
SuccessfulStart | Instance started successfully | |
SuccessfulStop | Instance stopped successfully | |
SuccessfulUpdate | Instance updated successfully | |
UnhealthyCustomerIntegration | Customer integration is unhealthy | |
UnhealthyInstance | Instance health check failed | |
UnhealthyIntegration | Integration is unhealthy | |
UnhealthyRegion | Region is experiencing issues | |
ServiceEvents | DeploymentCellCreateCompleted | Deployment cell creation finished |
DeploymentCellCreateStarted | Deployment cell creation initiated | |
DeploymentCellDeleteCompleted | Deployment cell deletion finished | |
DeploymentCellDeleteStarted | Deployment cell deletion initiated | |
DeploymentCellUpdateCompleted | Deployment cell update finished | |
DeploymentCellUpdateStarted | Deployment cell update initiated | |
EnvironmentsOutOfSync | Environments are not synchronized | |
InstancesPendingUpdate | Instances have pending updates | |
ScaleDown | Scale down operation initiated | |
ScaleDownFailed | Scale down operation failed | |
ScaleDownSuccess | Scale down completed successfully | |
ScaleIn | Scale in operation initiated | |
ScaleInFailed | Scale in operation failed | |
ScaleInSuccess | Scale in completed successfully | |
ScaleOut | Scale out operation initiated | |
ScaleOutFailed | Scale out operation failed | |
ScaleOutSuccess | Scale out completed successfully | |
ScaleUp | Scale up operation initiated | |
ScaleUpFailed | Scale up operation failed | |
ScaleUpSuccess | Scale up completed successfully | |
UserEvents | ApproveSubscriptionRequest | Subscription request approved |
UserSignUp | New user registration | |
UserSubscription | User subscription created | |
UserSubscriptionInvite | User invited to subscription | |
UserSubscriptionRevoked | User subscription access revoked | |
UserUnsubscribed | User unsubscribed from service | |
IdentityProviderEvents | FailedIdentityProviderVerification | Identity provider verification failed |
SystemEvent | UpgradeScheduled | System upgrade scheduled |
UpgradeMaintenanceActionRequest | Maintenance action requested for upgrade | |
DeploymentCellCompleted | Deployment cell operation completed | |
DeploymentCellStarted | Deployment cell operation started | |
BillingEvent | S3MeteringExportFailed | S3 metering export operation failed |