Autoscaling¶

Overview¶

Omnistrate enables your Resources to have advanced autoscaling capabilities using custom metrics. You can configure autoscaling to dynamically adjust the number of replicas based on load, helping you optimize infrastructure costs while maintaining performance.

Omnistrate offers advanced autoscaling that works for your stateful systems. Here are some specific features in addition to simply autoscaling based on the load:

Scale machines on demand based on custom metrics, not only the load metric average. A common problem occurs if one node is misbehaving and has high load; the solution in those cases is not to always add more resources
Run specific operations as part of the scale operation. Omnistrate allows you to run custom code as part of the scale operation, for example, to call rebalance, update cluster metadata, or rotate certificates
Automatically coordinate different operations on your Resources to prevent complex operations from interfering with each other

Note

To scale down to zero, use Custom Autoscaling or Serverless capabilities.

Configuring auto-scaling¶

You can enable autoscaling for one or more Resources by adding the autoscaling configuration under x-omnistrate-capabilities. Here is an example using compose spec:

services:
    app: 
        x-omnistrate-capabilities:
            autoscaling:
                maxReplicas: 5
                minReplicas: 1
                idleMinutesBeforeScalingDown: 2
                idleThreshold: 20
                overUtilizedMinutesBeforeScalingUp: 3
                overUtilizedThreshold: 80

Configuration parameters¶

maxReplicas: Maximum number of replicas to scale up to
minReplicas: Minimum number of replicas to maintain. The value must be between [1, maxReplicas]. To scale down to zero, use Custom Autoscaling or Serverless
idleMinutesBeforeScalingDown: Duration in minutes to wait before scaling down when idle
idleThreshold: Metric threshold value below which the system is considered idle and will scale down after idleMinutesBeforeScalingDown
overUtilizedMinutesBeforeScalingUp: Duration in minutes to wait before scaling up when over-utilized
overUtilizedThreshold: Metric threshold value above which the system is considered over-utilized and will scale up after overUtilizedMinutesBeforeScalingUp
scalingMetric: Optional custom metric configuration. By default, Omnistrate uses CPU load metrics, but you can configure your own custom metrics

Note

For custom metrics, Omnistrate leverages Prometheus endpoints and requires your application to emit the required metrics.

Configuring auto-scaling with custom metrics¶

You can enable autoscaling for one or more Resources by adding the autoscaling configuration under x-omnistrate-capabilities and defining metrics configuration. Here is an example using compose spec:

services:
    app: 
        x-omnistrate-capabilities:
            autoscaling:
                maxReplicas: 5
                minReplicas: 1
                idleMinutesBeforeScalingDown: 2
                idleThreshold: 20
                overUtilizedMinutesBeforeScalingUp: 3
                overUtilizedThreshold: 80
                scalingMetric:
                    metricEndpoint: "http://localhost:9187/metrics"
                    metricName: "custom_metric_name"
                    metricLabelName: "application_name"
                    metricLabelValue: "custom_metric_label"