Skip to content

Auto-scaling and Serverless

Omnistrate enables your service components to have advanced auto-scaling and scale down to zero using a few clicks. It allows you to save on infrastructure costs when your customers are not using the underlying infrastructure.

In addition, Omnistrate allows you to extend your service components to behave completely serverless including any session state such that your client connections will have no idea when you automatically pause and resume your infrastructure.

Autoscaling

Omnistrate offers advanced auto-scaling that works for your stateful system. Here are some specific features in addition to simply autoscaling based on the load:

  • Scale machines on demand based on custom metrics not only just the load metric average. A common problem occurs if one node is misbehaving and has high load, the solution in those cases is NOT to always add more resources
  • Often, we need to run specific operation as part of the scale operation. Omnistrate allows you to run custom code as part of the scale operation, ex - to call rebalance or update cluster metadata or rotate certificates.
  • Omnistrate automatically coordinate different operations on your resources to prevent complex operation to interfere with each other
  • Automatically scale storage layer if configured

Here is an example configuration to enable Autoscaling:

x-omnistrate-capabilities:
  autoscaling:
    maxReplicas: 5
    minReplicas: 1
    idleMinutesBeforeScalingDown: 2
    idleThreshold: 20
    overUtilizedMinutesBeforeScalingUp: 3
    overUtilizedThreshold: 80
    scalingMetric:
      metricEndpoint: "http://localhost:9187/metrics"
      metricLabelName: "application_name"
      metricLabelValue: "custom_metric_label"
      metricName: "custom_metric_name"
  • maxReplicas: Maximum number of replicas that we will scale
  • minReplicas: Minium number of replicas that we will keep. Please note that the value has to be in between [1, maxReplicas]. To go down to zero, please enable Serverless in addition to autoscaling as mentioned here
  • idleMinutesBeforeScalingDown: Idle time in minutes before scaling down
  • idleThreshold: value of the underlying metric and threshold used below which we will scale down after idleMinutesBeforeScalingDown
  • overUtilizedMinutesBeforeScalingUp: time in minutes before scaling up
  • overUtilizedThreshold: value of the underlying metric and threshold used to determine when to scale up after overUtilizedMinutesBeforeScalingUp
  • scalingMetric: by default, we use load metric but you can customize to use your own metric

Note

For custom metrics, we leverage prometheus endpoint and require application to emit the required metric.

Scale down to zero

Autoscaling requires you to have a minimum of 1 machine but if you want to scale down to zero and automatically restart machines during load, you can enable Serverless capability for the corresponding resource instances.

Enabling serverlessConfiguration will enable auto-stop and auto-wakeup capabilities for your service component in just 1-click. There are two serverless modes: 1/ Basic 2/ Advanced.

Setup

You can enable Serverless in the Basic mode for one or more service components by adding x-omnistrate-capabilities. Here is an example using compose spec:

x-omnistrate-capabilities:
  serverlessConfiguration:
    enableAutoStop: true
    targetPort: 3306

Note

Serverless capability is not available under Omnistrate hosted configuration. You will have to bring your account or bring your customers account (BYOA) to enable Serverless.

The default serverless mode only works for TCP based services and we use the number of active connections to a target port as a metric to scale down to zero. For custom metrics, please see this for more details.

Finally, in the above setup, any connection or session state will be lost as it scales down to zero. Please see this for more details.

Test

Create an instance of the serverless resource. Once its up and running, check the connectivity.

Wait for the configured period of inactivity for infrastructure to be automatically stopped. You can confirm that by checking the state of the resource instance.

As soon as new activity will resume, the underlying resource instance should start automatically in several seconds.

Faster wakeup using warm pool

In-fact, to speed up the wake up, we also offer warm pool capability that you can enable for faster resume times.

    minimumNodesInPool: 5

minimumNodesInPool: if you want to configure static number of machines in the pool. You can always call our APIs at any time to change it programmatically or manually through UI.

Demo example

Here is a demo example using Postgres:

Demo Postgres Serverless SaaS

Achieving full elasticity

Scale down to zero enables your underlying software to go from 0 to 1 when in use and back to 0 when not in use. However, in practice, you may want to go from 0 to N as the load increases and vice-versa when not in use

To achieve that enable auto-scaling in addition to scale down to zero. Here is an example configuration using compose spec:

x-omnistrate-capabilities:
  autoscaling:
    maxReplicas: 5
    minReplicas: 1
    idleMinutesBeforeScalingDown: 2
    idleThreshold: 20
    overUtilizedMinutesBeforeScalingUp: 3
    overUtilizedThreshold: 80
    scalingMetric:
      metricEndpoint: "http://localhost:9187/metrics"
      metricLabelName: "application_name"
      metricLabelValue: "custom_metric_label"
      metricName: "custom_metric_name"
  serverlessConfiguration:
    targetPort: 3306
    enableAutoStop: true
    minimumNodesInPool: 5

Advanced Serverless

You may need advanced serverless if you care about any of the following:

  • Resume any client state even after infrastructure has been scaled down to zero. To understand this better, let's say you want to scale down the infrastructure for your database application on idle connections but you have prepared statements or temporary tables as part of the session state. By default, on scale down to zero, you will loose all of your connection state and your client connections will fail. With advanced serverless, you can store the connection state such that on resume, clients will continue to behave the same way making your client completely oblivious of the scale down to zero.
  • Custom metrics on which you want to make a decision to scale down to zero / resume
  • Take control on the proxy configuration from its image, infra, capabilities including managing the lifecycle of proxy resource and much more.

To enable advanced serverless using compose spec:

serverlessConfiguration:
  referenceProxyKey": "proxyResourceKey"
  portsMappingProxyConfig: 
    numberOfPortsPerCluster: 4
    maxNumberOfClustersPerProxyInstance": 1
  proxyStorageConfig:
    aws:
      storageType: "s3"
  • referenceProxyKey: reference to the proxy resource that will be used to listen to client traffic
  • proxyStorageConfig: state store to keep the connection state(s)
  • portsMappingProxyConfig: configurations to map proxy resource ports to database application

Info

Note that advanced Serverless mode is only available in enterprise plan only, please reach out to us at support@omnistrate.com to learn more.

In a nut shell, if you are looking for any sort of customization, you will likely need an advanced mode. For simple use-cases, the basic Serverless mode should suffice.