Building your SaaS Product Using Kubernetes Operators¶
Getting started¶
Omnistrate supports deploying Kubernetes Operators as part of your SaaS Product topology. This enables you to automate infrastructure management and application lifecycle orchestration within Kubernetes clusters. By leveraging Operators, you can turn complex, stateful applications into managed, multi-tenant SaaS Products with minimal effort. This guide starts from a blank spec and walks through the sequence you’ll follow, then shows a worked example.
How It Works¶
When you build a SaaS Product from a Kubernetes Operator, Omnistrate automates the entire lifecycle:
- Infrastructure Provisioning: Omnistrate deploys a dedicated or shared Kubernetes cluster in the cloud and region of your choice.
- Operator Installation: The Operator itself is installed into the cluster, typically via a Helm chart dependency that you specify.
- Custom Resource (CR) Instantiation: For each tenant who subscribes to your SaaS Product, Omnistrate creates an instance of your Operator's Custom Resource (CR). The CR is configured using parameters provided by the customer and system-generated values.
- Lifecycle Management: The Operator takes over, provisioning and managing the application components as defined in the CR.
- Readiness and Endpoints: Omnistrate monitors the status of the CR to determine if the SaaS Product instance is ready and exposes the necessary endpoints to the customer.
Walkthrough: Building from an Operator¶
Use this flow when you start from scratch with an Operator-based product.
1) Prepare the Operator assets - Package the Operator as a Helm chart and confirm the Custom Resource Definition (CRD) exposes a status block with the fields you will reference. Locally apply a sample Custom Resource (CR) and run kubectl get <crd-kind> <name> -o yaml to capture the exact status paths you will use. - Decide whether the Operator should be cluster-scoped (install once via helmChartDependencies) or per-instance (install as a regular helmChart service so it lives in the instance namespace).
2) Create your Plan skeleton - Define hosting and tenancy (deployment.hostedDeployment, tenancyType). - Choose compute classes and instance types. If your Operator image is large or unpacks many artifacts, set rootVolumeSizeGi on the node group (defaults: AWS 10 Gi, Azure/GCP 30 Gi). Field experience shows 50 Gi prevents disk-pressure failures on heavy Operator images.
3) Add the Operator install - In helmChartDependencies, point to the Operator chart and version. If you need to override chart values (e.g., replica count, resources), use chartValues.
4) Model the Custom Resource - In operatorCRDConfiguration.template, render the Custom Resource your Operator manages. - Add readinessConditions that point to real status fields emitted by the Operator. If the field never appears, workflows time out with “output parameter not resolved.” - Add outputParameters to surface status/info back to customers (e.g., connection details or versions). - Add supplementalFiles for supporting secrets/configmaps your CR references.
5) Place workloads correctly - Define pod affinity/anti-affinity in the CR (or via chart values) so Operator-managed pods land on Omnistrate-managed nodegroups with your chosen instance types and storage. See Define affinity rules.
6) Set endpoints and parameters - Use apiParameters for tenant inputs and endpointConfiguration to expose connection info (hosts/ports).
7) Build and publish - Run omnistrate-ctl build -f spec.yaml --name '<Plan Name>' --release-as-preferred --spec-type ServicePlanSpec.
8) Create the first instance - Use the Customer Portal or CLI to create an instance. Watch live status in Operations Center → Workflows. - The very first deployment per account/region creates the underlying cell (EKS/VPC/nodegroups) and can take longer than subsequent deploys because it bootstraps the infrastructure. Avoid deleting cloud resources directly in your cloud console; use Omnistrate to create/delete so metadata stays in sync.
9) Debug and iterate quickly - If readiness stalls, open the workflow details to see which condition/output failed (Operations Center → Workflows). - Use Deployment Cell Access to obtain kubeconfig and check: - Operator pods are running (kubectl -n default get pods). (Operators installed via helmChartDependencies default to the default namespace) - The Custom Resource exists and has the expected status fields (kubectl -n <instance-namespace> get <crd-kind> <name> -o yaml). - Any pod scheduling issues (events often show affinity or disk-pressure problems). - Fix the spec (status paths, affinities, resources, chart values), publish a new Plan version, and upgrade the instance.
Common Pitfalls¶
- Status fields missing: If
readinessConditionsoroutputParametersreference a status path the Operator never writes, the workflow times out. Confirm the Custom Resource’sstatusblock after a test reconcile. - Operator pods unscheduled/disk-pressure: Large Operator images may exhaust node root disks. Increase
rootVolumeSizeGias needed and use pod affinity so Operator-managed pods land on the intended nodegroups. - Manual cloud deletions: Deleting clusters/VMs/nodegroups directly in your cloud account desynchronizes Omnistrate. Create/delete cells and instances from Omnistrate instead.
- Namespace expectations:
helmChartDependenciesinstalls the Operator once per cluster (in thedefaultnamespace). If you need a per-namespace Operator, deploy it as a standalonehelmChartservice in the same spec as a dependency on your CRD service.
Example: Building a PostgreSQL SaaS With the CNPG Operator¶
Below is a worked example that follows the walkthrough above to build a managed PostgreSQL SaaS Product using the CloudNativePG (CNPG) Operator (see the community-contributed PostgreSQL PaaS repository).
- Steps 1–3: The Operator is packaged as a Helm dependency (
helmChartDependencies). - Steps 4–6: The CR template, readiness conditions, outputs, endpoints, and parameters are defined under
operatorCRDConfiguration,endpointConfiguration, andapiParameters. - Steps 7–9: Once built and published, create an instance, watch the workflow, and use Deployment Cell Access if readiness does not resolve.
We will define our SaaS Product in a spec.yaml file. This file tells Omnistrate how to install the operator, what kind of database to create for customers, and how to expose it.
Here is the complete spec.yaml for our PostgreSQL SaaS Product. We will break down each section below.
# yaml-language-server: $schema=https://api.omnistrate.cloud/2022-09-01-00/schema/service-spec-schema.json
name: PostgreSQL Server # Plan Name
deployment:
hostedDeployment:
awsAccountId: "<AWS_ACCOUNT_ID>"
awsBootstrapRoleAccountArn: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/omnistrate-bootstrap-role"
tenancyType: CUSTOM_TENANCY
features:
INTERNAL:
logs: {} # Omnistrate native
CUSTOMER:
logs: {} # Omnistrate native
services:
- name: CNPG
compute:
instanceTypes:
- apiParam: instanceType
cloudProvider: aws
apiParameters:
- key: instanceType
description: Instance Type
name: Instance Type
type: String
modifiable: true
required: false
export: true
defaultValue: "t3.medium"
- key: postgresqlPassword
description: Default DB Password
name: Password
type: Password
modifiable: false
required: true
export: true
- key: postgresqlUsername
description: Username
name: Default DB Username
type: String
modifiable: false
required: false
export: true
defaultValue: "app"
- key: postgresqlDatabase
description: Default Database Name
name: Default Database Name
type: String
modifiable: false
required: false
export: true
defaultValue: "app"
- key: numberOfInstances
description: Total Number of Instances
name: Total Number of Instances
type: Float64
modifiable: true
required: false
export: true
defaultValue: "1"
limits:
min: 1
- key: storageSize
description: Storage size for PostgreSQL data
name: Storage Size
type: String
modifiable: true
required: false
export: true
defaultValue: "20Gi"
endpointConfiguration:
writer:
host: "$sys.network.externalClusterEndpoint"
ports:
- 5432
primary: true
networkingType: PUBLIC
reader:
host: "reader-{{ $sys.network.externalClusterEndpoint }}"
ports:
- 5432
primary: false
networkingType: PUBLIC
operatorCRDConfiguration:
template: |
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: {{ $sys.id }}
spec:
enablePDB: true
bootstrap:
initdb:
owner: {{ $var.postgresqlUsername }}
database: {{ $var.postgresqlDatabase }}
secret:
name: basic-auth
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: omnistrate.com/managed-by
operator: In
values:
- omnistrate
- key: topology.kubernetes.io/region
operator: In
values:
- {{ $sys.deploymentCell.region }}
- key: node.kubernetes.io/instance-type
operator: In
values:
- {{ $sys.compute.node.instanceType }}
- key: omnistrate.com/resource
operator: In
values:
- {{ $sys.deployment.resourceID }}
instances: {{ $var.numberOfInstances }}
storage:
resizeInUseVolumes: true
size: {{ $var.storageSize }}
storageClass: gp3
managed:
services:
additional:
- selectorType: ro
serviceTemplate:
metadata:
name: "{{ $sys.id }}-cluster-ro"
annotations:
external-dns.alpha.kubernetes.io/hostname: reader-{{ $sys.network.externalClusterEndpoint }}
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-subnets: "{{ $sys.deploymentCell.publicSubnetIDs[*].id }}"
spec:
type: LoadBalancer
updateStrategy: patch
- selectorType: rw
serviceTemplate:
metadata:
name: "{{ $sys.id }}-cluster-rw"
annotations:
external-dns.alpha.kubernetes.io/hostname: {{ $sys.network.externalClusterEndpoint }}
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-subnets: "{{ $sys.deploymentCell.publicSubnetIDs[*].id }}"
spec:
type: LoadBalancer
updateStrategy: patch
supplementalFiles:
- |
# Basic auth using parameters
apiVersion: v1
kind: Secret
metadata:
name: basic-auth
namespace: {{ $sys.id }}
type: kubernetes.io/basic-auth
data:
username: {{ $func.base64encode($var.postgresqlUsername) }}
password: {{ $func.base64encode($var.postgresqlPassword) }}
readinessConditions:
"$var._crd.status.phase": "Cluster in healthy state"
'$var._crd.status.conditions[?(@.type=="Ready")].status': "True"
outputParameters:
"Postgres Container Image": "$var._crd.status.image"
"Status": "$var._crd.status.phase"
"Topology": "$var._crd.status.topology"
helmChartDependencies:
- chartName: cloudnative-pg
chartVersion: 0.26.0
chartRepoName: cnpg
chartRepoURL: https://cloudnative-pg.github.io/charts
Info
For more detailed information on the pricing, metering, billingProviders configuration, please see End-to-End Billing and Usage Metering.
Anatomy of the Plan Specification¶
Let's break down the key sections of the spec.yaml.
apiParameters¶
This section defines the inputs your customers will provide when creating a new PostgreSQL instance. These parameters are then available in the CRD template using the $var prefix (e.g., {{ $var.postgresqlPassword }}).
apiParameters:
- key: postgresqlPassword
description: Default DB Password
name: Password
type: Password
required: true
- key: numberOfInstances
description: Total Number of Instances
name: Total Number of Instances
type: Float64
defaultValue: "1"
- key: storageSize
description: Storage size for PostgreSQL data
name: Storage Size
type: String
defaultValue: "20Gi"
helmChartDependencies¶
This is where you specify the Operator's Helm chart. Omnistrate will install this chart into the Kubernetes cluster before creating any instances of your SaaS Product.
helmChartDependencies:
- chartName: cloudnative-pg
chartVersion: 0.26.0
chartRepoName: cnpg
chartRepoURL: https://cloudnative-pg.github.io/charts
operatorCRDConfiguration¶
This is the core of the integration. It tells Omnistrate how to interact with your Operator.
-
template: This is a Go template for the Custom Resource (CR) that the Operator will manage. Here, we define aClusterresource for the CNPG operator. Notice the use of{{ $var.variableName }}for customer inputs and{{ $sys.variableName }}for system-provided values like the instance ID or network details. -
supplementalFiles: This allows you to create additional Kubernetes resources alongside the main CR. In this example, we create aSecretto hold the database credentials provided by the user. This secret is then referenced in thebootstrapsection of theClusterCR. -
readinessConditions: This tells Omnistrate how to determine if the service instance is ready. It checks thestatusfield of the CR. For CNPG, we wait for thephaseto beCluster in healthy state. -
outputParameters: This exposes fields from the CR'sstatusback to the customer. This is useful for displaying information like the running PostgreSQL version or the current cluster status in the customer portal.
endpointConfiguration¶
This section defines the connection details that will be shown to your customers. The host field uses system variables to construct the public DNS endpoint for the writer and reader services created by the CNPG operator. The portExpressions field allows you to dynamically map ports using expressions. This is useful when you need to generate random ports or map ports deterministically for load balancer configurations.
endpointConfiguration:
writer:
host: "$sys.network.externalClusterEndpoint"
ports:
- 5432
portExpressions:
- "{{ $func.randomminmax(10000, 20000, 42)}}"
- "{{ $func.randomminmax(10000, 20000, 43)}}"
primary: true
networkingType: PUBLIC
reader:
host: "reader-{{ $sys.network.externalClusterEndpoint }}"
ports:
- 5432
portExpressions:
- "{{ $func.randomminmax(10000, 20000, 42)}}" # same seed for deterministic mapping
primary: false
networkingType: PUBLIC
Info
For a complete list of available functions and system parameters, see Build Guide / System Parameters and Evaluate Expressions.
Registering the SaaS Product¶
Once you have your spec.yaml file, you can build and register your SaaS Product using the Omnistrate CLI:
This command will:
- Validate your Plan specification.
- Create the SaaS Product and a "PostgreSQL Server" Plan.
- Set up a development environment for you to test.
- Provide you with a URL to a dedicated Customer Portal for your new SaaS Product.
Deploying A PostgreSQL Instance¶
After registering the SaaS Product, you can use the auto-generated Customer Portal to deploy instances of your PostgreSQL SaaS Product. Your customers will be able to:
- Sign in to the portal.
- Choose the "PostgreSQL Server" plan.
- Select a cloud provider and region.
- Configure the parameters you defined in
apiParameters(like password, storage size). - Click "Create" to deploy their own isolated PostgreSQL cluster.
Omnistrate and the CNPG Operator handle the rest, and the customer will see the connection endpoints once the cluster is ready.
For more details on system parameters and advanced configurations, refer to the Plan Spec guide.