Workload autoscaling

Note

Workload autoscaling is in public preview and disabled by default as of release 23.07. See the 23.07 release notes for instructions on enabling workload autoscaling during your upgrade.

Workload autoscaling lets Instabase autoscale data services based on demand. Autoscaling optimizes service resources to maximize efficiency and performance for any workload at a given time. Workload autoscaling also removes the need to manually size services and presents cost saving opportunities.

Autoscaling is performed with Kubernetes HorizontalPodAutoscalers (HPAs) based on CPU usage for conversion-service, ocr-msft-lite, ocr-msft-v3, and ocr-service.

Info

See the infrastructure requirements documentation for information on the required Kubernetes components.

Configure workload autoscaling

For deployments using Instabase standard resourcing sizing, Deployment Manager automatically determines and applies appropriate minReplicas and maxReplicas values for all autoscaled services. For deployments using custom resourcing sizing, however, you must set the minReplicas and maxReplicas values for the HPA services corresponding to the following autoscaled deployment services:

Deployment service	Corresponding HPA service
ocr-msft-v3	autoscaler-ocr-msft-v3
ocr-msft-lite	autoscaler-ocr-msft-lite
ocr-service	autoscaler-ocr-service
conversion-service	autoscaler-conversion-service

While the maxReplicas value can be set based on your preferred resourcing sizing, the minReplicas value must be calculated based on the number of celery-app-tasks pods in the deployment.

To adjust a service’s minReplicas and maxReplicas values, apply the following patch to each HPA service.

# target: <name of HPA service>
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
spec:
  maxReplicas: <max replicas>
  minReplicas: <min replicas>

Tip

For releases 23.07 and later, you can find sample patches for modifying HPA services in the release’s installation.zip file, in the custom-hpa-patches folder (installation > optional_patches > custom-hpa-patches). If you use these sample patches, you must still define the minReplicas and maxReplicas values.

Calculate HPA minReplicas for autoscaled services

To calculate minReplicas for an HPA service, use the following formulas, where n is the number of celery-app-tasks pods in your deployment:

Info

The ceil() function returns the smallest integer value that’s greater than or equal to the calculated number.

Deployment service	Corresponding HPA service	minReplicas formula
ocr-msft-v3	autoscaler-ocr-msft-v3	ceil(0.28 * n)
ocr-msft-lite	autoscaler-ocr-msft-lite	ceil(0.57 * n)
ocr-service	autoscaler-ocr-service	ceil(0.28 * n)
conversion-service	autoscaler-conversion-service	ceil(0.28 * n)

Disable autoscaling

You can disable autoscaling at any time. The process differs based on if your deployment uses custom resourcing sizing or standard Instabase resourcing sizing.

To disable autoscaling in deployments using standard Instabase resourcing sizing:

Disable the ENABLE_AUTOSCALING environment variable:
1. On the command line, run the following command: kubectl edit deployment/deployment-control-plane -n $IB_NS, where $IB_NS is your Instabase namespace.
2. Locate the ENABLE_AUTOSCALING environment variable.
3. Set the value to False.
4. Save your changes.
Call the Push latest materialized configs to cluster API to redeploy the deployment with autoscaling disabled.
Call the Update cluster size API to reset your resourcing sizing.
For all autoscaled services, set the corresponding HPA service to a static range by applying a patch that:
- Sets maxReplicas to your desired value. This can be the same maxReplicas value the resource had before autoscaling was enabled, or a new value.
- Sets the minReplicas value to the same value as maxReplicas.
For example:
```
# target: <name of HPA service>
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
spec:
  maxReplicas: <n>
  minReplicas: <n>
```

To disable autoscaling in deployments using custom resourcing sizing:

Disable the ENABLE_AUTOSCALING environment variable:
1. On the command line, run the following command: kubectl edit deployment/deployment-control-plane -n $IB_NS
2. Locate the ENABLE_AUTOSCALING environment variable.
3. Set the value to False.
4. Save your changes.
Call the Push latest materialized configs to cluster API to redeploy the deployment with autoscaling disabled.
For all autoscaled services, set the corresponding HPA service to a static range by applying a patch that:
- Sets the maxReplicas to your desired value. This can be the same maxReplicas value the resource had before autoscaling was enabled, or a new value.
- Sets the minReplicas value to the same value as maxReplicas.
For example:
```
# target: <name of HPA service>
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
spec:
  maxReplicas: <n>
  minReplicas: <n>
```

Verifying autoscaling configuration changes

You can verify that patches targeting HPAs have applied successfully using the HPAs tab of the Deployment Manager infra dashboard.

To confirm that an HPA’s replica count has updated successfully:

Open the Deployment Manager HPAs tab (All apps > Deployment Manager > Infra Dashboard > HPAs).
On the Horizontal Pod Autoscalers, select the updated HPA.
Confirm that the General Info section lists the correct Min Replicas and Max Replicas values.

To confirm that an HPA is active:

Open the Deployment Manager HPAs tab (All apps > Deployment Manager > Infra Dashboard > HPAs).
On the Horizontal Pod Autoscalers dashboard, select the updated HPA.
Verify that the Conditions table includes an AbleToScale condition. This condition means that CPU metrics are available and the HPA is active.