Workload autoscaling
Workload autoscaling is in public preview and disabled by default as of release 23.07. See the 23.07 release notes for instructions on enabling workload autoscaling during your upgrade.
Workload autoscaling lets Instabase autoscale data services based on demand. Autoscaling optimizes service resources to maximize efficiency and performance for any workload at a given time. Workload autoscaling also removes the need to manually size services and presents cost saving opportunities.
Autoscaling is performed with Kubernetes HorizontalPodAutoscalers (HPAs) based on CPU usage for conversion-service
, ocr-msft-lite
, ocr-msft-v3
, and ocr-service
.
See the infrastructure requirements documentation for information on the required Kubernetes components.
Configure workload autoscaling
For deployments using Instabase standard resourcing sizing, Deployment Manager automatically determines and applies appropriate minReplicas
and maxReplicas
values for all autoscaled services. For deployments using custom resourcing sizing, however, you must set the minReplicas
and maxReplicas
values for the HPA services corresponding to the following autoscaled deployment services:
Deployment service | Corresponding HPA service |
---|---|
ocr-msft-v3 | autoscaler-ocr-msft-v3 |
ocr-msft-lite | autoscaler-ocr-msft-lite |
ocr-service | autoscaler-ocr-service |
conversion-service | autoscaler-conversion-service |
While the maxReplicas
value can be set based on your preferred resourcing sizing, the minReplicas
value must be calculated based on the number of celery-app-tasks
pods in the deployment.
To adjust a service’s minReplicas
and maxReplicas
values, apply the following patch to each HPA service.
# target: <name of HPA service>
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
spec:
maxReplicas: <max replicas>
minReplicas: <min replicas>
For releases 23.07 and later, you can find sample patches for modifying HPA services in the release’s installation.zip
file, in the custom-hpa-patches
folder (installation
> optional_patches
> custom-hpa-patches
). If you use these sample patches, you must still define the minReplicas
and maxReplicas
values.
Calculate HPA minReplicas for autoscaled services
To calculate minReplicas
for an HPA service, use the following formulas, where n
is the number of celery-app-tasks
pods in your deployment:
The ceil()
function returns the smallest integer value that’s greater than or equal to the calculated number.
Deployment service | Corresponding HPA service | minReplicas formula |
---|---|---|
ocr-msft-v3 | autoscaler-ocr-msft-v3 | ceil(0.28 * n) |
ocr-msft-lite | autoscaler-ocr-msft-lite | ceil(0.57 * n) |
ocr-service | autoscaler-ocr-service | ceil(0.28 * n) |
conversion-service | autoscaler-conversion-service | ceil(0.28 * n) |
Disable autoscaling
You can disable autoscaling at any time. The process differs based on if your deployment uses custom resourcing sizing or standard Instabase resourcing sizing.
To disable autoscaling in deployments using standard Instabase resourcing sizing:
-
Disable the
ENABLE_AUTOSCALING
environment variable:-
On the command line, run the following command:
kubectl edit deployment/deployment-control-plane -n $IB_NS
, where$IB_NS
is your Instabase namespace. -
Locate the
ENABLE_AUTOSCALING
environment variable. -
Set the value to
False
. -
Save your changes.
-
-
Call the Push latest materialized configs to cluster API to redeploy the deployment with autoscaling disabled.
-
Call the Update cluster size API to reset your resourcing sizing.
-
For all autoscaled services, set the corresponding HPA service to a static range by applying a patch that:
- Sets
maxReplicas
to your desired value. This can be the samemaxReplicas
value the resource had before autoscaling was enabled, or a new value. - Sets the
minReplicas
value to the same value asmaxReplicas
.
For example:
# target: <name of HPA service> apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler spec: maxReplicas: <n> minReplicas: <n>
- Sets
To disable autoscaling in deployments using custom resourcing sizing:
-
Disable the
ENABLE_AUTOSCALING
environment variable:-
On the command line, run the following command:
kubectl edit deployment/deployment-control-plane -n $IB_NS
-
Locate the
ENABLE_AUTOSCALING
environment variable. -
Set the value to
False
. -
Save your changes.
-
-
Call the Push latest materialized configs to cluster API to redeploy the deployment with autoscaling disabled.
-
For all autoscaled services, set the corresponding HPA service to a static range by applying a patch that:
- Sets the
maxReplicas
to your desired value. This can be the samemaxReplicas
value the resource had before autoscaling was enabled, or a new value. - Sets the
minReplicas
value to the same value asmaxReplicas
.
For example:
# target: <name of HPA service> apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler spec: maxReplicas: <n> minReplicas: <n>
- Sets the
Verifying autoscaling configuration changes
You can verify that patches targeting HPAs have applied successfully using the HPAs tab of the Deployment Manager infra dashboard.
To confirm that an HPA’s replica count has updated successfully:
- Open the Deployment Manager HPAs tab (All apps > Deployment Manager > Infra Dashboard > HPAs).
- On the Horizontal Pod Autoscalers, select the updated HPA.
- Confirm that the General Info section lists the correct Min Replicas and Max Replicas values.
To confirm that an HPA is active:
- Open the Deployment Manager HPAs tab (All apps > Deployment Manager > Infra Dashboard > HPAs).
- On the Horizontal Pod Autoscalers dashboard, select the updated HPA.
- Verify that the Conditions table includes an AbleToScale condition. This condition means that CPU metrics are available and the HPA is active.