This page shows how to manually scale a Deployment horizontally, by changing its replica count. Manual scaling lets you directly control the number of running Pods for predictable load changes or cost management.
This is different from vertical scaling: leaving the replica count the same, but adjusting the amount of resources available to each Pod.
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
You need an existing Deployment. If you do not have one, and you just want to practice, you can create the nginx Deployment from Run a Stateless Application Using a Deployment:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml
Verify the Deployment runs two Pods:
kubectl get deployment nginx-deployment
The output is similar to:
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 2/2 2 2 10s
There are several different ways you can change the replica count for an existing Deployment.
kubectl scaleUse kubectl scale to set the replica count:
kubectl scale deployment/nginx-deployment --replicas=4
The output is similar to:
deployment.apps/nginx-deployment scaled
Verify that the Deployment has four Pods:
kubectl get deployment nginx-deployment
The output is similar to:
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 4/4 4 4 1m
kubectl applyInstead of running an imperative command, you can update the manifest file and apply it. This approach fits well with version-controlled configuration workflows.
Save the current Deployment configuration to a local file:
kubectl get deployment nginx-deployment -o yaml > /tmp/nginx-deployment.yaml
Edit /tmp/nginx-deployment.yaml and change .spec.replicas to 4.
Before applying, compare your local changes against the cluster state:
kubectl diff -f /tmp/nginx-deployment.yaml
Apply the edited manifest:
kubectl apply -f /tmp/nginx-deployment.yaml
To reduce the number of Pods, set --replicas to a lower value:
kubectl scale deployment/nginx-deployment --replicas=2
Kubernetes gracefully terminates the excess Pods, respecting each Pod's
terminationGracePeriodSeconds setting.
Verify that the Deployment has two Pods:
kubectl get pods -l app=nginx
The output is similar to:
NAME READY STATUS RESTARTS AGE
nginx-deployment-66b6c48dd5-7gl6h 1/1 Running 0 2m
nginx-deployment-66b6c48dd5-v8mkd 1/1 Running 0 2m
You can scale a Deployment to zero to temporarily suspend the workload without deleting the Deployment itself:
kubectl scale deployment/nginx-deployment --replicas=0
Verify that no Pods are running:
kubectl get deployment nginx-deployment
The output is similar to:
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 0/0 0 0 5m
--replicas to a positive
number.Common use cases for scaling to zero include:
In addition to kubectl scale, you can change .spec.replicas with
kubectl edit or kubectl patch.
kubectl editkubectl edit deployment nginx-deployment
Change the .spec.replicas field in the editor, then save and exit.
kubectl patchYou can update .spec.replicas with a strategic merge patch:
kubectl patch deployment nginx-deployment -p '{"spec":{"replicas":4}}'
For scripting, use a JSON patch with a prerequisite test. The following command sets the replica count to 4, but only if the current count is 2:
kubectl patch deployment nginx-deployment --type=json -p='[
{"op": "test", "path": "/spec/replicas", "value": 2},
{"op": "replace", "path": "/spec/replicas", "value": 4}
]'
The test operation causes the patch to fail if the current value does not match, which prevents unintended changes when multiple people or scripts modify the same Deployment.
| Aspect | Manual scaling | Automatic scaling (HPA) |
|---|---|---|
| Best for | Predictable, scheduled, or one-off load changes | Variable or unpredictable demand |
| How it works | You set .spec.replicas directly | HPA adjusts replicas based on observed metrics |
| Response time | Immediate when you run the command | Reacts to metrics with a short delay |
| Metrics awareness | None — you decide the replica count | Monitors CPU, memory, or custom metrics |
| Maintenance | Requires manual intervention to adjust | Runs autonomously after configuration |
Delete the Deployment:
kubectl delete deployment nginx-deployment