AVD Autoscale and Cost Optimisation

The biggest AVD bill-shock comes from running every session host 24×7 when your users work 8×5. Autoscale scaling plans fix that by starting and stopping hosts on a schedule and on demand — often cutting compute cost by half or more without users noticing.

Where AVD money actually goes

Compute — session host VM hours. By far the largest and most controllable lever.
Storage — profile and image storage; smaller, steadier.
Licensing — typically covered by eligible M365/Windows licences.
Egress/extras — usually minor.

Because compute dominates and is the easiest to flex, autoscale is the highest-leverage optimisation you can apply.

Scaling plans in plain terms

A scaling plan keeps more hosts on during peak hours and a minimal floor overnight.

A scaling plan attaches to one or more pooled host pools and divides the day into phases. In each phase AVD decides how many hosts to keep on based on a load threshold and a capacity buffer:

Phase	What it does
Ramp-up	Morning: bring hosts online ahead of demand so logons are fast
Peak	Business hours: keep enough capacity for full load
Ramp-down	Evening: drain and shut down hosts as users leave
Off-peak	Overnight/weekend: run a minimal floor of hosts

Two key dials

Capacity threshold (the load % that triggers more hosts) and minimum percentage of hosts on (your floor). Tune these per phase to trade responsiveness against cost.

Graceful drain, not a hard stop

During ramp-down, autoscale doesn’t kill active sessions. It puts hosts into drain mode (no new sessions), optionally notifies signed-in users, waits a grace period, then deallocates empty hosts. You control the forced-logoff behaviour for stragglers.

Right-sizing the hosts themselves

Autoscale decides how many hosts run; right-sizing decides how efficient each one is. Two questions to revisit with real telemetry:

Is the VM SKU correct? Watch CPU, memory and disk via Azure Monitor / AVD Insights. Persistently low utilisation means you’re paying for headroom you don’t use.
Is users-per-host tuned? Too many users per host hurts experience; too few wastes money. Adjust the max-session-limit and SKU together.

Levers beyond autoscale

Reserved Instances / savings plans for the always-on floor of hosts you know you’ll run.
Ephemeral OS disks for stateless pooled hosts — cheaper and faster to rebuild.
Right-tier storage — Premium where profiles need it, Standard where they don’t.
Image hygiene — smaller, well-maintained images deallocate and start faster.

Measure before and after

Capture a week of cost and utilisation before enabling a scaling plan, then again after. It both proves the saving and shows whether your thresholds are too aggressive (slow logons) or too generous (wasted hosts).

A pragmatic first scaling plan

Start conservative: a modest ramp-up before business hours, a peak phase with a comfortable buffer, a ramp-down an hour after typical finish with user notifications, and an off-peak floor of one or two hosts for late workers. Watch logon times for a week, then tighten.

Autoscale is the rare optimisation that improves both the bill and the platform — fewer idle hosts, fresher hosts each morning, and a clear cost story for the business. Set it up early and revisit the thresholds with real data.

Need a hand with your AVD platform? 🚀

I help organisations design, migrate and optimise Azure Virtual Desktop. If you’re planning or troubleshooting a deployment, get in touch.