To set up autoscaling, you don't need to be a Heroku expert. Surprisingly, 63 percent of Heroku App Owners do not have autoscaling enabled. You may start adding autoscaling to your app by investing 15 minutes of your time. Let's go over how to set up autoscaling on your application so you can start saving money right away.
Autoscaling is a great feature for increasing the availability of your applications. It enables you to autonomously scale up or down dynos in response to system traffic. This is useful because it allows you to raise the number of dynos during periods of high traffic and lower the number of dynos during periods of low traffic to save money while maintaining availability.
Is Heroku's Autoscaling a good fit for me?
Heroku provides dyno autoscaling, however their solution has limits. Let's take a look at those limits before we get started.
Do you have performance dynos?
If you do, Heroku will allow your application to autoscale. Unfortunately, Heroku does not enable Autoscaling for standard Dynos at this time. Another solution that supports standard dyno Heroku Autoscaling is discussed below.
Do all HTTP requests take the same amount of time?
If so, Heroku scaling would be a good fit. If your requests have a wide range of backend latency (200-500ms), Heroku's autoscaling is likely to generate a lot of false positives, costing you more dyno time. Since Heroku's autoscaling metric is Response time rather than Queue time, this is the case.
If you have this problem checkout another solution detailed below, that can handle Heroku Autoscaling with high variance in backend latency.
Are you running in a private space?
If you answered yes, you're in luck. Heroku can scale programs that are hosted on a private space. If you don't have access to a private space, look into the alternative option listed below.
Navigate to the Resources tab of your application on the Heroku Dashboard.
Click on the Enable Autoscaling button next to your web dyno details.
Select the upper and lower bounds for your applications autoscale range. A preview of the monthly cost range will appear below the dyno range. Your dyno count will always be scaled to a quantity inside the range you specify.
Next you set the desired p95 Response Time for your application. This value is the threshold used by Heroku's autoscaling engine to determine how to scale your dynos. A recommended p95 response time is listed below this field.
Then, if you would like to receive an Email Notification when your application is autoscaled to the max dynos check the box. This notification is sent at most once per day.
Finally, click Confirm to save your autoscale configuration. This will immediately adjust the dynos of your application to conform to the autoscale configuration.
The autoscaling engine uses the Desired p95 Response Time set in the prior step to determine when to scale your app. The autoscaling algorithm looks back over the past hour to calculate the number of dynos required to achieve the desired response time for 95% of incoming requests.
When the engine deems the number of dynos incorrect, a single dyno is either added or removed from your app. No more than 1 Autoscaling event can occur in 1 minute.
The autoscaling engine scales down less quickly than it scales up. This is to protect against a substantial downscaling from a temporary lull in requests resulting in high latency if demand spikes upward soon after.
If there is no request throughput your application for 3 minutes, dynos will scale down at 1-minute intervals until throughput resumes.
Monitoring Autoscale Events
Navigate to the Metrics tab of your Heroku application. Scaling events can be seen in the Events chart. By hovering over a scaling event you can see what initiated it. Heroku autoscaling events are labeled “Dyno Autoscaling”. If multiple autoscaling events occur in the same period, only the step where the scaling changed direction is shown.
To disable autoscaling navigate to the Resources tab of your application on the Heroku Dashboard. Then click on the Disable Autoscaling button and select a fixed dyno count.
Dynoscale is a Heroku Autoscaling Addon that solves many of the limitations of Heroku's autoscaling engine.
It works for both standard and performance dynos, letting you decide the proper resources for your application. In addition, Dynoscale uses queues time instead of response time to scale your application. This ensures that long running application code doesn't scale based on a false positives from a Response Time metric.
Dynoscale currently supports Ruby and Python applications. To install for a Rails/Ruby Application, follow these getting started instructions. To install for a Python Application, follow these getting started instructions.
Once you have installed the Dynoscale addon, next lets setup the scaling rules by specifying the Autoscaling Logic. For those with the Heroku CLI installed, in a terminal run
heroku addons:open dscale. For those who do not have the Heroku CLI installed, navigate to the Resource tab of your application in the Heroku Dashboard and click on Dynoscale.
Next navigate to the Web Autoscale Rules to setup the autoscaling logic. Click on the blue edit rule button.
On the upper left side of the rule you can see the upper and lower bounds for dynos. These bounds set the max/min dynos this rule will permit. For a relatively low traffic application with some periods of high traffic its sensible to set the lower bound to 1 and the upper bound to 2.
On the lower left side of the rule you can see the Upscale and Downscale Configuration. On the Upscale configuration, the first field is the request queue threshold. It determines when more dynos are needed to meet the traffic needs of your application. Once the threshold is exceeded, Dynoscale will upscale your application by an increment set in the next field. Finally, to avoid frequent scale ups, the last field delays incremental scaleups with a buffer time.
On the Downscale configuration, the first field is the request queue threshold. It determines when to reduce the dynos needed to meet the traffics needs of your application. Once the request queue drops below the threshold, Dynoscale will downscale your application by an increment set in the next field. Finally, to avoid frequent scaling, the last field delays incremental scale downs with a buffer time.
Once you are happy with a rule, click on the green save button in the upper right side of the rule. Alternatively, if you don't want to save changes, click on the black cancel button.
Finally click on the toggle on the upper right side of the page. Red indicates that your rules are off and green your rules are active.
Monitoring Autoscale Events
In addition to the Heroku Dashboard, Scaling information can be found on the Web Dyno Activity Page. This page displays queue time and dyno scaling charts for the past 24 hours.
To turn off autoscaling for your Heroku application. Navigate to the Web Autoscaling Rules page and click on the toggle in the upper right-hand side of the screen. Red means the rule is off.
This article explores the scenarios for which Heroku's autoscaling solution is a good fit for your application. In addition, it details the setup procedure for basic autoscaling configuration.
If Heroku's Autoscaling is not a good fit, this article details an alternative called Dynoscale and outlines quick, easy setup instructions.
heroku addons:create dscale