Shortly after writing about Lambda cold starts last year, I attended a re:invent recap meetup that threw a spanner in the works. A lot was covered during a relatively short amount of time in that session, but one of the things mentioned was provisioned concurrency as a solution to the Lambda cold start problem. ‘Oh great’ I thought ‘my whole post has been invalidated’. Of course even a quick skim of the documentation shows that setting up provisioned concurrency will engage the cold start problem when your Lambda hasn’t been used for a prolonged period, but it doesn’t overcome the issue when there are spikes in usage. ‘Hey’, thought I, ‘I can write an Insight about that’.
What is Provisioned Concurrency?
The basic principle here is that Amazon will have a number of instances up and running, waiting for your needs so that they do not start from cold. The catch being that you have to pay for them even when they are not in use. If that’s not enough explanation there are plenty of articles that have a deeper discussion on the topic. One major theme is how much of a departure provisioned concurrency is from the serverless ethos. I’ll not go into that here though as I intend to keep this post concise.
A Bit More Code to Prove the Point
I wrote some Bash scripts to set up, tear down and invoke my Lambdas using the AWS CLI and following the instructions I found in an AWS blog post. You should have no issue doing the same.
The only true stumbling point is that your provisioned instances need to be set up against an alias. I’m guessing this simplifies the process for Amazon. As before I set up an AWS Gateway to run a few tests. Of course the documentation wasn’t wrong in regards to the strengths of the technology, initial invocations of my Java Lambda functions were speedy fast, but what about the weaknesses?
What Weaknesses?
Besides the fact that you are paying for these instances to be kept warm, there is also the fact that you are not protected from usage spikes. To explain: say you have provisioned ten instances, but suddenly your service is hit with twenty simultaneous requests, half of those requests are going to deal with the same old slow response time issue. It’s something to keep in mind if you are choosing this as a solution.