Anyone in the cloud architecture or microservices business will be excited at this long overdue evolution of the Lambda ecosystem. When Lambda was first introduced, the idea of having functions hanging out there in the cloud, just waiting to be executing based on some sort of condition was very exciting.
As any seasoned architect will tell you, one of the secret tools of a good scalable architecture is liberal but strategic use of queues.
A queue allows you to decouple components from one another to the point of where downstream execution can be many magnitudes later, making the overall resilience of the platform much stronger at the same time as far more scalable. For the sake of a quick illustration, imagine two components that are tied together. One has to process an order, and the other is responsible for emailing out the confirmation. You may split these up into two different services and when the order has been completed, you could make a RESTlet call to the service to send out the email. But what happens if that email service is no longer there, or returns an error? Now you have to start developing retry logic and figure out what state you want to leave that order with respect to the customer.
Instead of a strong coupling like this, the better solution would be a queue between both components, where the ordering service would place a message on the queue that the email service would pick up at a later date to process, resulting in an email going out. If the email service is down, then no problems, the queue can still accept events and will store them until such times the email service can process them.
When I first started dreaming of using Lambda it was for queue processing. There was many times that the amount of business logic required to process queue’s would have fitted wonderfully well within the Lambda environment, all without having to worry about standing up containers or EC2 instances. However, I had to keep dreaming because this was not available at the time of Lambda launch – one of the most obvious use-cases of Lambda and Amazon made us wait nearly 4 years for it.
The wait is over and now we can start using Lambda for some serious queue related applications. Yes, you can put some quite meaty processing behind each event on a queue and that will make architectures much easier to manage and scale, but the real power behind this development is actually in making some quite sophisticated event routing applications.
Imagine taking a source event from a given queue and then deciding, based on its contents, which additional queues it should be placed on for parallel processing. Take the example of the order system in the sidebar; the email service is not the only service that should know about an order being complete, there could many others in the enterprise that could benefit from that information, for example fulfillment, accounting, warehouse to name a few. A Lambda function could take that order from the queue, and decide quickly which other queues should have a copy of that event.
Such routing design patterns are common and historically you’ve either had to use some non-cloud technology to perform these, or grown your own. This Lambda tie-up simply reduces the amount of infrastructure required to support such a design.
While the wait is over, Amazon has not made it as clean as I would have hoped. There is a little sting in the tail and it gives a little clue as to how they are providing this service under the covers.
One of the shortcomings (though I understand why) of SQS is that it requires the client to do a long-poll to determine if there is events on the queue to be processed. In other words, you had to keep making an HTTPS call “do you have any events for me?” and while the call would wait for a period of time before returning back, you had to do this all the time. Each call to SQS, yes, you were getting charged for it.
For large systems that have a constant stream of messages coming through, this cost is negligible, but for systems with sporadic bursts, this overhead could be costly and above else, in-efficient. Traditional messaging systems, would keep a constant connection to the messaging service and events would be delivered down the wire instantly.
To minimize this overhead, it is not uncommon to have one consumer of a queue who would do the polling and then pass out the work to a battery of internal threads. This is an environment where you would have a server handling queue event processing that was capable of executing multiple threads at once.
The Lambda world however, is serverless, so you have to forget about the underlying platform of real servers that Amazon manages for you to give you this illusion of serverless computing. Since they charge only for the function execution, as far as you are concerned you are allowed to forget the underlying server.
SQS has not changed its behavior. You still have to poll to retrieve messages and this is what Lambda is doing under the covers for you. However this time, you have no real control over the amount of consumers that will be running up that SQS bill for you, particularly if you have come off of a very large volume of events that Lambda scaled out to execute in parallel all those events.
No doubt Amazon have thought of this and will be monitoring the situation and taking the necessary steps to reduce this overhead. In an ideal world, they would figure out a way for the Lambda service to have a continuous direct connection (think websocket for queues) to the SQS service so when an event came in, they could instantaneously hand it over to a Lambda function for execution.
in the meantime
Until that time, this is a huge step forward and makes the use of serverless computing even more attractive as design out the next generation of cloud solutions.
You can do more reading here from Amazon’s official blog.