re:Invent is always an exciting time for me. AWS always announces large batches of new features during it. This year is no different, and at The Alliance, part of our job is to evaluate whether or not any of these new features fit into our infrastructure.
Transit Gateway is a service that helps facilitate a hub-and-spoke networking model to connect networks across accounts or premises.
At The Alliance, we use multiple AWS accounts to isolate our resources. For example, our prod deployments exist in an AWS account with no other resources and tightly controlled access. This isolation is great for security, cost-reporting, and just general separation of concerns. However, some of our services depend on services from other accounts.
For most of our cases PrivateLink fits the bill perfectly. We just stand up a network load balancer in front of the dependency (often using this template) and create an endpoint service from it. Then dependent services can use a private endpoint to connect to the other service. Nothing has to be exposed to the open Internet and we can maintain fine-grained access controls.
But what if we need to connect more than just two services together? What if we want to connect two VPCs together? Without Transit Gateway, this would be done via VPC Peering or VPN.
We have a service deployed to its own account that creates additional servers on-demand that we need to be able to access from our intranet. VPC Peering almost works for this; if we connect the service account to our intranet via VPC Peering, resources within our intranet will be able to access it. But users connected to our intranet via VPN Gateway or remote access VPN won't be able to access the service (see VPC Peering's unsupported configurations).
Enter Transit Gateway. Transit Gateway connections are transitive. This means if we use a Transit Gateway to connect the two AWS accounts, everything in our intranet, including VPN Gateway or remote access VPN users, will be able to access the service. It will truly be a first-class citizen. From there, we can use network ACLs, routing tables, and security groups to lock down traffic to and from the service to our liking.
We need the connections to and from our API to be fast and reliable. This means that we need a CDN in some form. When we first deployed our API it was served directly from an ALB. This works fine, but we really want our users to be able to connect to a nearby endpoint rather than making connections to endpoints on the other side of the country.
Unfortunately, Global Accelerator doesn't preserve client IP addresses when routing traffic to endpoints. IP addresses are valuable to us for auditing, so Global Accelerator is out for the API. It might have a place elsewhere though, such as for AV ingestion.
This isn't actually a re:Invent announcement, but it's a pretty important one. Now that CloudFront supports WebSockets (which we will be using), using it for our API is possible. Connections made through CloudFront have their IP addresses made available via HTTP header, making it more viable than Global Accelerator for us. As of today, our production API is now using CloudFront.
As mentioned in this article, we'll be using Snowballs on game-day for archiving and compute power. Now we have more options for our Snowballs. The increased compute power isn't that interesting as our in-stadium compute needs are pretty modest. However, the option to add a GPU to the Snowball is a more interesting one since we will be processing video in the stadium. It's getting a bit too late to make drastic hardware changes for season one, but there might be a place for Snowball GPUs in season two.
Our API uses DynamoDB as a primary data store. With a bit of cleverness, we can keep things fault-tolerant and safe for concurrent access without built-in support for transactions. After all, Amazon has always had a library that provides transactions for DynamoDB. But the built-in support for transactions allows us to make things much simpler and more robust overall.
Prior to this announcement, we were relying on scheduled actions and auto-scaling to manage our DynamoDB capacity. The new on-demand capacity allows us to get rid of all of that. In our simulations, it performed just as well as or better than auto-scaling for large bursts of unexpected traffic, so we switched all of our environments over to it almost immediately.
On game days, when we can count on large traffic spikes, we may switch our tables over to maximum provisioned capacity just to be safe, but on-demand capacity is serving us well.
We've been tinkering with making our API serverless for a while. Our only big barrier to that has been the need for WebSocket support on the same domain. Until now, there was no simple way to direct requests for a particular path to Lambda function and requests for another path to an EC2 service. But with Lambda ALB targets, we can easily route paths to serverless API implementations. We'll almost definitely be making use of this for season one.
AWS Client VPN
AWS made several presentations about an OpenVPN-based remote access VPN solution:
This was something that had been on my wishlist for a while, but mysteriously no release or even preview announcement has been made. Once this is released, we're likely to use it internally though.