I think it's about 5 years since I started using AWS "for realz". We were gearing up to migrate a bunch of stuff from on-prem to cloud and we planned to use Azure - we were a Windows/.net shop after all. But. There were some issues. So just for fun, we took an app that we were having serious difficulties running in Azure, and refactored it to run on AWS Lambda using Kinesis Firehose and S3 (it was a kind of event ingestion app), and it was rock solid from day one (as far as I know, it still hasn't missed a beat). As we learnt about the mature client sdks in all languages we neeeded, complete documentation and robust (and actually understandable) vpc networking, it just clicked. We did a 180, scrapped Azure and went full steam ahead on AWS. And we mostly never looked back.
We've been discussing testing in production for a long time. Especially for our "main" api we knew we needed to do something in order to improve the way we develop and deploy. This api relies on production (or production-like) data, and is also the primary contact point for a host of different clients (both apps, set-top boxes and web apps).
We've been using Traefik as the Ingress Controller of choice ever since we started using Kubernetes. We also use Traefik for our non-containerized apps, where Consul is used as the "source of truth" for routing configuration.
If your kops-managed Kubernetes cluster is in shambles, this blog post is for you.
Weird how time flies - we've actually been running Kubernetes in production for over 2 years already. Until now, we've used an Ansible-based provisioning approach written by yours truly (<https://github.com/trondhindenes/Kubelini>). This has served us well enough, but it has also required some maintenance effort so we were looking to simplify and standardize. This post is about that.