Video

A Safari of Kubernetes and its Natural Habitat

Have you ever wondered what life in the data center really looks like, after the sun goes down and the people leave?

Come along on a journey with Open Source Advocate, Noah Abrahams, as we visit some of the inhabitants of the Cloud Native savanna on this trip through the ecosystems of Kubernetes and its natural predators.

During the session, you will learn the warning cries of an ever alert Prometheus and watch as the cluster is hunted by a gaggle of red teamers, while they all try to drink from the same data lake. This talk is a whimsical introduction into the daily life of Kubernetes and common production deployments while you listen to some very mediocre impressions of famous naturalists.

EPISODE 1: Hunger

It is evening in the data center. Most of the IT staff has already made a hasty retreat. They are hopeful that on this evening, they will enjoy some peace… and that no clarion will sound. So far, this evening is a quiet one. Back in the fields, here we can see… a wild Kubernetes. This Kubernetes cluster is blissfully unaware. It is currently grazing, among other instances… looking for some delicious YAML to consume. This YAML provides the Kubernetes with all the nutrition it needs, to perform its daily activities, and it guides the herd’s actions of carrying workloads. But the cluster shares its environment with many cloud native projects… and others.

A pride of developers arrives, hungry for idle resources that it can use for its applications.

While the cluster remains unaware, the developers are quite active… They function on a different schedule. While the developer starts functionally blind, looking for purchase in this vibrant and robust ecosystem, it is wily and creative. Eventually it adapts to its environment, but it may find the footing unstable, and it can be fraught with perils. This pride is currently searching to sate its hunger, and knows that when others are not using the cluster, many additional resources may become available to them and their applications. These developers, who previously fought for easier access to the systems, bring along some YAML of their own… Some of them carry it along in Helm charts, only ever wanting to change one or two variables. But, for others, it is the ease of deployment they care about, and for them it is carried forth by this river named ArgoCD. Argo maintains the health of this valley and ensures all the YAML is nutritious and up to date.

With Argo’s help, they are able to consume many more resources than if they were to sow their YAML by hand. But, with a ravenous appetite, the developer says… “I want it all”. “I want to test quickly, so I want as many resources as you can give me.” Through use of a quick one-liner in bash, the developer makes copies of their application… tens, sometimes hundreds. They then copy and paste within the Argo configuration, and with a git push they are off and running.

But what’s this? With the influx of all these new development, QA, and pre-production deployments… a problem arises. These nodes are full and can consume no more. But, what happens next… is truly extraordinary. To protect itself from the developer’s resource consumption, the Kubernetes relies on one of its greatest strengths… it scales. 

In doing so, it is able to bring many more resources into the herd, and respond to the developers’ insatiable appetite. But while scaling provides more resources… it is not magic. It cannot help with everything. In this case, because of constraints upon the cluster’s design specifications, there are no more internal IP addresses available for nodes. The allocated pool has simply run dry. The overconsumption from the developers’ hunger begins to reach a boiling point, and suddenly… new nodes and new deployments are beginning to fail. As the coming chaos unfurls… Alert manager begins to take notice of these problems. The group of Prometheuses, known as a gang, sends out a warning cry to alert whoever is listening. They shout about node failures, about there being insufficient CPU and memory to fulfill all the requests, and even about indecipherable errors from the upstream cloud provider. With the prometheus’ cries crossing a threshold… The clarion has rung.

The IT staff is spurred into a defensive formation, huddling around and protecting the cluster. The pride of developers, of course, will collectively proclaim no knowledge of what has transpired this evening, though the git logs will tell us otherwise. Meanwhile some of the nodes may have succumb… but through sheer numbers, enough make it through to ensure the survival of the applications, as a whole. 

The next day, after meeting upon meeting, quotas are being set and Open Policy Agent, or OPA, is being installed. The previous approval process that was loosened, is reinstated, so the developers can no longer check in changes to the Argo configuration without oversight, or at least a thumbs up. They have paid for their indiscretions, but this particular problem will not arise again. The developers will have to be even more crafty… to sate their hunger tomorrow.

EPISODE 2: Birds of Prey

The previous cluster managed to keep damage to a minimum, because it was protected. Between Dex and Gangway handing authentication back and forth, the Prometheus Alert Managers’ constant eyes, and later OPA enforcing its rules, controls stood guard to keep watch over the herd. This, however, is Nigel. Nigel is a brand new on-prem cluster. The first of many, just released into production, on a Monday evening, for an adoption test.

The original design team, however, had foolishly decided while building non-production clusters that since the access keys were only given to GitLab and the administrators, they felt that they did not need to worry about additional authentication and login patterns, until production became a concern. This, as will surprise no-one, was not corrected, before being propagating to this newly launched production cluster.

An unprotected cluster, however… attracts other visitors with more nefarious intent.

Birds such as these will take advantage of any easy meal that they can catch. As soon as the cluster was turned loose into the wild, this gaggle of red teamers, acting from a remote location… was swift and efficient. They have found that the kubeconfig and admin keys are embedded in the image for the custom gitlab runner, and it points to an API endpoint that is accessible from anywhere within the corporate network. These are design decisions that should never have made it out of testing, so their attack is merciless. As with all birds of prey, the carrion, the detritus… the potential for disease to spread and infect others in the biome… has now been identified and consumed.

If this had been a real attacker… by morning, there would be no secure nodes left in this cluster… no pod would be safe. If the internal corporate network were compromised, or the actors otherwise had access on the inside… The entire cluster would be lost. Unfortunately, with no elastic cloud capacity behind it, it would have no room with which to scale, so by the time they are done mining bitcoins or whatever they’re doing, this cluster would have simply run out of resources. The herd would be picked clean.

What’s more… once compromised, the flock can also be expected to drink deeply from any and all storage buckets they find behind this cluster. They would simply drink this data lake completely dry.

Luckily for Nigel, the Red (Offensive Security) team makes a formal report, and the Blue (Defensive Security) team has now responded. They had very little knowledge of the danger they were in, but have now been made aware of the terrible design decisions that were made long before these clusters even thought about being put into production. Once made aware of the threat, they are a force to be reckoned with. They will implement strict authentication and authorization, tied to the corporate LDAP servers. They will bring in another avian challenger, Falco, to mind the threats and help watch over the herd. Significant rules will be created and processes will be changed… but first… they need to tend to their Gitlab runners. 

This was created as an upstream problem, which they have enshrined themselves in the registry of Harbor. If they do not fix the danger that was checked into their image… it will only be a matter of time before a feast like this presents itself again. Instead, keys will now be encrypted and then stored as secrets for the runner to decrypt at runtime, and new network segmentation will be put in place. Quite a lot of work, for day 2 of production. 

But, what’s this?! While the infosec team is implementing new segmentation and separation policies, a new VPC and new cluster have arrived. Meet Bertram. Bertram is the latest release, even though it only has 12 months until the end of its life. Many will not even consider it feasible until it is mature, but by then the growing herd will have left it to die. Will this young upstart Bertram take its opportunity to challenge the dominant cluster? No. As part of the segmentation, this new cluster lives solely for PCI compliance, and has even gone through a completely separate and quite rigorous qualification process. While it may drink from the same lake… and it may look similar… it’s paths will not cross with the other cluster’s. Now when the predators return… they will find this prey difficult and hardened. Not exactly… what they were after. 

Despite all the new measures put in place after a laborious and disastrous Day 1, the herd cannot, however, think themselves completely safe. These birds of prey will continue to circle, watching from afar… forever searching for their next meal.

EPISODE 3: Exploration

By now, you have surely decided that this world is one worth exploring, but you may be wary of the threats looming around you. The world of Cloud Native, however, is not all predators and danger. There is plenty of fun, excitement, and even leisure to be found here. The excitement that awaits you, is just part of a much larger ecosystem…

Here, though… in this unforgiving land… It can be difficult to get one’s bearings, especially with interesting denizens competing for your attention. It is easy to get lost… even with a map.

Luckily, for all of us… there are guides a-plenty. The Cloud Native locals are not just friendly… they are incredibly hospitable. They are always eager and happy to help, because it is the people that make this community so vibrant. They can help you understand Service Meshes…or maybe some of the indigenous GitOps platforms… Or perhaps you may need help with something ubiquitously problematic… like DNS. It’s always DNS.

And let’s not even get started… on how many denizens there are that call “Observability” their home category. Why, the “Monitoring” section alone is so densely populated it is hard to tell those taking action… from those who sit by and idly watch. But many of them can bring you peace of mind… and help you sleep at night. But there is even more in motion here. There is also a hierarchy among the projects, vying for dominance and contributions. With enough market acceptance, some of them will graduate, while others… may be left in the sandbox. But, perhaps you would like to join them in this sandbox… There are dozens of projects here that would be happy for some of your attention, and would just love for you to come… and play with them… and contribute.

Once you have your bearings, you will find a wondrous world stretched out before you… with projects ranging from security to database scaling… CI/CD to network management… and from cross-cloud control planes to cleaning up those cloud workloads… On the human side, there are many various meetups, conferences, and other gatherings… The denizens sometimes congregating simply to discuss how much they love to congregate. You will always find an outstretched hand here from one of these ambassadors… While the real savanna is full of endangered species and faces many crises… The Cloud Native ecosystem is healthy, thriving, and growing.

Whatever part of this ecosystem you find yourself in… you can rest assured that there are people willing to help you on your journey… and events to make it informative and enjoyable.

That’s all for this evening, and we hope you will join us next time as we take a trip… to the clouds.

EPILOGUE

Thank you, Sir David Attenborough.

So, yes, I opted for the cold open on that talk. I thought it would be fun to just sort of go in. I had a couple people suggested and what would normally be the intro slides here we put at the end. So, who am I? Why did you even come to my talk?

My name is Noah Abrahams. I am the Open Source Advocate at StormForge, which means I am split between open source program office and dev advocacy. If you want to know more about StormForge, find us at the booth. A talk is not the place for talking about that. 

I’m a CNCF ambassador. I run the MeetUps out of Las Vegas and I’ve been doing this cloud thing for quite a while, since about 2008, and I’ve spent the time bouncing back and forth between low-level hands-on work, being devops or SRE, being the guy that broke the thing because I didn’t realize that wasn’t namespaced – whoops – and being much higher level and having conversations like this to broader audiences.

Why did I make this absurd talk? This talk started from a tweet actually. It was a tweet by Kris Nova who said: what would your entrance music be if you were walking onto stage for a keynote at KubeCon? I have a feeling you all can guess what the music was that immediately popped into my head because I’ve hummed it like 15 times through the course of this talk. You all know the non-words. 

Because I love nature shows and I think getting in touch with the natural world and getting in touch with what’s around you, it just helps ground you as a human being, but that evolved from what would the theme music be to how would you build a talk around that, which eventually morphed into how would you get Sir David Attenborough to come give a talk like that, which the answer is you don’t because he’s very busy. So I had to do it for him and we did that because it’s another set of analogies. There’s a lot of analogies. There’s no lack of nautical analogies, that’s for sure, in the Kubernetes environment, or buildings with concrete foundations building up stacks and layers or shipping containers or spider webs or any number of them. 

I wanted this one to be something that people could associate with to show you some starter patterns. So now when someone says Prometheus, you don’t think Prometheus, that’s the monitoring solution. You think Prometheus, that’s the yelling meerkat. So you’re welcome. That’s all in the back your minds now. 

We wanted to call out a couple patterns that were good to avoid too. The flip side of why did I make the absurd talk is why did I make the talk absurd? Partly because I think I’m funny.

Less of you do now than when you walked in that door. Also I really wanted to just do an

impersonation on stage, so thank you for letting me get that lifelong dream out. But I did it so you’d remember it because we have to have those associations. We have to have something that we build upon as we fumble our way through this incredibly vast ecosystem, and if you’ve got anything to associate it with, then it helps you learn the projects, the people, the ecosystems. Helps you learn everything. So that’s why I did it, and also because it’s been two years since we were all together in person and you all deserve some fun. I hope you all had fun.

A few points to remember just as you’re all getting up to run out and go to lunch, which I’m sure will happen very shortly. This is a living breathing ecosystem the same way that the savannah is full of plants and animals. This ecosystem is full of people and projects, and the projects are just more people. So always remember that there’s people under the hood for everything that you do and it’s important to care for these resources. 

One analogy that you’ll see in a lot of places – people talk about pets versus cattle. That was obviously an analogy written by someone who never had to take care of cattle. There’s an idea that some things are disposable and some things you just don’t give attention to. Well as I just pointed out, if there’s things you don’t give attention to, suddenly your entire system can be compromised. So treat things well, give them respect, and as I mentioned just a moment ago, that everything is built on people in both technology and in naturalism and conservationism and things. Teamwork is important. You have to be able to work together as a group. You have to be able to go hand in hand and sing kumbaya and go down this road happily together. We don’t do this alone. None of us gets through this without the help of at least a couple thousand other

people that are here today. Cooperation is far more important than competition in this industry because we’re all relying on open source projects.

I also want to take the moment to point out that the work we do is tied to the physical ecosystems. A number that we had in a previous report is that every year data centers create over 100 million metric tons of CO2, and if you don’t pay attention to the resources, if you don’t pay attention to scaling, if you give your devs unfettered access to the system and allow them to spin up eight billion new instances, it doesn’t just harm your wallet, it harms the planet.

So, keep that in mind. That happens. It’s real and you know, I mean, this was tongue-in-cheek, it was meant to be fun, it was meant to be silly, but the issues are serious, so don’t forget them.

Apologies to Sir David Attenborough for my terrible impersonation. Apologies to the National Geographic Society for using an orange or yellow rectangle to public broadcasting and to viewers like you. Thank you all for coming to my talk. Very image heavy. 

Any questions?

I’m going to take that as a no. Do we have any questions from online?

Thank you all for coming.

Latest Resources

Seeing is Believing

Start getting resizing recommendations minutes from now.

Watch An Install

Free trial includes full version on 1 cluster for 30 days!

We use cookies to provide you with a better website experience and to analyze the site traffic. Please read our "privacy policy" for more information.