Foundations of The Cloud With Mark Burgess, CFEngine

Episode Description

In this special episode of the Platform Engineering Podcast on Foundations of The Cloud, Cory O'Daniel chats with Mark Burgess, the creator of CFEngine and a pioneer in configuration management. They dive into Mark's journey from physics to computer science, the birth of CFEngine, and the dynamic world of system administration and DevOps. They also explore the evolution of IT operations, the challenges of managing modern infrastructures, and the future promises of AI and autonomous systems in tech. Join us for an insightful discussion filled with historical anecdotes, practical advice, and visionary ideas for the future of platform engineering.

Episode Transcript

Hi, I'm Cory O'Daniel, and welcome to the Platform Engineering Podcast. In the next few episodes, we're going to talk about the history of operations, its evolution into DevOps, and platform engineering. We'll talk about the tools, the culture, and how the cloud has shaped our industry over the years. My first guest is Mark Burgess, creator of CFEngine and a physicist focused on promise theory, semantic spacetime, graph theory, and knowledge management. 

Mark, I'd love to learn about your background and how you got started in computer science and sysadmin, but first, I've got to know. Have you figured out time dilation? Because you're getting way more done than anybody else I know.

If only, it could keep me young, I would have been happy about that. But no, I haven't figured that one out yet.

Nice. So how did you get started in computer science and system administration?

Great question. Hello everyone, and thanks for the invitation, Cory. I was a lowly physicist doing a postdoc. I had just arrived in Oslo for my first postdoc after finishing my PhD. Basically, the funding for the physics PhD started running out, and I had taken over the computing system we had bought to conduct our research. It was a newfangled system of Sun workstations. Over time, I got my feet wet, or my hands dirty, as they say, diving into that system and managing it. As the funding for physics ran out, I thought maybe I should look for work in computer science. One of my colleagues at the time happened to tell me about a lectureship position in computer science at the university down the road. I applied, got the job, and the rest is history.

Nice. Before we hop into CFEngine, for those that aren't familiar as this is a DevOps-oriented platform engineering podcast. Could you give a short description of what CFEngine is and maybe explain how your role led to its creation or how you came up with the idea?

The Birth of CFEngine

It all came out of my being a physicist, right? Because I like to understand things. My goal in life is to understand how stuff works, rather than necessarily to build stuff. But as I was diving into the computers and figuring out how they worked, it was taking up too much of my time. I decided that maybe I could try to automate some of it. Having seen some of the scripting stuff that people had done at the university to automate things like monthly backups and purge file systems, and all kinds of clever things, I thought, "This is quite impressive." At the time, I was reading a lot about artificial life, and I had always had an interest in artificial intelligence. So I thought, what if you could make a kind of system that would enable me to think about computers as I think about physics? I could study the problem, see what computers do—not just what we ask them to do, but actually what they do. Some of it's good, some of it's bad. If we could identify the bad parts, filter those out somehow, and steer it towards the good, maybe that would be a way to regulate computers, as you might regulate a physical system.

In physics, we think about thermodynamic systems, boilers, engines, heat systems, air conditioning, and stuff like that. But computers don't have those kinds of regulation principles built in, and I thought they should. If we built a system that was stable to begin with, it could just fly like a plane on autopilot and correct its minor mistakes. We would set the course that we wanted, the policy that we wanted it to adopt, and somehow it would be smart enough to analyze its sensory data and figure out how to steer towards that desired end state.

I started to look into how that could be done, and that's what got me on the path of writing CFEngine. I think people got their expectations for configuration management reset when it became more popular around the cloud, and several other systems like Puppet, Chef, and Ansible came out. But my idea goes back to 1993, right? Early days, when system administration was fly-by-the-seat-of-your-pants stuff done by bearded wizards with special arcane knowledge and spells and hocus pocus. Trying to turn that domain knowledge into something that could be automated, formalized, and made reproducible was a tall order.

Did you see at that point the same types of resistance from the sysadmins that we still see in some organizations today as you were trying to give them this new tool?

Absolutely, some people were 100% against this because they wanted their magical incantations to be something that only they could perform. But others jumped on this and thought it was really cool to have a kind of software robot sensing the virtual world inside computer networks and figuring out: Where am I? What's going on? How do I rebuild the system? Over time, expectations changed because systems got simpler. Back in the nineties, there were 15-20 different kinds of Unix. It wasn't all Linux. Every flavor of Unix was different and they had different shell commands with different options. Writing shell scripts was a nightmare because you'd have to say, if this is version 4.13, then use this shell command; if this is version five, use another command; if this is AIX version, yet another command. It was like 10 million if-then-else statements just to determine the first command, never mind the desired end state.

It seemed pretty obvious that you needed to abstract away all of that variability and turn it into a simple declaration of what you wanted the system to be. That's what CFEngine was about. I figured that the system was smart enough to figure out a lot of it where you didn't have to literally declare every shell command and all the options. We could figure that out using smart, intelligent methods. Over time, this developed into separating out the intent in a kind of declarative way and then burying the details of evaluating that intent and figuring out the path towards success as a kind of block-solving puzzle like you did in old-fashioned AI with robotics.

At the time, Free Software Foundation was just coming out. Richard Stallman was looking for new projects. I thought we could put this online, let people use it, and see if they found it useful. Before I knew it, a lot of people were using it at CERN, the particle accelerator place, and all of the national labs in the US. Lawrence, Berkeley, Los Alamos, a lot of these academics, especially those associated with physics because of my background, took on this tool and started to use it. And then it spread from there.

So, at the time, pretty much everybody was just scripting, and that's what we had. We had this very imperative way of describing systems. It's funny because you're using the word declarative. I feel like, even in my experience, I started getting into operations around 2005 or 2006, and it was still very if-based. The first time I went into a data center where we were doing any sort of automation, it was still just bash scripts. I remember we first introduced—we actually built our own system that was very similar to Puppet before we discovered Puppet. Even just getting that to the operations team, again, this was before the term DevOps was coined, there was so much resistance and it was so hard.

But it sounds like, at least for you, the network of physicists working on these systems knew that they had to be reproducible and configurable. They didn't have time for things that weren't part of their work to slow them down. That was really a boon.

The other side of that, though, was that at the university, things were very different from your average data center environment today, like the cloud. We somehow imagine everything should just be the same because it’s easy it’s kind of the stem cell approach to computing—churn out a lot of stem cells, and eventually they'll differentiate into specialized things by uploading some software. Back then, at the university, every environment was a special kid. They wanted their own software, their own configurations. Any attempt to standardize would be met with fierce resistance by the academic community in particular.

In the tech world, every environment used to be a 'special kid'. Standardization was often met with resistance! 

So, not only did this software have to adapt to everybody's special needs, it then had to convert those needs into a stable target state, a desired end state that it could converge towards and maintain—again, like the autopilot idea. But everybody had their own course setting, right? Some people wanted to go to Tokyo, some to New York, and you had to be able to fly their system in exactly the direction they wanted, while simultaneously protecting them from themselves against all kinds of issues, like garbage collection.

Even today, people rarely think about garbage collection in systems. They will run their systems until the disks fill up, the memory leaks to fullness, and then the system crashes. You restart it—Ctrl-Alt-Delete—you run the system by rebooting. As we move towards the cloud, this kind of biological approach to systems—just let things die and then reboot and restart—becomes more common. But back then, computers were much more valuable and expensive resources. They couldn't go down. It was a matter of pride to keep systems running for years at a time. If you managed to keep your system running for a year without rebooting, it was a celebration day.

Times were very different then, and the attitudes towards configurations were very different from the kind of disposable computing we have today. CFEngine emerged in that environment where failure was not an option. Systems could never go down; they had to continuously be stabilizing, bolstering, securing, and recovering in real-time against their worst enemies—the users who were constantly trying to crash them in a sense.

In a sense, yeah. It's funny to think about that. It just reminded me of one of the first data centers I worked in, which was in healthcare. I remember logging into a system that had been up for a long time. There was a banner when you logged in: "This system has been online for 400-something days." The head of Ops was very proud of that. We had to do an upgrade once that required a reboot, and he was like, "I don't want to reset it to zero."

It's funny because compute is a commodity today—it's disposable. The idea of making a service that might be backed by 20, 30, or 40 different containers close to 100% uptime is hard, let alone a single system, like a single operating system or a single mainframe. That is a world of difference in approach to achieving that kind of uptime.

I may have had something to do with that kind of shift in the end, because as CFEngine developed, I ended up taking it to the conference circuit in America. The USINEX LISA conference was kind of the mecca for system administrators back in the day. After an initial attempt to present CFEngine, which amazed me because I didn't realize how many people were using it in 1997 when I first went there. I gave a talk, an academic talk about some of the theory and programming behind it. But then in the evening, I gave a birds-of-a-feather session for interested people. I said, "Come along, I'm going to talk about this thing I made called CFEngine." I rented a small room in the hotel for about 100 people, thinking that if I had a few attendees, that would be great. The room was full when I got there, and people were crowded outside because they couldn't get through the door. In the end, I was standing at the front, planning to show a few slides, but people were literally sitting around my feet. I could barely stand in the room. I just couldn't believe how many people were already using it, even by 1997. It actually threw me a bit.

On the plane going home, I got sick, as I often do on long flights, but I was thinking about how I could explain CFEngine to the academics because they didn't really get it - the users could see the utility of it - but they didn't understand the concept of desired state and stabilization and the different ways of thinking about managing and regulating systems. So I thought, "Okay, how can I describe this?" My first thought, because I was sick, was that it's a bit like an immune system. I went back to Oslo and wrote a paper called "Computer Immunology," which I then presented in 1998 at USINIX. It is still online and widely read.

I made the point that in biology, our strategy for survival has a lot in common with disposable technology. We've got billions and billions of cells, and if you scratch your arm, a few million cells roll off onto the floor, but it doesn't matter. We just make new ones and keep going. There's sufficient redundancy to make systems robust. Back then, we couldn't afford to do that because we only had a limited number of systems. The really big institutions maybe had a thousand, but that was rare.

At that conference, it was interesting. After my talk, some folks from the Sandia government laboratory came up to me and said, "Hey, do you know these guys at Los Alamos? They also have a project called Computer Immunology." One of the other guys at the conference was Steve Traugott, my friend who now runs infrastructures.org. He had similar notions of how to scale systems in production. He'd worked at NASA and other places.

Anyway, long story short, a discussion came out about how to balance the notion of throwing systems away and rebuilding them versus stabilizing this valuable thing that you can't possibly throw away or do without. These competing methodologies —converging towards a perfect, stable state and living forever versus the idea of throwing everything away and letting the next generation take over—were in play for a while.

When virtual machines came along in the early 2000s, suddenly there was a way to do both. You could have this kind of cellular reproductive approach to throwing away systems, rebuilding them, even freezing the state and transporting them elsewhere, and re-energizing them, recovering them from suspended animation, so to speak. That kind of approach took over as it became cheaper and cheaper to do that, and those technologies had to scale quickly. People put more effort into that side than into the desired end-state approach. By the time the end of the 2000s came along, when Puppet and Chef entered the picture, the cloud had definitely got a foothold, and the "throw it all away and rebuild" approach was starting to take over, eroding the focus on the desired state.

Yeah, I want to talk about that, but I want to rewind to that room you were talking about, where the entire room was packed full. We have a lot of startups that listen to the podcast, and I feel like most of them would kill for that experience. It’s really funny, I feel like you always see your first users using your system in ways that you would not have expected. That's where you start to gravitate towards these people who are using it. When you met this room full of people who weren't academics but were using CFEngine, were there any surprising use cases or significant reactions that you just didn't foresee people using CFEngine for?

Always, people never used it the way I thought they should use it, and to this day, they still don't. But there are a few aficionados who are really good and understand all the principles. It does way too much, right? I envisaged this whole artificial life system that was monitoring the system. In the nineties, it was doing early machine learning, which was almost unheard of back then. It was monitoring, reacting, and compensating and doing these various things together. Some people would use one tiny bit of it, and others would use another. It blew me away that by the early to mid-2000s, you could go to literally any computer data center in the world, from the mega software places like Amazon or AT&T to the smallest startups, and find CFEngine being used. You could turn over any rock and you would find CFEngine growing underneath it like a moss. It was on millions and millions of systems. It was just extraordinary to me. Even today, that version, which was version two back then, is still running in some of the largest providers, who shall not be named. They might be running just one script that does hardly anything, but it's still there, which is extraordinary when you think about it. The lifetime is now up to 33 years or something like that.

Yeah, that is awesome. As Google, Amazon, and other big clouds started to emerge, how did that affect CFEngine? How did CFEngine change to meet these challenges, moving beyond just configuration management to encompass infrastructure management?

So back then, around the early millennium, say 2000-ish, I was very active in the research community around system administration. I was the chair of the mega-conference in 2001. I was involved in a lot of these processes. We had many workshops around CFEngine, which started out as the CFEngine workshop and gradually evolved into the Configuration Management workshop, where tools like Puppet emerged. So there's a whole history there. That meant that we were all talking to each other and adapting to the changes.

My PhD student at the time, Kyrre Begnum, and his friend John Sechrest, whom I met at LISA, were very interested in virtual machines. Honestly, they didn't interest me that much at the time, but they were really keen on these virtual machines. John, in particular, saw the business potential of virtualization as a way to manage data centers. So we built an early model of cloud using CFEngine to instantiate machines and control ongoing processes on different machines.

Because maybe not everybody knows this: CFEngine is designed to make every individual computer be an autonomous entity. It's not a system where you deploy one image to every machine; it had to adapt to various special uses. Every machine had its integrity and autonomy to the point where it was actually impossible to send instructions or “destructions” to the machine from outside, by design decision, both for security and privacy reasons. Basically, people didn't want their autonomy overridden by some central entity.

That model made it less straightforward to deploy to, say, a million machines and decide what each one would do and orchestrate all that behavior. So that required some pretty fancy jitsu stuff to make that happen, which all became part of the theory behind CFEngine. At some point, I realized that the original model I had was missing some pieces.

Promise Theory: A New Paradigm

And I kind of went back to the drawing board and came up with something that ended up being called Promise Theory in 2004; that was actually the first time I wrote it down. And that has been... I thought it was maybe something I'd work on for a couple of months while I figured it out. But it's actually taken over 20 years of my life, figuring out all the consequences of this model, how things cooperate and form systems in everything from biology to physics to computer science to AI and all that stuff, today. It's all embeddable within this framework of Promise Theory, which is essentially about how initially autonomous things cooperate. Turning into something that can be scaled has kind of taken over my life.

I read up a bit on Promise Theory, and I'd love for you to give an ELI5 to people who aren't familiar with it. But it seems like that idea is especially relevant with the speed at which AI is starting to do many things - whether good or bad, as you were saying earlier - if we focus on those good parts and filter out the bad parts, it seems like that becomes almost more important as we get to these systems that can behave on their own.

The whole thing came out of an argument that I had around 2003 with some guys doing network management in Europe. The classical model that everybody in computer science still uses to this day is that you have to force computers and anything around you to do what you want. So it goes back to our manual intervention kind of stuff. We like to use our hands, get in there. We imagine ourselves wrestling systems with our hands and turning them into whatever we want by our control. But actually, that's not the way things really work. We need their cooperation to be able to do that. People would make systems based on the notion of obligation. "We must make the system do this. We must force the system to do that. It must obey our commands." I turned it on its head and realized that what if you can't do that? What if you need its permission to make a change? And it will only cooperate on a voluntary basis with your instructions or your wishes. So if you say, "I would like you to obey this security guideline," and it says, "Screw you. I'm not going to do it," there's not much you can do about that. If you can get the cooperation of the system, you could say, "I will take the bits that I want, and I will leave the bits that I don't, and I will cooperate to some extent."

So instead of forcing, obliging, and making things mandatory, I wanted things to promise what they would comply with or what they would agree to do or behave as. And to build the scaling of systems from that idea that we respect one another's autonomy and react to one another's promises. So if you won't promise to be a good guy, I'm going to promise to protect myself against you, and that's the way it's going to be.

As a scaling paradigm, that works extremely well because it requires a lot less communication to get on with your own thing and occasionally listen to hints and rumors about this and that, than it does to actually send every single instruction over the Internet, character by character, and often repeated 20 times a day. So it's enormously wasteful to do that mandatory push-based engineering, and it's very scalable and economical to do pull-based voluntary cooperation.

Promise Theory is about that. It says, imagine a world in which every agent, every thing, person, machine, whatever, is an independent entity. You can't force it to do anything. It will only promise what it intends to do. It's a statement of its intent, that declaration of intent that we were talking about in CFEngine. That's the key to the desired state, its lifestyle, what it intends to do. And you have this lifestyle model of engineering in which everything declares its intent to behave in a certain way, and we manage that and corral it in ways, much as we do with humans.

And this is something that we kind of pursue. I feel like as operations engineers, kind of everywhere, whether it's our secrets management or compliance around PII, we wish we had systems that could actually ensure that a healthcare patient's Social Security number is never accessible by a human. But we know that someone will find a way around that system. There's a database somewhere that someone can log into and access, and eventually, it will end up in an application where it shouldn't be, right? These are the challenges we're tackling today, but I feel like we are often reinventing the wheel as we move from one technology to another.

Absolutely. The reason I've dedicated 20 years to this work is because I saw how powerful and fundamental this idea is, not just for scaling engineering systems like physics, but also for protecting autonomy and information integrity. It's inherently tied to security and scalability, but when we talk about managing functional adaptations, it's also at the core of how we create autonomous software components—from object-orientation to functional programming. Nowadays, people are developing actor-based systems where autonomous entities exchange hints to adapt and respond while maintaining their safety and integrity, yet remaining responsive without being overwhelmed by external requests.

This concept has been hugely important across various domains I've worked on. While I moved away from configuration management years ago after starting the company CFEngine and releasing the latest version CFEngine 3 around 2008. For the past decade I've applied these principles to other areas: software-defined networking, data pipelining, monitoring, data consistency platforms, and integration of microservices integration. Microservices illustrate another instance where we seek autonomy in small silos. In the past, silos were viewed as negative, but now we see their autonomy as beneficial—though they need to communicate effectively to avoid problems in scaling our human workforce. This has led to a deeper understanding of scaling not just technical systems, but human engineering systems as well. Human interactions, crucial in maintaining, reprogramming, updating, and developing these systems, often receive less attention than technical aspects, as many people find it easier to communicate with computers than with fellow humans.

The story of Promise Theory also plays a role here, where managing teams and fostering communication aligns with a DevOps approach. It revolves around how we cooperate by making promises to each other, rather than throwing commands and instructions over the wall and obliging one another, forcing each other to do stuff.

Yeah, is there a practical implementation of Promise Theory, or is it more of a theory that we should apply to how we develop software, contracts, and APIs? Are there existing tools or frameworks that can actually be used to wrap around systems?

There are many implementations of Promise Theory. The theory itself is articulated in a book, grounded in mathematics without specific allegiance to computer science or biology—it's quite universal. Then there's CFEngine, which was the first IT implementation of promises. It introduced a language for expressing promises autonomously, akin to "I promise to do this, cross my heart and hope to die" rather than "you better do that or else" based systems. Cisco made software-defined networking, ACI I think it was called, they built a first software-defined networking model on Promise Theory. Then there were the smart data pipelines that we built around Aljabr… it was a project that was a great way to understand cloud-based systems at a higher level—similar to how data flows between microservices today. APIs play a crucial role in this ecosystem as well.

What I see is that, even though I've taken a more hands-off approach lately, more systems are following the ideals of Promise Theory, whether I had anything to do with it or not. It seems that people are converging on the idea that this is how we should build systems now. One of the first things that jumped out at me about Kubernetes was that the initial presentation was basically a Promise Theory talk. Whether my past talks influenced this direction is uncertain, but I see a broader shift toward autonomy, not just in technical systems but also in human teams. Concepts like the Agile Manifesto and Lean management emphasize letting individuals leverage their expertise, make promises about what they'll deliver, and uphold those commitments with minimal coordination. These ideas are very general and recur across various domains, and I'm hopeful they will continue to help solve problems by striking the right balance between push and pull strategies.

I feel like, with the rise of AI, initially, when we started seeing things like chatbots and GPTs, the talk was about automating interactions with APIs. Now, everyone's trying to use AI to write code, but I feel like it hasn't quite hit the mark yet. It has exposure to code but lacks the experience of running it. When we write software, I find a lot of the least valuable time is spent interfacing with someone else's API. I have a business goal and user needs, but I'm wrestling with an API designed to expose a system to me.

Looking ahead, AI could streamline these interactions. Imagine systems that focus on what they're meant to do and clearly define what they can and cannot allow others to do. These systems would seamlessly interact, like Stripe handling payments without needing me to parse through documentation and potentially make implementation errors. These systems should be able to behave together, and it seems like that would be really important as AI starts to take over maybe some of the way that our systems interact. We want to be able to wrap these guarantees around it.

We're already seeing glimpses of this in Kubernetes and other operational software. Whether we know the term or not, engineers are intuitively moving towards systems that cooperate effectively. 

I see a strong resistance in IT toward learning from the past. We're not very good at passing on knowledge from generation to generation. Part of this is because technology changes rapidly, creating a steep learning curve to keep up. It often seems easier to create something new than to absorb knowledge from the past. I believe this reluctance will somewhat hinder us. Anyone believing that AI will solve this issue for us is kind of fooling themselves. Knowledge isn't something you can simply input into a system to handle on your behalf. It must be very much in your mind as you are interacting with it.

There are these Wiki systems companies buy to store corporate knowledge. I often joke about them saying that they are data graveyards where knowledge goes to die. People write documents like "How to Install a New User on Our System," they're added to the wiki, and then never revisited. The next person figures it out the hard way and writes their own document, which also goes unread. This creates a growing waste of knowledge that's rarely revisited.

This pattern extends to academic literature as well. Lessons learned are documented, but seldom reviewed more than once or twice. Over generations, this accumulation of knowledge loss accelerates. In some ways, I feel honored that Promise Theory survived despite these challenges. It's heartening that some people still remember and discuss it, preserving its ethos even if the name or its originator fades.

My dream has always been for Promise Theory to permeate the industry, empowering people to use it for good and improve systems, thereby enabling us to solve problems we couldn’t solve.

Yeah, as we mentioned earlier, knowledge management is one of your focus areas. Could you tell us a bit about some of the work you've been doing in that space?

When I started the CFEngine company, I didn't think we'd be able to sell configuration management as a thing. Because it was free, right? It was open source. Everyone was already using CFEngine 2, again literally moss under every stone in every data center. Then came Puppet and Chef the year after, and they were supported by these VCs. And I thought maybe there is a story, they had some customers, so people buying this stuff - it was amazing, why not? So we had a go, that's how the company started.

But my thought was, we already know how to configure systems. What we don't know is how to understand the monster we've created, especially as we got these enormous systems, thousands and tens of thousands and millions of machines, after a while. It's easy to set something up, but as it evolves and degrades and misbehaves or whatever, how do we understand that? How do we keep it on track? We can't possibly keep it in our heads, so the machine needs to do some of the work. Hence the AI connection. Hence the machine learning connection.

But again, the fierce resistance of engineers to adopt those automation methodologies has held us back. And what we've ended up doing is instead of mastering the scaling through technology, we've divided up the scaling into small pizza teams. Everybody dealing with their small system and everybody managing an interaction… symbiosis between a bunch of developers and a bunch of containers which never scales beyond a few dozen boxes. And that people can keep in their head. This is the opposite of what I anticipated back in 1993, where I thought the machine will take over and we'll just interact with it as we do other technologies today.

Fun fact: CFEngine started making waves as early as 1993, transforming how we think about system administration.

But people valued their own expertise and saw it as a game, right? I believe many IT workers are gamers, and for them solving problems is like killing dragons on the command line. They would much rather figure out this perfect jutsu move with all the backslashes and minus options to some shell command to make something happen by magic than they would ask some AI or some automatic system, please figure this out and solve it for me. Because they want to have the honor of having done that and prove their skill to their coworkers. And that persists. It shifted away from sysadmins now to developers. So developers are doing a lot of those things, whether they like it or not, many of them don't. That's been forced upon them by that method of working. And it's evolved out of that need to scale the human management task rather than the technological management task.

Interestingly, there are still some of the old users of CFEngine. Pure systems, like, I think I'm allowed to say, LinkedIn, because they never hid that from anyone. They only had like five engineers running their entire world operation and they did it using CFEngine. Five people could scale millions of machines. That was the idea back in the day, because there weren't that many skilled people. But now it's shifted more towards that pizza team model, and I see that definitely the way things are going.

And so the scalability challenges and the communication and corporation challenges also change. Now, you never go to an IT conference without 90% of the conference almost being about IT guys saying, how do we make friends? How do we talk to each other? How do we form communities? How do we hold each other's hands and be good to each other? Because they know how to do the tech. They can build things on a penny, but they don't know how to talk to each other as human beings and cooperate and solve those human challenges of working together and productizing things and serving one another and being a good service industry, in a way.

There’s a State of CD report, and one of the things they measure is the adoption of infrastructure as code configuration management. It's actually a terrifyingly low number—only 27% of respondents. Like it's probably not 100% rolled out in most of the companies these people are at.

I have friends that I would consider graybeard or older school operations people who resist this idea of community and using tools like CFEngine, Terraform, Ansible, etc., to codify their expertise. And the wild thing is, at the same time we've always had this joke 10x engineer that recruiters throw out, and operations people, truly, if they lean into community and building software to reproduce systems, they're the only ones that can really be a 10x engineer. Like they can actually force multiply and make other teams way more efficient.

And I think that a lot of them have a struggle acknowledging that. We're not necessarily trying to all amass power, but like you have a lot of power in being able to share your knowledge, not just hoarding your knowledge. And I feel like a lot of them think if I automate me, I don't have a purpose here anymore. And it’s like, no, you do. You have your continued expertise on how this system behaves in production.

Absolutely. Going back to your original question, which I lost in my diatribe, apologies. A lot of my research has been about how to represent knowledge and how to write it down in a way that allows it to live in people's heads and not just in some database. Because the worst thing you can do is put it in a wiki or database and let it die. You need to be getting it out and looking at it.

Right now, I'm learning a new language. And you understand the importance of practicing, rehearsing, using, and thinking about how to apply the language when you try to speak something, when you're trying to make use of knowledge. We don't often think about that in response to IT or technical skills or when we go to school or learn physics or something. We don't try to apply it every day. We'll be happy to consult a book after a few years and see if we can pick something up. But really, active knowledge on a day-to-day basis is useful to us, that requires keeping it in your mind.

This actually led me to apply Promise Theory in a different way, something that became known as semantic spacetime. Where you can imagine a world in which every volume in space and time has some kind of function, has some kind of message or purpose. It's solving a problem. Like the table in front of me is holding my glass up. It has the purpose of holding things up, whereas the shelf is for storing books, obviously, and the computer is for doing this and that, and everything keeps a kind of promise. And we organize space and time around us in order to be functional in that way.

This is how everything from ants to human beings adapts the environment around them to create this kind of symbiotic, accelerative process around us. I've spent a lot of time talking about how this works. I wrote a book about it called "Smart Spacetime." I wanted to write it to bring more people along on this journey of understanding how we get from the very basic ideas about organizing stuff, all the way through understanding physics and biology, to AI at the end, with a very small number of basic principles that come out of Promise Theory.

Everything's an autonomous thing. You offer something, somebody accepts it or not, and you make that work, or you don't. You kind of evolve that point of view. It’s the basis of Darwin's evolution. Things offer their random mutations, and they're either selected or not by something else. It's offer and accept. It's this contract idea between autonomous entities. It's not under any grand control; it's an emergent thing. And yet it's the basis on which all of our complex systems have come about.

I do feel we've got a long way to go in understanding how to make these things work. We tend to run after the latest shiny object in tech because it's cool, but figuring out how to put all this to the benefit of humanity is something we still don't know or understand very well.

I was in New York, just as the cloud was coming out, and I had a fierce argument with a good friend of mine about the cloud. He thought the cloud's going to take over everything, it’s so cheap, there's no argument against it. And I said, it's not going to be cheap forever because it's so easy, so good; it's going to end up being expensive. Supply and demand. He said, no, it's going to take over. Well, eventually people will go back to their small systems again. And I think we see that now, things fragmenting, people walking away from the cloud, back to on-premises systems, because not only the cost but the mental cost of coping with that complexity is hard.

The reason we've built microservices and pizza teams is to protect humans from having too much to cope with. The machines can do it; it's we who can't do it. And that's why we scale things in small cells. So we have this cellular - again, the biological model - creating new cells, new organelles, new organs, which we recombine on different levels. And this is the way we need to understand smart spacetime as the technological basis of our next-level civilization.

Huge number-crunching things like the ultimate AI, are not going to happen because we don't want it to happen. We don't want them to replace us, and therefore they won't. We will scale them back until they help us and make us better life forms independent of one another, because we value our autonomy too much to let it go.

The Future of DevOps and System Administration

Definitely agree. I think AI is one of those things that if you think it will replace you, it will. And if you don't think it will, it won't. And I think that's the distinction there: realizing that you can use this thing as a tool to make yourself and your team better versus how will I compete with this?

So we're getting close to the top of the hour. I would love to know - it's funny, I feel we don't necessarily learn from our past and we recreate the wheel in the way that we've managed systems. We've gone from data centers to VMs to containers to serverless containers, and we're about to see this big shift to systems that may or may not be managed by AI - What advice would you give to current or aspiring DevOps and system administrators trying to become more efficient and become a better member of their operational community?

Read my books, damn it.

There will be a link in the show notes for sure. I just ordered it yesterday.

Closing Thoughts and Advice

That's number one, obviously number one, the most important thing of all. But read books because there are so many ideas out there. I reckon, on any scale, I'm a cross-disciplinary person, whether it's coming from physics to biology to computer science, whatever. It's true because I read a lot and I have a lot of ideas at the same time. So I'm working on three things at the same time and end up kind of putting them together, which leads to innovation. So read everything that you can, understand everything that you can. 

Maybe the two most important things for everybody to have a good life are to be humble and to be good to yourself. Because whipping yourself to learn every last thing, or to master every skill, or to work every hour, day and night to produce as much as possible, and make as much money as you can, none of those things mean anything at the end of the day as much as your own well-being and your own state of mind. You do your best work when you feel good about yourself. So taking care of yourself first is always a good thing to do. And being happy is also the place from which creativity and generosity spring. Again, that's how you make friends, that's how you form good communities, and it's how we end up making the next generation.

There are two key lessons for everyone: Be humble and be good to yourself. Your best work comes from feeling good about yourself. 

Simple answer, right? There's no rocket science to living your life. It's buy my books and be good to yourself.

And be good to yourself. It's that easy! I think that's the perfect way to end the show. I absolutely love it. Mark, it has been wonderful having you on, I really appreciate the time. 

Thank you so much for the invitation. I very much appreciate being remembered.

Well, I mean, CFEngine started it all. Thanks so much.

Important Links

Featured Guest

Mark Burgess

Founder of CFEngine