What replaces CentOS [Distance DevOps Dec 15]

Mar 19, 2021 21:43 · 3271 words · 16 minute read

Hello, I’m Rob Hirschfeld CEO co founder RackN. And this is the December 15, DevOps unch and learn where we talk about CentOS nd the changes Red Hat is making and what he alternatives are and how the industry is oing to react. great conversation, it really tands the test of time. Even as we released his in March, I think the discussion brought p exactly the issues that we’re still rappling with. So enjoy the conversation.

00:27 - Rocky, Linux has been busy. neighborhood Linux is reshift. You.

00:37 - Yay. I’m surprised that rocky hasn’t come over there yet. But well, it’s, I have to find the right people over there to migrate to actually what I’d love to do is join up the folks starting rocky Linux with our spinning off rocky Linux with the decentralized web folks and see how far we can go.

01:05 - Is it actually gonna be a company? Or is it just full org? foundation? Working the whole set offs thing and trying to rebuy foundation doesn’t say company, right? It won’t be a company. It won’t be. It’ll be it’ll be an organization with with governance and a few other different things. But any contact there? Yeah, I do. I’ll jump for that.

01:39 - So I haven’t been in these meetings at one o’clock, because I’ve been on the HPC mg meetings at one o’clock. And this this week, the management structure of HPC got so extremely overloaded with rocky that we’re not having a meeting today. So that’s why I’m able to actually be here today.

02:00 - What so? what that mean? Yeah. So was that? I so what do you mean that got so over over overloaded with it, just considering changing it or so we’ll do this with a slack went from one tweet from Jim Kay, ended up causing the slack to go from, I think, like 1000 users to 5000 days. And it just got crazy. busy. Just people everybody signing up managers jumping in saying, Hey, here’s hardware, where can we throw our money? Where can we get this going? Wow, the forums are went from I don’t know how many people we have.

Now. Again, we we turned it on. And then within a few hours, we have let’s see. Can’t even it hasn’t tallied yet. It’s the 1000s and 1000s and 1000s of users. That’s so yeah.

03:15 - Wow. abandonment of centers. Centers basically announcing that they’re selling CentOS eight next year.

03:28 - was such a moving to continuous update, right. I mean, well, it’s moving, essentially moving, moving center costs into essentially upstream as opposed to downstream is the Yeah, actually. That’s right. Yeah. And so so you have, you know, Fermilab, you have all the National Laboratories, you have all of the, you know, huge HPC infrastructure, all reliant on this 10 year cycle of releases, that they basically just abandoned.

And in doing my backgrounds are going crazy today. And basically just abandoned. Yes, long term release structure. So it kind of caused a mass migration. So it was it was kind of a excuse. My language was a dick move by Red Hat. And yeah, you know, it was definitely IBM, wondering when IBM was going to, to really up the volume on redhead dick moves, because he has a tendency towards them anyway. Well, is that an assumption or, I mean, nobody had a warning that this was coming.

I mean, that that’s why it’s being called a dick move.

04:51 - Right. I mean, do we know or that’s just the most, you know, that perhaps is IBM has made an official statement that Red Hat was gonna do this anyhow, quote unquote. It wasn’t us.

05:09 - Britain bedded into red hat was just released.

05:15 - But it was IBM holding the night. It was Red Hat who actually pushed it forward.

05:23 - So is there is there consensus to migrate off of CentOS to Ubuntu LTS especially for HPC? Use Cases? Absolutely not. Ubuntu does not support. I mean, long term Ubuntu is not empty. Yeah.

05:40 - Ubuntu is not useful in a except maybe as a container, or as a as a virtualization system. capizzi generally, yeah.

05:55 - It’s my desktop. Yeah. FreeBSD What? Oh, sorry. Somebody, man, but I have to whip out the apple about the tattoo.

06:04 - I was just say like, it might end up seeing migration to FreeBSD or similar. Just the underlying platform, the non container one? Um, sadly, I don’t think so. Just simply because, again, the tooling is not there for HPC. You know, the ipmi libraries? The Yeah.

06:26 - I mean, you know, it just, there’s so much there, we count on Santos as the base, right? I mean, for what we do, we bake. Santos was seven, we just migrated to eight.

06:41 - As as the discovery image, and we do that, because it’s so universally supported. The worst thing is, is they waited until right until the wave of people who had been putting off updating to eight, started kind of pushing over. So it was like, it was right as the you know, everybody had been saying, okay, we’ll update to eight next year kind of thing. Well, next year, and then this year, yeah, COVID. Everybody started pushing over to eight.

And so that that kind of it started that wave of people switching the eight started. But then now you’ve got people who were switching the eight, who are now locked in and saying, Oh, we just spent all of this time money effort to switch over to eight.

07:29 - And now you’re telling me at AOL next year? Tell me your life cycle. Wait, wait.

07:37 - But I mean, one, Red Hat sells support for REL right? Let’s get on a five year cycle. Yeah. Life if you really, really want.

07:51 - So put on a few. So first off, it makes me asked the question, what’s the next move? This wasn’t done in isolation. But it was forced to Rh do I mean, it was really truly, let’s, let’s take all of these samples. Let’s take all of the centers, movement and people and turn it into our upstream. Let’s do what with Santos what what we’ve been doing with Fedora? Yeah, let’s put everybody on what’s happened.

08:20 - What’s happened with Fedora? Nothing. I mean, it’s Fedora. I mean, people use it as a desktop operating system, people will not as a server for this reason for that same reason you essentially Fedora now, but then centers, the new centers kind of competes with Fedora. It does.

08:41 - Yeah. But But IBM owns all of them. So what does it matter to them? Yeah, as long as they can funnel everyone into REL and paid subscriptions, it doesn’t matter to them.

08:54 - Right? That was the plan. All this is what’s what to me is funny, right? It’s that was the plan all along? is, you know, is that you were supposed to go? Yeah, that’s why they bought the Santos team. Because they were trying to you know, they want you to move into the position music expected to be consultative, right? They don’t know why they bought it.

09:21 - I put my marketing hat on and I go like, Look, I’ve got two new people jumping to a right they’re locked in for a 10 year cycle, I want to force them to basically go upstream, now’s the time to make the change before you write and then I’ve got a lack of HPC feature. So the next thing I’m looking for is how do I start going to do seeing more of those into the core platform, so they have a migration path to it. And then this thing I sit in here looking at going like this release cycle as a software person.

That’s hideous. Yeah. 10 years as the board is unmaintainable. So I don’t wait for the market back for that. So I’m kind of saying what’s the next steps. I think it’s probably a bit more thought out than just don’t know, but And the irony to me is that we never saw Santos as that stable, right? They yank the ISOs for older releases, like, you know, the day, the day they come out with a new drop. So we know, we were always battling the libraries and stuff like that not, you know, not being there.

Well, when people needed it, well, you just, that’s why you aim at the vault instead of at the, at the main line. So yeah, one on a release, it goes into vault. So if you want stability, you aim at the vault. So on day one releases move into the vault on launch, including a copy of what was released on that launch day.

10:43 - So if you want to ultra stability, you aim at that, well, previously, you’d aim at the vault, everything was vaulted. And so that’s, that’s, that’s where, you know, you’re really you’re really structure would work for that.

10:54 - But yeah, it’s the rocky stuff is, is interesting, and it has the potential and holy crap doesn’t have a lot of people pushing, trying to move forward. So and there, there seems to be a good amount of structure, and a lot of good amount of intention. And the amount of donations like that people are willing to offer for, like hardware and software and time and effort is pretty extreme. So I’m hoping that a lot of good things will come out of it.

11:28 - It seems to me for a long time. So 10 year release cycle, 10 years book cycle, the hardest problem is maintaining the codebase.

11:39 - Right, so keeping patchable happen. So that’s expensive background patches and to get well, in addition to that, also maintaining forward movement. I mean, that’s always been one of the most difficult aspects of this. But it’s, it’s also solved a lot, at least in HPC, through containers. You know, that’s one of the joys of singularity is the fact that, you know, you can have this beautiful 44,000 node HPC structure that works beautifully with these ultra stable libraries underneath.

And then if you want your latest, greatest whiz, bang, editor, compiler, whatever, you just fire up singularity in your, you know, CentOS 12 container or whatever, or, and away you go.

12:33 - What about what about device service? So if they’re not that bad, with regards to, at least at least you using, like, specifically for like HCI, and Nvidia drivers and common release drivers and things like that it’s not that bad. Maintaining it at least underneath singularity, it hasn’t been bad at all. There’s, there’s, there’s a lot of effort out there to maintain that. But again, that’s when when you’ve got a single, like everyone’s supported sometimes.

So that was one of those things is that it’s it’s, you know, it’s the standard Linux kernel.

13:25 - Everybody pretty much supported Santos to some extent or another. I don’t know if that’s going to be the case moving forward, though that which is a major concern.

13:36 - Yeah. Like you, like you said that everybody was moving to eight. So it’s not like, most of the people are not sticking with a 10 year cycle. But having the ability to stick with the cycle, if necessary, is if you take a look at HPC. And if you take a look at eta, where I’m more focused, silicon development, that sort of thing, it’s very common to stick wary with older releases, just because the software vendors don’t move their support very quickly, at least in the ETA world.

Getting synopsis to support, you know, or getting cadence to support a modern operating system is amazingly painful. Same, so, I was in release at cadence for a while.

14:32 - Very good. Yes. Then. Then, you know, well, yeah, I have people over there who so I actually released all of my eta tools inside singularity containers now. So that’s how I just it’s my users all don’t even know they’re calling singularity containers, beautiful when they’re using their their, you know, dV scripts or they’re using their design scripts. They’re they’re actually firing off up a cadence tool that’s inside of a singularity container.

They don’t even know it that’s actually running in CentOS seven container outside inside of a rel 8. 3 machine on the most modern kernel, so they don’t know. But that they do know that, hey, they can run their latest greatest editor, they can run the latest greatest Python, they can run the latest, greatest whatever they want, but then still also get access to these legacy tools. And these legacy tools believe that they’re running in a fully compatible environment, because they are.

But yeah, what I’m going to say though, is that a lot of these older installations and HP installation HPC installations are still running on CentOS six. Yeah, in some instances in CentOS seven, in some is, and six zero well, right. But they’re still being forced to run on it.

15:55 - Because for whatever reasons, whether it be a specific driver, whether it be they have a piece of software that cannot be re engineered, whether they have to support you know, some of the particle physics laboratories, they have very specific hardware that cannot be, you know, that they’ve got an interface which is written that they can’t, they can’t do it. But that 10 year lifecycle gave them design time to migrate.

Sorry, rocket, I mean to talk over, you know, I didn’t mean to talk to you, but space space applications to you talking to satellites that are 10 2030 years old.

16:33 - Yeah, find the next 25 or something driver for for a CentOS eight, it’s very difficult.

16:40 - They’re out there. But you know, or take it another step backwards, you know, 30, to 70 emulation. You know, it’s, yeah, and the IoT world is going to the industrial IoT world is another one that feeds into the need for those long, long lived OSS because once something’s out in the field, for instance, there are oil field monitors out in the permafrost of Canada that are connected with CDMA modems, and they haven’t been touched for five to seven years.

17:19 - And they just keep broadcasting and you can’t really update those puppies, or the sensors attached to them. Yeah.

17:28 - I just had a fight with Gartner, which is always a player about AI at the edge. And, you know, because we do tend to step in smart TVs, and about software to run on the traffic light.

17:47 - You know, I’m sorry, that stuff’s not gonna get changed ever.

17:55 - I do. I do think we’ve gotten with cloud stuff. We’ve gotten really lazy about the longevity and life cycle of an infrastructure.

18:04 - You know, the hypervisor. The hypervisor was at try at fixing that. Right. So it was the low level hardware interface with an ability to handle at least a clean abstraction to VMs above it. It turns out, no, containers are another way, but it’s certainly a way to do it. And certainly, I think the lF edge stuff, right, which is hypervisors, for edge things, it may work, but may not. That is, it’s not clear to me that vendors will support that they’ll just continue to embed stuff.

But, you know, supporting abstractions, which abstract away hardware problems, and allowing the software to move on is a good thing.

18:58 - It it is in general, but again, you know, if you start looking at the broader use of another yet another layer of indirection, functionally, yes, it does solve the problem.

19:13 - But, you know, let’s, let’s use the same HPC use case we’re talking about, right? So there are lots and lots of sticking points, why stability matters. Why we have to contend with it my particular use cases, we have this massive cluster and needs to be hybrid, because the emulation that I’m working on, it’s done in hardware, specifically, these are cognitive radios. And then you know, there’s a compute cluster built on a cluster of cognitive radios.

And there’s stability matters because the what you’re getting out of these emulation scenarios, is so precise, and they’re so I mean, the cardinality is so high to add yet another layer have to start saying Okay, now what We are going to do quote unquote, rolling upgrades through multiple quadrants of this compute cluster, you know, to add another layer of complexity and start saying, okay, you know, you could have a heterogeneous environment and hardware software OS, different patch levels, and how to add that in how to start saying, okay, we’re going to start normalizing this.

So the top end captures, and already accounts for heterogeneity that the, you know, software layer at the hardware layer. So, you know, in some cases, abstraction helps in other cases, where abstraction is just there to, you know, kind of abstract away the hardware per se, there, it doesn’t help because, you know, the problem becomes intractable, right? How do you normalize across this disparate infrastructure footprint, right, whether that’s hardware or software.

So in some cases, abstraction helps in other cases, where you got to say, Okay, now I got to, you know, go to a shoe and, you know, run another experiment on how hard the normalization would be, that becomes really hard. There there, you have to say, okay, you know, if I’m abstracting at every layer of abstraction, I have to do a full uplift, right, the mass of at each abstraction there, that’s haros cost.

21:27 - It isn’t the type acted like the smart Nic conversation, right? What you wind up with is lowest common denominator. Right? What’s the level of abstraction that everyone supports? So I can run across these components? And then there’s a predominance? Yeah, that’s right. You’re absolutely right.

21:45 - Yep. All right. So wait, I’m gonna have a hard stop at 1245. I got invited to be part of the cube, reinvent analysis, just as I can promote somebody and y’all can keep going if you want, but I was wanting to restructure his code.

22:09 - I do I do want to I Yeah, actually, no, but he was he was all about what we’re talking about here. Which is, things are super complex. We need a whole bunch of operational cooling. You know, our system is really hard to use and confusing. And we we’ve we’ve got to help people use it. I think smart. I think smart NICs are both amazing. And operationally, I think you’re entirely right.

22:35 - That’s like, smart next are going to make people’s heads explode as an as a management plane. Thank you for joining us in another cloud 2030 DevOps Lunch and Learn discussion.

22:47 - If this was exciting to you. And these topics are interesting. We have discussions like this every week, just come in and enjoy some DevOps, discussion and camaraderie at the 2030 Cloud. Thanks.