Helm with Kubernetes: Package Manager

Dec 30, 2020 08:15 · 6330 words · 30 minute read come seriously neat idea gives

welcome to the session helm past present and future i’m matt butcher i’m matt farina and i’m bridget crumho i think a lot of people are interested in what will home four look like and there are many possible answers here uh butcher do you want to give us some of them i i think helm4 is going to uh present a very interesting uh problem that we will want to solve and that is what exactly is hell uh is is helm really and truly just a package manager in which case have we gone too far with many of the things we’ve done in home three and is it time to tear down and get back to basics or are there still major features out there that uh that helm needs and is still lacking uh or is it some mixture of the two and i think that’s gonna be the number one thing that the hell maintainers are gonna have to solve as we kind of look at and move toward the roadmap for home four so wait you’re saying we have the floor wax and dessert topping problem that we need to uh figure out the answer to what’s our move here farina i don’t know um the one thing that i want to see in helm for is probably a little bit more consistency because some of our apis and some of our our stuff is differently like the way we have a couple of different default time formats um so if you’re using helm as a building block for other things which happens with package managers you want a certain amount of consistency and i think because we really put a focus on that in helm 3 we can learn a lot and improve on that in helm4 and so that’s kind of what’s on my mind but again i don’t know because we’ve got time yeah yeah absolutely and you both bring up some interesting points and i think we’ll explore those a bit more just the idea of package management deployment um it will help solve all of your cod problems there are so many possibilities here that i think would probably answer this by going into the wilds of the past uh farina weren’t you just telling us about you just looked at some dates is it how far past are we talking so uh if i remember right the first commit of helm was october 19 2015. so just over five years ago now helm the first commit landed that started this whole thing off and it all started because of a company that matt butcher worked at deyes had a problem with their paths called workflow matt can you talk a little bit about what that problem was yeah so we were building uh coming out of engine yard we had already built a product that was a general path system that ran on a number of orchestrators and we decided that we wanted to take that platform and rebase it on kubernetes and really focus on taking advantage of all the things that kubernetes which was at 1.1 1.2 at the time all the things kubernetes had to offer but one of the first things we discovered was the process of installing our own paths on kubernetes was tedious uh sometimes taking hours and hours to just get all the yaml manifest upload bit and and things like that uh and we we realized that there had to be a better way and in reality we were just dealing with a set of yaml manifests that we needed to upload and install in a particular order so there had to be a better way to do that and that was kind of the core intuition behind uh what became hell and this this came out of a hackathon which would lead you to believe that it perhaps didn’t start with a very specific and detailed planning um farina you joined the project slightly after that hackathon can you talk a little bit about the decisions being right versus right now or how you even involve a project that starts small and becomes giant so by the time i had joined the project um helm had already merged with a kubernetes project called deployment manager and that’s where helm v2 came about and then after that you know it was now under the kubernetes umbrella and kubernetes after that joined the cncf and so we had helm version two and we had um uh charts and and that’s kind of where i got involved was after home version two was out the door and we were looking at how do you grow home and what’s useful and if you’re gonna have a package manager and you want it to be useful one of the things you need is tons of charts tons of content tons of packages for people to install and use when i came in there was lots of manual curation happening and quite frankly if you’re doing lots of manual reviews uh of the same thing over and over it presents this great opportunity for automation and that’s where i jumped in and i helped automate a lot of these reviews and get it going so we could have you know you could install anything or tons of things and we ended up with hundreds and hundreds of packages uh in the stable repository that people could do and then packages elsewhere through people’s own repositories ironically this is not the first package manager matt farina and i worked on together uh and matt farina had i and i had written a glide which was a package manager for go uh just a few years prior to this and i remember that yeah yeah and it was interesting to go from that set of design constraints the the way you’re thinking about a a package a dependency management system that’s largely about uh managing source code and then switching over to something like helm which was a really more exercise in something closer to an operating system package manager and back then we used to talk about that all the time how uh in our minds if we talked about kubernetes like an operating system the next kind of opera the next evolution of an operating system up the up the stack then uh helm was really closer to something like apt-get or homebrew a lot of our very early design inspirations came out of those because we because of the metaphor that we kept saying to each other what if we just treat kubernetes like an operating system what if we just uh treat packages like the same way you treat installing something a new binary package onto your operating system so while at the same time you know we had already come from a background of working together on package management systems it actually presented so many differences this time that matt kind of uh dove in on the packaging side at the same time that i was really focusing more on the command line side and the tiller side back then so it was a totally in spite of the fact that it was a common ground in some ways uh it was a totally different experience for us than building glide had been yeah now you’ve you mentioned a number of pieces there that i think that you’ve gone one direction and then another on um with the benefit of all of the hindsight uh what now would you do differently than you did uh in the last five years so this begins the four hour portion of our show there are some things that uh that i i have often wondered if they could have been done better and try and remind myself of what the constraints were at the time the original helm did not have templating at all and an early prototype of it used uh unix-style environment variables and had no real programming logic in there but we ended up settling on go templates and largely we settled on go templates because it required us to build very little uh we could basically start with core go and build from there uh i have often this is the thing one thing i go back and forth on the most when i look back on helm is did we make the right choice there uh and earlier you said uh you know the contrast between doing it right doing right doing it right now that is exactly where go templates uh fit in right because we said okay you know there are there are languages like jinja that are fabulous template languages but there was at the time no go implementation of jinja and so we did talk about what would it take for us to write this uh you know an entire template uh engine and do we want to maintain that do we want to be able to pipe this out to any external templating engine something we tried that went uh in my opinion very very poorly or do we go with a default template language and and hope that it works well enough and we really ended up choosing uh path c on that one and in some ways it served us well because uh we haven’t had to maintain this huge body of code but in some ways i feel like we you know we had to cope with the template engine changing from under us when the go developers decided that things were going to work differently the documentation honestly has been very poor and we’ve had to really kind of uh matt could probably speak to this better than i uh we’ve tried several different ways of trying to document this but it ends up becoming keeping the go documentation in sync with the home documentation in sync with the sprig documentation and it’s been an interesting tour in the end was it a mistake i i guess probably if i were in the same place again i would make the same decision but that is probably the number one that i still lose sleep over that’s a really good insight and i think actionable for people when they’re trying to do technical decision making you know what there will never be perfect future knowledge and so you have to do what is expedient and hopefully also offers you options yeah and the thing that i look back on um is a little different i look back at the security angle so you can sign charts and have provenance files and it uses pgp to sign those but it hasn’t taken off in the public avenue i mean there are some people who use it to you know a great degree for internal security and things like that when you go get lots of public charts they’re not signed you don’t have that that providence that you know who you’re getting it from that you know you get that security angle and i would love to have seen something that made it more accessible to people and made it easier for them to sign their charts to verify them and to make that security angle more widely spread i don’t know how to do that but that is one of those things i would like to see uh and i remember telling matt you know it’s going to be hard for new developers to come to it but they’re all going to see the benefit of doing this and you know by default you know we’re going to see 90 plus percent of people signing their charts i would guess that now for four and a half five years later we’re still at what maybe two percent three percent of people sign their charts even that are we that high yeah yeah so if we can improve that security angle and maybe that’s something to try to figure out for home for uh so what’s going on with home right now i know i’ve been giving talks and writing blog posts and we have a workshop that we’ll be doing um about the home v2 to v3 migrations that’s very front of mind uh but i would love to hear from you folks what do you think are some of the right now things present in the home project that you’re focused on so i guess the first thing i’m going to talk about is charts so helm had this period of growth in popularity huge growth in popularity and we had the stable and incubator charts repositories and they were i mean we had so many charts in them and quite frankly maintainers would come in and try to help maintain them and we went through periods of burnout because there was so much activity it’s so much to do every day and that’s where we came in there was lots of automation that was added but it still wasn’t enough we had maintainers you know burning out running out of time and so instead of having this one repository which was really example charts that morphed into hey here’s the common repository it’s now a distributed model where you have charts everywhere then we have the helm hub because once you move everything distributed you still need to be able to find it and the helm hub uh it just had so many users and it helped them do that but the software that powered it was originally designed for these on-premise smaller installations and there are more things than just charts out there and so the artifact hub project which is a cncf project was started and so now we point people to there because now with these things that are distributed from companies and organizations all over uh you know now we still have this central way of finding them when we broke this up but it was done so that way people could maintain their charts themselves and not have to wait on a handful of maintainers in the helm project and uh it kind of allowed us to scale because we don’t scale and then we can work on those automation tools those people who were interested in charts can work on automation tools to help all of those different people scale out their chart repositories chart tests sharp linting and things like that yeah i would add to that uh two things uh artifact hub has been i think one of the most exciting developments on the cncf landscape in the last year uh in part because the danger as any community gets to be the size that the cncf community is is that we will begin to fracture by default right we’ll just everybody will naturally sort of drift apart uh because we’ve all got our little uh fiefdoms that are all getting their own development artifact hub was a great example of representatives from different communities inside of cncf different sub-communities coming together and saying you know at the end of the day we all have to figure out how to get our particular software packages in the hands of our users that’s a common problem can we do a common solution and so i think the work that matt farina and many others have done in unifying different projects and around the idea of the artifact hub and then getting helm you know right away as one of four or five projects migrated over to that has been uh sort of a very low-key victory for cncf as a whole and i’m still very very enthusiastic about that uh and i think that i would like to reinforce another thing that matt farina said which is how important this idea of switching to a distributed um chart development model is going to end up being uh one of the big insights that uh that the chart maintainers had was that many of the more vibrant packaging communities uh don’t require central um authoritarian authoritarian editorial control over things right uh if you look at npm or in any of the major operating i mean sorry programming language package management systems now they’re all largely developers self-motivating and self-curating their own uh stuff and the the core maintainer roles are really more about keeping the infrastructure running and i think that moving from a centralized chart repository into into those kinds of areas will not just protect helm developers from burnout which has been a very big problem for us to be honest uh for all for every single one of us um the but it will also really uh empower the community at large to be able to uh take on a role of curating but also rapidly build their own charts and their own packages for the things that they’re actively developing on as the helm project has grown and you’ve realized that uh it doesn’t make sense to try to do everything centrally um or uh on a small scale just kind of making the one-off call here and there you also have introduced the the project has introduced the home improvement proposal um and or hip um and uh is it fair to say that these hips don’t lie like what are what are we doing with the improvement proposals so that was started by another mat on the project matt fisher and they’re kind of like kubernetes caps kubernetes enhancement proposals uh where it’s a way to you know when you’re a fast-moving project who’s just figuring things out right you can just put up pull requests or you can file an issue or just throw a proposal out there but now whenever you have a big change on a mature project uh just like kubernetes says you want to take the time to think it through to look at the implications to understand who the end user is and their impact because you know there’s there’s so many users of helm we don’t wanna to hurt the 80 percent to please somebody in the one percent and so we need to think through features and how they’re implemented and so the uh hip process uh lets us look at something propose something talk it through analyze it add details to that before we ever go create it because uh you know we’ve had many cases where somebody comes with a pull request for a neat idea and it’s a lot of code and we want them to rewrite it because it doesn’t fit the mold and i feel bad for them when they have to go rewrite it or they’ve put a lot of code in it’s something that may fit better as a plug-in or as maybe something that should live as a different you know a layer on top of helm and use helmets a dependency and this gives us an opportunity to think it through hand through beforehand and give good feedback and curate it for any big ideas yeah i very much like the way you frame it as something that is intended to be more respectful of our many contributors because uh reiterating it’s it’s really hard to say to someone i know you did a lot of work but uh you know there are a number of things here that uh that we need to consider before doing that do you mind rewriting several thousand lines of code uh also i’ve liked the way that it’s become the the hips have become a form of documentation so that we can point people toward a very definitive explanation of of how how it is supposed to function what it’s intended i believe that one of the things we’re going to notice in the future about this is that hips prove invaluable when we start dealing with uh bug reports and we’re trying to figure out if the report is uh you know should have been doing x according to the hip and it wasn’t or if it was really more of a like um soto voce feature request or something like that uh so i think this is going to be a good thing for the community going forward because it’s going to help give us not just uh assuredness that we’re doing the right things in the pr’s but also the benefit of uh being able to refer back to something in writing that’s definitive all right oh i was gonna say it also let you um for future maintainers you know one of the things when you come in to be a new maintainer on a project is you’ve got to figure out what’s going on why are we doing things this way what’s the history behind it right because you may not know and i i know that because i’ve stepped into more than one project after it’s already been started and had to understand how did we get here why do we make those decisions and i’m reminded of a hip you’re working on matt butcher around crds and the pain points because we’ve had more than one go at how do we figure out and try to solve this and where are those hard spots so we can try to solve those hard problems while thinking through cases for people who aren’t like me but are really relevant to the project uh and another area where we’re really seeing the hips work well i think bridgette your hip number three is about to get merged or is already merged uh gum or nits turns out you gotta write it down and uh it’s probably good to think through your governance for your projects out there before you have a problem just pro tip farina because you already jumped on that third rail of crds do you want to give us the short version of the cliff notes what is going on with crds and home and where do you think it’s going so crds are hard right because if you delete a crd it deletes all the custom resources in your cluster and that could cause production outages right imagine doing some kind of helm uninstalled because your app has an issue and next thing you know you deleted your crds and then all your customer resources are gone and then all this stuff got deleted from your cluster and not even just for you but maybe for other tenants and things like there there’s really possible cascading problems um and so we’re very cautious around crd handling because the gotchas to it can really hurt you and so we’re looking for ways to try to make it easier while also trying to shield the gotchas and so if somebody wants to sit down and talk about that next feature we’re happy to walk through it and see will it help will it not does it is it require helm four can we fit it into helm three um but realize we took we’re taking the conservative route because breaking production as matt said earlier is something we don’t want to do and so we’re being very cautious about it but we do have a couple of different ways to handle crds today that do take that cautious route but anything else is going to be very slow and worked through and experimented on i think one thing that uh that we should explain about when we talk about the present and the future is how seriously we try and take backward compatibility and helm and following semantic versioning rules which which which suggests that you do not ever introduce a breaking change without incrementing the major version number of something so in many ways when we start talking about helm4 we have these dueling instincts where uh in fact it was illustrated earlier in the talk right where we have these dueling instincts between looking at helm4 and saying finally i can fix that one function call that’s had a best digital argument for the last eight months uh and you know those kind of uh minor things that in the grand scheme of things are minor but because of our compatibility guarantees we will not change until home four so we’ve got those on one hand and then on the other hand are the things like the the crd problem where we’re saying okay this is a very big uh issue that is impacting many many users of the home ecosystem we committed to something for helm three and we’re going to keep plugging away at that for helm3 but does helm four offer us some new chances to try something out there that we can’t do today because it would break and this is where uh a lot of the hip that i had worked on was saying look we’re stuck for helm three inside of these fairly narrow guard rails but in helm four if somebody can come up with a really intriguing uh way to work around many of these limitations in the system we can kind of go big at that point and and you know work on thousand line prs that you know break interfaces and change the chart format and things like that uh i’m curious if either of you know of other ones like that that are big issues coming in helm4 where we could you know potentially do something big with charts or with the indexing formats or how repositories work or things like that some of the things i think are coming in helm3 anyway like there’s the oci artifacts if the oci distribution spec goes 1.0 with artifact support um then we can push and pull helm charts from oci repositories but that could be a helm three thing um and it’s a future thing but it’s not uh something that requires helm for it’s not a breaking change that’s good that’s good um butcher i know you were talking a little bit before about uh how we have these theoretical uh hips and pull requests coming in from some amorphous person what does that look like like how does somebody get involved yeah and this is one of the things that i think uh has has really changed over time too you know taking that look all the way back to the beginning you know uh every every pull request was uh was immediately considered and you know within days we usually had things merged but we’ve had to take a more metered approach now and hips are are really really helpful i’m really happy with the way they’re going uh but it does change we’ve had to kind of redefine the way that we’re uh asking for people to to join the community and participate in the in the ongoing development of helm because of that and uh you know on one hand it’s it’s fairly simple right there are uh we’ve added more templates to fill out when you file an issue to help us rapidly figure out what it is because the volume of issues that comes in is high and many of them are answerable by reading the documentation but when people don’t find it in the documentation they’ll file an issue which of course is fine but uh it’s it’s meant that we’ve had to ask for you know lots and lots of different things to be filled in in those issue templates it’s always very helpful to us when people jump in on the issue queue and say oh i can also reproduce this here’s some more detailed information about how i reproduced it whereas it’s very unhelpful for people to say me too with no additional context and it’s been interesting how reliant we are getting on the the detailed uh analysis that our users are being able to provide that is fabulous and i i’ve been thrilled whenever i see people following on and saying here are more details about how to reproduce this or even better i traced it down to this area of the code you might not be able to do the pr yourself but giving us detail debugging information often will go will help us get from the point where we’re saying oh this is frustrating i don’t have enough i’ll come back and look at it two weeks later when i go through my next sweep to oh okay i’ve got enough that i can reproduce this now and get going yeah yeah some of the bugs we get now are such edge case bugs that it’s something in the dependency of helm that we pick up from kubernetes uh that’s only affected on a certain operating system and so us on a different operating system primarily may not even experience it and then trying to troubleshoot that and get an issue filed or a pull request fixed somewhere in the chain it can be difficult with some of those now and a lot of how maintainers currently work at cloud providers or vendors but if we don’t have a maintainer um who we can tap to say take a look at the thing for your cloud sometimes we just reach out to other members of the community in the kubernetes community and beyond and say hey it looks like this is a problem that is specifically affecting the users of your product or you know your service can you take a look at it and the community has been really responsive with that so that’s it’s always nice when you can get the warm hand off instead of the just go start over in some repo over there we don’t know like if you can get members and i have seen members of other communities come over into our issue and start talking to the user there and it’s like oh that’s fantastic most people think of the helm client the helm cli and it has the library that other things will build on too but we also have the website the charts maintainers we have all kinds of tools for using github actions to deploy your repository and github pages or for testing charts that goes more beyond what helm tests and how mint do and we’ve got a number of different projects there’s chart museum and each one of these projects so anybody can become a maintainer on once you’re a maintainer you don’t just write code and push it up you actually help other people be successful in contributing and so we’re looking for people across all of the code bases we have who are interested in doing that and willing to and there can be things that aren’t scary like the helm client um and and aren’t just code they can be docs and and other things as well yeah i really appreciate that perspective because i think that tech in general can often um prioritize and elevate the code contributions which are obviously important but as you’re alluding to are also not all of the everythings so uh butcher uh i feel like you you mentioned a little bit before so i’m gonna bring it back to the beginning you mentioned templating as something that maybe could change in the future and i thought spoiler alert okay what is your thinking about where home could go be futurist uh strictly in terms of templating i mean we we have this chance again to revisit some of the crazier things we’ve talked about in the past uh you know we have talked about doing uh embedded runtimes lua javascript webassembly something like that uh that can help build a chart dynamically not off the table yet definitely no we definitely haven’t signed up for it yet but there are some things like that that are really excite me that uh you know we might be able to really push some boundaries in some interesting ways uh to really open it up for people to do some uh some deeper and more robust things with their charts as they’re either rendering them or installing them fabulous and farina i’ll ask you the same question because you mentioned security as a thing that we could do a lot in that space what’s your vision for that well i’d like the experience you know the technology i haven’t figured out which is a hard thing to figure out especially if you break away from pgp which is what’s most widely known but a really easy way to sign and to verify and to share keys would be just a fantastic thing that may end up being notary version two uh which is trying to figure out some of the problems we’ve got with notary version one in our oci and our ability to use those registries but being able to figure out how can we make it easy where somebody can sign without having to learn how to use new you know gpg in order to sign and try to remember what those command line flags are right and i do it all the time and i still got to look them up how can we make it simple and yet work with hardware security and things like that because security is important that would be a great thing to dig into i love it uh so let’s let’s wrap this with what we’ve learned that lets us give dispatches you know from the future to people who are on this journey with their cloud native projects and i’ll start with it’s a lot harder for us to deal with translations in our command line output that we also generate some docs from because that english is hardcoded into the code and being output by the uh you know home binary itself and so while we do have people translating the website docs into you know japanese and korean and chinese currently and hopefully more languages if you’re into that be awesome just translate a few pages you don’t have to translate the entire website um but the problem of oh gosh people are starting to roll up with translations that we can’t currently put into the command line to output in language of choice i i don’t have a solution to that but that is a thorny problem that i would love to see us work on um in the future and if so my my advice to people who are on this path is be very careful about what is possible and not possible to localize because people are going to want to localize all sorts of things and places where they can’t just think hard about where you can send them instead to output that can be put in their language of choice instead of hard coding you know english or whatever um yeah so farina what are you thinking along these lines the thing that i keep coming back to is listening to our users right i think that’s one of the biggest things that a project can do is listen to who your end users are um and try to understand them right i remember uh kubecon cloud nativecon several years ago where we literally had appointments with different end users and we sat down we just interviewed them and we listened to them and we took what they had to say you know what are they being successful with and where are their issues because you want to learn you know what would they like to do that they can’t do now and so i think as we you know other projects should do the same thing i think every project that wants to be successful listening to your users and understanding them is so important i think even as we roll into helm version four spending a good amount of time sitting down and listening to those users who rely on helm will help us have more insight into where we should go and what we should do next well for talk about the past present and future i suppose it fits in well to say i feel like i’ve come full circle on something here uh early on we were we were obsessed with this with the user story of getting somebody started in kubernetes kubernetes was young moving fast very conceptually difficult and we wanted helm to be the way you got from uh zero to your first kubernetes installation in in a couple of seconds right uh the helm two period and up into the up into the mid helm 3 period now has largely been focused on solving a whole bunch of other problems around that but now what we’re experiencing is a huge influx of new users to kubernetes and kubernetes is not any simpler it’s actually far more complicated than it was several years ago so once again the thing that i find myself feeling most passionate about is can we keep making helm or even improve the story for helm for that first experience coming to kubernetes and uh and getting that first installation done in a few seconds and then using that installation as a springboard to understanding what’s actually going on in kubernetes how pieces fit together and how they work and i feel like we got to leave it at that and say this is a community that is interested in you come talk to us we have all sorts of community engagement opportunities and uh we’ll see you on github .