ConfigMaps in Kubernetes

Dec 6, 2020 09:00 · 5690 words · 27 minute read receive um rad analytics io

my name is trevor mckay i’m a principal software engineer at red hat and this is config maps 102. you’re still here it’s uh 4 30 on friday a lot of you could be on your way home on a nice airplane but i know why you’re here right you were looking at that schedule and you said wait a minute this guy’s going to talk about config maps i got to hear this i’m staying right is that what happened yeah i know it that’s why i’m here right um okay so all kidding aside um you’ve heard a lot of great tech this week there’s been presentations on crds on extending kube all kinds of um you know enterprise kubernetes success stories machine learning stuff there’s been a lot of really great stuff but um today we’re going to take a sort of a new spin on a golden oldie we’ll take a quick look at what config maps are um i’m not going to assume that you necessarily know right this might be your first kubecon uh you might be have been around kubernetes for a while but you’ve just never really run into a config map so no worries we’ll go over exactly what those things are we’ll look at some restrictions related to typical usage restrictions are fine every design system has some limitations we’ll see what those are and then we’ll look at another way to interact with these things and it’s not necessarily the method it’s just a method that we’ve been using and i hope that it’s useful to you and gives you another tool in your kubernetes toolbox um and then at the end i will show you a um sort of a a mini example of this stuff running in just sort of a contrived example and then i’ll show you just a part of a real world application that we’re working on that is using some of these techniques so first of all um what is a config map right it’s simply a key value store that lives as a kubernetes object i get my uh pointer turned on here so uh here’s our key right there’s kubecon it’s awesome that’s the value here’s kubernetes it’s awesome uh excuse me brilliant that’s its value this is an export of a config map object um in kubernetes after i’ve already created one so it’s just a key value store if you’re familiar with python think dictionary for other languages think associative arrays or string maps it’s just names and values that’s it and it happens to live in etcd all right so how do these things make my life better the the major benefit to a config map is that it lets you modify application behavior without re-spinning an image okay so here we have a couple of pods running go back to pointer mode so here’s our application one and we have a config map here configmap1 mounted in the pod as a volume okay and likewise here’s our second application and we have config map 2 mounted as a volume in the application pod so if you design your applications so that the malleable parts are expressible as as key value pairs right um you can export sort of the changeability outside of your application image and then you can use the same image over and over and over again and you don’t have to respin to to change something like um you know the number of back-end threads you have on a database or something you just make that part of your config right so this has been the power of config maps since the beginning this is why they were invented all right so that’s how they make my life better how can i create one um there are a bunch of different ways you should read the documentation but here are uh this is a pretty broad view here which gives you a lot of flexibility in the top example we see we’re just calling cube cuddle create we’re making a config map called my config and we pass in some literal values right you can stack these multiple flags multiple from literal flags you give it an equals and then a key and a value so in this case we’re creating a key name verbose the value is true and another key name debug the value is also true a second way to do this is to descend over a directory so in this case if you use from file you can stage a directory with a bunch of config files in them and what kubernetes will do is actually create a config map where every key is a file name and for each each uh key the value is the entire contents of the file okay so that can be quite powerful if you have you know yaml or json descriptors all kinds of stuff the only place you can get into trouble there is that xcd puts a limit of one megabyte on objects and so your config map can’t be over a megabyte so if you’re um you know putting lots and lots of large files in there just be careful and of course um if you’re doing things a little bit more interactively you can just create an empty config map this way this will have no data section in it at all and then you can call a cube cuddle edit and actually add the data section by hand so with those few techniques you should be off to the races creating config maps all right that was how to create one how do i consume one again there are a bunch of different ways you should check the documentation you can mount these things as environment variables you can do all kinds of stuff but one way that i find that’s really flexible to consume these is as a volume and the way that you do this is in your pod spec you simply declare a volume right and you give it a name um and you tell it i’m waiting for that menu to go away you tell it um that the data source is a config map i guess it’s not going to cooperate with me and uh you give it a name so um the follow-on to this after you specified the pod volume is that in your container you add a corresponding definition which you can see up here you list volume mounts you give it a name again and reference the name from the pod and then just give it a path so in a nutshell any pod spec you have if you want to mount a config map you add a chunk like this to your container or containers into the pod and what this will result in is when your pod spins up this uh in this container you will have um at the path slash etsy slash config everything that was in the config map will be expressed as a file right the file name for each of them will be a key and the contents will be a corresponding value okay and these things can be tiny they can be a single value so you might have something where the file name is dog and the content is fido and that’s it that’s all that’s in there or it may be you know an entire an entire yaml file for some application that you’re running all right so i love config maps um i think these things are great um if you haven’t used them you should give it a try it’s a it’s a small piece of tech in the kubernetes world but it’s it’s uh it’s really powerful but there are some limitations right there are always limitations um you can’t design for everything so what are some of these well um the name must be known and what this means is that uh the name of a config map that you’re consuming in a pod has to be known at pod creation time okay you can’t get away with an empty string kubernetes won’t let you do that it’s an error so it has to be a definite name if you’re using some kind of a templating engine that gives you a little bit more flexibility um so that you can you know generate your pod spec from a from a template and you can pass the name in that way but the factory name remains that you must have a definite name for this thing when the pod gets created update latency so if you have one of these things mounted into a config map as a volume and you change the config map it will be up to one pod sync interval latency before the updates get written through and by default that’s a minute now for some applications this may not matter um you know it might be fine that uh the config will change you know up to a minute after you’ve actually written to the config map some applications may care all right they’re not easily composable so what i mean by this is there is no um there is no built-in way to have a config map where a field value in one config map actually refers to another config map right so you can kind of imagine you could take a large complicated configuration and you could break it up into a hierarchy right where you have your top level config and then you break some you know piece of it out some aspect of your app and put it in a different config map and so on and so on and you can nest these things theoretically as deep as you want to but there’s there’s no simple way to do this um with configmaps out of the box and lastly order matters um there’s a caveat on this what i’ll get to i’ll get to in a minute if you reference a config map in a container spec the config map must exist before the container will be created i’ve done this to myself it’s not always immediately apparent as a developer why your container is stalled and is not being created unless you start you know pouring into the events but if you reference a config map that’s not there your container will just i forget which state it ends up in container creating or pending or something but it’ll just sit there forever right now the caveat on this is in one six um there was this awesome pr where the optional keyword was added so you can actually call out a config map as optional and in that case if you do that and it doesn’t exist your container will start up anyway and then in the case of volume mounted config maps if you do add it later that data will get written through however you still run up against the issue that you have to have a definite name ahead of a time you know empty string isn’t good enough and you have the latency issue but this was a big help so if you’re running on one six or greater that’s good there’s probably some um you know enterprise production type people who are still on one 1-5 or earlier because right you can’t be changing every three months so what is an alternative method here’s my path in the woods like i said i have another way i’m not saying like it’s the way it’s just hopefully something that’s useful to you and what is that all right so it’s a more dynamic application model and in this case what we do is we just embed a tiny kubernetes client inside our application which talks to the api server for us and allows you to consume config maps dynamically now this isn’t necessarily novel if you’ve existed in the kubernetes core or you’re already making crds or you’re making api extensions you know you’re writing deployment pods that kind of stuff you’re already used to having access to the api inside your pods right but there are a lot of us i think uh developers who view our applications really strictly as running on kubernetes right we don’t think of ourselves as part of the ecosystem and so i think for for uh those of us on sort of that side of the game this is still a relatively novel approach and what i’m going to show you is that you can do this very easily and safely with just a tiny bit of code and it doesn’t have to be a big involved project yes yeah and i’m going to show you how all this works right okay so that’s the basic model um we get some benefits from this what what are those well really quickly uh going back through referring to some of the limitations we did before order doesn’t matter right um if you’re controlling this stuff yourself create things in whatever order you want likewise you have flexible naming right if you want to leave your config map with an empty string go right ahead as long as your application can handle it if you want to change the name midstream that’s fine too as long as you have some way to communicate that to your app through an event or you know some trigger that you come up with a url or something and you could change it as many times as you want update handling you can handle latency any way you see fit if you have a polling interval that is appropriate for your app go ahead and pull if you want to do read on demand every time you hit a particular piece of processing and you you know that that’s you know infrequent and it’s not going to overdrive the api server or anything do it that way um you know if you receive some event through a kafka queue or something that tells you hey your config just got updated why don’t you re-read it you could do that the point is um you have choice right you can you’re not limited to updates with this uh variable uh duration up to a pod sync interval anymore you can know you can update definitely um in bounded time all right you also get uh composability like i talked about before so let’s pretend we have a web app here um we’ve broken out our theme and our scale as separate configs um maybe i’ll grab the pointer here um so just assume dot dot dot means really huge right which is why you would want to break these things out in the first place but we have a western theme with some desert and cactus we have some scale but let’s say that we run this thing for a while and we decide okay i really need a large scale right so i go and i define a large scale and then i want to take my running app and scale it up so i can just do this right i can just go into my top level config change this one entry in my top level config map to large right and as long as my application understands that this is my convention that i’ve set up this is my design it can go and read the large config on the fly in an update so you could do this you know you could nest this to however many levels you like all right um another thing that’s great about this is you can run this stuff in a pod or outside of a pod and the only thing that’s different is the authentication mechanism you’ll see that in a few minutes if you’re running a client there’s a little bit of off code and if you pull that out and separate it from your your main routine you can reuse the same code so here we are in a pod here we are out of a pod running on a laptop um you can use the same client snippet okay so some people might say well maybe this is too much right the kubernetes api is big um you know there might be a security risk here because i’m exposing more of the api surface area in my pod i might be pulling down lots and lots of dependencies and and this has me worried i’m not sure that i really want to go here but what if you can do this right what if you take the api and you take that huge stack of books and turn it into just the little tiny piece that you need right in our case in this example all we need is uh a single call to get uh on a particular namespace for a config map right that’s a tiny part of the kubernetes api your client doesn’t need to actually include anything other than that so your dependencies are limited and your surface area is limited and i think if you do that then the risks are negligible and it’s worth doing okay so let’s actually look at this thing and go this is the actual routine i’m going to show you in a demo later the only difference is that i put a loop around it for the demo okay um so what’s in here well there’s our our client set um this is the result of the off call we’ll see that in a minute this is the part that you break out so that you could run in a pod or outside of a pod we pass in a name a directory and a namespace as arguments right so those are pretty simple this right here is the core of this whole thing right it’s a get on a config map in a particular name space and what you get from this is a string map okay it’s just uh just a map of strings i don’t know how it’s to explain it if you loop over this thing and go right you loop over it and you get key value pairs back and from this point on this is just normal processing right it doesn’t have anything to do with kubernetes it’s what you would do anywhere so i get keys and values i create a new path based on the key i open a file and i write my string boom that’s it right so with this little fragment of code you can read configmaps all day long and write them to a directory okay so authentication stuff in a pod it looks like this this reads your config from the pod environment this gives you a client set and you return it outside of a pod it’s very similar this is using the client go by the way you pass it a path to your config build your config based on that return your client set and you’re off to the races okay so i’m going to show you this in action let’s see here let’s start fresh all right so i don’t know if you can see that okay let me blow that up a little bit does that look pretty good um let’s look at what’s in our yaml file first it’s very simple okay we have a pod here um we’re just going to run this pre-staged image that i created um you’ll notice if you’re really observant that this is set up for local access so if you’re doing this stuff on minikube or on um on openshift origin um just know that this example is set up to pull something from the local docker demon and then we pass the name of the config map we want to consume as an environment variable right that’s it so if we run this okay there is our pod running and now we want to log it oops yeah loop there we go so it’s sitting there saying it hasn’t found config map bob well that’s not surprising because um we haven’t created it yet so let’s get another terminal over here blow that up a little okay and so now i’ll create mine bob all right so that’s empty uh at the moment and what we should see here is um is it off the bottom of the screen let’s see i might have forgotten to do one thing oh no it’s reading it okay so here you can see the point um it’s reading bob it’s not doing anything very interesting because bob doesn’t have any values in it yet so if we go back over here we can edit it and this is how you add a data section and we will give bob a car he’s got a ford he likes antiques so we’ll give him a model t and now we should see this update here there we go so we’re pulling every 10 seconds we saw that we changed and now we’re rewriting our config directory you see that we’ve got a file for make and a file a file for model so we can do this all day we can change the car he owns we can delete bob and it will keep updating but i think you get the idea there let’s see so if we go back to our presentation there’s something else i want to show you we’ve got about 12 minutes left um i want to show you a real world example um that was intentionally simple so you could see how it worked hopefully that made sense to you i work on a project called rad analytics io and as the text here says it’s a it’s about empowering insightful data-driven application development on openshift so the piece i want to show you of this today is a tool we have for creating spark clusters very easily and it uses some of these techniques to let you configure the spark cluster so this is one little part of um of what we’re doing but i have links up to our site at the end if you are interested so what does a containerized spark application look like um with uh using a standalone spark scheduler you end up with a driver which is where the spark application is running you have a master which is part of the spark cluster and you have a couple workers right how do you create one using um red analytics io tools well we have a tool called oshinko it’s very simple it’s a little cli and you say create my cluster and in this case uh the default number of workers is one but we’re going to tell it here to create two workers so that’s pretty simple but you can also do it this way so let’s say that you um you wrote some cluster configurations that you want to save for later we let you do that we let you reference them with a flag called stored config right in this case we’re going to use debug and so you just pass this flag to create and oshinko will go and read the config map and figure out what your cluster should look like this is sort of an idealized view of how it uses them here’s our top level config we include things like how many workers you want um we let you turn on optional uh services like uh this oops that turns on spark metrics um we won’t actually see that today um and then here this is where we use the composability um spark takes its configuration from a bunch of yeah a bunch of files just like if you’ve ever done anything with hdfs right hadoop is configured in the same way it has a collection of of files in the um in the config directory so we put an entry for the spark master config and an entry for the spark worker config but oshinko expects that these are the names of other config maps right so we have another one called debug master it’s got a log4j properties file in there uh that controls logging um you can put metrics properties in there you can put sparkdefaults.conf in there those should actually be lowercased there’s a typo in the slide um anyway just know that wouldn’t want that to throw you off um and we’ve got the same thing for uh the worker config down here right so we break these things out so what does this look like in an actual run so here’s oshinko running on our laptop outside of kubernetes we’re going to create a cluster from our command line and we’re going to hook a spark application up to it later so it uses an embedded client like the one we just saw to go and read the debug config it reads that and finds out oh okay this thing has a spark master config in it that points to another config map named debug m so i am going to go create the master pod um and in this case since i’m actually generating the pod spec i go ahead and mount it the good old-fashioned way right i generate the pod spec and just uh add the configmap as a volume um so it’s taking a hybrid approach here right it’s using a dynamic approach where it needs to um and it’s using the static approach where um there’s there’s really no downside in that particular case and it does the same thing for the workers all right so i will show you quickly what this actually looks like in practice let’s go back over here we’ll stop monitoring this and i should have oshinko on my path so we will create a cluster called first cluster now if we go back and look at the openshift console this stuff runs on openshift from red hat which if you’ve seen other presentations is really kubernetes package for the enterprise um it’s kube with extensions but it’s the heart of it is kubernetes so there’s really nothing different here so here we created we spun up a master and we spun up a worker this is our tiny spark cluster i want to show the show you the effect of the config so something really easily observable is uh i don’t know if you can see that is the logging level you can see here that you’ve got like regular old info level logging in the master pod all right so let’s go back uh to our command line here what if we want to create another one right actually let me get that up on screen a little bit more so you can see it oh that’s not going to help is it maybe i can shrink the window all right that’s a little higher up so if i do oshinko create my second cluster and i give it a stored config called cluster config it went and created a second cluster now um if we do this um let’s just do an oc export here oc is the open shift uh client and it has a cool extra command called export which gives you a nice view of objects without all the extra values in them you can see that our config map in this case called for three workers and it had an uh and a spark master config uh called out and a worker config right so if we go back here we should see a different kind of cluster and we do here’s our uh our master we got three workers this time because we asked for them and if we go in and we start looking at the logging we can see that we’ve got debug log entries in here now instead of info so the config took right we got something different um if you want to see what the result is of putting config files um in a config config map what did i call that okay hold on a second i must have spelled it wrong uh master config there we go so this is what a config map with files embedded in them look like you can see here’s the data section um here’s the key name right which ends up being the file name and then you’ve got kind of the the yaml designation for multi-line stuff um you have the license file and everything in here right so that’s what we passed in but it’s nice that you can hide these things away using that little composability trick okay so i need to go back to where’s my presentation there it is all right so that was the end of my demos um takeaways so why are config backs good uh in general you can change behavior without a respin they’re powerful with a few limitations which may not affect you but if they do um reading it dynamically is very easy gives you a ton of flexibility um and uh it’s safe is if you use a small scoped client that doesn’t have superpowers you’re not really endangering anything i mean the only thing our extra client can do in that pod is read a config map and it’s scoped to the project right so i mean you can’t even read anybody else’s stuff not very dangerous here are links to all of our stuff i’ll leave this up here if you want to take a picture of it um rad analytics io is our community landing site if you want to see more about the work that we’re doing we have images up on docker hub our github is listed here and uh the little looping example that i showed you first i actually threw up the um the pod spec and the code for that on my github account in case you want to take a look at it and play with it it’s based on the client go examples so it’s pretty easy to consume go away go away and there is my my email address is there on the bottom but it’s occluded there we go all right so the easiest way to get in touch with me if you have questions or comments on this stuff or you want to know more about what we’re doing is just send me an email and with that we’ve got uh three minutes for questions i mean the truth is the conference is over so we have as long for questions actually as you want so um so are you asking can multiple applications use the same config map at the same time yeah yeah that’s fine um a configmap can be mounted in multiple places um the downside on that is if you change the config map for one you change it for everybody right so um in that case you have to decide do do i actually want to be able to change the configuration for applications individually or is it fine to change them as a group and then you just design accordingly is it atomic so the question is am i aware of any problems with updating config maps and and um i’m guessing you know like race condition kind of stuff um i would guess not um i don’t i don’t know for sure you’re right i mean by not um by reading this yourself you’re kind of stepping out of the kubernetes pod sync behavior but at the same time kubernetes has to know in general that um people can do gets on config maps right and people can edit config maps so i’m assuming there’s uh there’s there’s uh guardrails around you know atomic operations at a lower level i think it’s probably fine but i can’t say for sure question is if you’re doing this in openshift do you have to set up a service count and the answer is yes in fact there’s a service account in that project which i didn’t show you which i created before you came in same thing in kubernetes um when did our back get included like uh 1 6 i think if you’re running kubernetes with our back turned on it’s the same stuff as an open shift so yes you need so so out of the box right the the the role that you need is edit okay edit gives you read and write but that’s the one you that you get out of the box however um it’s scoped to the name space so you don’t get edit for everybody and if you want to restrict it lock it down further than that you can create a custom role which only has access to get right not even access to list or you know other read operations and and definitely not access to write you can define um that on your own so that your it’s locked down to read um but you’re correct if you run this without the um without the service account so what i did is i took the default service account for the namespace and just added edit to it so it’s system service account my project default um you run without doing that and you run a version of this with an error handling still in the code it will it will tell you um such and such service account doesn’t have privileges to read config maps in this project so it’s easy to see if you actually have that turned on like oh yeah i have to go at this so thank you very much .