Performance Profiling | CPU Usage Tool

Oct 9, 2020 15:01 · 3831 words · 18 minute read functions within top five whole

Are you wondering why your application is taking so long to run? Well, did you know in Visual Studio there’s a tool that can help you investigate the issue? Learn more on this episode of Visual Studio Toolbox. [MUSIC] >> Hey everyone. Welcome to Visual Studio Toolbox. I’m your host, Leslie Richardson, and today I’m joined by Esteban Herrera, who is a PM on the Visual Studio profiling team. >> Hey Leslie. >> Hey Esteban. >> How is it going? >> It’s going well. Glad to have you back. >> Thanks, glad to be here. >> Sweet. This is a larger part of our game performance profiling series, and today we’re actually going to kick start it and go in depth on every profiling tool that you should be using in Visual Studio. What are we talking about today? >> Today we’re going to do a deep dive into the CPU usage tool, and also remind you, you can reach these tools by hitting that “Alt F2” shortcut on your keyboard or go into the debug, “Top Menu”, and go into the “Performance Profiling. ” >> Sounds good. What is the CPU usage tool? It sounds important. >> Yeah, so let me just jump right in.

01:14 - The CPU usage tool is one of the many diagnostics tools that we offer in the suite of tools. It’s one of the most powerful and the easiest way to talk about it is to lay a little bit of groundwork first by answering some questions. The first is, what does it actually measure? It measures two things specifically. One is the percentage of CPU utilization over time. That’s a percentage of your total CPU usage, the same thing that comes up when you look at your task manager, and that after you’re done recording a trace, that’ll show up in the graph like you see on your screen right now.

01:58 - The interesting thing about this view is you can see peaks and valleys of CPU utilization and hopefully narrow in on an interesting time during your trace where your CPU was either spiking when you weren’t expecting it, or maybe it was in a valley or plateauing when you were expecting it to be working harder. It’s just a really easy way to quickly narrow in on what part of your program you want to be looking at; and the next thing it measures is how many times we saw a call to a function on the call stack over time. For this explanation, we’re going to imagine a really simple program, like you see on your screen now where a main function calls functionA. functionA calls functions B and D, and functionB ends up calling functions. They’re pretty straightforward execution of code.

02:52 - Now if you can imagine the call stack during the execution of this code, if you were to sample it at different moments in time, that first sample would have just your main function, that’s that first tick on that graph, and then when it calls functionA, you would get a frame on the call stack on top of main that says functionA, and so on and so forth as the program executes. What we do is we record two things. One is how many times total did we see certain function over that sample. In this case, main was at the bottom of every single call stack in this sample. This is way oversimplified of course, but you see it every time so that the total eight times, and then we also count how many times you see a function by itself, meaning it’s at the top of the call stack. So a functionA, for example, we see it seven times total, but only at that second interval, that sixth and eighth interval, do you see functionA at the top and the call stack by itself? We keep track of that number as well.

04:04 - Now, because we know how often we’re gathering the samples, by default, it’s 1,000 times per second. By seeing how many times we saw certain frame on the call stack and multiplying it by the sample rate, we can estimate that functionA took seven total milliseconds in this case of CPU time, and three of those milliseconds functionA was the only thing executing at that time. With the combination of those two things, well, you should be able to narrow in on what parts of your program and what specific functions your program is spending a lot of time or work on. One quick thing is we can actually customize that sampling rate. Like I mentioned, by default, it’s 1,000 times per second.

04:57 - But there’s a little gear icon which I’ll point out when I switch over to Visual Studio, but it allows you to change the sampling rate to any custom rate. With that, unless you have any questions about that, we can just jump into some code. >> Yeah, it’s cool. This is one of those profiling tools that’s technically available in the diagnostics hub window, which we talked about in an earlier part, right? >> Yeah, that’s true. The CPU usage graph, you’ll actually see it’s on by default when you’re debugging. I know some people tend to hide that. But it’s, again, it’s a really useful way to narrow in on interesting times during your program’s execution. >> I agree. I try to close it a lot less often these days. >> Yeah. I mean, it really depends on your scenario.

05:59 - I mean, if you’re doing a performance investigation, I think that’s a great place to start. As we mentioned in the previous episodes, this tool can be run in parallel with a bunch of other tools in the diagnostics and suite, and so sometimes you want to see how CPU and your database are behaving during a certain time and a certain CPU spike might point to an interesting time in your database or vice-versa. Again, it’s all about how quickly we can get you to your spot in code where you might be able to make some changes to make the program more performant. >> Sounds good. Well, I’d loved to see it in action. >> Yeah. For this demo, I just brought up a really simple application, where I start 10 timers each in its own thread that are just timing from one to 10 seconds. >> Lot of timers. >> Yeah, that’s a lot of timers, and this is purposely bad code just to demonstrate like a high CPU usage. >> [inaudible] I’m bad timer.

07:20 - >> Exactly, and so we’ll just jump into how to launch the tool. I’m going to press “Alt F2”, and you’ll see this window where you can select what tools you’re going to run during this trace. I’m going to make sure that the CPU usage tool is selected, and here’s that gear icon that I was talking about earlier, where you can change the sampling rate of the tool. >> By making it to a higher sample, that’s at the cost of potentially how long it’s going to take to get all that info or what’s the trade-off? >> The trade-off is the size of the trace and it’s just a lot more data. I mean, if you change it from a 1,000 samples per second to 2,000, you’re doubling the amount of data that we’re gathering.

08:13 - The overhead should be too high during the execution of your program. But when the trace ends and we’re trying to build all these costs tax, it might take a little longer. But you might need that granularity. We found that 1,000 samples a second is almost always enough resolution to figure out what’s going on. But as you might see in this trace, sometimes it isn’t enough. I’ll point out where you might want more granularity in the tool if it comes up. >> Got you. That makes sense.

08:52 - >> Oh, one other thing to point out is you’ll see this warning here, when you try to start any profiling tool and you’re in debug mode, it’ll warn you about that and recommend that you search server release mode; and the reason for that to, when you’re running in debug mode, we incur some additional overhead in order to make the debugger and some of the profiling tools has more information available to them and be able to step forward and back and do some of those clever things. But that’s not how it runs in production. To get an accurate performance trace, you really want to make sure that you’re running in release mode. So when I search for release mode, you’ll notice that warning changes, we’ll make sure that my start-up project is to this demo, and then I’ll just hit “Start”. When I hit “Start”, my program is just going to create 10 timers. Wait for all of them to finish. The whole time the CPU usage tool will be sampling the call stacks, and when it’s over, you’ll see it build a report on that. So we’re running a program.

10:07 - >> That is nice that it gives you the warning about changing to release mode though I would definitely be that person who forgets and then I’d be sitting through the whole profiling thing and like 4,000 snapshot sample. >> We land on this, what we call the summary page. You’ll notice a few things right away. Quite a lot of information. That’s CPU graph, and you see it uses a ton of CPU here at the beginning. Then as timers start pending, those resources are winding down. You’ll notice as I highlight different time selections on this graph, the face on the bottom are updating.

10:53 - I am really filtering the data to just the things that happened between these two timestamps. If you see a really big spike, perhaps like right here, it’s interesting, then I can narrow in on that and see information that really it’s just to that time period. >> That’s pretty neat. I’m also digging the pie chart on the side there, that always makes things look more official. But it looks like it’s breaking down where the usage is being spent. >> Yeah, I love a good pie chart. This pie chart, what it does is when we know the module that a function is in and we know what that module is related to.

11:46 - For example, if it’s in something having to do with I/O or something that interacts with the kernel or with the file system, we tag each of those calls. Then we could give you the top five categories of how your time is spent in your program. For some more complex things, you might see that networking is taking up a third of your time or I/O is taking up three-fourths in this example. That’s another way that you might be able to focus your work to get more and more performance application. One of the things, just going back to this timeline real quick, is when you run another tool in conjunction with the CPU usage tool. For example, that database tool.

12:33 - If you highlighted time selection on this timeline, it will also filter the other tools. That’s a very useful way to quickly get to interesting spots in your code. >> If you had a database tool tab open as well, then it would just, automatically pivot over to that integral? >> Exactly. In my case, it would tell you what database queries were happening during that time. >> Sweet. Yeah. That saves a lot of time. It’s not fun going back and forth and changing this later. >> Yeah.

In general with profiling investigations, 13:13 - we get so much data that it’s all about narrowing to the important data as quickly as possible. This is what this summary page is attempting to do by showing you the graph, the pie chart, and then these two tables that I’m about to go through. The first one is the top functions. If you recall the PowerPoint slide, that list of functions where it tells you how many times it happens in total over a certain amount of time, and how many times that function was by itself. These are the same two graphs right here. We’ll show you the top five for both of those. For example, in this time sample, this external call to system.date, time.getnow happened 99.

5 percent of the time, 14:08 - and it was by itself almost every single time. That’s a pretty good indication that whatever it is calling this system function is probably a place to go tweak the performance of your code. >> Great. >> Then below that, you’ll see the hot path. The hot path is, it tries to get you to your most common function call and the most common path that you arrive there. Code isn’t always as simple as this example.

14:46 - There’s several ways that you can arrive at a certain method call. But what we try to do is try to identify the one that is the most common. That’s defined by, if the function that we’re looking at is responsible for at least roughly a third of its caller functions time. Then we tag it as a hot path and we just drill down from there. Again, this might be a way to see the part of your code that’s most responsible for order, work or time. >> Cool.

Clicking on, 15:30 - we’ll redirect you to the- >> Depending on which one you click. If you click on something in this functions table, it’ll take you to something we call the functions view, which I’ll show you right now. The functions view is again that same table that I’d talked about in the PowerPoint. It’s just a flat list. It doesn’t tell you exactly how you arrived somewhere. But it tells you that in this CPUDemo.exe, these are all the functions that were called.

16:11 - This is how many times they showed up by themselves. Per total, this is how many times they showed up by themselves, and it will give you a percentage. You can sort by these columns to find a function that you find interesting. We also tell you what module the function is set. If you have a complex project, that’s another way that you can focus in your investigation. >> Cool. Got you. >> Going back to that summary page, if you click on a hot path. Or another way to get there is change the view to your call tree. You’ll get to a call tree view, which when I click on a function, it’ll tell me exactly how I arrived at that function. You’ll notice that this time, it’s not a flat table. You’ll actually be able to expand. You can see everything that happened during the CPUDemo.exe.

17:13 - Looks like it caused some native code, which then caused some external code. Then it looks like this is the first call to my actual code. We can see how many times it occurred. The call frame occurred in that sample. You can follow that path down to a call that you might find interesting. Then wherever we can, if we have the symbols, we will show you a preview of your code down in this page. It’ll show you exactly what line of code was making that call.

17:58 - In this example, it looks like this call to get a new timer was happening a bunch of times in that sample. I think this is still narrowed to just that timeline. If I clear the selection, so I see I get a much higher number of samples. These percentages show you what percentage of time was spent on that line or on the calls that that line makes. >> That’s a really nifty tool. If I do say so myself, I think that can save people a lot of time trying to narrow things down and play trial and error, as let’s say you get from this time. >> Hopefully.

One other great tool in this country view is 19:03 - that you can expand the hot path and that’s that same concept that I talked about on the summary page. Except there can be several hot paths within an application, so you can highlight a node and then expand the hot path under that node and it’ll step into the calls that this makes until it reaches a point where no one thing is responsible for 30 percent or more or a third or more of the time spent there. In this case, it steps you all the way down into some external code and if I were to investigate here, I would just probably start by looking at its color function. Double-click on it and it’ll take me exactly to where in my code that’s happening. >> Great, I like that short cut that you found it yourself. >> Yeah.

Another way to navigate 20:05 - that same nested tree isn’t as viewer parent in this color Kali view, which if you select any method and say viewing color Kali, you’ll get almost the same data but you’ll get this view that shows you the function that you’re currently looking at, the functions that it calls here in this right box with the red boxes and you also see its parent function or the function that called it. You can step backwards and forwards through code until you get somewhere, like the bottom of the call stack where you can hopefully find some code to change. >> Awesome, so many different views. >> Yeah, there’s quite a few and the nice thing is that you can switch between them pretty seamlessly. This last few, I want to highlight is this module’s view, which will show you every module that your program is utilizing. Again, it’ll try to attribute time spent during that traced to those DLLs and tell you specifically what functions within those are responsible.

21:37 - External code, I probably don’t have much power to change in this scenario, same with this private correlated. But if I step into my DLL, I can see the functions that my program is utilizing and again here something might pop out like for example, a bad timer. I can go then preview when I double-click on it and if I click on this link above the preview, I can actually go to my code to change it. >> Yeah, I like that there’s so much flexibility for these. If they went one view over another, then they can have an option that the more visual learner, then they can have that option too.

22:23 - >> Yeah and another awesome thing which I think we talked about during the higher overview episode is that you can save these direct sessions and that actually saves to trace that I collected. I can share it with somebody on my team or I can look at it later to compare runs and so that’s a really good way to measure performance improvements or to try to diagnose things with a team. >> Absolutely. >> There is one last thing I wanted to call out and that is in this summary page, these filters in the top right corner allow you to filter your sample to any specific thread. If you know a thread that you are particularly trying to diagnose what’s happening on it or it’ll tell you the percentage of utilization of those threads. There might be one that’s like 95 percent and you might want to narrow to just that thread.

23:35 - It also allows you to hide native code, to hide external code or show it and to stitch asynchronous code, so it gets squashed into one single line, not for each asynchronous method. Again, those categories that this pie chart represents, you can also filter by rows. If I see that I have a bunch of work happening in IO, then I can just look at IO top functions app and hot path and filter that way. >> That’s awesome. >> Yeah, go ahead. >> It’s cool that you can save this file for later and still get that same functionality so you don’t have to re-thread and do the whole trace again. >> Yeah. The only tricky, I wouldn’t even call tricky, the only thing there is in order to show those code previews and take you to sourcing code, you’re going to need access to the same symbols.

24:39 - But within a team, it can be a really powerful way to get help diagnosing an issue or even to take traces and see if something runs faster than before. >> Sweet. Yeah, I learned a lot more about the CPU usage for than I realized than I though I would. I feel like the CPU usage tool has hidden depths that most people, even once you do use CPU ashore and aware of. It’s really cool to see the whole expanded tool. >> Yeah and I think this is true that the CPU usage tool can run with the most other tools out of the ones that we offer.

25:31 - I think it may be able to run with every other tool but I would just recommend to leave it on even, if you’re doing a database investigation or some memory investigation, it’s just a really good starting point because if an application is performing slower than you expect, the CPU is a pretty good place to look at, whether it’s peaking and not, it doesn’t have the resources to do more or it’s not being utilized as much as you would expect, it can be a really good way to start your investigation. >> Yeah, absolutely. Anything else before we close out? >> I think that’s it. In the description, we’ll have a link to the docs for this tool and I’ll make sure that my contact information is there as well. If you have any questions, I would encourage everyone to try it, you might see an issue with your program that you didn’t know was there. There might be just some low-hanging fruit to make your application more performing.

26:43 - So I would just encourage you to try it and reach out if you have any questions. >> Sounds great. Well, thanks for being here, Esteban, and I’m sure we’ll see you in our Future Profiling video in the near future. >> Happy to be here. I look forward to it. >> Great, so until then, happy profiling. >> Happy profiling. [MUSIC] .