(For the previous article, see Optimization, Part 1: Figuring out your problem)
Although the word optimization probably makes you think “make the program run faster,” it isn’t always about optimizing algorithms. Usually, you can make noticeable improvements by changing when your code is executed rather than trying to adjust how long it takes. In this post, I’ll discuss the options of loading screens, amortization, and performing calculations at build time.
Although a loading screen is traditionally used for anything your program can’t run without–for example, a game can’t run until the upcoming level is loaded–you can also pre-load items in them. This is primarily useful for video games, as having business software take a tenth of a second to do some operation is fine. In a game, stopping for a tenth of a second is a noticeable hitch in your frame rate.
Let’s take a simple example: you’re building a platformer game similar to Mario or other games. The first time you spawn a given enemy type in a level, the game stutters. On further investigation, you discover that the game is taking some time to load all of the enemy’s animations into memory for the first time, and the combination of parsing a bunch of animation data along with transferring a bunch of data into RAM causes a brief stutter. One solution is to pre-load that animation data into your animation system before the level begins.
“But aren’t loading screens bad?” Well, yes, and moving more things to a loading screen is going to make it take even longer. However, if you get creative, you can hide a “loading screen” in your game. Have you ever played a game where you have to sit in an elevator? That elevator was used as a loading screen. You know when games have you walk through a door then lock it behind you? It’s unloading the section of the level you just passed through, and that upcoming winding, empty corridor is being used to keep the rest of the level out of line of sight until the game has had time to load it in.
Portal hides most of its loading time in elevator sequences
+ Easily “fixes” most slow operations
+ With some creativity, can be hidden so it’s less noticeable
– Nobody likes sitting at loading screens doing nothing. Making them take even longer just makes it worse.
– Paying for things you aren’t using is a Bad Thing. You’ll probably need that thing loaded eventually, but you’re wasting memory on it when you don’t.
Amortize is originally a financial term, but it just means “to write off a cost of (an asset) gradually.“ In other words, rather than doing an expensive computation all at once right now, we’ll spread it out over time. What are some ways we can do that?
A common example can be found with streaming video. Rather than blocking everything waiting for the entire video to download ahead of time, the “cost” of downloading the video is paid over time, allowing the user to start watching right away.
If your program is multithreaded, having a worker thread do a calculation in the background accomplishes this goal. However, a single-threaded application can apply the same concept of multitasking. Let’s use a video game example: you’re building an endless runner game like Canabalt or Robot Unicorn Attack or Temple Run. Because the game goes on forever until the player dies, you need to dynamically generate more platforms, obstacles, and enemies in front of the player. You set the game up to generate more content once the player gets close to the end of what’s currently loaded, but every time you hit that level generation code, your game freezes up briefly for 100ms while the new content is created. Rather than unload the old content and load the next “chunk” all at once, you can slowly load in a couple new objects every frame. By spreading that loading time over two seconds, assuming you’re running at 30 FPS, you only need to spend an average of 100 / 60 = 1.7ms per frame rather than 100ms all at once.
Before you apply this type of optimization, dig deep enough to make sure it will actually fix your problem. If you want to smooth out loading content taking 100ms, but 95ms of it is caused by one single object, you’re still not going to get that lag spike any lower than 95ms.
+ Good for smoothing out spikes in CPU usage
– Potentially adds a lot more complexity to your code
– May require saving a lot more of your algorithm’s current state mid-calculation, using up more memory and opening up to lots of room for bugs
Performing Calculations Ahead of Time
Not having to pay a cost at all is even better than paying it at an opportune moment. If your calculation only relies on static data, you can calculate it ahead of time before your program ever runs. For a simple example, math libraries may have functions like sine and cosine pre-calculated and simply look up the correct value rather than having to calculate it at run time.
Optimizations of this nature can often be integrated directly into your tools and workflow. For example, let’s say you’re building a 3D simulation and you need to calculate some physics data from the 3D models of your objects: maybe a bounding box, or the object’s center of gravity, or generating a convex hull. Instead of waiting until your program is running to calculate this data for every object, you could instead write a custom exporter for your 3D modeling program of choice that calculates this data ahead of time. While you’re at it, you could even tweak a standard 3D model file format to instead be tailored exactly to how your engine works. This sort of optimization is why a lot of AAA games use custom file formats, and modding them often requires downloading a plugin for modeling tools.
Programmers can also have some calculations performed ahead of time. The major downside is that, unlike the above example of custom asset formats, anything you have that runs on build is going to make every single build take longer for every member of your team, but sometimes it may make sense. One straightforward choice is using pre-build and post-build scripts. If you’re using C++, template metaprogramming is a technique that can be used to perform calculations when your program builds, however, template metaprogramming code is often difficult to read and by extension can be rather difficult to maintain. In either case, compiling code already takes long enough; you don’t want to make it worse unless it’s worth the cost.
Well, okay, you _probably_ don’t want to make compilation times worse
+ Can make expensive calculations effectively free. Yay!
– May slow down developer iteration time, and developer time is expensive
– Most likely adds complexity to your workflow and/or build process, making it harder for new team members to get up to speed, and giving more potential places for things to go wrong