Category Archives: Programming

Posts about anything programming-related

Optimization, Part 2: When do you pay the cost?

(For the previous article, see Optimization, Part 1: Figuring out your problem)

Although the word optimization probably makes you think “make the program run faster,” it isn’t always about optimizing algorithms. Usually, you can make noticeable improvements by changing when your code is executed rather than trying to adjust how long it takes. In this post, I’ll discuss the options of loading screens, amortization, and performing calculations at build time.

Loading Screens

Although a loading screen is traditionally used for anything your program can’t run without–for example, a game can’t run until the upcoming level is loaded–you can also pre-load items in them. This is primarily useful for video games, as having business software take a tenth of a second to do some operation is fine. In a game, stopping for a tenth of a second is a noticeable hitch in your frame rate.

Let’s take a simple example: you’re building a platformer game similar to Mario or other games. The first time you spawn a given enemy type in a level, the game stutters. On further investigation, you discover that the game is taking some time to load all of the enemy’s animations into memory for the first time, and the combination of parsing a bunch of animation data along with transferring a bunch of data into RAM causes a brief stutter. One solution is to pre-load that animation data into your animation system before the level begins.

“But aren’t loading screens bad?” Well, yes, and moving more things to a loading screen is going to make it take even longer. However, if you get creative, you can hide a “loading screen” in your game. Have you ever played a game where you have to sit in an elevator? That elevator was used as a loading screen. You know when games have you walk through a door then lock it behind you? It’s unloading the section of the level you just passed through, and that upcoming winding, empty corridor is being used to keep the rest of the level out of line of sight until the game has had time to load it in.

Portal hides most of its loading time in elevator sequences

Portal hides most of its loading time in elevator sequences

+ Easily “fixes” most slow operations
+ With some creativity, can be hidden so it’s less noticeable
– Nobody likes sitting at loading screens doing nothing. Making them take even longer just makes it worse.
– Paying for things you aren’t using is a Bad Thing. You’ll probably need that thing loaded eventually, but you’re wasting memory on it when you don’t.


Amortize is originally a financial term, but it just means to write off a cost of (an asset) gradually. In other words, rather than doing an expensive computation all at once right now, we’ll spread it out over time. What are some ways we can do that?

A common example can be found with streaming video. Rather than blocking everything waiting for the entire video to download ahead of time, the “cost” of downloading the video is paid over time, allowing the user to start watching right away.

If your program is multithreaded, having a worker thread do a calculation in the background accomplishes this goal. However, a single-threaded application can apply the same concept of multitasking. Let’s use a video game example: you’re building an endless runner game like Canabalt or Robot Unicorn Attack or Temple Run. Because the game goes on forever until the player dies, you need to dynamically generate more platforms, obstacles, and enemies in front of the player. You set the game up to generate more content once the player gets close to the end of what’s currently loaded, but every time you hit that level generation code, your game freezes up briefly for 100ms while the new content is created. Rather than unload the old content and load the next “chunk” all at once, you can slowly load in a couple new objects every frame. By spreading that loading time over two seconds, assuming you’re running at 30 FPS, you only need to spend an average of 100 / 60 = 1.7ms per frame rather than 100ms all at once.

Before you apply this type of optimization, dig deep enough to make sure it will actually fix your problem. If you want to smooth out loading content taking 100ms, but 95ms of it is caused by one single object, you’re still not going to get that lag spike any lower than 95ms.

+ Good for smoothing out spikes in CPU usage
– Potentially adds a lot more complexity to your code
– May require saving a lot more of your algorithm’s current state mid-calculation, using up more memory and opening up to lots of room for bugs

Performing Calculations Ahead of Time

Not having to pay a cost at all is even better than paying it at an opportune moment. If your calculation only relies on static data, you can calculate it ahead of time before your program ever runs. For a simple example, math libraries may have functions like sine and cosine pre-calculated and simply look up the correct value rather than having to calculate it at run time.

Optimizations of this nature can often be integrated directly into your tools and workflow. For example, let’s say you’re building a 3D simulation and you need to calculate some physics data from the 3D models of your objects: maybe a bounding box, or the object’s center of gravity, or generating a convex hull. Instead of waiting until your program is running to calculate this data for every object, you could instead write a custom exporter for your 3D modeling program of choice that calculates this data ahead of time. While you’re at it, you could even tweak a standard 3D model file format to instead be tailored exactly to how your engine works. This sort of optimization is why a lot of AAA games use custom file formats, and modding them often requires downloading a plugin for modeling tools.

Programmers can also have some calculations performed ahead of time. The major downside is that, unlike the above example of custom asset formats, anything you have that runs on build is going to make every single build take longer for every member of your team, but sometimes it may make sense. One straightforward choice is using pre-build and post-build scripts. If you’re using C++, template metaprogramming is a technique that can be used to perform calculations when your program builds, however, template metaprogramming code is often difficult to read and by extension can be rather difficult to maintain. In either case, compiling code already takes long enough; you don’t want to make it worse unless it’s worth the cost.

Well, okay, you _probably_ don't want to make compilation times worse

Well, okay, you _probably_ don’t want to make compilation times worse

+ Can make expensive calculations effectively free. Yay!
– May slow down developer iteration time, and developer time is expensive
– Most likely adds complexity to your workflow and/or build process, making it harder for new team members to get up to speed, and giving more potential places for things to go wrong

Optimization, Part 1: Figuring out your problem

“Dang, my game is running really sluggish. I know, I’ll just optimize it! (*opens code editor*) …ok, how do I actually do that?”

First things first: You can’t fix something until you know what to fix. You’ll probably know a few symptoms: Maybe the game freezes up every once in a while, or the game crashes on old phones because it runs out of memory, or maybe the framerate just seems to get choppy every now and then. You might even have a guess as to what’s causing the problem, but before you dive into your code to start fixing everything, make sure you’re fixing the right thing.

If you can, an easy first step is to run a profiler. Essentially, a profiler tracks how long different parts of your program take to execute, letting you drill down deep into your code to figure out how long each function call takes. While off-the-shelf tools are generally the easiest option, sometimes they’re not available or are impractical. You can still do the same thing manually, inserting code to check the current time before and after a function is done running and dumping that info out to a log file.

Unity3D has a built-in profiler in the Pro version

Unity3D has a built-in profiler in the Pro version

So, what data are we looking for? Anything that’s taking too long, of course! But what counts as “taking too long”?

To answer that, let’s talk about frames per second (FPS), also known as frame rate. This is a measurement of how many images are presented to the player in one second. From a programmer’s perspective, this measures how many times you can loop through your game code in one second. Generally speaking, your goal is to never dip below 30 FPS. If your game requires a lot of really responsive controls, like a first-person shooter game, you might even want to push for 60 FPS. Having a target FPS is important, but FPS itself isn’t a good measure of our game’s performance when optimizing because it’s non-linear. Going into the details behind that is a bit of a tangent, but you can read more about it here:

Instead of frames rendered per second, what we really want to measure is: how long does each frame of our game take? Instead of (frames / second), what we want is (seconds / frame). A second is a really long time in terms of computation, though, so we typically use milliseconds. To convert from (frames/sec) to (sec/frame), we just take the inverse.

And that is where these numbers in Unity's profiler come from

And that is where these numbers in Unity’s profiler come from

If our goal is 30 FPS, we have (1 / 30) = 0.03333… sec = 33.33… milliseconds per frame
If our goal is 60 FPS, we have (1 / 60) = 0.01666… sec = 16.66… milliseconds per frame

Finally, we’ve reached a reference point! Assuming our target is 30 FPS, if one iteration through our game’s code takes longer than 33.3ms, it is officially “taking too long”. Huzzah!

Armed with this knowledge, what do we look for in the profiler data?

If the game briefly “freezes” every now and then, look for spikes in performance time:

This one frame has a giant jump in computation time

Not all performance hitches will be this visually obvious, but it’s the same idea: look for a sudden jump up in frame time. The frame pointed to by the arrow took a lot longer to process than the others. If we look at the color-coding, we can deduce that something in our physics setup caused the problem. Click on that frame in the window, then look at the list of function calls to see what happened that frame. Try and find a way to consistently reproduce the problem and get as much data as you can. Maybe we just created a bunch of objects? Maybe we moved around a bunch of “static” colliders that don’t have rigidbodies and forced the engine to re-calculate all the physics data?

If the performance issue comes from your scripts, look for what function calls took the most time. Dig as far down as possible to make sure you’re fixing the right problem. I recently spent a bunch of time “fixing” a performance spike that happened whenever we would load in new level data in an endless runner. “Of course, adding all of those objects at once is causing it, duh!” I said to myself. After a few hours changing the code to slowly stream in the new objects a handful at a time… the problem was still there. Oops. Turns out the performance hitches were actually caused by loading in animation data for objects we hadn’t used yet. Fixing that took all of twenty minutes to make the game pre-load animations beforehand; I didn’t need to re-work our level loading system at all. I could’ve avoided wasting that time if I had just spent a few more minutes digging deeper into the profiler data.

If your game is consistently slow and sluggish rather than just having “spikes” like the above, you’ll need to work a bit harder. If you’re really, really lucky, it’s one subsystem or function taking way too long and you can just fix it and move on with your day. More often than not, though, your update loop is just taking too long overall across… well, everything. Rather than your performance being killed by one big orcish axe to the chest, it is instead being subjected to death by ten thousand paper cuts.

The main question you now have to answer is, how much time does each subsystem get? Remember, you only get 33.3ms to spread across every single system in your game at 30 FPS. In a big AAA project, there will probably already be a performance budget given to each subsystem: “OK, AI gets [X] ms, physics gets [Y] ms…”. For a smaller team, you probably don’t have this level of formality yet. Now you have to change that. Even after you get your performance back to a manageable level, every single thing you add will potentially push it right back over, so it’s important to keep things in check. Generally, you’ll want to start with whatever subsystem takes the largest amount of time, and work your way down from there.

Once we figure out the source of our performance issues, how do we fix it? Well, that’s a big, complex problem that I’ll write about more later. In the meantime, hopefully this post has helped you figure out where to get started!

(For the next article, see Optimization, Part 2: When do you pay the cost?)

Advantages of a Beginner

When you’re a beginner, it’s easy to feel like you don’t have anything to offer. After all, there are a gazillion people out there who are better than you, right? It turns out newbies bring a lot to the table.

Beginners know how to learn new skills. Let’s say you want to learn programming. Who do you think is better to ask for advice: a friend who’s been doing it professionally for thirty years, or your friend who took a single college course on it about two years ago? Well, the 30-year seasoned professional hasn’t had to learn how to program in decades, and they learned back on a now-ancient Commodore 64. Even though your other friend is quite inexperienced, they’ll have a much better idea of where to get started.


Not that you couldn’t still learn to program on one of these today, but it’s a few decades behind current technology. (Source: Wikipedia)

A beginner can write a guide or tutorial in a language other beginners will understand; experts will have trouble.

Newbies (potentially) know a lot about new tools and technology. When you’re teaching yourself a new skill, you’re inherently forced to put a lot of research into that topic. Odds are that you’ll stumble upon a lot of new and upcoming tech during that research. It’s also easier to objectively compare different tools and tech when you don’t have several years invested into a particular tech stack.

Beginners don’t yet know what isn’t possible. It’s easier to think outside of the box when there is no box. When you have a lot of experience in a field, it’s easy to dismiss a lot of ideas that seem impractical without giving them the attention they deserve. Obviously, this same trait can also backfire horridly and bite you in the rear, but it’s not without its advantages.

A Swift Review of Swift

Over at Ludisto, we’ve been using Apple’s new programming language Swift for our latest mobile game. We started back in July while Xcode 6 was still in beta, and the language has evolved and improved a lot over that time period. Most of the time, the experience is fantastic! Other times, well…

swiftc segfault error

…the compiler segfaults with no useful error message.

The upside is, you know the issue was caused by your most recent code change(s). Tracking it down usually isn’t too hard, but figuring out a workaround can be annoying. Most of the major issues were hammered out during Xcode 6’s beta phase, but you do still run into problems from time to time. Some form of continuous integration (i.e. automated builds) would likely be pretty useful for catching issues early. For example, right now, we’re dealing with an issue where our project won’t Archive, which is needed to build the app for the app store; that would’ve gotten caught much earlier if we had an automated build system checking things nightly.

For anyone with any reasonable amount of programming experience, Swift is super easy to learn. Apple’s book The Swift Programming Language covers it really well, and is a good reference manual. The catch here is that you won’t learn any of Apple’s APIs from that book, so jumping into actually writing an app in Swift is probably a lot easier if you’ve used Objective-C in the past. As someone who had no prior experience with native iOS development, I struggled a little, but it was absolutely manageable. Adobe Flash was über easy to get started with back in the day; SpriteKit is even easier. Apple’s Swift resources page gives a lot of useful links. For a really good tutorial project to get started with SpriteKit, I highly recommend How to Make a Game Like Candy Crush with Swift. Additionally, Apple’s WWDC 2014 videos have a ton of useful talks to watch for free.


Getting to use Xcode is also a nice perk

Overall: Swift is great! I highly recommend checking it out and using it, especially for smaller projects. There are still some small kinks that need to get hammered out, though, so I’d recommend against using Swift exclusively in a large project just yet; check back in maybe six months to a year.

Why Depth is Important

Generally speaking, your education will expose you to a lot of different ideas and make you a well-rounded individual. That said, while breadth is important, having an area of specialization can make all the difference in the world. As a newbie, the value of this isn’t inherently obvious: doesn’t a more varied skill set keep more options open? Sure, big teams might need focused specialists, but wouldn’t small teams favor generalists? It’s actually surprisingly simple: having a specialist on your team, or at least knowing someone to go to, makes things a lot easier when you get stuck.

When I was working on Carbon Conquest (an RTS being made in Unity3D), I had a lot of trouble optimizing the fog of war. Essentially, the fog was created by creating a semi-transparent black texture, then drawing fully transparent circles in it at each unit’s position. The texture was then projected onto the terrain. This looks exactly how you’d want it to, but generating the texture was a real bottleneck. The naive implementation I started with just created a 2D array of colors in code, drew the circles in all via C# script, then copied the color data over to the fog projector’s texture. Obviously, doing this every frame in a component script is going to slow things down.

As this looked perfect but just ran too slowly, I wanted to re-write the texture generation as a shader to speed things up. Couldn’t be too hard: you just need to go through a bunch of location data and change the texture’s color if it’s close enough to a unit at that location. I started to learn Unity’s shader language via a pretty awesome tutorial, but I got stuck: how should I get the unit position data over to the shader? Based on what little I’d learned, I had this dilemma: “I can pass data to the shader, but only via constants, and we have a variable number of units… huh.” And right there, I was stuck. To the internet! ….well, googling didn’t help whatsoever. To other human beings!

Alright, I’ve got friends who’ve done graphics programming, I’ll ask them! As it turned out, everyone I thought to ask had only really dabbled in graphics programming, so they weren’t really sure either. At that point, I was stuck with two options: 1. learn graphics programming in depth, or 2. hack it and actually finish the game before the deadline. Obviously, we went with option two. The fog of war would now only update once per second. Turns out this actually looks alright by virtue of older RTSes having similarly “choppy” fog of war (w00t!), but it would’ve been nice to have something more responsive.

Had either I or one of my friends had the appropriate depth of knowledge in graphics programming, we could’ve probably figured out a more efficient approach that would have ultimately led to a better gameplay experience. Instead, we had to settle on a hackish workaround. Generalists are good, but if you want to deliver the best possible product/game/experience, you’re going to want to have someone who really knows a given subject. Ideally, you’ll be someone with a broad skill set who also has an area of expertise; this is the “t-shaped” skill set that companies like Valve are looking for.

Licenses page added to my site!

I’ve finally gotten around to creating a page that details licenses for source code posted on my site. Check the Licenses page for details!

This is something I really should have gotten around to earlier. If you run a site with code posted, it’s also something you should do. Unless you have a license available for your code somewhere, others cannot use it in their own projects. For example, even if you write a super-helpful blog post about how to solve a programming problem, or offer useful code snippets in your posts, others cannot legally use them.

I opted for the MIT License because it’s super open. I’m in the process of adding a text file with the license to applicable files on my site, but in the meantime, feel free to contact me if there is any confusion. If you’re not sure where to start on picking a license for your code, consider this list of Open Source Licenses.

Also, as a disclaimer: I am not a lawyer. If you need legal advice, you should consider consulting a lawyer or attorney.

GDC Day 1: Math for Games Programmers

GDC '13 logo

For the second year in a row, I attended the Math for Games Programmers tutorial at GDC today. As expected, there was a fair amount of review, but I also learned some cool new things!

The earlier talks on splines, blending, matrices, etc. were basically review for me, but it was nice to have that refresher. That said, one tip came up that really stood out to me for matrix math. Matrices are multiplied in a sort of “reverse” order, for example:

Rotation * Translation = translate-then-rotate

Incidentally, this is actually the same order you’d write functions while programming:

rotate( translate( point ) ) = translate-then-rotate( point )


Jim van Verth’s talk on quaternions was absolutely fantastic. He spent the talk discussing how quaternions work rather than why we use them. Questions like “why four values?,” “why do we input theta over 2 instead of the entire angle?,” “how can we visualize 4D space?” and more are covered. As a bonus, the tutorial will actually make it to the GDC vault this year, so definitely check out the talk if you can! If you don’t have vault access, you should be able to pick up the slides at once they’re posted.

Dual numbers sound useful, but I’m not sure how often I’ll use the content of the lecture. That said, “you can basically get the derivative for free” is a pretty awesome thing to keep in mind. They’re pretty interesting from a mathematical standpoint.

The talk on Orthogonal Matching Pursuit and K-SVD for Sparse Encoding went way over my head, but I still pulled a good deal of information from it. I think I have a rough idea of how compression works now. I have some research to do!

The talk on Computational Geometry was pretty interesting, though it seemed more or less the same as last year. Still, it was good to get the review – I had forgotten a lot of it. I also learned about higher order surfaces on the GPU (i.e. tesselation, etc.), which was new to me!

Finally, the talk on Interaction With 3D Geometry by Stan Melax was amazing. It was a super-fast-paced crash-course on a huge number of subjects that left me really inspired to start writing some tech demos to learn how all of the concepts work. I’ll definitely be watching it on the vault.

Tomorrow: Physics!