Tag Archives: programming pariahs

Programming Pariahs – goto Statements

xkcd: gotoSource: http://xkcd.com/292/

Interestingly, although I do not believe I’ve ever been formally taught how goto works, I have been told on numerous occasions in classes that it’s something to avoid while programming. In fact, during one of my introductory programming courses, I distinctly remember the TA scoffing at my usage of goto in a C# program (where it was actually required). I’ll walk through the rationale behind the advice after a quick explanation of how goto is used in programming.

So, how does this newfangled goto thing work?

goto is used to jump to another location in a program. Upon reaching a goto statement, your program will jump to wherever the goto commands it to.

Though the details of goto differ between programming languages, here’s an example in C++ to print “I hate using loops!” five times:

#include <stdio.h>

int main()
{
int i = 0;

this_is_a_label:
   printf("I hate using loops!\n");
   i = i + 1;
   if (i < 5)
      goto this_is_a_label; // jump to the label if we're not done

   return 0;
}

This code has the following output:

I hate using loops!
I hate using loops!
I hate using loops!
I hate using loops!
I hate using loops!

Notice the label on line 7, “this_is_a_label:“. Labels are used as locations that you can jump to with a goto statement. The statement “goto this_is_a_label;” tells the program to jump to that label and continue execution from there.

Alright, now why “shouldn’t” I use goto?

Jumping around in code can make it very hard to process mentally while reading it, especially once the code becomes more complex than just one loop. For example, try figuring out what this program does:

#include <stdio.h>

int main()
{
    // Using nested goto loops
    int x = 1;
    outer_loop_begin:
        int y = 1;

        inner_loop_begin:
            printf("(%i,%i)\n", x, y);
            ++y;
            if(y < 4) goto inner_loop_begin;
        // inner loop done

        ++x;
        if (x < 4) goto outer_loop_begin;
    // outer loop done

    return 0;
}

It’s certainly less intuitive than this example using for loops:

#include <stdio.h>

int main()
{
    // Using nested for loops
    for(int x = 1; x < 4; ++x)
    {
        for (int y = 1; y < 4; ++y)
        {
            printf("(%i,%i)\n", x, y);
        }
    }

    return 0;
}

Both of the above examples have the same output:

(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(3,1)
(3,2)
(3,3)

As I’m sure you could imagine, improper/abusive use of goto can make a program significantly harder both to understand and to debug. You can use goto in a number of clever ways, but most places where you could be using goto you should generally be using some other part of the language (a loop structure, a function call, etc.).

When should I use goto?

The summary version is: whenever you are required to, or whenever it makes your code clearer. The latter is very subjective, so I won’t touch too much on that; however, it’s important to know when you need to use a goto statement.

Exiting from Nested Loops (C++, other languages)

Say hypothetically you’re inside nested loops, and you want to exit the entire structure, not just the loop you are in. Traditionally, a break statement is your trusty tool for exiting a loop, but that will only get you to the outer loop. For example:

#include <iostream>

int main()
{
    int searchValue = 2;

    // Search for a certain j value. Once it's found, exit the search loops
    // and exit the program.
    for (int i = 0; i < 5; ++i)
    {
        for (int j = 0; j < 5; ++j)
        {
            if (j == searchValue)
            {
                std::cout << "Found value " << searchValue
                          << " at (i,j): (" << i << "," << j << ")"
                          << std::endl;
                break; // exit our loops and print the completion message
                // BUG: breaks to the outer i loop
            }
        }
    }

    std::cout << "Search completed!" << std::endl;

    return 0;
}

You were hoping for the following output:

Found value 2 at (i,j): (0,2)
Search completed!

But you ended up with this!

Found value 2 at (i,j): (0,2)
Found value 2 at (i,j): (1,2)
Found value 2 at (i,j): (2,2)
Found value 2 at (i,j): (3,2)
Found value 2 at (i,j): (4,2)
Search completed!

The break statement at line 18 is only exiting your inner for loop. Once that break is hit, i gets increased by one and the loops continue going! To get the desired result, a goto statement will do the trick:

#include <iostream>

int main()
{
    int searchValue = 2;

    // Search for a certain j value. Once it's found, exit the search loops
    // and exit the program.
    for (int i = 0; i < 5; ++i)
    {
        for (int j = 0; j < 5; ++j)
        {
            if (j == searchValue)
            {
                std::cout << "Found value " << searchValue
                          << " at (i,j): (" << i << "," << j << ")"
                          << std::endl;
                goto search_done; // exit our loops and print the completion message
            }
        }
    }
search_done:

    std::cout << "Search completed!" << std::endl;

    return 0;
}

An alternative would be wrapping the loops in a function and returning where the goto is, but depending on where your loop is being called and what work is being done inside of it, goto may be the better choice.

“Falling through” non-empty cases in a switch statement (C#)

Traditionally, if you want multiple cases to do the same thing in a switch statement, your code will look something like this:

switch(direction)
{
    case WEST:
        goWest();
        break;
    case NORTH:
    case SOUTH:
    case EAST:
        System.out.println("There is no exit in that direction");
        break;
    default:
        // Invalid argument
        throw new InvalidArgException("Ya dun goofed!"); // they dun goofed.
        break;
}

This works fine in C#, unless you want to do something in a case before falling through it. In many language, simply omitting the “break;” statement at the end of the case will allow this functionality; however, this is not the case in C#: “A jump statement such as a break is required after each case block, including the last block whether it is a case statement or a default statement. With one exception, (unlike the C++ switch statement), C# does not support an implicit fall through from one case label to another. The one exception is if a case statement has no code.” (MSDN source). This is a good thing in that you won’t have an obscure bug caused by forgetting to put in a break statement; however, it requires using a different technique to fall through your cases.

If you want to “fall through” a non-empty case, you must explicitly state which case to go to:

switch(direction)
{
    case WEST:
        goWest();
        break;
    case NORTH:
        logSomethingHere();
        goto case EAST;
    case SOUTH:
    case EAST:
        System.out.println("There is no exit in that direction");
        break;
    default:
        // Invalid argument
        throw new InvalidArgException("Ya dun goofed!"); // they dun goofed.
        break;
}

Notice the “goto case” at line 8.

Other Uses

There are assuredly other reasons to use goto out there, especially because each programming language is different; however, two examples is enough for this particular article. Check the documentation for your programming language of choice!

References / Recommended further reading

The goto Statement – MSDN

goto (C# Reference) – MSDN

Wikipedia’s article on Goto, especially the Criticism and Decline section.

Programming Pariahs – Don’t Make Everything Public

Recently, while talking to my classmates, I have found myself frequently explaining the rationale behind common programming no-nos. Though many students have been told that we should “avoid using goto,” and we are often told “don’t make everything public,” a lot of these “pariahs” of programming are never actually explained; they are simply re-iterated. To help combat this, I plan on posting the why behind these words of warning in a series of posts.

So, on to today’s topic: Why not make everything public?

Perhaps the most obvious benefit to private data members is that you can restrict other classes’ access to that data member. By providing both an accessor (getter) or mutator (setter), users will have read or write access (respectively). By providing one, both, or neither, you can control whether users have read access, write access, read/write access, or no access to your class’s data field; the choice is yours. First and foremost, this helps make your code safer. For example, you cannot accidentally change a value that was not meant to be altered if your code cannot change that value in the first place. Second, your class’s interface (in the sense of what methods, fields, etc. programmers using your class are supposed to use, not in the “ISerializable” sense) is dictated by what users have access to publicly. In other words, the public parts of your class dictate how your class is meant to be implemented. If the interface is mucked up with implementation details, it is a lot more ambiguous.

Another important reason to declare data members private is encapsulation. If you don’t know, or perhaps don’t fully understand, what encapsulation is, read this article for a solid explanation. Encapsulation allows you to change implementation details with minimal (or, ideally, no) effect on code outside your class. You could add logging features to a method call, add verification for pre-/postconditions (e.g. with assert statements), add thread synchronization, clamp input values, or whatever else you can think of. If your implementation is not encapsulated well, although it is still possible to add such functionality, it’s a much larger headache – you’ll have to re-write all the code that used the (now-broken) old implementation. Thus, even if you’re using only your own code, encapsulation is beneficial.

A quick side note: protected is (for all intents and purposes) no more encapsulated than public. The main idea behind encapsulation is to minimize the chance of other code breaking when you change something. If another class implements yours, it has access to all of your class’s protected members, and thus you’re still stuck with the same dilemmas as public members. With this in mind, a good rule of thumb is to prefer private data members to public or protected whenever possible.

Finally, a not-so-obvious reason is for consistency. Scott Meyers explains this one excellently in his book Effective C++: “If data members aren’t public, the only way for clients to access an object is via member functions. If everything in the public interface is a function, clients won’t have to scratch their heads trying to remember whether to use parentheses when they want to access a member of the class. They’ll just do it, because everything is a function. Over the course of a lifetime, that can save a lot of head scratching.”

References / Recommended further reading

Effective C++: 55 Specific Ways to Improve Your Programs and Designs, Third Edition by Scott Meyers. pp.94-98

P.S. Thanks to Somara Atkinson for inspiring this.