Code Complete: A Practical Handbook of Software Construction. Redmond, Wa.: Microsoft Press, 880 pages, 1993. Retail price: $35. ISBN: 1-55615-484-4. 

Buy Code Complete from Amazon.com.


16.1 Using gotos   

Computer scientists are zealous in their beliefs, and when the discussion turns to gotos, they get out their jousting poles, armor, and maces, mount their horses, and charge through the gates of Camelot to the holy wars.

No one quarrels with using gotos to emulate structured constructs in languages that don't support structured control constructs directly. The debate is about languages that support structured constructs, in which gotos are theoretically not needed. Here's a summary of the points on each side.

The Argument Against gotos

The general argument against gotos is that code without gotos is higher-quality code. The famous letter that sparked the original controversy was Edsger Dijkstra's "Go To Statement Considered Harmful" in the March 1968 Communications of the ACM. Dijkstra observed that the quality of a programmer's code was inversely proportional to the number of gotos the programmer used. In subsequent work, Dijkstra has argued that code without gotos can more easily be proven correct.

Code containing gotos is hard to format. Indentation should be used to show logical structure and gotos have an effect on logical structure. Trying to use indentation to show the logical structure of a goto, however, is difficult or impossible.

Use of gotos defeats compiler optimizations. Some optimizations depend on a program's flow of control remaining within a few statements. An unconditional goto makes the flow harder to analyze and reduces the ability of the compiler to optimize it. Thus, even if introducing a goto produces an efficiency at the source-language level, it may well reduce overall efficiency by thwarting compiler optimizations.

Proponents of gotos sometimes argue that they make code faster or smaller. But code with gotos is rarely the fastest or smallest possible. Donald Knuth's marvelous article, "Structured Programming with go to Statements," gives several examples of cases in which using gotos is slower and larger (Knuth 1974).

In practice, use of gotos tends to violate structured programming principles. Even if gotos aren't confusing when used carefully, once gotos are allowed, they spread through the code like termites in a rotting house and aren't used carefully. If any gotos are allowed, the bad creep in with the good, so it's better not to allow any of them.

Overall, in the two decades since publication of Dijkstra's original letter, experience has shown the badness of goto-laden code. In a survey of the literature, Ben Shneiderman concluded that the evidence supports the badness of the goto (Shneiderman 1980).

The Argument for gotos

The argument for the goto is characterized by advocating its use in specific circumstances rather than its indiscriminate use. Most arguments against gotos are based on indiscriminate use, rather than careful use. The goto controversy began when Fortran was the most popular language. Fortran lacked any presentable loop structures, and, in the absence of any good advice on programming structured loops with gotos, programmers wrote a lot of spaghetti code. Such code undoubtedly was correlated with low quality programs but has little to do with careful use of a goto to make up for a gap in a structured language's capabilities.

A well-placed goto can eliminate duplicate code. Duplicate code leads to problems with the two sets of code being modified differently. It increases the size of source and executable files. The bad effect of the goto is outweighed in such a case by the worse effect of duplicate code.

The gotos is useful in a routine that allocates resources, performs operations on those resources, then deallocates the resources. With gotos, you can cleanup in one section of code, and they reduce the danger of forgetting to deallocate the resources in each place you detect an error.

In some cases, gotos can result in faster and smaller code. Knuth's marvelous 1974 article cited a few cases in which gotos produce a legitimate gain (Knuth 1974).

Good programming doesn't mean eliminating gotos. Methodical decomposition, refinement, and selection of control structures automatically leads to goto-free programs in most cases. gotolessness is not the aim, but the outcome, and putting the focus on no gotos isn't helpful.

Two decades worth of research with gotos has been inconclusive in demonstrating their badness. In a survey of the literature, B. A. Sheil concluded that unrealistic test conditions, poor data analysis, and inconclusive results failed to support the claim that the number of bugs was proportional to the number of gotos (Sheil 1981). That criticism applies to Shneiderman's survey of the literature, cited in the argument against gotos, as well as other studies. Sheil did not conclude that gotos were good, rather that experimental evidence against them was not conclusive.

Finally, goto was included as part of the Ada language, the most carefully engineered programming language in history. Ada was developed long after both sides of the goto debate were fully developed, and, after considering all sides of the issue, included the goto.

The Phony goto Debate

The primary feature of most goto discussions is a shallow approach to the question. People on the "gotos are evil" side usually present a trivial code fragment that uses gotos, then show how easy it is to rewrite the code fragment without gotos. This proves mainly that it's easy to write trivial code without gotos. On the other hand, people on the "I can't live without gotos" side usually present a case in which eliminating a goto results in an extra comparison or two lines of duplicated code. The significance of the gain is questionable, and this proves mainly that there's a case in which using a goto results in one less comparison, rarely a significant gain on today's computers.

Most textbooks don't help since they merely provide a trivial example of rewriting code without a goto and think they're done. Here's an example of a trivial piece of code from such a textbook:

Pascal Example of Code that's Supposed to be Easy to Rewrite Without gotos

repeat 
   GetData( InputFile, Data ); 
   if eof( InputFile ) then 
      goto LOOP_EXIT; 
   DoSomething( Data );
until ( Data = -1 );
LOOP_EXIT:

The book quickly replaces this with gotoless code:

Pascal Example of Supposedly Equivalent Code, Rewritten Without gotos

GetData( InputFile, Data );
while ( not eof( InputFile ) and ( Data <> -1 ) do 
   begin 
   DoSomething( Data ); 
   GetData( InputFile, Data ) 
   end;

This "trivial" example is disguised because it contains an error. In the case in which Data equals -1, the translation detects the -1 and exits the loop before executing DoSomething(). The original code executes DoSomething() before the -1 is detected. In other words, a programming book trying to show how easy coding is using only structured programming techniques translated its own example incorrectly! The author of that book shouldn't feel too bad, however, because other books make similar mistakes! Even the pros have difficulty achieving gotoless nirvana.

Here's a faithful translation of the code with no gotos:

Pascal Example of Truly Equivalent Code, Rewritten Without gotos

repeat 
   GetData( InputFile, Data ); 
   if ( not eof( InputFile )) then 
      DoSomething( Data );
until ( Data = -1 or eof( InputFile ) );

Even with a correct translation of the code, the debate is still phony because it trivializes the case in which a goto is needed. Cases like this are not the ones in which thoughtful programmers choose a goto as the preferred form of control.

It would be hard by now to add anything worthwhile to the theoretical debate about gotos. One level of discussion isn't usually addressed, however, and that's the situation in which a programmer who is fully aware of gotoless alternatives chooses to use a goto on the basis of its readability and maintainability.

The following sections present cases in which some experienced programmers argue for using gotos. The sections provide examples of rewriting the code without gotos, and discuss the tradeoffs between the various versions.

Error Processing and gotos

Writing highly interactive code creates additional programming demands. In particular, it demands that you pay a lot of attention to error processing and cleaning up resources when errors occur. Here's an example of code that purges a group of files. It first gets a group of files to purge, then finds each file, opens it, overwrites it, and erases it. It checks for errors at each step:

Pascal Example of Code with gotos that Processes Errors and Cleans up Resources

PROCEDURE PurgeFiles( var ErrorState: ERROR_CODE );
{ This routine purges a group of files }
var
   FileIndex:        Integer;
   FileHandle:       FILEHANDLE_T;
   FileList:         FILELIST_T;
   NumFilesToPurge:  Integer;

label
   END_PROC;

begin
   MakePurgeFileList( FileList, NumFilesToPurge );

   ErrorState := Success;
   FileIndex   := 0;
   while ( FileIndex < NumFilesToPurge ) do
      begin
      FileIndex := FileIndex + 1;

      if not FindFile( FileList[ FileIndex ], FileHandle ) then
         begin
         ErrorState := FileFindError;
         goto END_PROC
         end;

      if not OpenFile( FileHandle ) then
         begin
         ErrorState := FileOpenError;
         goto END_PROC
         end;

      if not OverwriteFile( FileHandle ) then 
         begin
         ErrorState := FileOverwriteError;
         goto END_PROC
         end;

      if Erase( FileHandle ) then
         begin
         ErrorState := FileEraseError;
         goto END_PROC
         end

      end; { while }

   END_PROC:

   DeletePurgeFileList( FileList, NumFilesToPurge )

end; { PurgeFiles }

This routine is typical of circumstances in which experienced programmers select a goto. Other, similar cases occur when a routine needs to allocate and clean up resources such as memory or handles to fonts, windows, brushes, and printers. The alternative to gotos in those cases is usually duplicating code to clean up resources. In such cases, a programmer might balance the evil of the goto against the maintenance headache of duplicate code and decide that the goto is the lesser evil.

You can rewrite the above routine in a couple of ways that avoid gotos, and you make tradeoffs in both cases. Here are the possible rewrite strategies:

Rewrite with Nested if Statements. To rewrite with nested if statements, nest the if statements so that each is executed only if the previous test succeeds. This is the standard, textbook, structured-programming approach to eliminating gotos. Here's a rewrite of the routine using the standard approach:

Pascal Example of Code that Avoids gotos by using Nested ifs

PROCEDURE PurgeFiles( var ErrorState: ERROR_CODE );

{ This routine purges a group of files }

var
   FileIndex:        Integer;
   FileHandle:       FILEHANDLE_T;
   FileList:         FILELIST_T;
   NumFilesToPurge:  Integer;

begin
   MakePurgeFileList( FileList, NumFilesToPurge );

   ErrorState  := Success;
   FileIndex    := 0;

   while ( FileIndex < NumFilesToPurge and ErrorState = Success ) do
      begin
      FileIndex := FileIndex + 1;

      if FindFile( FileList[ FileIndex ], FileHandle ) then
         begin
         if OpenFile( FileHandle ) then
            begin
            if OverwriteFile( FileHandle ) then 
               begin
               if not Erase( FileHandle ) then
                  begin
                  ErrorState := FileEraseError
                  end
               end 
            else { couldn't overwrite file }
               begin
               ErrorState := FileOverwriteError
               end
            end 
         else { couldn't open file }
            begin
            ErrorState := FileOpenError
            end
         end 
      else { couldn't find file }
         begin
         ErrorState := FileFindError
         end

      end; { while }

   DeletePurgeFileList( FileList, NumFilesToPurge )

end; { PurgeFiles }

For people used to programming without gotos, this code might be easier to read than the goto version, and if you use it, you won't have to face an inquisition from the goto goon squad.

The main disadvantage of this approach is that the nesting level is deep. Very deep. Deeeeeeeep. With nesting like this, to understand the code, you have to keep the whole set of nested ifs in your mind at once. Moreover, the distance between error-processing code and code that invokes it is too far: the code that sets ErrorState to FileFindError, for example, is 23 lines from the if statement that invokes it.

With the goto version, no statement is more than four lines from the condition that invokes it. Moreover, it doesn't require that you keep the whole structure in your mind at once. You can essentially ignore any preceding conditions that were successful and focus on the next operation. For these reasons, in this case, the goto version is more readable and more maintainable than the nested-if version.

Rewrite with a Status Variable. To rewrite with a status variable (also called a state variable), create a variable that indicates whether the routine is in an error state or not. In this case, the routine already uses the ErrorState status variable, so you can use that:

Pascal Example of Code that Avoids gotos by Using a Status Variable

PROCEDURE PurgeFiles( var ErrorState: ERROR_CODE );

{ This routine purges a group of files }

var
   FileIndex:        Integer;
   FileHandle:       FILEHANDLE_T;
   FileList:         FILELIST_T;
   NumFilesToPurge:  Integer;

begin
   MakePurgeFileList( FileList, NumFilesToPurge );

   ErrorState := Success;
   FileIndex  := 0;
   while ( FileIndex < NumFilesToPurge ) and ( ErrorState = Success ) do
      begin
      FileIndex := FileIndex + 1;

      if not FindFile( FileList[ FileIndex ], FileHandle ) then
         begin
         ErrorState := FileFindError
         end;

      if ( ErrorState = Success ) then
         begin
         if not OpenFile( FileHandle ) then
            begin
            ErrorState := FileOpenError
            end
         end;

      if ( ErrorState = Success ) then
         begin
         if not OverwriteFile( FileHandle ) then 
            begin
            ErrorState := FileOverwriteError
            end
         end;

      if ( ErrorState = Success ) then
         begin
         if not Erase( FileHandle ) then
            begin
            ErrorState := FileEraseError
            end
         end

      end; { while }

   DeletePurgeFileList( FileList, NumFilesToPurge )

end; { PurgeFiles }

The advantage of the status-variable approach is that it avoids the deeply nested if-then-else structures of the first rewrite, so it's easier to understand. It also places the action following the if-then-else test closer to the test than the first rewrite and completely avoids else clauses.

Understanding the nested-if version requires substantial mental gymnastics, but this version is easier to understand because it closely models the way people think about the problem. You find the file. If everything is OK, you open the file. If everything is still OK, you overwrite the file. If everything is still OK, ...

The disadvantage of this approach is that the using status variables isn't as common a practice as it should be. Document it carefully, or some programmers might not understand the general approach. In this example, the use of well-named enumerated types helps significantly.

Comparison of Approaches

Each of the three methods has something to be said for it. The first avoids unnecessary tests and deep nesting but has gotos. The second avoids gotos but is deeply nested and gives an exaggerated picture of the logical complexity of the routine. The third avoids gotos and deep nesting but introduces extra tests.

The last approach is slighty preferable to the first two because it's more readable and models the problem better, but that doesn't make it the best approach in all circumstances. Any of these techniques works well when applied consistently to all the code in a project. Consider all the factors that have been presented, then make a project-wide decision about which method to favor in your programs.

gotos and Sharing Code in an else Clause

One challenging case in which some programmers would use a goto is the case in which you have two conditional tests and an else clause, and want to execute code in one of the conditions and the other else clause. Here's an example of a case that could drive someone to goto:

C Example of Sharing Code in an else Clause with a goto

if ( StatusOK )
   {
   if ( DataAvail )
      {
      ImportantVar = x;
      goto MID_LOOP;
      }
   }
else
   {
   ImportantVar = GetVal();

   MID_LOOP: 

   /* lots of code */
   ...
   }

This is a good example because it's logically tortuous-it's nearly impossible to read as it stands and one of the hardest cases to rewrite correctly without a goto. If you think you can easily rewrite it without gotos, ask someone to review your code! Several expert programmers have rewritten it erroneously.

You can rewrite it in several ways. You can duplicate code, put the common code in a routine and call it from two places, or retest the conditions. The rewrite won't be as fast as the original in most languages, but it will be almost as fast. Unless the code is in a really hot loop, rewrite it without thinking about efficiency.

The best rewrite is to put the /* lots of code */ part in its own routine. You can then call the routine in the places you would otherwise have used as the origin or destination of a goto and preserve the original structure of the conditional. Here's how it looks:

C Example of Sharing Code in an else Clause by Putting Common Code in a Routine

if ( StatusOK )
   {
   if ( DataAvail )
      {
      ImportantVar = x;
      DoLotsOfCode( ImportantVar );
      }
   }
else
   {
   ImportantVar = GetVal();
   DoLotsOfCode( ImportantVar );
   }

Normally, writing a new routine (or a macro in C) is the best approach. Sometimes, however, it's not practical to put duplicated code in its own routine. In this case you can work around it by restructuring the conditional so that you keep the code in it rather than in a new routine. Here's how it looks:

C Example of Sharing Code in an else Clause Without a goto

if ( (StatusOK && DataAvail) || ! StatusOK )
   {
   if ( StatusOK && DataAvail )
      ImportantVar = x;
   else
      ImportantVar = GetVal();

   /* lots of code */
   ...
   }

This is a faithful and mechanical translation of the logic in the goto version. It tests StatusOK two extra times and DataAvail one, but the code is equivalent. If retesting the conditionals bothers you, notice that the value of StatusOK doesn't need to be tested twice in the first if test. You can also drop the test for DataAvail in the second if test. Try rewriting it yourself if you want the practice.

Summary of Guidelines for Using gotos

Use of gotos is a matter of religion. My dogma is that in modern languages, you can easily replace nine out of ten gotos with equivalent structured constructs. In these simple cases, you should replace gotos out of habit. In the hard cases, you can still exorcise the goto in nine out of ten cases. In these cases, you can break the code into smaller routines; use nested ifs; test and retest a status variable; or restructure a conditional. Eliminating the goto is harder in these cases, but it's good mental exercise, and the techniques discussed in this section give you the tools to do it.

In the remaining one case out of 100 in which a goto is a legitimate solution to the problem, document it clearly and use it. If you have your rain boots on, it's not worth walking around the block to avoid a mud puddle. But keep your mind open to gotoless approaches suggested by other programmers. They might see something that you don't.

Here's a summary of guidelines for using gotos:

  • Use gotos to emulate structured control constructs in languages that don't support them directly. When you emulate structured constructs, emulate them exactly. Don't abuse the extra flexibility the goto gives you.
  • Don't use gotos when an equivalent structured construct is available.
  • Measure the performance of any goto used to improve efficiency. In most cases that use gotos, you can recode them without gotos with improved readability and no loss in efficiency. If your case is the exception, document the efficiency improvement so that gotoless evangelists won't remove it when they see it.
  • Limit yourself to one goto label per routine, unless you're emulating structured constructs.
  • Limit yourself to gotos that go forward, not backward, unless you're emulating structured constructs.
  • Make sure all goto labels are used. They might be a symptom of missing code, namely the code that goes to them. If they're not used, delete them.
  • Make sure a goto doesn't create unreachable code.
  • If you're a manager, adopt the perspective that a battle over a single goto isn't worth the loss of the war. If the programmer is aware of the alternatives and is willing to argue, the goto is probably OK.

Further Reading

These articles contain the whole goto debate. It erupts from time to time in most workplaces, textbooks, and magazines, but you won't hear anything that wasn't fully explored 20 years ago.

Dijkstra, E. "GOTO Statement Considered Harmful," Communications of the ACM, v. 11, no. 3, March 1968, pp. 147-8. This is the classic paper in which Dijkstra put the match to the paper and ignited a controversy that shows no signs of abating.

Wulf, W. A. "A Case Against the GOTO," Proceedings of the 25th National ACM Conference, August 1972, pp. 791-97. This paper was another argument against the indiscriminate use of gotos. Wulf argued that if programming languages provided adequate control structures, gotos would become largely unnecessary. Since the paper was written in 1972, languages such as Pascal, C, and Ada have proven him correct.

Knuth, Donald. "Structured Programming with go to Statements," 1974, in (Yourdon 1979), pp. 259-321. This long paper isn't entirely about gotos, but it includes a horde of examples of code that's made more efficient by eliminating gotos and another horde of examples of code that's made more efficient by adding them.

Rubin, Frank. "'GOTO Considered Harmful' Considered Harmful" (letter to the editor), Communications of the ACM, vol. 30, no. 3 (March 1987), pp. 195-6. In this rather hot-headed letter to the editor, Rubin asserts that gotoless programming has cost businesses "hundreds of millions of dollars." He then offers a code fragment that uses a goto and argues that it's superior to gotoless alternatives.

The response that the letter generated was more interesting than the letter itself. For five months, the CACM published letters that offered different versions of Rubin's original 7-line program. The letters were evenly divided between those defending gotos and those castigating them. Readers suggested roughly 17 different rewrites, and the rewritten code fully covered the spectrum of approaches to avoiding gotos. The editor of Communications of the ACM noted that the letter had generated more response by far than any other issue ever considered in pages of the CACM.

For the follow-up letters, see:

Communications of the ACM, vol. 30, no. 5 (May 1987), pp. 351-355.
Communications of the ACM, vol. 30, no. 6 (June 1987), pp. 475-478.
Communications of the ACM, vol. 30, no. 7 (July 1987), pp. 632-4.
Communications of the ACM, vol. 30, no. 8 (August 1987), pp. 659-62.
Communications of the ACM, vol. 30, no. 12 (December 1987), pp. 997, 1085.

This material is Copyright 1993 by Steven C. McConnell. All Rights Reserved.

 

Email me at stevemcc@construx.com.