Handling Bad Inputs Early in the Book

Published on 11 Jun 2022.

This question comes up on Discord all the time:

How do I make sure that bad/garbage user input doesn’t crash my program, and is handled reasonably well?

This is going to be a long one, so let’s just start with the TL;DR: Validating input and ensuring the code reasonably handles garbage data is an important thing to be doing, but early on in the book, we just don’t have the tools to do that well. So it is safe to just assume that all inputs are good inputs in your code, or at least that “garbage in, garbage out” is good enough. Over the course of the book, you’ll pick up a large set of tools to help with checking inputs, and you’ll be able to start applying those as you go. But it’s okay to just worry about the “happy path” for the short term.

In the long run, TryParse is what I’d typically use, but it demands output parameters, which are a bit complicated. You could, of course, start using it without knowing what it does, and not really have any huge repercussions.

But in the next edition of the book, I plan on weaving in another “thread” to the reader’s progression. (Not sure of a better way to phrase that.)

Here’s a preview of generally what I expect to do. (Feedback appreciated.)

Even as early as Level 3 (Hello World), I plan on saying something like this, with a (!) by it to indicate that it’s an important point/pitfall:

Almost as soon as you start using Console.ReadLine(), you have to start thinking about if the implications of the user entering bad data. Unless the set of all possible strings and the utter lack of a string is expected, your program may not work correctly if you just take what the user types and try to run with it. In certain situations, this could result in your program crashing and ending prematurely. Which is far from ideal.

But how bad is it, really, to have a program crash because of bad input?

That depends on the nature of the program. If your program was controlling the nuclear launch codes for half the planet and crashing meant “launch the nukes,” then (aside from needing to change that design immediately) it is extremely critical to detect and reject bad inputs. On the other hand, if the program is for your own use only, and is a little tool meant to do something simple like, say, calculate x in the quadratic equation, then you already know what good data looks like, and you know how to re-run the program if you fat-finger your inputs. The stakes are low, unlikely to be encountered, and easy to recover from.

As a general principle, it is wise to account for bad user inputs. However, at this point in time, we just simply don’t have a very rich set of tools available to detect and recover from errors. With the small programs shown in the book and with the challenges you’ll do throughout the early parts of the book, we’re in that second scenario, where the stakes are low, the user’s knowledge of expected inputs is high (it will probably only be you running your programs), and the time to recover is small.

To that end, for the short term, it is completely reasonable to write code that just simply assumes the inputs are good, even if that means crashing in the rare cases where the inputs are not good. Over the course of this book, we’ll pick up more tools to do better error handling, and be able to start adding those to the mix as we go.

Then in Level 6 (The C# Type System, where Convert is introduced) I’m planning on adding in something like this in the section about Convert and [type].Parse:

We’re going to use methods like Convert.ToInt32 and int.Parse very heavily, moving forward. The console window has no real mechanism for getting input beyond raw text and characters, and these methods allow us to take that text and change it into other types. But what happens if we feed Convert.ToInt32 something that can’t be turned into a number? What if we all it with Convert.ToInt32("Hello, World!") or even Convert.ToInt32("four")? (Note, on that last one, that Convert.ToInt32 is expecting digit characters, and doesn’t understand the spelled-out version of numbers.)

If you feed invalid data to Convert’s methods or to the various Parse methods, they crash. There are things we can do about that. In Level 35, we’ll learn how to handle these types of errors before they crash the program. In Level 34, we’ll learn about some other options that don’t blow up on bad data. But both of those require a bit more knowledge than we actually have right now.

We’ll get there soon enough, but for now, it is still reasonable to just assume the user will enter good data, knowing that for the programs we’re currently making, the stakes are low, and it is easy to just run the program again.

Then in maybe Level 9 or 10, where if statements and switch statements are covered, I might add something like:

I’ve been promising that we’ll learn tools for handling bad user input as we go. The default case gives us our first tool to handle bad inputs. In situations where there’s a fixed set of options to choose from, we can check for those specific cases in the other arms of the switch, and do something different if the input wasn’t one of the known, valid options.

This doesn’t handle all possible forms of bad input. While it works in this type of multiple-choice scenario, it wouldn’t really help us if the user needs to enter an arbitrary number.

We also currently have the limitation that if the user didn’t pick one of the known, good values, we’ve got to continue on anyway. We can maybe display an error message such as, “That wasn’t one of the choices!” but the show must be able to go on, which might mean assigning default values to any variables that needed to be updated. This particular limitation, we’ll account for in the next level when we discuss loops.

Then in Level 11, when loops are introduced, something like this:

With loops, we pick up another tool to help with input validation. We can structure our code to ask a question repeatedly until we get a valid answer. In the previous level, we were in a state where “the show must go on,” but with loops, we now have the option to stop the show temporarily, for as long as the user keeps giving us bad data.

This isn’t bulletproof, however. Our while loop is able to ensure that the user picks a number between 0 and 10 inclusive, but if they type in "asdf", we’re still in trouble, because Convert.ToInt32 can’t make that conversion correctly. We’ve begun picking up tools to help check user inputs, and we should start using them, but we still have more to learn in the future. We’ll start applying the tools that we know, but for other failure modes, we’ll still keep letting it slide until we learn better tools in the coming levels.

I think the next place where this input validation comes up is Level 34, when output parameters are introduced, as well as the TryParse methods. This level does already call out the fact that TryParse is an improvement over Parse and Convert because it is able to handle bad inputs without blowing up. However, I think it deserves to be called out far more directly, with all of the other changes going into the book that I’ve described above:

At last! We’ve found a tool that makes it easy to handle bad inputs! Now that we’ve seen output parameters, and TryParse has become an option, this should generally be your go-to tool for safely parsing user inputs. In fact, you should typically only use Convert’s methods and the various Parse methods if you are confident the parsing will succeed or if you have other error handling wrapped around the code.

And then I think Level 35 needs a bit more added to more clearly illustrate that it is the culmination of the bad input journey. I do think that’s already reasonably clear, especially because it specifically brings up the failures of Convert.ToInt32, but I might add something like this:

With exception handling, we’re finally achieved critical mass in our error handling and input validation journey. We’ve been assuming that user inputs are “good enough” up until now, but we now have all of the major tools in our inventory to ensure that even somebody trying to intentionally crash the software won’t be able to.

So is that it now? Handle all possible bad data with exception handling?

To answer that, let’s return to the beginning of our error handling journey once more. I mentioned that if your code controls the nuclear launch codes and getting it wrong launches all the nukes, you’d want to be extremely precise about user input, and reject anything bad. That’s true of many programs; it doesn’t demand involving nukes to be able to justify exception handling. If users or businesses stand to lose money, time, patience, or confidence in your software by having it crash, you’ll want to handle those exceptions to avoid that. But if you compare exception handling code against code that doesn’t handle exceptions, there’s definitely a cost to readability. That cost is typically well worth it, but it isn’t nothing. You may rightly decide that, for some low-stakes code, it’s acceptable to crash. Those situations are somewhat rare–and close to non-existent in any code that somebody might refer to as “production code,” used by the masses. But they do come up. (And experiments and practice exercises may often qualify.) It is usually worth it, but there are occasional exceptions where it’s not worth the trouble.

I do also wonder about adding something into Level 35 along these lines:

If you can write code in such a way that it can’t throw exceptions, that’s often desirable. Using int.TryParse is cleaner code than calling int.Parse and handling FormatExceptions and OverflowExceptions. You can find similar strategies in many other places as well.

And there might also be room for a comment like this:

Some exceptions represent exceptional circumstances that you can’t completely escape. A network connection becoming disconnected unexpectedly will generally throw an exception. That’s an extenuating circumstance that makes sense for the software to just handle.

But other exceptions represent programmer errors. NullReferenceException is a good example of this. It means you failed to check some value for null, but just went ahead and used it anyway. This shouldn’t be fixed by catching the exception, but by changing the code to ensure that either a null value is never used in that situation or to ensure that the code checks for null before attempting to use it. The exception indicates a definite problem, but the fix is not to just catch it. Rather, it is to fix the mistake in the code itself.

That’s probably the end of the journey. I can’t think of any other part of the book I’d expect to change to weave this thread into the book.

I’m definitely interested in people’s thoughts on these additions in future books, so feel free to reach out to me and let me know what you think.