Sunday, June 29, 2014

Emotions and Programming

I've always been an emotional coder. For most of my early life, there was one dominant emotion that I experienced while coding: joy. I was that nerdy kid that was excited to get home from school most afternoons because it meant I could fire up my trusty IBM PC XT 486 DX Pentium II computer of the time, open up my beloved GWBASIC Turbo Pascal Borland Pascal vim Emacs editor/IDE of the time, and get back to whatever silly thing I had been working on later the preceding evening than I'd like to admit to my mom.

I'd feel other emotions too, like uncertainty and surprise, but for the most part, I felt joy. At least that's how I remember it. Joy at making something out of nothing; joy at discovering something new; joy at deepening my understanding of some hitherto opaque mystery.

These days, I still get emotional when I write code. My emotions while programming today are a much more mixed bag, though. I still experience a taste of that childhood joy, but it's usually reserved for moments when a particular piece of code just works really well. Other times, I'm typically either stoic about the experience, or I'm feeling some form of frustration.

As far as negative sensations around code go, people talk about code smells, but that's not a good analogy for how I experience it. For me, smell is a sensation that's purely restricted to the physical realm (perhaps owing to my exposure to it as a parent of two children and owner of servant to two dogs). Instead, I typically feel negative emotions related to programming in two ways: as pain in my back and upper shoulders, and as mental discomfort not unlike the experience of cognitive dissonance.

Let me tell you about some of the ways in which I get emotional about code.

Bad tools give me back pain. Literally. If I'm using a tool that's unfamiliar to me yet behaving in an erratic or unintuitive fashion, especially if that tool requires burdensome and unnecessary steps, or responds in a non-deterministic manner, I will eventually get back pain. First, my shoulders and upper back will tense up. Eventually this tension will extend to my lower back, and at that point, I'll start noticing a sharp or dull pain down my back. I try hard to maintain a relatively ergonomic work environment, and most of the time I can be comfortable at my desk for hours, but expose me to the default Windows command prompt window with its insane copy/paste interface, and it won't be long until I feel the manifestation of its crappiness pulsating down my back.

Weird, I know.

There are a few things that give me a sense of frustration similar to what I experience when I'm facing a tough decision and have conflicting opinions bouncing around in my head. One is not knowing exactly what to do next. Most of the time, when I'm coding, I have a pretty good sense of what the next step is that I need to take. Sometimes, though, I don't have enough constraints to make that clear. Or, I haven't made up my mind about some key design decision. Or, I'm struggling to find the right way to bend some particular API to my will. In any of these cases, I'll experience a sense of frustration that lasts until I've sorted out the particular problem.

Another source of frustration is something I seem to have picked up somewhere in my mid twenties: inability to test. If I can't write good, repeatable, automated test cases for something, I get very frustrated and find it incredibly difficult to churn out code. Sometimes I can get over this with the realization that testing something thoroughly and automatedly would just not be worth the effort. Sometimes the code is simple enough, and whatever it's based on is solid enough, that I simply have no concerns about whether it'll work or not. But most of the time, I know I could do better, and I remember all the pain that a lack of testing has cost me in the past, and I just don't feel right until I can bang out a few functional tests.

This becomes especially painful when I'm working on some code that's difficult to test, not because of some inherent reason, but simply because of how it's structured. I don't really buy into the TDD kool-aid wholeheartedly, and I scoff at the overuse of dependency injection and mocking that seems to be so common in some coding cultures, but I sure hate it when I know there's no real reason something can't be tested, and yet I'm prevented from doing so. Hate is a strong word, but it reflects the emotional state I experience when faced with such a situation.

Considering I spent the first 15 years or so of my "coding life" not writing a single test case (forgive me, I was a teenager for most of that time…) this still astounds me a little bit. My best guess is that the experience I gained when first implementing longer-term, commercial, shipping code and all the maintenance nightmares associated with it really stuck with me at a visceral level. Combine that with an instinct to automate all that can be automated and I am where I am today. Even as I'm writing this, I'm feeling mild discomfort at a few pieces of currently maintained code that I know aren't tested as well as they could or should be: a sort of potpourri of back pain, drowsiness, and primal guilt that evokes memories of secretly eating candy before I've had my supper as a five year old when I know I'm not supposed to.

Seriously, what's wrong with me?

Something I contemplate every now and then is whether this emotional attachment to the state of my code, and the process of developing it, is a good thing or a bad thing. Most of the time I've come to the conclusion that, like all emotion, it's probably a good thing as long as it doesn't completely dominate. Emotions help stir action and can be powerful motivators. When they work well, they nudge you in a positive direction. On the other hand, I could certainly use less back pain, and I know my partner would probably appreciate it if I came home not quite as emotionally drained from work more often.

Perhaps the worst disservice I do to myself is sometimes not listening to the emotions that are clearly there. "I've got a deadline to meet", I'll say to myself, as I plow along (un)happily churning out code that I know really should be tested better. I try to convince myself that "it's just work, and I shouldn't really feel strongly about it", stubbornly trying to ignore the very real emotional state that I happen to be in. That, I find, is when I get into trouble — sooner or later I'm staring at the blinking cursor in my editor, wondering how I got to this point, and considering whether Office Space had a point about construction being a much happier line of work than software development. Ignore your emotions, and soon you won't notice they're there anymore, because they have entirely engulfed you.

It's not always like that, though. I've noticed that I only seem to get (negatively) emotional about code that I'm writing to accomplish something. If I'm writing code purely for fun — something exceedingly rare these days with my family life consuming most of my free time — I have none of these qualms. I get most emotional about the code that I'm writing when I'm trying to get somewhere - when the code that still needs to be written merely seems like an obstacle standing in my way. While I find not knowing what to do next frustrating, I also sometimes despise knowing exactly what code needs to be written, feeling almost as if I've lost my sense of free will by being forced by circumstance to churn out the "unavoidable thousand lines of code" needed to solve some problem. Creativity lies somewhere between uncertainty unbound by constraints and exact knowledge of what must occur next. Sometimes, it's a fine line.

That's not to say that writing code for work (or for any set goal other than coding for its own purpose) is always a negative experience. It's just that, when I don't have a particular purpose in mind, I never notice these emotions. More than anything, I think that's probably because I just avoid the stuff that bugs me in the first place when I'm coding for fun.

I have great respect for people who can maintain a Zen-like state of detachment in whatever it is they do, including coding. Detachment can be blissful and highly productive. I also have a great deal of empathy for people for whom coding is a creative, emotional, disturbing, joyful, frustrating, fulfilling experience. I think great talent exist in both camps, and sometimes both extremes are embodied by a single person over time.

I think I'll probably spend the rest of my programming life occasionally perplexed as to how it is that interacting with a machine, writing software in a completely abstract landscape, and something as mundane as testing can have such a strong emotional impact on myself and others in my field. And I expect I will need to learn, over and over again, how to harness those emotions positively. Perhaps most importantly, I'll need to remind myself to recognize them in others despite how clinically technical the act of programming can seem from the outside.

In the meantime, I'll continue to fret about those tests I haven't written. And maybe soon I'll get to write some of them, and the back pain will lessen a little bit.

Thursday, May 1, 2014

A C error handling style that plays nice with C++ exceptions

I've written a lot of library code using what I call an "hourglass" pattern: I implement a library (in my case, typically, using C++), wrap it in a C API which becomes the only entry point to the library, then wrap that C API in C++ or some other language(s) to provide a rich abstraction and convenient syntax. When it comes to native cross-platform code, C APIs provide unparalleled ABI stability, and portability to other languages via FFIs. I even restrict the API to a subset of C that I know is portable to a wide variety of FFIs and insulates the library from leaking changes in internal data structures — expect more on that in future blog posts.

C Error Reporting Styles

For now, I want to talk about the style of error reporting I've come to adopt for such APIs. Options for error reporting from C functions include the following:

  • Return an error code from functions that can fail.
  • Provide a function like Windows's GetLastError() or OpenGL's glGetError() to retrieve the most recently occurring error code.
  • Provide a global (well, hopefully, thread-local) variable containing the most recent error, like POSIX's errno.
  • Provide a function to return more information about an error, possibly in conjunction with one of the above approaches, like POSIX's strerror function.
  • Allow the client to register a callback when an error occurs, like GLFW's glfwSetErrorCallback.
  • Use an OS-specific mechanism like structured exception handling.
  • Write errors out to a log file, stderr, or somewhere else.
  • Just assert() or somehow else terminate the program when an error occurs.

There are loads of tradeoffs among these options, and it's a matter of opinion as to which ones are better than others. Like all matters of style, unless a style is just plain unsuitable (like the last two options above might be for many uses), it's probably most important to apply the style you choose consistently.

The Hourglass C Error Handling Style

Let me illustrate the style of error handling I use for hourglass C APIs with the following example function for a fictional library called libfoo:

/** Create a set of  widgets.
  * Return libfoo_success and store the widgets in *out_widgets if the widgets were created
  * successfully, otherwise return
  *  - libfoo_invalid_argument if count is less than 1
  *  - libfoo_invalid_argument if out_widgets is NULL
  *  - libfoo_runtime_error if memory for the widgets could not be allocated
  */
libfoo_result_t libfoo_create_widgets(int count, libfoo_widget_container_t* out_widgets, 
                                      libfoo_error_details_t* out_error_details);

This made-up function creates objects of some sort and stores the created objects via a pointer passed in by the client. Assuming that libfoo_widget_container_t is a pointer type (perhaps a typedef to void*), a valid call to libfoo_create_widgets might look as follows:

libfoo_widget_container_t container = NULL;
libfoo_create_widgets(12, &container, NULL);

While valid, this call doesn't check for errors in any way, so a better approach might be:

libfoo_widget_container_t container = NULL;
if (libfoo_create_widgets(12, &container, NULL) != libfoo_success) {
   /* handle the error appropriately */
}

By now you might wonder what's up with that last parameter, out_error_details. The documentation for libfoo_create_widgets() doesn't mention it, and in the above examples we simply pass NULL into it. Elsewhere in our C API we might define and document the libfoo_error_details_t type and some related functions as follows:

/** Many functions in this API take a libfoo_error_details_t* argument as their last
  * parameter. Such an argument is always permitted to be NULL. If the argument is not NULL,
  * and the function returns a value other than libfoo_success, a libfoo_error_details_t
  * object will be stored at its location, from which more error information can be obtained.
  */
typedef void* libfoo_error_details_t;

/** Obtain a NULL-terminated string with more details about an error that occurred.
  */
const char* libfoo_error_details_c_str(libfoo_error_details_t error_details);

/** Free all resources associated with error_details. If error_details is NULL, do nothing.
  */
void libfoo_error_details_free(libfoo_error_details_t error_details);

Let's adjust our example call to libfoo_create_widgets() above to make use of this last parameter:

libfoo_widget_container_t container = NULL;
libfoo_error_details_t error = NULL;
if (libfoo_create_widgets(12, &container, &error) != libfoo_success) {
   printf("Error creating widgets: %s\n", libfoo_error_details_c_str(error));
   libfoo_error_details_free(error);
   abort(); // goodbye, cruel world!
}

While each (possibly failing) function in the API returns an error code, it also accepts an optional parameter, the error details object, that can provide additional error information. In the above example, we query the error details to retrieve a string with a (hopefully) human-readable description of the exact error that occurred.

If you think this form of error handling looks somewhat cumbersome from the point of view of someone writing plain C code, I agree with you! Remember that this API is the middle of an hourglass API, so it's really targeted primarily at being wrapped by another language that provides a more convenient interface. We could, though, make some tradeoffs if we wanted to make it less cumbersome — for example, we could reserve some space for the error details structure and only allow one valid error details object at a time, which would remove the need to have a libfoo_error_details_free() function.

Moving up into C++

Let's focus, though, on the hourglass case, and let's say that we're wrapping this API in some C++ code. How might we handle errors from our C API there?

For starters, we might want to use an RAII type to wrap up our error details object, so we don't have to manually free it. In C++11, such a class might look as follows:

class ErrorDetails {
public:
  ErrorDetails()
  : _details(nullptr) // initialize _details to NULL
  {
  }
  
  ~ErrorDetails()
  {
    // Note: libfoo_error_details_free() is safe to call on NULL values
    libfoo_error_details_free(_details);
  }

  // Provide access to the underlying libfoo_error_details in a way that's
  // convenient for passing to the libfoo API.
  libfoo_error_details_t* get() { return &_details; }

  const char* c_str() const {
    return libfoo_error_details_c_str(_details);
  }

private:
  libfoo_error_details_t _details;
};

With this class, a C++ code snippet calling libfoo_create_widgets() might look as follows:

libfoo_widget_container_t container = nullptr;
ErrorDetails error;
if (libfoo_create_widgets(12, &container, error.get()) != libfoo_success) {
  std::cout << "Error creating widgets: " << error.c_str() << std::endl;
  std::terminate(); // goodbye, cruel world!
}

In addition to using RAII to automatically free the error details object when we're done with it, we've also wrapped libfoo_error_details_c_str() in a member function for convenience.

Using Exceptions

On top of all of the error handling styles available in C, C++ has an entire language construct just to deal with errors: exceptions. What if we wanted to throw an exception when we notice the error, instead of dramatically terminating the entire program? We might write something like:

libfoo_widget_container_t container = nullptr;
ErrorDetails error;
if (libfoo_create_widgets(12, &container, error.get()) != libfoo_success) {
  throw std::runtime_error(error.c_str());
}

As before, our use of RAII in defining ErrorDetails will ensure that the error details object is freed up before the exception is propagated up the stack. Great! And we can now handle the exception however we like somewhere higher up in the program. But, this is still pretty verbose: if we wanted to throw an exception for every error that occurs when making a call to a libfoo function, we'd need this kind of code repeated over and over again.

We can do better. To do so, we're going to legitimately do something that's usually considered a no-no in C++: we're going to throw from a destructor. Have a look at this class:

class ThrowOnError {
public:
  ~ThrowOnError() noexcept(false)
  {
    if (*_details.get() != nullptr) {
      throw std::runtime_error(_details.c_str());
    }
  }

  // Allow an instance of ThrowOnError to be passed directly as the last argument
  // of a call to a libfoo function.
  operator libfoo_error_details_t*() { return _details.get(); }
  
private:
  ErrorDetails _details;
};

Before we go into the details of ThrowOnError, let's look at how we would adjust our example call to libfoo_create_widgets() to use it:

libfoo_widget_container_t container = nullptr;
libfoo_create_widgets(12, &container, ThrowOnError());

Whoa! Our four-line function call is back down to a single line! If libfoo_create_widgets fails, for whatever reason, we throw an exception of type std::runtime_error, and we include whatever error message we got from the call to libfoo_create_widgets. And, as before, our error details will be cleaned up nicely because of our use of RAII in ErrorDetails.

I hope you agree that this is a lot less syntactic overhead than explicitly checking for an error and throwing every time.

How ThrowOnError Works

Let's tease apart what's happening in our final call to libfoo_create_widgets():

  1. First, a temporary object of type ThrowOnError is constructed. This, in turn, constructs its ErrorDetails member, which allocates space for a libfoo_error_details_t object, ErrorDetails::_details and sets it to NULL.
  2. In order to pass the temporary ThrowOnError object to libfoo_create_widgets(), the ThrowOnError::operator libfoo_error_details_t* user-defined conversion function is invoked and returns a pointer to the ErrorDetails::_details member via ErrorDetails::get().
  3. Next, libfoo_create_widgets() is called. When it returns:
    • If the function succeeded, its last argument, the libfoo_error_details_t, has been left untouched.
    • If an error occurred, a non-NULL libfoo_error_details_t object has been assigned to the ErrorDetails::_details member.
  4. After libfoo_create_widgets() returns, but before the statement completes, the ThrowOnError destructor is called. At this point, if ErrorDetails::_details is not NULL, the destructor retrieves the error message and throws an exception.

Throwing From Destructors

Usually, throwing from a destructor is considered a Bad Idea, and for good reason. As an exception being unwound makes its way up the stack, destructors get called, and if such a destructor throws an exception, a C++ program will instantly terminate.

In this case, however, we have a legitimate reason for throwing from a destructor. If we stick to using ThrowOnError for its intended purpose — passing it as the last argument to C functions that use this style of error handling — we would need to be calling such a function from a destructor (directly, or indirectly via another function), for us to run the risk of a throw-during-unwind. That is no more or less risky than any other function call from a destructor.

This does suggest another rule for our hourglass C APIs, though: functions that are responsible for freeing resources, such as libfoo_error_details_free, should never fail, so as not to create the temptation to handle their failure using ThrowOnError.

You might wonder about the noexcept(false) decorating ThrowOnError's destructor, especially if you're unfamiliar with C++11. In C++11, the noexcept specifier was introduced to mark functions that don't throw exceptions. By default, functions are assumed to be able to throw exceptions, like in C++98, except for destructors. In C++11, by default, a destructor is assumed to be noexcept, and it takes an explicit noexcept(false) to mark it as throwing. If you leave this out, the C++ compiler will terminate the program when the destructor throws, just as it does with any other function marked noexcept if it throws.

Wrapping Up Loose Ends

By choosing a particular convention for a C API's error handling, we were able to very conveniently translate errors from the C API to C++ exceptions. You might wonder why the functions still bother to return an error code. I've found this helpful when writing C code that uses the C API directly, as it allows me to continue to handle errors using an if statement, and ignore the "error details" mechanism if I so wish by passing NULL as the last argument.

You might also wonder why we used an error details abstraction at all, rather than just making the last argument something like a char** that is set to the error message if an error occurred. Having an abstraction allows us to extend libfoo_error_details_t with additional functionality. For example, if we want to throw a different exception type based on the type of error that occurred, we can easily add a error_details_type() function that returns a libfoo_result_t given a libfoo_error_details_t. Then, we can use that function in the ThrowOnError destructor to decide what kind of exception we want to throw.

Every experienced C++ programmer I know has opinions on how errors should be handled. I've found this style to be quite useful when layering C++ on top of C, and I think there's some beauty in the interplay between C and C++ here. Whether you love or hate exceptions, and whether you prefer returning error codes or setting errno, I hope you agree with me on that last part!


Edit: As pubby8 points out below something to watch out for with this form of ThrowOnError is that it shouldn't be used more than once in an expression. In practice, with this particular C API convention, this is unlikely, but it's nonetheless something to watch out for and just goes to show that throwing from destructors is fraught with danger.