Archive for the ‘Programming’ Category

Easy Cocoa Setup Assistants with TESetupAssistant

Sunday, January 31st, 2010

Setup assistants can be a great tool when you need to guide users through a series of steps.

TESetupAssistant was born during my work on the 2.0 update to Espionage, when I discovered that many of its UI elements could stand to benefit from a generic setup assistant class.

The gallery below shows some of the places in Espionage where we use TESetupAssisant, illustrating its versatility:

You can create these sorts of UIs very quickly and use them wherever the user needs to complete a series of steps, or even a single step (as shown in the first image above). Here’s a basic overview of it:

There are two main classes: TESetupAssistant and TEBaseAssistant. TESetupAssistant is associated with a nib file that determines the overall layout. It manages a one or more assistants, each of which inherit from TEBaseAssistant. Each TEBaseAssistant subclass has its own nib file, usually just containing a single NSView container object.

Example of a Modal Assistant

Here’s an example of a very simple assistant using the minified UI and running modally:

And here is all the code needed to create it:

#import <Cocoa/Cocoa.h>
#import "TESetupAssistant.h"

@interface MiniAssistant : TEBaseAssistant {
    IBOutlet NSTextField *textField;
}
@end

@implementation MiniAssistant
- (NSArray *)orderedSteps
{
    return [NSArray arrayWithObject:@"Mini Step"];
}
- (void)start
{
    [[controller nextButton] setTitle:@"Finish"];
    [textField setStringValue:NSSTR_FMT(
        "Hi there! I'm a mini-assistant %srunning modally!",
        [controller modal] ? "" : "not ")
    ];
}
@end

// To run it modally is just 4 lines of code (somewhere):

TESetupAssistant *sa = [[[TESetupAssistant alloc] initMini] autorelease];
[sa setModal:YES];
[sa addAssistant:[MiniAssistant assistant]];
[sa run];

Notice that we don’t even need to load the nib file. That’s because our nib file is named after our assitant (MiniAssistant.nib). You can of course override the -assistantNib method to specify a different one, but that illustrates one key aspect of TESetupAssistant, and that is that there are sensible defaults for almost everything, allowing you to quickly throw together these kinds of interfaces.

Get It On Bitbucket

I’m releasing TESetupAssistant as open source under a liberal license (just an attribution is asked for), and I’ve included a little demo app to help you hit the ground running. You can download it on bitbucket.

If you use it in your application I’d love to know! Shoot us an email or post a comment below and I’ll place a link here to your app.

Enjoy! :)

You can follow me on twitter here.

How newLISP Took My Breath (And Syntax) Away

Friday, January 8th, 2010

A few years ago, a little-known language called newLISP completely changed my understanding of what “good” programming languages look like.

Why newLISP?

Before saying another word, I’d like to address the question that some of my LISP-familiar readers may be asking right now: Why newLISP? Why not Clojure, Scheme, or Common LISP?

The answer is that after evaluating these dialects of LISP, I’ve come to the conclusion that newLISP has several important advantages over other LISPs.

Today, unfortunately, whenever someone mentions newLISP on an online forum frequented by adherents of another LISP, an almost clan-like flame war will erupt. An adherent of one particular dialect, usually someone who has never used newLISP, will shout obscenities such as, “newLISP goes back on just about every advancement made by LISPs”, or “dynamic scope is evil!”

The historical context out of which these sentiments are born is mostly unknown to those observing the debate. As signal gives way to noise, the discussion collapses, the melee disperses, and the bystanders go back to sipping their Java and eating their Pythons.

It is fortunate, I think, that my introduction to the language did not sprout out of one of these battles. It was, for the most part, largely unbiased.

Something called LISP

While attending the University of Florida, I had signed up for a course on artificial intelligence. Our professor predictably introduced the class to a programming language called LISP.

Up to that point in time my knowledge of LISP consisted of the usual hearsay and mantra of those unfamiliar with the language:

“People who like it are crazy zealots who think they’re superior to everyone!”
“It’s mainly used for artificial intelligence stuff.”
“No one uses it for practical purposes.”
“It’s mainly used as a research language.”
“It stands for ‘Lots-of-Irritating-Silly-Parenthesis’. Haha!”

You get the idea. I did not have a clue as to what it was, but I was excited to finally be forced into the position of having to find out. My grade depended on it, after all.

What I discovered was that LISP did in fact live up to its reputation of zealot-inducing awesomeness. Simply being exposed to some of the basic concepts and philosophies of LISP had an immediate and positive impact on my abilities as a programmer.

Syntax: Programmer Enemy #1

LISP’s relative lack of syntax was perhaps the greatest insight, for I immediately felt as though a great weight had disappeared. I realized that it was syntax that was at the root of most programming errors. It was syntax that created a subconscious burden that I had simply been unaware of; causing errors, bugs, and making it difficult to simply turn my thoughts into code. Despite having many years of experience with C-based languages, and being intimately familiar with their syntax, I realized that it was nevertheless a totally unnecessary burden that was slowing me, and everyone else, down.

This is one great advantage of Lisp-like languages: They have very few ways of forming compound expressions, and almost no syntactic structure. . . . After a short time we forget about syntactic details of the language (because there are none) and get on with the real issues.

—Abelson and Sussman

It’s not just the size of the syntax that matters, it’s what you can do with it. What requires layers of special syntax in languages like PHP, Python and Ruby, LISP can do with its basic concepts of lists, functions and symbols. It can do everything those languages can do, in a more elegant fashion, and still have enough tricks left up its sleeve to accomplish feats that are simply not possible in other languages.

Common LISP: A Series of Unfortunate Mistakes

Despite all of these exciting discoveries I still had an uneasy feeling.

Common LISP (CL) was a great departure from what I had known previously, but it reeked of antiquity, and worse, its syntax simply wasn’t very well thought out. The “Zen-like” perfection that it seemed to be yearning for was missing. It was a feeling that Mac users are all too familiar with: There were too many buttons, and most of them were unnecessary.

Common LISP Syntax in a Nutshell

Compared to C++, that’s fantastic. Then again, compared to C++, most languages appear favorable. Having had a taste of the liberty offered by the drop in syntax from C/C++/Java to CL, I did not see why all of this syntax was necessary, and indeed, most of it wasn’t.

Every little piece of syntax that’s introduced into a language adds to the programmer’s mental burden, be he conscious of it or not. I was not prepared to spend the effort learning Common LISP if a better alternative could be found.

What do you mean by “syntax”?

You may object at my inclusion of the functions defvar, =, eq, eql, etc. as they are functions. I include them because they constitute low-level functionality that cannot be expressed in the syntax of LISP itself. In other words, the functions = and eq are low-level primitives that must be defined in the language that LISP itself is implemented in, and their meaning and usage cannot be deduced from any other existing LISP syntax.

Consider PHP’s != and !==, they are both operators and are used in the same way, yet that doesn’t tell you anything about what the difference between them is. There’s no way to deduce their meaning from the existing semantics of the language and thus they each represent new syntax that must be learned.

How I Discovered newLISP

The professor revealed to us our “major class project”: we were to implement a text-based version of the game Chainshot:

Chainshot

An implementation of Chainshot

Chainshot starts with a grid completely filled with colored cells. Your goal is to clear the board and you do this by clicking on each cell. If the cell has any adjacent cells of the same color, they all disappear. The effect spreads to include all of the cells adjacent to those, and so on. Cells then drop down to fill in the gap left by the vanished group. If an entire column disappears then all cells to the right of it move left to fill it in. There’s a wonderful and free implementation of this for OS X called Otis.

For the midterm we were to make a version of this playable by a human, and for the final we had to write an AI to play the game.

“One more thing…” he said, “If you do it in LISP, you’ll get a 10% bonus.”

The problem though was most of the lectures focused on various algorithms and theory for doing AI. Those who wanted the 10% bonus would have to teach themselves the language.

Most students chose to forsake the bonus in favor of using a language that they were already familiar with, and like most students I had very little time, so I was partial to that sentiment. However, I decided to do a Google search to see if I could find a Common LISP alternative that was more appetizing.

To my delight I found a language that seemed to check all the right boxes, and surprisingly it wasn’t Scheme (although that is a wonderful language as well). Like Scheme, this language had greatly simplified Common LISP’s syntax, but at the same time it came with a standard library full of useful functions for performing modern scripting tasks, and all you needed to run it was a single tiny executable!

Discovering newLISP

newLISPHere is what ultimately turned me into such a fan of newLISP.

We had several months to complete the first part of the project, and the night before it was due, I did not have a single part of it complete. I had spent the night working on other things, and for perhaps an hour I spent some time looking over the newLISP website, reading about it.

The next day, approximately four hours before I was to hand in my finished, playable version of Chainshot, I sat down at a desk, put on my headphones, and proceeded to simultaneously learn newLISP while creating this game with it.

I finished in about 3 hours, of which only about ten minutes was spent debugging. I was dumbfounded. I had discovered something new, a language that allowed me to rapidly write what, after years of C-based languages, seemed like virtually “bug-free code” that just worked. It was a language that I had come close to mastering in a matter of minutes, having never used it before! In the time I had learned newLISP and written a text-based game in it I would probably only be finishing the tutorials for Ruby or Python.

newLISP’s Strengths

At the beginning of this post I made the claim that newLISP has several advantages over the other LISPs. They can all be summarized as follows: If you want a LISP-based scripting language, choose newLISP.

Before getting into the details let me warn:

newLISP is not a general-purpose programming language.

In the same sense that you wouldn’t use JavaScript to write an iPhone app (some beg to differ), you wouldn’t use use newLISP to write an operating system, a music player like iTunes, or a web browser like Firefox. For such endeavors I recommend without hesitation Clojure, Scheme, C, Objective-C, etc. In other words, languages geared for solving complex, low-level problems, as quickly as possible.

The very first sentence on newLISP’s website states (emphasis mine):

newLISP is a Lisp-like, general-purpose scripting language.

Long ago, when computers were slow, the LISP community was mercilessly mocked by their C and Assembly-wielding counterparts for the crime of being too slow. Ever since that time the subject of performance has been a sore spot for the LISP community from which, I dare say, it has yet to fully recover. Its focus turned towards compilation and proving to the world that it too, could be fast. As a result, few seem to have noticed the need for a general-purpose LISP focused instead on interpretation and scripting.

Luckily, newLISP seems to fit that role rather well. It is a general-purpose interpreted scripting language. It’s my understanding that because of how dynamic it is, it cannot even be compiled to bytecode (this does not mean it is not fast, though).

Without this understanding, some of its design decisions will not make sense. Why choose fexprs over macros? Why dynamic scope instead of lexical scope? Why One-Reference-Only memory management instead of garbage collection?

Design and Syntax

newLISP comes in a single, tiny 200+KB binary executable. Out of all the LISP derivatives I’ve tried, it is the easiest to setup, deploy, and develop for. Somehow that tiny package contains the entire language and includes functions for reading and writing files, parsing text, regular expressions, running code in parallel, over a network, and much more. For the final project I was the only person in the class to submit a fully parallelizable AI (scalable to any number of cores) to solve each grid. The only reason I did it was because I could do it without breaking a sweat. newLISP made it mind-numbingly easy, and this was before it had all the actor and Cilk stuff.

newLISP’s syntax is minimalist and well thought out. For the sake of comparison with the Common LISP syntax card, I’ve kept most of the attributes (such as font size) the same:

newLISP in a nutshell

Functions do not need any of the &rest, &optional flags. Simply pass in variables or don’t, the parameters that don’t get anything are set to nil, and extra stuff can be accessed through the function (args) or the symbol $args.

Functions, like most other things, evaluate to themselves. You don’t need a special #’ syntax to access them. Functions are also real lists. This means you can get their source after they’ve been defined, and even modify them while they are executing.

fexprs and eval

Instead of macros newLISP chose to use fexprs, or functions that simply don’t evaluate their arguments (although to the chagrin of some, newLISP calls them macros). This decision makes sense because in an interpreted LISP, almost everything happens at runtime and there are situations where fexprs can be much faster to execute than macros. It also means that newLISP’s “macros” don’t need the special backquote syntax, making them easier to write and read.

You may have heard the mantra against using eval in other languages. In newLISP, this just doesn’t apply. newLISP’s eval is faster than other LISPs.

This has many consequences, one of which is that sometimes newLISP’s fexprs can be faster than compiled macros in the other LISPs, but also it means that using eval is no longer frowned upon, which opens up all sorts of coding possibilities.

Equality and Memory Management

Notice that there’s a single equality operator, the equals sign. newLISP can get away with this luxury because of its memory-management model, called One-Reference-Only (ORO).

In short, most things are passed by-value and so you end up not needing all of those ridiculous comparison functions. If two things have the same value they are equal—end of story (except in the case of Objective newLISP).

This is not as crazy as it sounds. Internally, newLISP passes data by reference between built-in functions and does other optimizations. You can pass data by reference through the use of contexts and symbols, or by using Objective newLISP. newLISP’s ORO also means repeatable code execution times; you’ll never experience “GC Hell” because there is no garbage collector.

Dynamic Scope

Much fuss is often made over newLISP’s use of dynamic scope. It is true, dynamic scope can be dangerous!

In a similar way, pointers and alcohol can be dangerous too! That doesn’t mean you shouldn’t ever program in C or enjoy yourself at a party. Check for NULL, don’t drink and drive, and beware of “free variables.”

Remember, newLISP is an interpreted language. Lutz Mueller, the author of newLISP, made a simple cost/benefit analysis and chose dynamic scope because it’s faster than lexical scope, and because it’s very easy to avoid the potential pitfalls of dynamic scope. Instead of this:

(define (my-unsafe-func)
    (println my-var)
)

Do this:

(define (my-safe-func my-var)
    (println my-var)
)

It’s a small price to pay for the performance improvement, and oftentimes it’s actually quite handy to have dynamic scope (especially in combination with the no-longer taboo eval). If you need lexical scope though, newLISP has you covered.

Parallel Processing

newLISP takes an interesting approach for running code in parallel. Whereas Clojure uses advanced methods for multi-threading and ensuring safety, newLISP simply uses its small size and lets the Operating System do all the work!

There are no threads. Writing safe, parallel code is simple through actors and spawn/sync because newLISP simply forks itself. Its modest stature makes this a fairly cheap operation, allowing the OS to handle scheduling and memory-protection. Try forking a JVM… :P

Excellent Documentation and Community

newLISP’s documentation is one of the best examples of excellent documentation that I’ve ever seen, simply surpassing the documentation for any other LISP that I’m aware of. You don’t need to shell out money for a book to learn it, and that’s because it doesn’t need a book! The short manual included with the reference documentation is all you’ll need to learn the language. Its documentation is one of the primary reasons that I was able to successfully procrastinate for my midterm.

If I had to pick a word to describe newLISP’s community it would probably be “cozy” (and for Common LISP it would probably be “abrasive”). Everyone’s question is heard and answered in a friendly and rapid manner, and there is no formality. Lutz Mueller will often answer your question or incorporate your suggestions directly into the language. It’s a small community, yes, but it’s also agile and capable of rapid change without politics.

Other Goodies

There are many other goodies that I won’t dwell on:

Conclusion + Related Links

newLISP is a true diamond in the rough, sorely under-hyped, but a thing of beauty nevertheless.

If you found this post interesting, you may find some of the links below worth visiting:

Introducing Objective newLISP

Tuesday, December 8th, 2009

newLISP is an awesome language that I use for all of my scripting needs, but one thing that is missing from it is a nice way of doing real object oriented programming.

By default it supports a pseudo-OOP paradigm called FOOP, but FOOP is simply inadequate for doing some of the most rudimentary of OOP tasks, such as allowing objects to hold references to each other.

That is why I’m announcing Objective newLISP: Real Object Oriented Programming for newLISP.

Let’s Dive In

Objective newLISP—ObjNL for short—is modeled after parts of Objective-C and Java. Let’s open up a REPL and begin:

$ newlisp ObjNL.lsp
newLISP v.10.1.6 on OSX IPv4 UTF-8, execute 'newlisp -h' for more info.

> 

Classes

Classes are simply contexts and are defined using the function new-class:

> (new-class 'Foo)
Foo

If we wanted to create a subclass of Foo called Bar we can easily do so:

> (new-class 'Bar Foo)
Bar

We can see that Foo is the superclass of Bar:

> Bar:@super
Foo

And that all classes inherit from ObjNL:

> Foo:@super
ObjNL

Objects

Objects are instantiated from classes using the function instantiate. They are contexts too:

> (setf obj (instantiate Foo))
Foo#1

As we’re subverting newLISP’s ORO memory management model to gain real OOP, we should deallocate it manually when we’re through using it. I will cover the topic of memory management last.

Constructors

Constructors are defined using the default function. Let’s define constructors for Foo and Bar (suppose we entered this into the REPL between a pair of [cmd][/cmd] tags):

(context Foo)
(define (Foo:Foo _bar)
    (setf bar _bar)
    true
)

(context Bar)
(define (Bar:Bar _foo)
    (setf foo _foo)
    true ; don't allow ourselves to be deallocated
)
(context MAIN)

Note the extra true at the end of each constructor. This is important because if the constructor returns nil that tells ObjNL that an error occurred and to therefore deallocate the object immediately. Thus if _bar were nil and we didn’t have that true the object would be deallocated, and we don’t want that.

When we call instantiate with extra arguments they are passed to the constructor:

> (setf obj (instantiate Foo (instantiate Bar)))
Foo#2

We can see that the instance variables were properly set:

> obj:bar
Bar#1
> obj:bar:foo

ERR: symbol expected : "obj:bar:foo"

Huh. We were able to check obj:bar but obj:bar:foo resulted in an error. It seems newLISP treats the entire thing as a symbol if there’s more than one colon, instead of assuming we’re doing multiple context lookups.

Thankfully Objective newLISP has you covered.

Deep Value and Symbol Access

> (. obj bar foo)
nil

The dot macro lets us look up the value of a symbol that we want through several object references. I’ll refer to this as “deep value access”. Sometime we want the symbol instead of the value, for example say for fun we want to create a circular reference between the objects obj and obj:bar. We can do this using the dot-reference macro:

> (.& obj bar foo)
Bar#1:foo
> (set (.& obj bar foo) obj)
Foo#2

The dot-reference macro allows for “deep symbol access”, it returns the context-qualified symbol for an object’s instance variable. Now we can show that our circular reference works:

> (. obj bar foo bar foo)
Foo#2
> (= obj (. obj bar foo bar foo))
true

Interfaces

Most object oriented systems have the concept of an interface, sometimes referred to as a protocol. Interfaces define a set of functions that a class can choose to implement or “conform” to. Objective newLISP has them too, and refers to them as interfaces even though they are technically mixins.

Let’s define a simple interface called protocol:

> (define (protocol:test) "hello!")
(lambda () "hello!")

There are two ways to implement an interface. You can specify a list of them when creating a new class:

> (new-class 'Foo ObjNL '(protocol))
Foo

Or you can add them to a class or object after its definition. We actually want to do this right now because we instantiated obj prior to adding protocol to Foo’s list of interfaces. We can check to see this is true by asking if obj implements protocol:

> (implements? protocol obj)
nil

So the second way to add an interface to an object or class is to use the function add-interface:

> (add-interface protocol obj)
(protocol Foo ObjNL)

Now obj should implement it, so we can try it out:

> (if (implements? protocol obj) (obj:test))
"hello!"

The only real difference between an interface and a class is that a class has a constructor (default function) and ultimately inherits from ObjNL. You can use implements? to check inheritance as well:

> (implements? ObjNL obj)
true

Memory Management

The last, and perhaps most important topic, is what to do with all those objects you’ve got lying around, also referred to as “memory management.”

Objective newLISP supports two styles of memory management: manual and reference counting.

Manual memory management is simple: instantiate your object, and when you’re done with it, deallocate it!

> (setf b (instantiate Bar))
Bar#2
> (deallocate b)
true

Reference counting is done the same way it is done in Objective-C. Each object starts with a reference count of 1. When you want to hold onto that object you retain it, and when you’re through with it you release or autorelease it (which decrements the reference count). When the reference count hits zero the object is deallocated by deallocate:

> (setf b (instantiate Bar))
Bar#3
> (release b)
true

I will cover autorelease next, but I won’t go to great lengths to explain how all of this reference counting stuff works. If you’re unfamiliar with it, just know that it’s not complicated. If you want some practice make an iPhone app. :P

To illustrate autorelease I will implement the method ObjNL:dealloc, which is called on an object just before it is deallocated.

> (define (Bar:dealloc) (println Bar:@self " has been deallocated!"))
(lambda () (println "Object " Bar:@self " has been deallocated!"))
> (push-autorelease-pool)
(())
> (dotimes (_ 5) (autorelease (instantiate Bar)))
Bar#8
> (pop-autorelease-pool)
Bar#8 has been deallocated!
Bar#7 has been deallocated!
Bar#6 has been deallocated!
Bar#5 has been deallocated!
Bar#4 has been deallocated!
true

One important point to mention is that deallocating objects in newLISP versions 10.1.8 or older is very slow. The details of why this is has to do with safety (which I discuss in the box below), but needless to say it was too slow to be acceptable. I contacted Lutz Mueller, the author of newLISP, and he agreed to introduce an “unsafe” optimization into the delete function. In versions 10.1.9 and later, deallocating Objective newLISP objects is approximately 480 times faster.

Because of this, it’s strongly recommended to use Objective newLISP with newLISP 10.1.9 or later. Currently the latest development release is 10.1.8, however Lutz graciously made this optimization available online in a development version of 10.1.9. Click here to grab the source for this version. This link will expire soon, when it does you can get the latest development release here.

Cautionary Note!

There are two situations to watch out for when using Objective newLISP:

#1: Unbound References in Functions

Instead of this:

(define (modify-obj)
    (setf obj:bar 5)
)
(setf obj (instantiate Foo))
(modify-obj)

Do this:

(define (modify-obj obj)
    (setf obj:bar 5)
)
(setf obj (instantiate Foo))
(modify-obj obj)

If you don’t do that, newLISP will read the obj:bar in the definition of modify-obj and instantly create and protect a context called obj, making it impossible to setf the obj later on.

#2: Dangling References

Use extreme caution when holding reference(s) to an object in a list or some other container! If that reference is later deallocated and you try to access it, bad things will happen:

> (setf b (instantiate Bar))
Bar#9
> (push b alist)
Bar#9
> (deallocate b)
Bar#9 has been deallocated!
true
> alist
Bus error

Normally this would not be a problem, the object in alist would simply be replaced with nil upon its deallocation. However, since we’re using the fast, unsafe version of delete to do our deallocation, newLISP will not do that. It is the same situation as when attempting to access free’d memory in C/C++/Objective-C.

Instead we should use retain/release:

> (setf b (instantiate Bar))
Bar#9
> (push (retain b) alist)
Bar#9
> (release b)
nil
> alist
(Bar#9)
> (release (pop alist))
Bar#9 has been deallocated!
true
> alist
()

When to use FOOP

Objective newLISP is not the answer to all OOP problems in newLISP. FOOP has its place too. If you’re dealing with a situation where you may end up needing lots of objects, FOOP is probably the better choice. Although you can’t do full-blown OOP with it, FOOP objects can use far less memory than ObjNL objects because in ObjNL, methods are stored in each object, not in the class. After trying out both you should have a good feeling for when to use one over the other (i.e., if the limitations of FOOP start to become obvious).

Download and API

Download Objective newLISP here:

Download Objective newLISP

Access the Objective newLISP API.

And for news, follow @taoeffect and @newlisp on twitter.

Thanks for checking out Objective newLISP!

Enjoy! :-D

Building a better lock: TESharedObject

Monday, August 31st, 2009

While I’m happy to see Grand Central in Snow Leopard, I won’t be using it in any of our applications anytime soon because that means we’d have to turn our backs on all those PPC users out there, and everyone who has yet to upgrade to Snow Leopard. I suspect that this represents a sizable chunk of the OS X using population, at least at the moment.

It would also be nice if we could use Clojure for writing Cocoa apps, but Apple decided to drop the ball on that one.

However, that doesn’t mean we still can’t write good, fast, multithreaded code.

Actors and Shared Data

Right now we’re in the process of rewriting parts of Espionage’s helper program, which is fairly multithreaded and does most of Espionage’s heavy-lifting. Currently we are using mutex locks (in the form of NSLock) to synchronize some of the shared data in the application, and while locks are “OK”, they can start to get messy when you’ve got a lot going on.

That’s why we’re going to convert the helper to use actors (courtesy of Plausible Labs‘ great PLActorKit). But even when you’ve structured your code to use the actor approach, you still have a problem if those actors need to operate on data that’s shared with other actors or other threads. This is where locks could come in, but locks tend to suck.

Locks are slow, everyone knows that, but in our experience they can also encourage bad code because you can associate a lock with just about anything, it doesn’t have to be a specific piece of data, it can be a group of actions operating on that data, or data related to it. When you don’t have consistency, things can get out of hand.

So I’ve written a class that aims to solve these two problems with locks.

TESharedObject

When you hold a lock, you prevent any other thread from accessing the data that it’s protecting, regardless of whether that is necessary or not. What if the data that you’re protecting is often only read from? Then despite the fact that it’s perfectly fine for multiple threads to read from a piece of data simultaneously, each reader has to wait in line for the lock to become available. This can really slow things down.

TESharedObject is a replacement for locks that takes this into account. It changes the “lock” paradigm in two ways:

First off, it’s a wrapper around shared data, that is as opposed to a lock, which is just another “thing” that you arbitrarily decide is associated with a piece of data, a decision that you may or may not change your mind about as your code evolves.

The other difference is that unlike a lock, it allows multiple readers to access the data at the same time, provided no one’s writing to it. Databases often take this same approach to improve performance.

Semaphores

TESharedObject implements a basic algorithm using the semaphore primitive. Semaphores aren’t used very often in Cocoa programming, so if you’re unfamiliar with them you’ll be forgiven. Quick overview: a semaphore is an entity that has a count associated with it. You can increase the count by calling, say, “up” on it, or decrease it by calling “down” on it. If you call “down” on it when the count is zero, you block until some other thread increases the count.

So, say we have a database, and we want people to be able to read from it safely simultaneously when no one’s writing to it. By using two semaphores and keeping track of how many people are reading we can accomplish this like so (pseudo-code):

semaphore sRead = 1;
semaphore sAccess = 1;
int readerCount = 0;

reader:
    down(sRead);
    if (++readerCount==1)
        down(sAccess);
    up(sRead);
    access_database();
    down(sRead);
    if (--readerCount == 0)
        up(sAccess);
    up(sRead);

writer:
    down(sAccess);
    access_database();
    up(sAccess);

Here the semaphore sAccess is used as the “lock” on the database, or more accurately, to suspend the next thread that calls “down” on it. Only the first reader will call down on sAccess. A second semaphore sRead is used as a backup to sAccess in the situation that another reader is already suspended on sAccess.

The code for the writer is simple, all writers decrement sAccess’s count, meaning a single writer is enough to stop everyone.

Building A Better Lock

Now that we have the pseudo-code, we need a design for our lock, and to get the design we need to have some sort of an idea of how we plan on using this lock in practice. I know! It should look something like this:

NSMutableString *sharedData = [NSMutableString stringWithString:@"Poop"];
TESharedObject *superLock = [[TESharedObject alloc] initWithObject:sharedData];

reader:
    NSObject *obj = [superLock borrowForReading]; // like "lock"
    NSLog(@"We've got an object! Take a look: %@", obj);
    [superLock returnObject]; // like "unlock"

writer:
    NSMutableString *aStr = [superLock borrowForWriting];
    [aStr setString:@"Harro!"];
    [superLock returnObject];

There, that looks pretty good. Our superLock is bound to the data it’s protecting. When we want to have a look at the data we call -borrowForReading to “borrow” it, and once we’re finished with it we “return” the data by calling -returnObject. Simple enough, and it works just like using a lock. All we have to do is make sure that we don’t write to the data. If we want to write to it, we call -borrowForWriting instead.

Let’s have a look at what’s inside.

-borrowForReading

- (id)borrowForReading
{
    semaphore_wait(sRead);
    if ( ++readerCount == 1 )
        semaphore_wait(sAccess);
    semaphore_signal(sRead);
    return obj;
}

There’s our pseudo-code! Well, about half of it, I bet you can guess where the other half is. But before we get to that, let’s take a look at -borrowForWriting:

-borrowForWriting

- (id)borrowForWriting
{
    semaphore_wait(sAccess);
    writing = YES;
    return obj;
}

Here the code diverges a bit with the introduction of a new variable writing. We use it so that whether we called -borrowForReading or -borrowForWriting, we only have to call:

-returnObject

- (void)returnObject
{
    if ( writing )
    {
        writing = NO;
        semaphore_signal(sAccess);
    }
    else
    {
        semaphore_wait(sRead);
        if ( --readerCount == 0 )
            semaphore_signal(sAccess);
        semaphore_signal(sRead);
    }
}

And that’s it. We’re almost done now, if you’ve made it here, thanks for sticking with me. I only have two more things to show you, and I think it’ll be worth it.

TESharedMap

Another aspect of shared data that we’ve neglected to address is the notion of “globality”. Yes, I did just make that word up, but it has important consequences! When you’re dealing with shared data, you’re often dealing with global variables, and dammit, now you’ve gotta find a place to put them!

A lot of people just put them at the top. They make long laundry lists of static declarations at the top of some file, and for each piece of shared data two declarations are required: the data, and the lock for the data. This can get kinda ugly, and ugly code is often harder to read and maintain. Our TESharedObject suffers from this same problem, it’d be nice if we could just focus on the data and not have to deal with the lock that’s associated with it.

We can get close to this with TESharedMap. TESharedMap acts as a “summoner”, we tell it: “Give us our object!” And it does. We don’t need to worry about keeping track of the associated TESharedObject, TESharedMap handles that for us. Put something into the map and it magically becomes thread-safe, so long as you remember to retrieve it only through the map.

Here’s its interface:

@interface TESharedMap : NSObject {
    TESharedObject *sharedMap;
}

+ (TESharedMap *)map;

- (id)borrowObjectForKey:(NSString *)key forReadingOnly:(BOOL)readonly;
- (void)returnObjectForKey:(NSString *)key;
- (void)setObject:(id)obj forKey:(NSString *)key;
- (void)removeObjectForKey:(NSString *)key;

@end

Benchmarks

What are the performance benefits of using TESharedObject and TESharedMap instead of NSLock and the like? For this I’ve written 3 programs, they each do the same thing, the only difference is that each uses a different synchronization primitive that we’ve discussed (TESharedObject, TESharedMap, and NSLock).

Here’s the TESharedMap version:

#import <Foundation/Foundation.h>
#import "TESharedObject.h"
#import "Common.h"
#import "Config.h"

static int reader = 0;
static int writer = 0;
static int msgIdx = 0;
static int tCount = NUM_READERS + NUM_WRITERS;

static NSString *msgs[] = {
    @"Hello World!", @"how are you?", @"random message!", @"hope we have enough of these...",
    @"I'm sure we will", @"there so many!", @"How many messages does it take", @"to screw in a lightbulb?"
};

#define newMsgIdx (msgIdx++%(sizeof(msgs)/sizeof(msgs[0])))

@interface Actor : NSObject
- (void)readerMain;
- (void)writerMain;
@end

@implementation Actor
- (void)readerMain
{
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    int i=READ_TIMES, readerID = ++reader;

    while ( --i > 0 )
    {
        fprintf(stderr, "Reader %d getting message...\n", readerID);
        NSString *message = [[TESharedMap map] borrowObjectForKey:OBJ_KEY forReadingOnly:YES];
        fprintf(stderr, "Reader %d got message: %s\n", readerID, [message UTF8String]);
        usleep(READ_SLEEP);
        [[TESharedMap map] returnObjectForKey:OBJ_KEY];
    }

    fprintf(stderr, "Reader %d done!\n", readerID);
    if ( --tCount == 0 )
    {
        printf("good-bye\n");
        exit(0);
    }
    [pool release];
}
- (void)writerMain
{
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    int i=WRITE_TIMES, writerID = ++writer;

    while ( --i > 0 )
    {
        fprintf(stderr, "Writer %d getting message...\n", writerID);
        NSMutableString *message = [[TESharedMap map] borrowObjectForKey:OBJ_KEY forReadingOnly:WRITE_MEANS_READ];
        [message setString:msgs[newMsgIdx]];
        fprintf(stderr, "Writer %d set message to: %s\n", writerID, [message UTF8String]);
        usleep(WRITE_SLEEP);
        [[TESharedMap map] returnObjectForKey:OBJ_KEY];
    }

    fprintf(stderr, "Writer %d done!\n", writerID);
    if ( --tCount == 0 )
    {
        printf("good-bye\n");
        exit(0);
    }
    [pool release];
}
@end

int main(int argc, char const *argv[])
{
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    NSMutableString *message = [[NSMutableString alloc] initWithString:msgs[newMsgIdx]];
    [[TESharedMap map] setObject:message forKey:OBJ_KEY];

    for (int i=0; i<NUM_READERS; ++i)
        [NSThread detachNewThreadSelector:@selector(readerMain) toTarget:[Actor new] withObject:nil];
    for (int i=0; i<NUM_WRITERS; ++i)
        [NSThread detachNewThreadSelector:@selector(writerMain) toTarget:[Actor new] withObject:nil];

    // wait for threads to terminate...
    while (tCount)
        sleep(1);

    printf("good-bye\n");
    [pool release];
    return 0;
}

The program launches a certain number of readers and writers, and they each access an NSMutableString a certain number of times. To simulate computation, each reader and writer sleeps for a certain amount of time upon accessing the string. The number of readers and writers, as well as other parameters, can be adjusted by modifying the “Config.h” file:

// play with these parameters
#define NUM_READERS 4
#define NUM_WRITERS 4
#define READ_TIMES 4000
#define WRITE_TIMES 4000
#define READ_SLEEP 1000
#define WRITE_SLEEP 1000

I ran 4 benchmarks using the *NIX ‘time’ command comparing the 3 synchronization primitives against each other by adjusting the number of readers and writers and keeping the other parameters constant.

Dramatic Results

TESharedObject results

First we see that the extra layer that TESharedMap adds on top of TESharedObject is pretty much negligible in terms of performance.

The results for 0r/4w is the worst-case scenario for TESharedObject (no readers), and as expected it performs pretty much exactly like the typical lock.

The results for 4r/0w is the best-case scenario, when there are only readers accessing the data. This is the real payoff, almost no penalty for accessing shared data! You can’t get this with mutex locks.

The other two results show what happens when you have both readers and writers, in which case TESharedObject quickly overtakes the mutex lock, but what’s most interesting is that as you add more readers, TESharedObject takes a fairly negligible hit while the lock’s performance is significantly degraded. Why?

The reason becomes pretty obvious if you actually run these tests yourself. What happens is that the readers dominate the lock. This happens because in this program, the data is besieged by a constant and unrelenting stream of readers and writers who lust after the data until they’ve had their fill. In this situation, the more readers you add, the less likely it becomes that a writer will be able to get a hold of it, so what happens is that there’s suddenly a stampede of readers with virtually unfettered access to the data which they quickly gobble up, and then after most or all of the readers have had their fill the writers get their turn.

So while TESharedObject can provide a significant performance boost, if your lock is highly contested by readers they can shut out any writers. In most of the situations that I’ve seen locks used, this doesn’t happen. But if you are using a TESharedObject in a maelstrom like this, you’ll probably want to subclass it and modify the -borrowForReading method so that it sleeps if the readerCount is too high, which will make it a bit slower, but it will still be at least as fast as a lock, and you’ll have better looking code.

I think that’s it for now. All the code in this post, including the code for TESharedObject and TESharedMap, is provided under an BSD-style license and can be downloaded by clicking the icon below:

TESharedObject

Enjoy! ;-)

- Greg (twitter: @taoeffect)

Using Bazaar Like Git + ‘repoalias’ Plugin

Sunday, June 28th, 2009

Continuing my exploration of distributed version control systems, I decided to take a quick look at the Git:

Managing Branches

A single git repository can maintain multiple branches of development. To create a new branch named “experimental”, use:

$ git branch experimental

If you now run

$ git branch

you’ll get a list of all existing branches:

  experimental
* master

That’s strange, I thought. Bazaar does it a bit differently. In Bazaar, the branches are often stored in a shared repository, which is just a special folder that contains the branches in it as subdirectories, but unlike Git, the repository is typically a “normal” folder that contains branches which are stored as subdirectories in plain sight to the developer.

Here’s an example from the Bazaar User Guide:

repository/       # Overall repository
 +- trunk/        # The mainline of development
 +- branches/     # A container directory
 |   +- foo/      # Branch for developing feature foo
 |     ...
 +- tags/         # Container directory
     +- release-X # A branch specific to mark a given release version
        ...

Hmm… I actually like how Git does it. Git “already works” with the way I have my existing projects setup. I don’t want to move my projects to match the directory structure shown above. Is there any way to use Bazaar in a Git-like manner? Well, I suppose you already know the answer to that, as I wouldn’t have much of a blog post to write if there wasn’t! :-)

Setting up a Git-like workflow with Bazaar

Being completely new to Bazaar, and to DVCS’ in general, I decided to ask the folks over at #bzr. They were very helpful, and fairly quickly ‘mzz’ had an answer:

First, we cd into our project, which is already setup as a bazaar branch:

cd path/to/myproject

Next, we run the following command:

bzr init-repo --no-trees .bzr-repo

That creates a hidden shared repository inside the project folder called .bzr-repo. We add the --no-trees option to tell Bazaar not to create unnecessary copies of the working tree.

Note that this is quite different from how repositories in Bazaar are normally used, usually they’re the parent directory.

bzr ignore .bzr-repo

We add the repository itself to the ignore list so that it’s not tracked.

Next, we commit the changes and create the main branch, called “trunk”:

bzr commit -m "added .bzr-repo to ignore list"
bzr branch . .bzr-repo/trunk

At this point we have two branches, the one in .bzr-repo/trunk and the branch we started with. This is no good, we want to be deal with only one branch at a time—the one we’re currently working with, and we want that branch to be one of the ones in .bzr-repo.

Thus enters the concept of a lightweight checkout. A checkout is very similar in concept to its SVN counterpart, you have a folder were you can make changes to files, and when you commit them they are sent off to some central location. One of the main differences between checkouts in Bazaar and checkouts in SVN is that in Bazaar the revision history is stored locally within the checkout itself. But in our setup, we don’t want that since each branch is stored locally anyway. Having two copies of it wouldn’t offer any advantage and would just take up space. A lightweight checkout lets us have a checkout without a revision history.

We can tell Bazaar to change the current branch into a checkout that points to .bzr-repo/trunk by using the reconfigure command:

bzr reconfigure --lightweight-checkout --bind-to .bzr-repo/trunk .

That’s it! :-)

Now we can continue our work as usual. Let’s test this out by adding a new file to the project:

$ echo "Version 1.0" > CHANGES
$ bzr add CHANGES
adding CHANGES $ bzr commit -m "added CHANGES" Committing to: /Users/gslepak/myproject/.bzr-repo/trunk/ added CHANGES Committed revision 2.

Everything seems in order. We can now use Bazaar like Git:

$ bzr branch . .bzr-repo/experimental
Branched 2 revision(s).
$ bzr switch .bzr-repo/experimental
Tree is up to date at revision 2.
Switched to branch: /Users/gslepak/myproject/.bzr-repo/experimental/

Since Bazaar doesn’t have an equivalent to the args-less git branch to list the available branches, we just use plain-old ‘ls’:

$ ls .bzr-repo/
experimental/ trunk/

At this point you might be wondering if there’s a way to get around having to type out the path to the repository each time you use bzr branch and bzr switch

Introducing the ‘repoalias’ Plugin

I asked that question on #bzr and mzz snapped into action, within a few minutes he had a working plugin ready that did just that. Needless to say my head was reeling from his Python kung-fu. :-)

Repoalias allows you to reference the branch’s repository like so:

bzr branch . repo:experimental
bzr switch repo:experimental
... do work ...
bzr switch repo:trunk
bzr merge repo:experimental
... etc ...

Getting the plugin is simple:

mkdir -p ~/.bazaar/plugins    # create the plugins folder if it doesn't exist
cd ~/.bazaar/plugins
bzr branch lp:~marienz/+junk/repoalias

That’s it! Many thanks to mzz and the folks at #bzr for helping me understand this stuff!

Installing Bazaar 1.16 on OS X [Updated]

Monday, June 22nd, 2009

Recently I decided to bite the bullet and move away from Xcode’s Snapshots to using a DVCS. After checking out Git, Mercurial, and Bazaar I finally settled on the latter, for now at least.

Bazaar is a great, it has excellent documentation on a very well designed website, it’s incredibly easy to use (one of its design goals) so you can focus on what’s important (coding), it has support for both distributed and centralized workflows, it supports pushing and pulling from SFTP servers that do not have bazaar installations, and it uses revision numbers (instead of hashes), and it does so in a very intelligent way.

Unfortunately the latest version doesn’t seem to come in a tidy OS X Installer package, nor is it available via MacPorts (yet), so here are some tips to install it from source.

If you’ve previously installed Bazaar using the package installer you’ll probably want to clear out some of the files it installed prior to installing the new version:

cd /Library/Python/2.5/site-packages
sudo rm -rf bzr*
sudo rm -rf Loom-1.4.0dev0-py2.5.egg-info
sudo rm -rf subvertpy*
sudo rm -rf Extmerge-r10-py2.5.egg-info
sudo rm -rf qbzr-0.9.9-py2.5.egg-info
# we'll install the latest versions of these in a bit
sudo rm -rf BzrTools-1.14.0-py2.5.egg-info
sudo rm -rf pycrypto-2.0.1-py2.5.egg-info
sudo rm -rf paramiko*

We do that to avoid conflicts and because we want to install these plugins ourselves, so that we can use the latest versions.

Installing Dependencies

As mentioned in the note above, and on the InstallationFaq, there are 3 things that come with the package installer that you should install: Paramiko and pyCrypto (for SFTP support), and BzrTools.

Because of the way we’re installing Bazaar, you should install BzrTools *after* installing Bazaar 1.16 (detailed below). Installing Paramiko and pyCrypto is simple though, as you only need to download Paramiko (its installer will grab pyCrypto for you):

  1. Grab the latest version of Paramiko, as of this writing it’s 1.7.2:
  2. Extract the contents of the archive and ‘cd’ into the directory
  3. To build and install run these commands:
    python setup.py build
    sudo python setup.py install

Optionally, the InstallationFaq suggests installing Pyrex to speed things up.

Installing Bazaar 1.16

Download the source for 1.16 (direct link). After you’ve extracted it run these commands from the source dir:

python setup.py build
sudo python setup.py install --home /usr/local
sudo mv /usr/local/lib/python/bzr* /Library/Python/2.5/site-packages/

Note: On Snow Leopard the default Python install is 2.6, not 2.5.

Installing Bazaar Plugins on OS X

This too isn’t very clear from the Bazaar Plugins page. To install a plugin follow these steps:

  1. Pick a plugin from the plugins page and grab its branch URL, for example the URL for Rebase is:
  2. In some directory run:
    bzr branch [branch-URL]
  3. That will download the plugin source. ‘cd’ into the directory, in the above example it’s “trunk”.
  4. Build and install the plugin using the same commands as you used to install the dependencies.

That should do it. One note is that a lot of the plugins are hosted on launchpad, in which case grabbing them is simple (using ‘bzr-search’ as an example):

bzr branch lp:bzr-search

Now you can enjoy the benefits of running the latest version. :-)

ebswitch: EventBox Profile Switcher

Tuesday, May 26th, 2009

My favorite Twitter client is EventBox by The Cosmic Machine. And while it has a million great features like support for Instapaper and a great interface, it’s missing one critical piece of functionality and that is support for multiple profiles. However, it’s still possible to use EventBox with multiple profiles, but perhaps not at the same time.

One solution is a great little program called rooSwitch. It can be used with EventBox to give you the ability to switch between different isolated profiles, each with its own settings. You could configure rooSwitch with multiple EventBox profiles, say for example one for each Twitter account that you use:

rooSwitch with two twitter profiles

While rooSwitch is great, I don’t have much use for it besides switching EventBox profiles, and I’m a terminal fiend anyway, so I wrote a simple newLISP script called ebswitch that does this for me. Here’s an example session:

$ ebswitch
ebswitch version 0.2
Usage: /usr/local/bin/ebswitch twitter_profile_name
$ ebswitch espionageapp
taoeffect => espionageapp
Creating fresh account for: espionageapp
Successfully switched to profile: espionageapp
    ... EventBox opens, enter 'espionageapp' login information ...
$ ebswitch taoeffect
Quit EventBox? [y|n]: y
espionageapp => taoeffect
Successfully switched to profile: taoeffect

Installing ebswitch

ebswitch is a newLISP script, so to use it you’ll need to make sure that newLISP is installed (Intel/PPC), and don’t worry, one of the great things about newLISP is how small it is. Then, after downloading ebswitch to your Desktop, install it into your /usr/local/bin (or /usr/bin) like so:

$ cd ~/Desktop
$ sudo install ebswitch.lsp /usr/local/bin/ebswitch
Password: enter admin password

Enjoy! :-)

Update: ebswitch 0.2 adds more intelligence and can now quit EventBox for you.

eb_switch.lsp

Programmers: Win a license to Espionage!

Saturday, April 4th, 2009

I’m part of the newLISP Fan Club. I consider myself a fan. :-)

Over on the boards there is a challenge, and whoever solves it first wins a license to Espionage, just because that’s something that I can offer to the winner. :-D

Contest ends when someone posts the solution on that forum, or on the 11th of April.

Who’s stealing your memory?

Thursday, January 1st, 2009

Terminal fiends will likely find this post useful.

A while ago, I was sitting in the library at the University of Florida under the pretense of preparing for a final exam that was scheduled for the following day. I had, however, made the idiotic mistake of bringing my laptop with me.

Instead of studying I became inexplicably fascinated with how much memory my various running applications were taking up. Actually, it was really the fault of Alex Harper’s fantastic MenuMeters application, because I noticed that I was running low on free memory, despite having 2 gigabytes installed and very few applications running.

This lead to another discovery, namely that Safari was hording over a gigabyte of RAM for itself. This upset me, as I’m rather neurotic about how much RAM applications use. Every time the OS has to pageout I cringe inside with the knowledge that my laptop’s battery life, performance, and theoretically, the lifespan of its hard drive, are all affected. So I set aside the textbook and wrote memusage, a shell script that reports back the largest of offenders:

gslepak$ memusage
Top 10 memory intensive apps:

	Name			Percentage	Size

#1:	Xcode                   5.3		217.688 MB
#2:	firefox-bin             4.4		181.754 MB
#3:	WindowServer            4.1		165.961 MB
#4:	Finder                  2.3		95.2305 MB
#5:	iTunes                  2.0		81.7227 MB
#6:	Mail                    1.8		75.7031 MB
#7:	Interface               1.7		67.7344 MB
#8:	coreservicesd           1.3		53.1914 MB
#9:	mds                     1.1		45.0312 MB
#10:	Quicksilver             0.9		38.4531 MB

As you can see, I don’t use Safari anymore. :P

I wonder what iSpy is using right now…

gslepak$ memusage ispy
ispyd: 0.0 %  0.441406 MB

Ten is too many, just show me the top 5:

gslepak$ memusage 5
Top 5 memory intensive apps:

	Name			Percentage	Size

#1:	Xcode                   5.3		217.688 MB
#2:	firefox-bin             4.5		182.457 MB
#3:	WindowServer            4.1		166.281 MB
#4:	Finder                  2.3		95.2305 MB
#5:	iTunes                  2.0		81.7227 MB

If you’re wondering why the percentages don’t match up with 2GB, it’s because I recently upgraded to 4GB, and I highly recommend it!

memusage

Error handling conventions

Sunday, November 16th, 2008

Programmers have many options available to them when it comes to error handling.  A very common convention among C programmers is to make a function return a non-zero value if an error has occurred.  This post is about what to do with that non-zero value.

Let’s start with a simple example:

    1 OSStatus initKeychainAccess()
    2 {
    3     OSStatus err;
    4
    5     err = SecKeychainSetUserInteractionAllowed(TRUE);
    6
    7     if ( err ) {
    8         log_err("couldn’t enable keychain user interaction");
    9         return err;
   10     }
   11
   12     err = SecKeychainUnlock(gKeychain, 0, NULL, FALSE);
   13
   14     if ( err ) {
   15         log_err("couldn’t unlock keychain");
   16     }
   17     else {
   18         err = SecKeychainAddCallback(MyKeychainCallback, kSecEveryEventMask, NULL);
   19         if ( err ) {
   20             log_err("couldn’t set callback for keychain");
   21         }
   22     }
   23
   24     if ( err == 0 ) {
   25         doThatFancyThingYouDo();
   26     }
   27
   28     return err;
   29 }

This example demonstrates three common error handling techniques that I’ve encountered in the wild:

  1. Return immediately (line 9)
  2. Building nested if-else clauses (lines 14-22)
  3. Repeatedly checking error status (line 24)

This is just a simple, short example, but these error checking patterns can really add up in more complicated code, making it unwieldy and unnecessarily complex, not to mention a pain to maintain and debug. There’s also the nuisance of having to write a custom error message for each situation as well, and most of the time developers tend to just avoid doing that altogether, making it difficult to troubleshoot problems when they occur on a remote system.

To get around these problems developers often use macro’s with goto’s. Here’s one technique that I’ve seen:

    1 OSStatus initKeychainAccess()
    2 {
    3     OSStatus err;
    4     err = SecKeychainSetUserInteractionAllowed(TRUE);
    5     require_noerr(err, fail_label);
    6     err = SecKeychainUnlock(gKeychain, 0, NULL, FALSE);
    7     require_noerr(err, fail_label);
    8     err = SecKeychainAddCallback(MyKeychainCallback, kSecEveryEventMask, NULL);
    9     require_noerr(err, fail_label);
   10     doThatFancyThingYouDo();
   11 fail_label:
   12     return err;
   13 }

Now that’s certainly an improvement, we went from 29 lines down to 13, and the code is much more readable. That’s pretty good, but I think we can do better, here’s my version:

    1 OSStatus initKeychainAccess()
    2 {
    3     OSStatus err;
    4     DO_FAILABLE(err, SecKeychainSetUserInteractionAllowed, TRUE);
    5     DO_FAILABLE(err, SecKeychainUnlock, gKeychain, 0, NULL, FALSE);
    6     DO_FAILABLE(err, SecKeychainAddCallback, MyKeychainCallback, kSecEveryEventMask, NULL);
    7     doThatFancyThingYouDo();
    8 fail_label:
    9     return err;
   10 }

In the event of an error, you’ll get all of the important information that you need to pinpoint exactly what happened: the function that failed, the error code, and the line number. Here are the definitions for DO_FAILABLE and a few variants thereof:

#define DO_FAILABLE(_errVar, _func, args...) do { \
    if ( (_errVar = _func(args)) != 0 ) { \
        log_err(#_func ":%d returned: %d\n", __LINE__, (int)_errVar); \
        goto fail_label; \
    } \
} while (0)

// useful when the error code isn't the return value
// ex: DO_FAILABLE_SUB(err, errno, setuid, getuid());
#define DO_FAILABLE_SUB(_errVar, _subst, _func, args...) do { \
    if ( (_errVar = _func(args)) != 0 ) { \
        _errVar = _subst; \
        log_err(#_func ":%d resulted in: %d\n", __LINE__, (int)_errVar); \
        goto fail_label; \
    } \
} while (0)

// just note that an error occurred, but don't do anything about it
#define FAILABLE(_errVar, _func, args...) do { \
    if ( (_errVar = _func(args)) != 0 ) { \
        log_err(#_func ":%d returned: %d\n", __LINE__, (int)_errVar); \
    } \
} while (0)

This cuts down on the number of lines of code by implicitly assuming that fail_label exists (more often than not, a single fail label is enough). Also note the cast to int, this is necessary because sometimes your error value might be stored in a type that will cause gcc to give a warning (e.g. if you have it enabled via -Wall) because of a mismatch between the type and the %d in the printf statement.

Another nice thing about these macros is that they’re very easy to adopt, oftentimes you could easily throw them into your code via a regex find&replace. You simply prepend DO_FAILABLE( in front of the error assignment, convert the equals sign into a comma, and replace the first open parenthesis in the function call with another comma. Then add a fail_label somewhere.

The real fun comes afterward, when you get to delete hundreds of lines of unnecessary error-checking code. :-)