We Must Be Doing Something Right

Yesterday while Elliot, Audrey and I were walking through DFW1 Elliot — he is 4 years old — asked me, “Where are we going?”

“Concourse C, but I am not sure which gate yet2.”, I said.

“Daddy, is it C dot com or C and some other letters?”, Elliot asked in reply.

  1. Which, I must say, is really a pretty nice airport these days. I have not been there since late last century, when I was traveling a lot for work. At that time, it was the airport in the US which I loathed the most.

  2. A two hour layover has it’s disadvantages, but when one adult is traveling with two children it is well worth it.

Caffeine Howto

What the world really needs is more of this sort of science blogging: Caffeine: A User’s Guide to Getting Optimally Wired.

Concurrency and System Architecture

Mr Dekorte take on concurrency in shared memory systems

If you’re looking for languages or concurrency tools that will scale to the high core count desktop machines of the near future, I wouldn’t put stock in MISD oriented solutions such as transactional memory or elaborate functional programming compiler techniques. Shared memory systems simply won’t survive the exponential rise in core counts.

He is right, what we have now is not going to scale in the long run. I am not sure we will see much change on the ground any time soon, though. People, and industries, have an strong inclination to hang onto the status quo, even when there are better alternatives available. On the other hand, I would not be surprised if the future is largely populated by virtual shared memory systems running on top of physical MIMD machines.

RESTful Service Discovery and Description

There has been a great deal of discussion regarding RESTful web service description languages this week. The debate is great for the community but I think Steve Vinoski has it basically right

never once — not even once — have I seen anyone develop a consuming application without relying on some form of human-oriented documentation for the service being consumed

When you start writing an application that makes use of some services you are not writing some sort of generic web services consumer. You are writing a consumer of one very specific web service and the semantics of a service, as with everything else, turn out to be a lot more complicated, subtle and interesting than the syntax.

Human-oriented documentation necessary because only human can understand the really interesting parts of a service description. Based on my experience, it also seems to be sufficient. Sure we could all jump on the full fledged service description language band wagon but I don’t think that service consumers would get much, if any, value out of it.1

Discoverability

Discoverability is the most important capability that interface definition languages bring to the table. However, most service description languages provide discoverability almost as a side effect, rather than it being their primary purpose.

I think it would be better to promote discoverability by working on a more focused capabilities publishing mechanism. To that end, I want to describe what my team has done on this front. It is not entirely suitable general use, but useful standards often emerge from extracting the common parts of many bespoke solutions.

First I want to be clear about the terminology I am using just to make sure we all understand one another

service
A cohesive set of resources that are exposed via HTTP.
resource
An single entity which exposes a RESTful interface via HTTP.
service provider
A process or set of processes that implement one or more services.
container
Another name for a service provider.

Background

We needed the ability to discover the actual URIs of resources at runtime from very early in our project because of our basic architecture. Our system is composed of at least four services2. The containers that provide these services may be deploy in various (and arbitrary) ways. Maintaining the list of top level resources of other services in configuration files became unmanageable long before we every actually deployed the system in production.

We need a way that any component in the system could discover the URIs of resources exposed by other components in the system. We handled this by providing a system wide registry of all the services that are available and a description resource for each service that provides link to the resources contained with that service.

Service Description

Containers that provide a service are responsible for exposing a “service description” for that service.

A service description is a resource that provides links to all the top level resources in service. Currently we support just one type of representation (format) for service description, a JSON format that looks like this

{
  "_type":        "ServiceDescriptor",
  "service_type": "http://mydomain.example/services/something-interesting",
  "resources": [
    {
      "_type": "ResourceDescriptor",
      "name":  "OpenIdProvider",
      "href":  "http://core.ssbe.example/openid"
    },
    {
      "_type": "ResourceDescriptor",
      "name":  "AllAccounts",
      "href":  "http://core.ssbe.example/accounts"
    }
  ]
}

service_type is the globally unique name for the type service that is being described. It should be a URI that is owned by the creator of the service. Each top level resource that is exposed as part of this service has a resource descriptor in the resources set.

If you wanted know about all the accounts of the system you would

  1. GET the service descriptor resource
  2. iterate over the resources collection until you found the AllAccounts resource descriptor
  3. GET the URI found in the href pair of the resource descriptor (http://core.ssbe.example/accounts in this example)

One important thing to note is that each resource is really exactly one resource, and not a type of resource. If you are looking for a particular account you have to get the AllAccounts collection and find the account you are looking for in that set.

Capabilities

The Capabilities resource is the only well known entry point for our system. If a program wants to interact with our system it always starts with the capabilities service and the works it’s way down, using the links in documents, to the resource it actually cares about.

The JSON representation we support looks like

{  
  "_type": "SystemCapabilities",
  "services": [
    {
      "_type":        "ServiceDescriptor",
      "href":         "http://alarm.ssbe.example/service_descriptors/escalations",
      "service_type": "http://mydomain.example/services/something-interesting"
    }
  ]
}

To discover the URI of a particular top level resource a consumer must

  1. GET the capabilities document
  2. iterate though the objects in services until if it finds the one of the correct service_type
  3. GET the full service descriptor using the URI in the href pair
  4. iterate of the resources until it finds the one with the correct name
  5. extract the URI from it’s href pair

Services are registered with the capabilities resource, by POSTing a service description to it, when the containers that provide those services are started.

Issues

No supported methods or format information

This approach only provides a way to discover the URIs of top level resources. It makes no attempt to describe the representations (formats) or methods those resources support. That sort of thing would not be hard add but so far I have had absolutely not need for it. That information is provided by the human-oriented documentation and since it does not change in each deployment there is no need for it included in the dynamic resource discovery mechanism.

Non-top level resources are not represented

Resources that are not top level — by which I mean resources that not listed in a service description document — are not represented at all. This is a feature, really, but it makes extending this format to include method and data format information less compelling becayse only a relatively minor subset of the resources in the system are surfaced in the service descriptions.

Encourages large representations

The fact that only singleton resources are supported can lead to top level documents that are excessively large. In fact, we have already had to deal with this issue. We have basically punted on the issue but I think the correct approach would be to introduce a ResourceTypeDescriptor that would operate much like a ResourceDescriptor except that the link would be a URI template rather than a concrete URI.

  1. On the other hand service providers might get some value. Something like WADL does give you a way to declaratively define a suite of regression tests. On the other hand, you might be better off using a tool specifically built for that purpose.

  2. That is the base number of services. Additionally functionality is added to the system in the form of additional services so the actually number of services varies based on what you need the system to do.

I ♥ DVCS

This week I worked on a project that uses Subversion and, man, what a difference a year makes. Back then I dreamed of being able use Subversion instead of Perforce. Now using svn feels a bit like walking around waste deep in water.

I have been using Git almost exclusively for the last couple of months. I am now firmly convinced that distributed version control systems, such as Git and Mercurial, are the way of the future. The basic model of dVCSs matches with real world software development much more cleanly that the model imposed by most (if not all) centralized VCSs.

Consider the scenario I ran into, I checked out an svn project and made my changes, then I svn up-ed and found one of the files I had edit had been changed in a way that resulted in a merge conflict, and a fairly complicated one at that. I manually resolved the conflict, merging the files by hand, and the commit the merged file, but what if I had gotten it wrong? My version of the file was never stored. Which means that after doing the svn resolved the merge cannot ever undone or fixed.

svn up is a branch merge which throws away the entire history of the target branch. Working directories are branches whether the VCS acknowledges it or not. Not acknowledging it simply results in branches in which the history is not tracked. A branch that does not track its history sounds silly. Because it is. None the less, that is the model that is forced by most VCSs.

Distributed VCSs, on the other hand, treat your working directory as the branch it really is. And that makes all the difference in the world.

OpenID 2.0’s Killer Feature

The OpenID 2.0 spec has been finalized. On the surface, it does not seem to be very different from the 1.1 spec but it does include at least one sweet new feature. It provides protocol support for directed identity.

Directed identity is the concept of having a single identity that appears to be a different identity for every relying party (ie, an application that wants to verify your identity). The identity provider would, of course, understand that all these single use identities are really all part of the same identity. This would, theoretically, prevent unscrupulous people from building a profile about all the things you do online, because each website you visited would know you by a different identity. However, you could still log into all those websites using the same credentials since the identity provider would know that all those single use identities belong to you.

The change that was made to the OpenID protocol to support directed identity is brilliantly simple. It amounts just defining that if a provider receives a normal authentication request with a predefined URI1 as the identity that the request is a directed identity request. The provider would then verify the user’s credentials and respond with the appropriate identity for the relying party. The OpenID provider responses always include the identity being verified so it turned out to be a very minor change to support that in our provider.2

Now, just to be clear, I don’t actually have any use for directed identity in the applications on which I work. However, the protocol mechanism that supports directed identity can also be used to implement multi-application single sign-on. And that is a killer feature for the stuff on which I work.

For maintainability, we have divided our application into five3 separate components. This works great as a way to keep the code simple, the architecture comprehensible and the system distributable. However, in the past, it left a bit to be desired with regards to the user experience because a user was forced to login five different times just to use the different sections of the application. Well, actually, we have been using OpenID for a while so the user did not actually have to login five time, but the did have to type in their user name five times. And that is not any better.

The directed identities support in OpenID 2.0 provides a solution to this repeated challenge problem. Each supplementary application discovers the trusted OpenID provider for the system and the performs a direct identity authentication request against that provider. The provider figures out who the user is and conveys that information to the supplementary application in the id_res response.

This means that once you have logged into any component of our system you will never the asked for your identity info again. We may make half a dozen requests to determine and verify your identity when you navigate to a component for the first time but all that work it is completely seamless from the users point of view. To the user it seems like just another page in the application.

  1. The predefined URI is <http://specs.openid.net/auth/2.0/identifier_select>.

  2. It did take a bit more work in the consumers, but that is because ruby-openid does not support directed identity authentication yet. It does not like the fact that the identity in the id_res response does not match the identity of the initial authentication request. However, a bit of consulting the source lead me to a way of tricking it into accepting the responses even though they appear, on the surface, to be unrelated to the initial authentication requests.

  3. That is five so far. All new functionality is implemented as a new supplementary application so this number is on an ever increasing trajectory.

How REST Can Relieve Your (Lack of) Documentation Guilt

A couple of months ago we hired a contractor to write a reporting interface for our high volume monitoring system. Our system exposes all of it’s data in RESTful web services, and his job has been to take that data and allow users to create reports based on it.

This morning a couple of my teammates and I asked him if he thought our documentation was sufficient to allow supplementary applications, like the one he was finishing up, to be written without having direct access to the developers. He replied to this effect,

To be honest, I did not really look at the documentation. I just fetched the URLs you gave me, and the ones I found in those documents, and so on. It did not take me very long to get a pretty good idea of what kind of data was available and where.

That ability to understand a large system by simply exploring it is one of the most powerful features RESTful architectures. But only if you are using all the precepts of REST, including that resources are represented by documents with links1 (or, hypermedia is the engine of application state, if you prefer a more traditional phrasing).

A RESTful architecture will let you scale, and distribute your application beyond all reasonable expectations. But even better, since you know that anyone who cares can just go exploring, it will also let you feel less guilty about not writing all that documentation that you never quite get around to.

  1. Hat tip to Stefan Tilkov for either reporting that Sanjiva Weerawarana used this phrase in is QCon 07 presentation, or for coining that phrase himself (I cannot tell for sure which it was).

The Sad State of Mobile Phones

The current crop of phones are depressingly lame. I am currently using a Samsung t519. It has served me well for pretty close to two years now. My two biggest complaints with it are that phone allowed T-Mobile to permanently hijacked the right soft button, and that is, perhaps, a tiny bit too thick. That second complaint is particularly unfortunate given the fact that I believe it to be the thinnest phone ever available in the US.

A few days ago I dropped my phone. It still works but case cracked so I decided that it was probably time for a new phone. Two years is a pretty good run for a phone. So I mossied over to my local T-Mobile kiosk to take a look at what is available. Man does the selection suck. The are all bricks with absolutely no sense of style. Hell, my very first cell phone, the Nokia 2190, (in 1997) looked almost as good as most of the phone available today. And it was only a little bigger.

How is it that phones have gotten less interesting, and larger, in the last two years? I guess I am not the target market for phones being produced today. If someone where to produce a thin candy bar with a decent sense of style and a reasonable feature set I would definitely be interested. Until that happens I think I am just oing to continue to use my two-year-old-but-still-better-than-anything-that-is-available-today phone until it actually stops working.

When To Use Exceptions

Marty Alchin recently posted about the “evils” of returning None (or nil or null depending on your language of choice). I think he has it basically right. Sure there are situations where returning nil1 is appropriate, but they are pretty rare. For example, if a method actually does what the client asked and there is nothing meaningful to return, then by all means return nil. If, however, the method was not actually able to do what the client asked, raising an exception is the appropriate thing to do. Unfortunately, I think that most methods that return nil in modern libraries and programs actually do so as a way to indicate a failure condition, and that is evil.

Cédric Beust responded to Mr Alchin saying basically, a) “problems caused by returning null are easy to debug” and b) “programmers are all knowing, about now and the future, so they can decide when to return nil and when to raise an exception.” (I am, as you might have guessed, taking significant liberties in my para-phrasing. You should go read Mr Beust’s post if want to know what he actually said.)

Ease of debugging

On the debugging point I would say that Mr Beust is generally correct. It is usually the case that nil returns are fairly easy to debug. However, this is not always true. In fact, it is not uncommon, in my experience, to find a nil where it is not suppose to be but then to spend a fair bit of time tracking down where that value became nil. That time is usually spent walking my way up the call stack trying to find the subtle bug that results in a nil return from a method in only some odd situations.

That sort of debugging is not the end of the world but it is annoying particularly because it is so easily avoidable.

All knowing programmers

To be fair, Mr Beust did not actually say that he believes that programmers are all knowing. But he did describe a way of using exceptions that would only make sense if programmers were all knowing. From that, one might infer that he does believe that programmers are in fact omniscient. In my experience this misconception is fairly common in the Java community (Actually, this mentality exists to varying degrees in most programming communities) . Java itself includes many decisions that seem only to make sense in the presence of this assumption. But I digress.

The statement to which I am referring is

Here is a scoop: exception should only be thrown for exceptional situations. Not finding a configuration value is not exceptional. Not finding a word in a document is not exceptional: it can happen, it’s even expected to happen, and it’s perfectly okay if it does.

My response to this is: Who exactly are you to decide that not finding a configuration value is a non-exceptional event for my application? Library programmers2 should not be deciding what is and is not an “exceptional” event. That is a value judgement they cannot possibly make correctly, unless they understand every way that every program will make use of that library (or class) in perpetuity.

Exceptions are not about “exceptional situations”, whatever that means. They are a way for methods tell their caller that it was not able to do what the caller asked it to do. If I say, “dictionary give me the value associated with a particular key” and the key does not exist, the appropriate response is a NoSuchKey exception. Returning a nil in that situation is a lie. A lie with real, and negative, consequences. Consider this, I ask a dictionary for the value associated with a key and it returns nil. Does that mean the key does not exist, or that the key does exist and it’s value is nil? Those two are very different but a nil returning method conflates them requiring, at the very least, an addition method call figure out which possibility is actually the case.

If having methods actually inform you of their inability to perform the desired action is complicated or hard to understand it is time to upgrade your language or write a better interface. For example, if catching an exception from a dictionary look up complicated for the cases where you just want to use a default value, that exception catching and default value behavior could easily be put in a get_with_default(key, default_value) method that either returns the value from the dictionary or the default value. That would certainly be clearer than returning nil and having every consumer add an ugly if block after the get. Or you could switch to a language (such as Ruby) with a compact single line exception handling syntax.

Either way my advice is: Do not use nil as a way to indicate that the object was unable to perform the requested operation, that is job of exceptions. If you see a method that returns nil demand an affirmative defense for that behavior because it is often incorrect.

  1. nil is the Ruby equivalent of None in Python and null in Java. Since all other programming languages are but pale shadows in comparison to Ruby I shall hence forth be using nil to describe this concept.

  2. By library programmer I mean someone that is writing code that will be used by someone else at some point in the future. If that does not include you it is because a) you are not a programmer or b) you are writing completely unmaintainable spaghetti code.

Craftsmanship

My dad is a carpenter. He has always taken a great deal of pride in his craft. By the time, I was old enough to work with him he was doing mostly finish work. I occasionally got to work with him when I was younger. But, there were always lots of things I was not allowed to do. Mostly, as it turns out, because I would not have been able to do them very well. Now that I am older, I really appreciate the fact that my dad took enough pride in a job well done not to let me mess it up. And I am sure his customers appreciated it even more.

Based on my experiences with my dad, I had sort of assumed that attention to fit and finish was a part of doing “finish” work. Not so much, it turns out.

We are having our kitchen remodeled at the moment and our new granite counter tops are not quite level and there is some spots that really should have epoxy that don’t. The new floor is beautiful, if you don’t look too close. However, if you do look close you will see that there is a lot of dirt, hair and who knows what else in the finish. Not to mention that mostly the planks are only marginally smoother now that the floor is “done” than they were when they were first brought into our house.

We are, thankfully, nearing the end of our kitchen remodel. I will be glad when it is over, regardless of the outcome. I am, however, really disappointed by the workmanship that has gone into it. I find this extremely frustrating, and not just because I feel riped off. Don’t these people want to do a good job and explore and refine their craft?

The answer is, obviously, no. I suppose that is because for most of them this is “just a job”. I have had quite a few different jobs in my life but none of them has been “just a job”. In fact, I have a hard time even imagining what that would be like.

It make me sad to think of these people spend so much of their time doing something that is not even worth doing well. But mostly it makes me angry that they are doing it on my dime. Perhaps, companies could just offer a “it won’t suck” upgrade to their normal bids and use real craftsmen for those jobs. I for one, would be willing to pay a bit extra up front for a job well done.