This Program Has Been Tested: It Works Perfectly!

... or at least that's what I was told when I found this implementation of Hello World in C++:

#include <iostream.h>

main()
{
    cout << "Hello World!";
    return 0;
}

Stashed away at the bottom of the page in the section "Program Notes" is this helpful bit of information:

This program has been tested, it works perfectly without any errors or warnings.

... and Yet, GCC Complains

You may be wondering to yourself "What on Earth prompted them to include that little tidbit of information?". Even in C++ Hello world should not be too hard to get right:

aaron@athena:~/scratch$ g++ hello.cpp
In file included from /usr/include/c++/4.2/backward/iostream.h:31,
                 from hello.cpp:1:
/usr/include/c++/4.2/backward/backward_warning.h:32:2: warning:
    #warning This file includes at least one deprecated or antiquated
    header. Please consider using one of the 32 headers found in section
    17.4.1.2 of the C++ standard. Examples include substituting the <X>
    header for the <X.h> header for C++ includes, or <iostream> instead
    of the deprecated header <iostream.h>.
    To disable this warning use -Wno-deprecated.

... but GCC still complains. The specifics of what it's complaining about are euclidated nicely in this article but gist is that <iostream.h> references Bjarne Stroustrup's original implementation which is only included (if it's included) in modern compiler distributions for compatibility reasons whereas <iostream> is the standardized version which is better, faster, cheaper, more portable and should be used in preference to <iostream.h> whenever possible.

After switching to <iostream> and informing the compiler that we're using namespace std GCC will happily compile and run our simple program:

aaron@athena:~/scratch$ g++ hello.cpp
aaron@athena:~/scratch$ ./a.out
Hello World!aaron@athena:~/scratch$

For stylistic reasons, I'll add an endl at the end of the message, but that's not fatal. Unfortunately, I'm not quite done. As someone who thinks that my compiler might know a thing or two about code, I like to turn on all of the warnings that my compiler is able to give me. Anything that the compiler is worried enough to raise a warning over is probably worth looking into. Compiling with -Wall -pedantic shows us that GCC is still not happy:

aaron@athena:~/scratch$ g++ -Wall -pedantic hello.cpp
hello.cpp:5: error: ISO C++ forbids declaration of ‘main’ with no type

... easily enough fixed, just add an int return type to the main function.

All Fixed Up

After changing the iostream header, adding a namespace declaration, adding a return type and a newline (which is a surprising number of things for "Hello World!" when you think about it) we can now compile with -Wall and -pedantic and GCC won't emit so much as a peep:

#include <iostream>

using namespace std;

int main()
{
    cout << "Hello World!" << endl;
    return 0;
}
aaron@athena:~/scratch$ g++ -Wall -pedantic hello.cpp
aaron@athena:~/scratch$ ./a.out
Hello World!

The Moral of the Story

When your students send you email complaining about your implementation of "Hello World" emitting a slew of obscure errors when compiled, posting a note indicating that your program has been compiled and tested and is bug free is probably not the most productive thing you could do.

Reading Stack Overflow Makes Me Do Strange Things

When browsing Stack Overflow I occasionally find myself doing ridiculous (programming related) things just because some one says something that my insane mind interprets as a challenge.

XML to JSON

For instance, after reading "There is no "one-to-one" mapping between XML and JSON..." I though to myself "Sure there is!"

JSON has much less meta-data (or baggage...) than XML generally rendering it more compact and, most importantly, much simpler. So:

<?xml version="1.0" ?>
<eveapi version="2">
  <currentTime>2009-01-25 15:03:27</currentTime>
  <result>
    <rowset columns="name,characterID,corporationName,corporationID" key="characterID" name="characters">
      <row characterID="999999" corporationID="999999" corporationName="filler data" name="someName"/>
    </rowset>
  </result>
  <cachedUntil>2009-01-25 15:04:55</cachedUntil>
</eveapi>

... could instead be written as:

{
    "time": "2009-01-25 15:03:27",
    "cachedUtil": "2009-01-25 15:04:55"
    "characters": [
        {
            "characterID": "999999",
            "corporationID", "999999",
            "corporationName": "filler data",
            "name": "someName"
        }
    ]
}

... it doesn't capture everything but it has all the data, it's easier to read and it is simple. Unfortunately, to achieve a 1-to-1 mapping from XML to JSON we need to store attributes, namespaces and other nasty XML stuff, but JSON nodes don't have metadata, they just have regular data.

You can; however, represent an Element Node (think DOM) using a JSON dictionary relatively easily. So:

<ant:property name="value" />

... becomes:

{
    "element": "property",
    "namespace-uri": "http://ant.apache.org/",
    "children" : []
    "attributes": [
        ["name", "value"]
    ]
}

Which means that our previous XML snippet could be represented as:

{
 "attributes": [["version", "2"]],
 "children": [{"attributes": [],
               "children": ["2009-01-25 15:03:27"],
               "name": "currentTime",
               "namespace-uri": ""},
              {"attributes": [],
               "children": [{"attributes": [["key", "characterID"],
                                            ["columns",
                                             "name,characterID,corporationName,corporationID"],
                                            ["name", "characters"]],
                             "children": [{"attributes": [["corporationName",
                                                           "filler data"],
                                                          ["corporationID",
                                                           "999999"],
                                                          ["characterID",
                                                           "999999"],
                                                          ["name",
                                                           "someName"]],
                                           "children": [],
                                           "name": "row",
                                           "namespace-uri": ""},
                                          ""],
                             "name": "rowset",
                             "namespace-uri": ""},
                            ""],
               "name": "result",
               "namespace-uri": ""},
              {"attributes": [],
               "children": ["2009-01-25 15:04:55"],
               "name": "cachedUntil",
               "namespace-uri": ""},
              ""],
 "name": "eveapi",
 "namespace-uri": ""
}

... and you though XML was verbose. So you can represent an XML document in JSON but you essentially have to store the parse tree. Welcome to Missed the Pointville, population you (or me as the case may be).

I guess the statement stands.

What the Date?

Today's Code Snippet of the Day (CSOD) from The Daily WTF shows how not to validate a date. Inspired by boredom and the knowlege that I could do it shorter and better. I set about writing my own date parsing/validation routines as a form of Code Kata.

In Python, Take I

A first crack written in python:

date_pattern = re.compile(r'^(?P<day>\d\d)/(?P<month>\d\d)/(?P<year>\d\d\d\d)$')

def parse_date(input):
    if not date_pattern.match(input):
        raise ValueError("'%s' is not in DD/MM/YYYY format" % input)

    day, month, year = map(int, input.split('/'))
    d = datetime.date(year, month, day)
    if d > datetime.date.today():
        raise ValueError("'%s' is in the future" % input)

    return d
  1. We validate the date against a regex so that we know what we're dealing with.
  2. We split up the input string and construct the date.
  3. We test that the date is not in the future, and return the result.

This implementation is better than the CSOD in a number of ways:

  1. It uses a regex to validate the format of the input string with is so much more faster/expressive/productive that writing our own validation code.
  2. We use the platform's built in Date object rather than storing and manipulating the year/month/day ourselves which helps to avoid all kinds of silly bugs.

Unfortunately, we still parse and construct the date ourselves, duplicating functionality present in the standard library.

In Python, Take II

A second attempt, this time we're going to rely on strptime rather than parsing the string ourselves:

def parse_date(input):
    d = datetime.datetime.strptime(input, "%d/%m/%Y").date()
    if d > datetime.date.today():
        raise ValueError("'%s' is in the future" % input)
    return d

This implementation is even better as it relyies on strptime to handle the parsing/validating and the only real code that we write is testing if the date is in the future which is our logic.

In JavaScript

An implementation in JavaScript because the CSOD was submitted in JS. This is essentially a transcription of the first Python implementation as none of the JS date parsing utilities seem to take a formate string:

date_pattern = new RegExp('^\\d\\d/\\d\\d/\\d\\d\\d\\d$');

function parse_date(input){
    if(!date_pattern.test(input)){
        alert("'" + input + "' does not conform to the dd/mm/yyyy format");
    }

    ordinals = input.split('/');
    d = new Date(ordinals[2], ordinals[1] - 1, ordinals[0]);
    if(d > new Date()){
        alert("'" + input + "' is in the future");
    }

    return d;
}

It would be preferable to try and achieve the simplicity of the second Python implementation but that would require writing (or including third party code) comparable to strptime.

Aside

It is very, very, strange that months are 0 indexed while day and year are not in the Date constructor:

js> new Date('12/02/2008')
Tue Dec 02 2008 00:00:00 GMT-0500 (EST)
js> new Date(2008, 12, 02)
Fri Jan 02 2009 00:00:00 GMT-0500 (EST)
js> new Date(2008, 11, 02)
Tue Dec 02 2008 00:00:00 GMT-0500 (EST)

Seriously, what's up with that?

Greenspun's Tenth Rule

Greenspun's Tenth Rule states:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

Which is why it brought a smile to my face to see the following while diging through the PostgreSQL sources:

#define NIL  ((List *) NULL)

... and later on:

extern List *lcons(void *datum, List *list);

So they've got lists, now all they need is "processing".

Unfortunately, car, cdr, cadr, caadr, caaadr and their ilk were nowhere to be found.

Aside

For those of who are a little confused, cons is used to construct a linked list in the various dialects of Lisp, while nil is generally the value of the next pointer at then end of such a linked list. The inference that can be drawn from the above facts is: whoever designed Postgres's linked list implementation probably modeled it on the linked lists provided in a Lisp environment.

With Statement (Redux)

My previous implementation of the with_statement in ruby had one major flaw: It allowed only a single context manager per with block. While effective, it leads to cluttered code, like the following:

with(open("hello.txt")){|f|
    with(open("bonjour.txt")){|g|
        puts f.read()
        puts g.read()
    }
}

Effective, but it isn't as clear as it could be and introduces more nesting than is really necessary. This coupled with my desire to explore varargs methods and blocks in ruby has lead me to a more robust implementation that can take an arbitrary number of managers as a parameter.

def with(*managers, &block)
    def with_(managers, resources, block)
        if managers.nitems > 0
            begin
                resource = managers.delete_at(0)
                resources << resource.enter()
                with_(managers, resources, block)
            ensure
                resource.exit()
            end
        else
            block.call(*resources)
        end
    end

    with_(managers, [], block)
end

The preceding implementation is more complex than the original. In addition to introducing "star magic"1, it relies on recursive creation of begin-ensure blocks which complicates exception handling; however, this implementation allows for such beautiful usage as:

with(open("hello.txt"), open("bonjour.txt")) {|f, g|
    puts f.read()
    puts g.read()
}

Should a manager's exit method raise an exception, that exception will supersede any exception raised within the block, though it will not interfere with exiting other managers.

An implementation and unittests are available in a Mercurial repository. The implementation can be retrieved as follows:

hg clone http://hg.crimzon.ath.cx/with_statement

Both the implementation and unittests are licensed under the MIT license.

[1]As it is called by pylint.