My Secret Life as a Spaghetti Coder
home | about | contact | privacy statement
Someone from our last houston.rb meetup asked me about contributing to open source software, so I wrote an email that talks about the little things I do. I thought others might find it helpful, so I'm also posting the email here. It is below:

It occurred to me I totally ignored your question about open source, so apologies for that, and here I go attempting to rectify it.

Here's basically what I do: More...

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!

(or your favorite regex engine)

I have a feeling this post is going to go over like a ton of bricks.

The subject of regular languages, context free languages, and just formal language theory in general caught my eye today with a question about prime numbers. This was one of my favorite classes as an undergraduate, so I thought I'd join in the discussion.

Raganwald quotes Sam, who said
If you can provide a regular expression such that it actually matches the string representation of only prime (or only non-prime) integers, that would be pretty sweet. A proof that such a thing could not be created would be equally impressive.
(Sam also linked to a blog post that linked to another that constructed a regular expression to decide if the number of 1s in a string of 1s is not prime, or /^1?$|^(11+?)\1+$/.)

A couple of comments from people at Raganwald mentioned using the pumping lemma for regular languages to prove that such a thing was not possible.

Indeed, even determining if 1* is not prime (the regular expression above) can be shown not to be a regular language. Using the pumping lemma for regular languages:
  1. Let the language L = { 1i, i is a number greater than 0 and is not prime }
  2. Assume L is a regular language. Then, by the pumping lemma, there exists a number p >= 1 such that every string w in L of length p or greater can be decomposed into substrings w=xyz, where |y| (length of y) > 0, |xy| <= p, and for all i >= 0, xyiz is also in L.
  3. Choose a string, w from L whose length is greater than p.
  4. Since there are infinitely many prime numbers, we can find one greater than any p.
  5. Therefore, we can choose an i in w=xyiz such that repeating it some number of times will be a prime. We arrive at a contradiction, and note that because w cannot be pumped, L is not a regular language.
Another commenter mentioned that "Regular expressions these days can match any context-free language through the use of sub-expressions." Clearly, since our language L is not regular, but it is matched by a regex, we can see that today's regexes are more powerful than regular languages would allow them to be. But, is our regex even restricted enough to be a CFL?

A similar proof using the pumping lemma for CFLs would show that our language L is more powerful than even a CFL. (Don't let me slide here if that's not the case.)

Still, that doesn't tell us anything useful for the problem at hand - only that the regexen of (at least) Perl and Ruby are more powerful than CFLs. But how much more? If we want to prove that a regular expression (in the practical sense of the word) cannot take a string representation of a number, (e.g., "13" or "256") and determine if it is not prime (or prime), then we need to know how powerful regex are.

But I don't know where to start on that front. Any ideas?

Alternatively, if we want to prove that it can be done, we need only demonstrate so by coming up with the regex to do it. I'm not convinced it's possible, but I'm not convinced it's not possible either. Ideally, I'd like to find the formal definition of how powerful regex are, and prove at least that we don't know if the language is in that class or not. (The pumping lemmas, for example, are necessary but not sufficient to prove membership of L amongst the respective class of languages.)

Comments are appreciated. I'm sort of stuck at the moment, so I wanted to bounce these ideas out there to see if someone might bounce one back.

One cool and sunny winter day, a beautiful young woman named Kate Libby (a.k.a "Acid Burn") was writing some mean code. That code was part of a new system that would integrate aspects of the Organization's SharePoint site with its Active Directory and various databases. (Kate had since grown out of her hackerish ways.)

It was early in the morning when the piece of code had finally passed all the tests and was ready to deploy - just about two hours into the day. At this rate, Kate would be at the pub by noon for a couple of pints, and then off to play a game of pickup Football in the early afternoon down at the park by her office.

Kate hit the deploy button and started packing her things, getting ready to leave. Just a quick verification on the live site and off she'd be.

"Oh jeez, what's wrong now?" Kate asked herself. "Why can't anything just work?" All the unit tests had passed. All the integration tests had passed. All the acceptance tests had passed. All the tests-by-hand had passed. What was the difference between the test site and the live one?

The error messages were largely useless. The last time Kate was getting useless error messages, there was an issue with differences in credentials to AD, SharePoint, the database server, or any combination of them. Kate proceeded under that assumption - it had a relatively high degree of being correct.

Twelve hours into three rewrites (trying different strategies to do the same thing) and several hundred WTFs later, it finally hit her:

"OMFG," yelled Kate.

She noticed that it worked in Firefox, but not in Internet Explorer. After another round of WTFs and some time spent wondering why the different browsers caused a server error, and then some time wondering why SharePoint should error out on its own browser, she realized: The issue wasn't with her code at all - it never had been.

The form was supposed to be a normal form - nothing to do with ASP.NET at all. But ASP didn't care - it wanted to see EnableEventValidation in the @Page directive to let you submit your own form - but only through Internet Explorer.

Kate's story is a tragedy: the essential complexity of the problem took her a couple of hours, while the accidental complexity ate up twelve. It cost her a couple of pints and some good fun on the field.

Luckily, you can avoid twelve hours of useless re-work if you just learn the lesson from Kate Libby's horrific tale: isolate errors before you fix them. Otherwise, you might spend a ridiculous amount of time "fixing" the parts that already work.

In the past, I've asked a couple of times about how you design algorithms. Of course, I'm not talking about algorithms with obvious solutions or those which are already well-known.

Lately, I've been working on a project that I can't share too much about, where we're exploring what is, to my knowledge, mostly virgin territory. One thing I have learned that I can share, is something about algorithm design.

It seems obvious to me now, and may be to you, but before recently, I had never thought about it: you can use an iterative approach in designing new algorithms. In the past, I had thought the opposite - since as a unit, the algorithm would rarely have the ability to be broken into chunks for testing purposes, how could we break it into chunks for iteration purposes?

But with the latest one we've been working on, we were able to take an iterative approach in it's design.

The process went something like this:
  1. Decide what it is you need to accomplish - not necessarily how to accomplish it.
  2. Based on that, find the parameters to the algorithm that will vary in each invocation.
  3. Fix the parameters, and run through the algorithm. This will be it's simplest version.
  4. Go ahead and write it if you want - but be aware it will likely change dramatically between now and when you are finished.
  5. For each parameter p, vary it and go through the algorithm keeping the rest of the parameters fixed.
  6. When you've got that working, now you can start varying combinations of parameters. Always observe what your algorithm is outputting versus expected output, and fix it when it's wrong.
  7. Finally, allow all the parameters to vary as they would in working conditions.
  8. Clean it up, keeping its essence.
A final bit of advice: don't force looping when the code won't go there. The code I was working on could have had any number of nested loops, and I spent a long time trying to find the relationships behind the parameters and their depth in the loop structure, in an attempt to find a way to fix the number of loops (so we don't have to change the code each time we need another loop in the nest). It was quickly becoming a hard-to-understand mess.

Instead, take a few minutes to step back and look at it from a distance. Do you really need to be looping? In my case, not as much as I thought. Use recursion when the code wants to be recursive. (If you still really want to avoid recursion, you'll probably need to simulate it with your own stack - but why do all the extra work?)

That iterative approach to algorithm design worked for me. It turned what at first seemed like a daunting, complicated algorithm into just several lines of elegant, understandable code.

Do you have an approach to designing algorithms whose solutions you don't know (and to be clear - can't find via search)?

Something's been bothering me lately. It's nothing, really. ?, ?, null, nil, or whatever you want to call it. I think we've got it backwards in many cases. Many languages like to throw errors when you try to use nothing as if it were something else - even if it's nothing fancy.

I think a better default behavior would be to do nothing - at most log an error somewhere, or allow us a setting - just stop acting as if the world came to an end because I *gasp* tried to use null as if it were a normal value.

In fact, just because it's nothing, doesn't mean it can't be something. It is something - a concept at the minimum. And there's nothing stopping us from having an object that represents the concept of nothing.

Exceptions should be thrown when something exceptional happens. Maybe encountering a null value was at some time, exceptional. But in today's world of databases, form fields, and integrated disparate systems, you don't know where your data is at or where it's coming from - and encountering null is the rule, not the exception.

Expecting me to write paranoid code and add a check for null to avoid every branch of code where it might occur is ludicrous. There's no reason some sensible default behavior can't be chosen for null, and if I really need something exceptional to happen, I can check for it.

Really, aren't you sick of writing code like this:

string default = "";
if(form["field"] != null and boolFromDBSaysSetIt != null
  and boolFromDBSaysSetIt)
    default = form["field"];

when you could be writing code like this:

    default = form["field"];

I think this is especially horrid for conditional checks. When I write if(someCase) it's really just shorthand for if(someCase eq true). So why, when someCase is null or not a boolean should it cause an error? It's not true, so move on - don't throw an error.

Someone tell me I'm wrong. It feels like I should be wrong. But it feels worse to have the default path always be the one of most resistance.

What do you think?

When I was younger I was "an arrogant know-it-all prick" at one point in the "middle years" of my programming experience, as many of you know from the stories I often relate on this weblog.

The phrase "middle years" doesn't give us a frame of reference for my age though. For instance, if I were 50 years old right now, my "middle years" of programming may have been when I was in my thirties. That's not the case, and I want to give you that frame of reference: I'm 28 at the time of this writing. The middle years as I talked about them would have referred to my late teens to early twenties. Maybe even up to the the middle of my twenties.

Old or young? By most standards, that's young.

And I know a thing or two about being set in your ways. We can all see the laugh I have at myself with the title here being "My Secret Life as a Spaghetti Coder" and some of the stories I've told as well.

In fact, let me add to the wealth of stodginess, idiocy, and all around opposite-of-good-developerness here:

I once said I preferred Windows to Linux. While that's not a completely shocking statement, the reason behind it was: I said I preferred Windows because 14 year olds work on Linux. Not because of any experience I'd had with it, but because of my fear of learning it.

Like with operating systems, your ignorance does not make a programming language suck. Although I've been tempted to say .NET sucks because of my early troubles with it, I've refrained, admitted my ignorance, and asked for help removing it.

Because of my prior experience being unwilling to learn, I was quite interested when I read this:
When you are young, you don't have that sense of self to protect. You're driven by a need to find out who you are, to turn the pages of your biography and see how the story turns out. If people around you are doing something you don't understand, you assume the problem is your inexperience and you go to work trying to understand it.

But when you are old, when you know who you are, everything is different. When people around you are doing something you don't understand, you have no trouble at all explaining why they are assholes mistaken.

. . .

If you want a new idea, you have to silence your inner critic. Your sense of right and wrong, of smart and stupid works by comparing new ideas to what you already know. Your sense of what would be a good fit for you works by comparing new things to who you already are. To learn and grow, you must let go of you, you must be young again, you must accept that you don't understand and seek to understand rather than explaining why it doesn't make any sense.

In a couple of paragraphs, Reg sums up almost precisely some of what I've been thinking and writing about for the last several months. He's so close, but misses a fundamental point: the old and young parts are incidental.

My hypothesis is that the level of learning and idea absorption you can attain has little to do with age. Instead, it is influenced more by your perceived level of experience. Normally, age is highly correlated to experience - but it doesn't have to be. In my case, when I was younger I thought I knew everything. Now that I've aged, I came to the realization I know very little.

My conclusion is not that different from Reg's, and this is not some scientific experimental contest, so let me explain why I feel the difference is worth noting: If we blame our reluctance to try new things on age, we are dooming ourselves to think of it as some unchangeable, deterministic process. By thinking of it in terms of perception of experience, we admit to being able to control it with more ease. (My belief is that we have control over what and how we perceive things.)

In other words, we lose our ability to blame anyone but ourselves. That's a powerful motivator sometimes.

Thoughts? Disagreements? Please be kind enough to let me know.

A sequence alignment from using BLAST on the NCBI website In the field of bioinformatics, one way to measure similarities between two (or more) sequences of DNA is to perform sequence alignment:
"a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences."
Think of it this way: you've got two random strands of DNA - how do you know where one starts and one begins? How do you know if they come from the same organism? A closely related pair? You might use sequence alignment to see how the two strands might line up in relation to each other - subsequences may indicate similar functionality, or conservation through evolution.

In "normal" programming terms, you've got a couple of strings and want to find out how you might align them so they they look as much like one another as possible.

There are plenty of ways to achieve that goal. Since we haven't done much programming on here lately, I thought it would be nice to focus on two very similar algorithms that do so: Needleman-Wunsch and Smith-Waterman.

The idea behind these two algorithms is that we have a scoring scheme we want to maximize as successive "matches" occur. One popular substitution matrix for scoring protein alignment is BLOSUM62 (and here's a good PDF describing how BLOSUM came about).

The particular scoring matrix you use will be determined by the goals you want to acheive. For our purposes, a simple matrix or two will suffice:

@substitution_matrix =
    [[" ", "a","c","g","t"],
     ["a",  1 , 0 , 0 , 0 ],
     ["c",  0 , 1 , 0 , 0 ],
     ["g",  0 , 0 , 1 , 0 ],
     ["t",  0 , 0 , 0 , 1 ]]

@substitution_matrix2 =
    [[" ", "s", "e", "n", "d", "a"],
     ["s",  4 ,  0 ,  1 ,  0 ,  1 ],
     ["e",  0 ,  5 ,  0 ,  2 , -1 ],                     
     ["n",  1 ,  0 ,  6 ,  1 , -2 ],                 
     ["d",  0 ,  2 ,  1 ,  6 , -2 ],           
     ["a",  1 , -1 , -2 , -2 ,  4 ]]

The first @substitution_matrix is fairly simplistic - give one point for each match, and ignore any mismatches or gaps introduced. In @substitution_matrix2 what score should be given if "s" is aligned with "a"? (One.) What if "d" is aligned with another "d"? (Six.) The substitution matrix is simply a table telling you how to score particular characters when they are in the same position in two different strings.

After you've determined a scoring scheme, the algorithm starts scoring each pairwise alignment, adding to or subtracting from the overall score to determine which alignment should be returned. It uses dynamic programming, storing calculations in a table to avoid re-computation, which allows it to reverse course after creating the table to find and return the best alignment.

It feels strange to implement this as a class, but I did it to make it clear how trivially easy it is to derive Smith-Waterman (SW) from Needleman-Wunsch (NW). One design that jumps out at me would be to have a SequenceAligner where you can choose which algorithm as a method to run - then SW could use a NW algorithm where min_score is passed as a parameter to the method. Perhaps you can think of something even better.

Anyway, here's the Ruby class that implements the Needleman-Wunsch algorithm.

class NeedlemanWunsch
    @min_score = nil
    def initialize(a, b, substitution_matrix, gap_penalty)
        @a = a
        @b = b
        # convert to array if a/b were strings
        @a = a.split("") if a.class == String
        @b = b.split("") if b.class == String
        @sm = substitution_matrix
        @gp = gap_penalty

    def get_best_alignment
        return extract_best_alignment_from_score_matrix
    def construct_score_matrix
        return if @score_matrix != nil #return if we've already calculated it
        traverse_score_matrix do |i, j|
            if i==0 && j==0
                @score_matrix[0][0] = 0
            elsif i==0 #if this is a gap penalty square
                @score_matrix[0][j] = j * @gp
            elsif j==0 #if this is a gap penalty square 
                @score_matrix[i][0] = i * @gp
                up = @score_matrix[i-1][j] + @gp
                left = @score_matrix[i][j-1] + @gp
                #@a and @b are off by 1 because we added cells for gaps in the matrix
                diag = @score_matrix[i-1][j-1] + s(@a[i-1], @b[j-1])
                max, how = diag, "D"
                max, how = up, "U" if up > max
                max, how = left, "L" if left > max
                @score_matrix[i][j] = max
                @score_matrix[i][j] = @min_score if @min_score != nil and max < @min_score
                @traceback_matrix[i][j] = how
    def extract_best_alignment_from_score_matrix
        i = @score_matrix.length-1
        j = @score_matrix[0].length-1
        left =
        top =
        while i > 0 && j > 0
            if @traceback_matrix[i][j] == "D"
                i -= 1
                j -= 1
            elsif @traceback_matrix[i][j] == "L"
                left.push "-"
                top.push @b[j-1]
                j -= 1
            elsif @traceback_matrix[i][j] == "U"
                left.push @a[i-1]
                top.push "-"
                i -= 1
                puts "something strange happened" #this shouldn't happen
        return left.join.upcase.reverse, top.join.upcase.reverse   

    def print_score_visualization

    def print_traceback_matrix

    def print_as_table(the_matrix)
        puts "a=" + @a.to_s
        puts "b=" + @b.to_s
        print "   "

        @b.each_index {|elem| print " " + @b[elem].to_s }

        puts ""
        traverse_score_matrix do |i, j|
            if j==0 and i > 0
                print @a[i-1]
            elsif j==0
                print " "
            print " " + the_matrix[i][j].to_s

            puts "" if j==the_matrix[i].length-1
    def traverse_score_matrix
        @score_matrix.each_index do |i|
            @score_matrix[i].each_index do |j|
                yield(i, j)
    def initialize_score_matrix
        @score_matrix =
        @traceback_matrix =
        @score_matrix.each_index do |i|
            @score_matrix[i] =
            @traceback_matrix[i] =
            @traceback_matrix[0].each_index {|j| @traceback_matrix[0][j] = "L" if j!=0 }
        @traceback_matrix.each_index {|k| @traceback_matrix[k][0] = "U" if k!=0 }
        @traceback_matrix[0][0] = "f"
    def s(a, b) #check the score for bases a. b being aligned
        for i in 0..(@sm.length-1)
            break if a.downcase == @sm[i][0].downcase

        for j in 0..(@sm.length-1)
            break if b.downcase == @sm[0][j].downcase
        return @sm[i][j]

Needleman-Wunsch follows that path, and finds the best global alignment possible. Smith-Waterman truncates all negative scores to 0, with the idea being that as the alignment score gets smaller, the local alignment has come to an end. Thus, it's best to view it as a matrix, perhaps with some coloring to help you visualize the local alignments.

All we really need to get Smith-Waterman from our implementation of Needleman-Wunsch above is this:

class SmithWaterman < NeedlemanWunsch
    def initialize(a, b, substitution_matrix, gap_penalty)
        @min_score = 0
        super(a, b, substitution_matrix, gap_penalty)

However, it would be nice to be able to get a visualization matrix. This matrix should be able to use windows of pairs instead of each and every pair, since there can be thousands or millions or billions of base pairs we're aligning. Let's add a couple of methods to that effect:

#modify array class to include extract_submatrix method
class Array
  def extract_submatrix(row_range, col_range)

require 'needleman-wunsch'
class SmithWaterman < NeedlemanWunsch
    def initialize(a, b, substitution_matrix, gap_penalty)
        @min_score = 0
        super(a, b, substitution_matrix, gap_penalty)
    def print_score_visualization(window_size=nil)
        return super() if window_size == nil
        #score_matrix base indexes
        si = 1
        #windowed_matrix indexes
        wi = 0
        windowed_matrix = initialize_windowed_matrix(window_size)
        #compute the windows
        while (si < @score_matrix.length)
            sj = 1
            wj = 0
            imax = si + window_size-1
            imax = @score_matrix.length-1 if imax >= @score_matrix.length
            while (sj < @score_matrix[0].length)
                jmax = sj + window_size-1
                jmax = @score_matrix[0].length-1 if jmax >= @score_matrix[0].length
                current_window = @score_matrix.extract_submatrix(si..imax, sj..jmax)
                current_window_score = 0
                current_window.flatten.each {|elem| current_window_score += elem}
                    windowed_matrix[wi][wj] = current_window_score   
                wj += 1
                sj += window_size
            wi += 1
            si += window_size
        #find max score of windowed_matrix
        max_score = 0
        windowed_matrix.flatten.each{|elem| max_score = elem if elem > max_score}

        max_score += 1 #so the max normalized score will be 9 and line up properly 
        #normalize the windowed matrix to have scores 0-9 relative to percent of max_score
        windowed_matrix.each_index do |i|
            windowed_matrix[i].each_index do |j|
                normalized_score = windowed_matrix[i][j].to_f / max_score * 10

                windowed_matrix[i][j] = normalized_score.to_i

        #print the windowed matrix
        windowed_matrix.each_index do |i|
            windowed_matrix[i].each_index do |j|
                print windowed_matrix[i][j].to_s
    def initialize_windowed_matrix(window_size)
        windowed_matrix =
        windowed_matrix.each_index do |i|
            windowed_matrix[i] =
        return windowed_matrix

And now we'll try it out. First, we take two sequences and perform a DNA dotplot analysis on them:

Local sequence alignment performed by DotPlot application

Then, we can take our own visualization, do a search and replace to colorize the results by score, and have a look:

Local sequence alignment performed by the version of Smith-Waterman presented here

Lo and behold, they look quite similar!

I understand the algorithms are a bit complex and particularly well explained, so I invite questions about them in particular. As always, comments and (constructive) criticisms are encouraged as well.

If you get too smart, you start to think a lot. And when you think a lot, your mind explores the depths of some scary places. If you're not careful, your head could explode.

In The Butterfly Effect, Ashton Kutcher's head could have exploded from the number of choices he could make when exploring the depths of his brain.

So to combat the effects of increasing intelligence due to reading books like The Mythical Man Month and Code Complete, I'm careful about maintaining a subscription to digg/programming in my feed reader. Incidentally, this tactic is also useful in preemptive head explosion. However, this second type of explosion is usually caused by asininity, as opposed to the combinatorial explosion due to choices you gain from reading something useful.

This man's head exploded, and had to be put back together with duct tape.

Reading digg/programming is how I came across Ohloh.

As Matt Marshall writes in VentureBeat,
Ohloh, a company that ranks the nation's top open source coders, is opening its service to let other developers to track and rank their own teams. [Strong emphasis is mine.]

It's the latest move by Ohloh, a Bellevue, WA company that already distributes its coder profiles and related data to about 5,000 open source sites. The Ohloh profiles can serve as advertising for these sites, because the profiles show how active their open source development projects are.

Here's how it works. Ohloh ranks individual coders by tracking their activity. Ohloh can do this because open source projects publish their code, along with a record of updates each coder makes. Ohloh exploits this publicly available information and analyzes which coders are the most active in making key contributions to the most important open source projects. It assigns them a "KudoRank" to each coder between 1 (poor) through 10 (best).

Teams now have access to Ohcount - "a source code line counter" that "identifies source code files in most common programming languages, and prepares total counts of code and comments."

Unfortunately, since Ohcount helps power the normal Ohloh website, I'd bet it can track commits and lines of code by committer.

As is well known to many people, if you want something done, measure it. In this case, presumably you want more lines of code.

O RLY? Are you sure about that?

The trouble is that lines of code is not a relevant indicator of productivity, so tracking it per developer may not tell you what you think it's going to tell you.

And what makes measuring lines of code per developer (and saying more == better) completely stupid is that program size is code's worst enemy. You'll end up doing the opposite of what you intended.

Still, Ohloh has some interesting stats for you to look at. And you know you want to be ranked #1.


Is there a perfect way to teach programming to would-be programmers? Let's ask the Magic 8-Ball.

8-ball says, 'Outlook not so good'

Outlook not so good.

Does that mean we shouldn't teach them? Of course not. Does it mean we shouldn't look for better methods of teaching them? Emphatically I say again, "of course not!"

And what of the learner? Should beginners seek to increase their level of skill?

Only if they want to become a level 20 Spaghetti Code Slingmancer (can you imagine the mess?). Or, that's how some make it seem.

Spaghetti Slingmancer
Photo from Ken Stein found via Spaghetti cables @ Geek2Live

All it means to me is that we shouldn't let our paranoia about the wrong ways of learning stop us from doing so. For instance, take this passage about the pitfalls of reading source code:
Source code is devoid of context. It's simply a miscellaneous block of instructions, often riddled with a fair bit of implicit assumptions about preconditions, postconditions, and where that code will fit in to the grand scheme of the original author's project. Lacking that information, one can't be sure that the code even does what the author wanted it to do! An experienced developer may be able to apply his insight and knowledge to the code and divine some utility from it ("code scavenging" is waxing in popularity and legitimacy, after all), but a beginner can't do that.
Josh Petrie, "Don't Read Source Code"

Josh also mentions that source code often lacks rationale behind bad code or what might be considered stupid decisions, and that copy and paste is no way to learn.

They're all valid points, but the conclusion is wrong.

Which one of us learned the craft without having read source code as a beginner? Even the author admits that he was taught that way:
Self-learning is what drives the desire to turn to source code as an educational conduit. I have no particular problem with self-learning -- I was entirely self-taught for almost three quarters of what would have been my high school career. But there are well-known dangers to that path, most notably the challenge of selecting appropriate sources of knowledge for a discipline when you are rather ill-informed about that selfsame discipline. The process must be undertaken with care. Pure source code offers no advantages and so many pitfalls that it is simply never a good choice.
This is a common method of teaching - "do as I say, not as I do." It's how we teach beginners anything, because their simple minds cannot grasp all the possible combinations of choices which lead to the actual Right Way to do something. It's a fine way to teach.

But I'd wager that for all X in S = {good programmers}, X started out as a beginner reading source code from other people. And X probably stumbled through the same growing pains we all stumble through, and wrote the same crapcode we all do.

Of course, there are many more bad programmers than good, so lets not make another wrong conclusion - that any method of learning will invariably produce good programmers.

Instead, let's acknowledge that programming is difficult as compared to many other pursuits, and that there's not going to be a perfect way to learn. Let's acknowledge that those who will become good programmers will do so with encouragement and constant learning. Instead of telling them how they should learn, let them learn in the ways that interest them, and let's guide them with the more beneficial ways when they are open to it.

Let's remember that learning is good, encourage it, and direct it when we can. But let people make mistakes. Learning in the wrong manner will produce good programmers, bad programmers, and mediocre ones. Independent, orthogonal, and irrelevant are all words that come to mind. The worst it will do is temporarily delay someone from reaching their desired level of skill.

I would be knowledgeable having read programming books with no practical experience. But I wouldn't have any understanding. Making mistakes is fundamental to understanding. Without doing so, we create a bunch of angry monkeys, all of whom "know" that taking the banana is "wrong," but none of whom know why.

What do you think?

Several days ago I wrote about common excuses people use for commenting code and things you can do to get rid of them. I took the stance that rarely is there something so complex that it really requires comments.

In other words, much of the time, comments are crutches for understanding bad code.

To show a counter-example where comments were needed, Peter Farrell posted a snippet of code that was hard to understand:

<cfloop condition="patArr[patIdxEnd] NEQ '*' AND strIdxStart LTE strIdxEnd">
  <cfset ch = patArr[patIdxEnd] />
  <cfif ch NEQ '?'>
    <cfif ch NEQ strArr[strIdxEnd]>
      <cfreturn false />
  <cfset patIdxEnd = patIdxEnd - 1 />
  <cfset strIdxEnd = strIdxEnd - 1 />

Indeed, that code is hard to understand, and comments would clear it up. And I'm not trying to pick on Peter (the code is certainly not something I'd be unlikely to write), but there are other ways to clear up the intent, which the clues of str, pat, * and ? indicate may have something to do with regular expressions. (I'll ignore the question of re-implementing the wheel for now.)

For example, even though pat, str, idx, ch, and arr are often programmer shorthand for pattern, string, index, character, and array respectively, I'd probably spell them out. In particular, str and array are often used to indicate data types, and for this example, the data type is not of secondary importance. Instead, because of the primary importance of the data type, I'd opt to spell them out.

Another way to increase the clarity of the code is to wrap this code in an appropriately named function. It appears as if it was extracted from one as there is a return statement, so including a descriptive function name is not unreasonable, and would do wonders for understandability.

But the most important ways in which the code could be improved have to do with the magic strings and boolean expressions. We might ask several questions (and I did, in a follow-up comment to Peter's):
  1. Why are we stopping when patArr[patIdxEnd] EQ '*' OR strIdxStart GT strIdxEnd?
  2. Why are we returning false when ch=="?" and ch!=strArr[strIdxEnd]?
  3. What is the significance of * and ?
In regular expression syntax, a subexpression followed by * tells the engine to find zero or more occurrences of the subexpression. So, we might put it in a variable named zeroOrMore, and set currentPatternToken = patArr[patIdxEnd]. We might also set outOfBounds = strIdxStart GT strIdxEnd, which would mean we continue looping when currentPatternToken NEQ zeroOrMore AND NOT outOfBounds.

Similarly, you could name '?' by putting it in a variable that explains its significance.

And finally, it would be helpful to further condense the continue/stop conditions into variable names descriptive of their purpose.

In the end, regular expression engines may indeed be one of those few applications that are complex enough to warrant using comments to explain what's going on. But if I was already aware of what this piece of code's intent was, I could also have easily cleared it up using the code itself. Of course it is near impossible to do after the fact, but I think I've shown how it might be done if one had that knowledge before-hand.

What do you think?

Your boss gave you three weeks to work on a project, along with his expectations about what should be done during that time. You started the job a week before this assignment, and now is your chance to prove you're not incompetent.

You're a busy programmer, and you know it will only take a couple of days to finish anyway, so you put it on the back-burner for a couple of weeks.

Today is the day before you're supposed to present your work. You've spent the last three days dealing with technical problems related to the project. There's no time to ask anyone for help and expect a reply.

Tonight is going to be hell night.

And you still won't get it done.

What can you do to recover? Embrace failure. Here's how I recently constructed (an anonymized) email along those lines:
  1. Take responsibility. Don't put the blame on things that are out of your control. It's a poor excuse, it sounds lame, and it affords you no respect. Instead, take responsibility, even if it's not totally your fault. If you can't think of an honest way to blame yourself, I'd go so far as to make something up.
    I've been having some technical troubles getting the support application to work with the project.

    To compound that problem, instead of starting immediately and spreading my work across several days, I combined all my work this week into the last three days, so when I ran into the technical problems, I had very little time to react.

    After trying to make the support application run on various platforms, I finally asked a teammate about it, where I learned that I needed to use a specific computer, where I did not have access.

    As such, I don't think I can meet your expectations about how much of the project should be done by tomorrow.
  2. State how you expect to avoid the mistake in the future. Admitting your mistake is not good enough. You need to explain what you learned from the experience, and how that lesson will keep you from making a similar mistake in the future.
    I just wanted to say that I take responsibility for this mistake and in the future, I will start sooner, which will give me an opportunity to receive the feedback I need when problems arise. I've learned that I cannot act as a one man team and I by starting sooner I can utilize my teammates' expertise.
  3. Explain your plan to rectify the situation. If you don't have a plan for fixing your mistake, you'll leave the affected people wondering when they can expect progress, or if they can trust you to make progress at all. Be specific with what you intend to do and when you will have it done, and any help you'll need.
    I already sent an email request to technical support requesting access to the specific computer, and await a response.

    In the mean time, here's how I expect to fix my mistake:

    a) I need to run the support program on data I already have. It will analyze the data and return it in a format I can use in the next process. I can have this completed as soon as I have access to the machine, plus the time it takes to run.

    b) I need to learn how to assemble another source of data from its parts. I have an article in-hand that explains the process and I am told we have another support program that will be available next week. I do have the option to write my own "quick and dirty" assembler, and I will look into that, but I do not yet know the scope.

    c) I need to use another one of our tools on the two sets of data to get be able to analyze them. Assuming I am mostly over the technical problems, I wouldn't expect this to cause any more significant delay.

    d) Finally, I'm unsure of how to finish the last part of the project (which is not expected for this release). If possible, I'd like to get feedback on how to proceed at the next meeting with our group.
After that, close the email with a reiteration that it was your fault, you learned from it, you won't let it happen again, and that it will be resolved soon.

Since I rarely make mistakes, I'm certainly no expert at how to handle them. Therefore, I pose the question to you all, the experts: How would you handle big mistakes? What strategies have worked (or failed) for you in the past?

Don't be afraid to make connections with other programmers, even if you might consider them a "rockstar." Those connections can make you a much better software developer. That's the point Chad Fowler makes in this week's chapter of MJWTI.

After relating the concept to the music scene (with which at one time I was also familiar), Chad (not surprisingly) sums up the matter in a few well-chosen words:
The most serious barrier between us mortals and the people we admire is our own fear. Associating with smart, well-connected people who can teach you things or help find you work is possible the best way to improve yourself, but a lot of us are afraid to try. Being part of a tight-knit professional community is how musicians, artists, and other craftspeople have stayed strong and evolved their respective artforms for years. The gurus are the supernodes in the social and professional network. All it takes to make the connection is a little less humility.
One of the reasons I started blogging was to make connections with other programmers. I enjoy it. Before I started reaching out to my fellow code-enthusiasts, I sucked. I still suck (don't we all?), but I suck progressively less each day. Part of the reason I'm on the road to Notsuckington can be attributed to the connections I've made with all of you.

Some of you taught me better design. To argue better. To write clearly, in code and prose. The value of being a wishy-washy flip-flopper.

I'm a wishy-washy flip-flopping programmer

Some of you helped me find flaws in my own work. Some helped correct them. The list could literally continue much further. However, in the interest of not publicly proclaiming myself a leech, I'll stop there.

Boo! Are you scared? Am I a zombie who wants to feed on your brain?

Lucy Liu photoshopped to be a zombie
From a contest @

Ok, so I am a zombie who wants to feed on your brain. Luckily, it's not a zero-sum proposition. You can retain your thoughts, knowledge, and memories while also sharing them with me.

Feel free to drop me a line any time. You might be able to teach me something, even if you're asking a question. I might be able to teach you something too. I won't be offended at you contacting me if you won't be offended if I don't always have the time to respond.

Let's travel to Notsuckington together.

Have you any stories of how connections with other programmers have made you better? Please, share them with the rest of us! (You can leave the names out, if you want.)

It's been said and repeated and retweeted:
Code is read much more often than it is written, so plan accordingly
I think that's sound advice. But is there ever a time when you ought to use cryptic one-letter variable names and strange symbols in your code?

If we're admonished to write code so that it's easier to be read, then I think, yes, depending on your intended audience, there are times when it's OK.

For example, if you work in a science lab and are implementing a math theorem, everyone on the team knows how to read and probably prefers the concise notation. Would that not be a good time to break the no-single-letter-variable-name rule?


I'm playing with an idea and I want to see your ideas of the craziest code that perhaps ought not be.

If you've got a moment, please help me by filling out the questionnaire or share it on Twitter.

Thanks for your help!

I just read an excerpt from @avdi's new alpha Confident Ruby ebook and it prompted some thoughts:

In the article, he talks about dealing with an account_balance where you iterate over the transactions of the account and sum up their amounts to arrive at a final balance.

A special case arrives when he points out you're dealing with a transaction.type whose value is "pending". You clearly don't want to include this in the account_balance because when the transaction processor introduces a new transaction of "authorized" for the same purchase, your overall balance will be incorrect.

A lot of the code I see (and used to write) looks like Avdi's example:

def account_balance
  cached_transactions.reduce(starting_balance) do |balance, transaction|
    if transaction.type == "pending"
      balance + transaction.amount

It cannot be stressed enough how important the advice is to go from code like that to introducing a new object. In my experience, many cases are solved by simply introducing an "you need", to: "support"), but Avdi advocates going further than that, and introducing a new object entirely.

I'm a fan of that, but typically I'll wait until YAGNI is satisfied, like when I need a method call with parameters.

Doing that is a huge win. As Avdi points out, it
solves the immediate problem of a special type of transaction, without duplicating logic for that special case all throughout the codebase
But for me, the second benefit he mentions is the biggest, and I hope he'll revisit its importance over and over again:
But not only that, it is exemplary: it sets a good example for code that follows. When, inevitably, another special case transaction type turns up, whoever is tasked with dealing with it will see this class and be guided towards representing the new case as a distinct type of object.
It's something I mentioned in my How to avoid becoming a formerly-employed Rails developer standing in line at the OOP Kitchen presentation, and I'll continue to stress its importance: "it's always easier to go with the flow, even when the flow is taking you through the sewers." So if you can introduce a little example of how to avoid the sewers, do it.

Anyway, I bought the book (which is literally just an introduction right now) based on the strength of that article and Avdi's commitment in his blog post. The article I linked to on Practicing Ruby is not yet in the book, but I hope it makes its way in.

I really enjoy the style of what I've read so far as a narrative, and if the article is any indication, this will be better than Objects on Rails (which I loved). One bit of feedback though: I'd like to see a "Key Takeaway" at the end of every section, so it can double as a quick reference book when I need to remind myself of its lessons.

Just two years ago, I was beyond skeptical towards the forces telling me that comments are worse-than-useless, self-injuring blocks of unexecutable text in a program. I thought the idea was downright ludicrous. But as I've made an effort towards reaching this nirvana called "self-documenting code," I've noticed it's far more than a pipe dream.

The first thing you have to do is throw out this notion of gratuitously commenting for the sake of commenting that they teach you in school. There's no reason every line needs to be commented with some text that simply reiterates what the line does.

After that, we can examine some seemingly rational excuses people often use to comment their code:
  • The code is not readable without comments. Or, when someone (possibly myself) revisits the code, the comments will make it clear as to what the code does. The code makes it clear what the code does. In almost all cases, you can choose better variable names and keep all code in a method at the same level of abstraction to make is easy to read without comments.

  • We want to keep track of who changed what and when it was changed. Version control does this quite well (along with a ton of other benefits), and it only takes a few minutes to set up. Besides, does this ever work? (And how would you know?)

  • I wanted to keep a commented-out section of code there in case I need it again. Again, version control systems will keep the code in a prior revision for you - just go back and find it if you ever need it again. Unless you're commenting out the code temporarily to verify some behavior (or debug), I don't buy into this either. If it stays commented out, just remove it.

  • The code too complex to understand without comments. I used to think this case was a lot more common than it really is. But truthfully, it is extremely rare. Your code is probably just bad, and hard to understand. Re-write it so that's no longer the case.

  • Markers to easily find sections of code. I'll admit that sometimes I still do this. But I'm not proud of it. What's keeping us from making our files, classes, and functions more cohesive (and thus, likely to be smaller)? IDEs normally provide easy navigation to classes and methods, so there's really no need to scan for comments to identify an area you want to work in. Just keep the logical sections of your code small and cohesive, and you won't need these clutterful comments.

  • Natural language is easier to read than code. But it's not as precise. Besides, you're a programmer, you ought not have trouble reading programs. If you do, it's likely you haven't made it simple enough, and what you really think is that the code is too complex to understand without comments.

There are only four situations I can think of at the moment where I need to comment code:
  1. In the styles of Javadoc, RubyDoc, et cetera for documenting APIs others will use.
  2. In the off chance it really is that complex: For example, on a bioinformatics DNA search function that took 5 weeks to formulate and write out. That's how rare it is to have something complex enough to warrant comments.
  3. TODOs, which should be the exception, not the rule
  4. Explaining why the most obvious code wasn't written. (Design decisions)
In what other ways can you reduce clutter comments in your code? Or, if you prefer, feel free to tell me how I'm wrong. I often am, and I have a feeling this is one of those situations. What are some other reasons you comment your code?

Update: I forgot to mention that discussions on relatively recent blog posts by Brian Kotek (Don't Comment Your Code) and Ben Nadel (Not Commenting and The Tipping Point of Poor Programming) got the creative juices flowing in my brain for this post. Thanks for bringing it up, gentlemen.

Here's a 35 minute recording of the presentation which I gave to houstonrb on February 21, 2012. It is a practice run I did before the live presentation, so you won't get the discussion, but hopefully you'll find it useful anyway.

How to avoid becoming a formerly-employed Rails developer standing in line at the OOP Kitchen from Sammy Larbi on Vimeo.

You can find the slides here: Slides for the Rails OOP presentation

There is also reference to a project whose purpose is to eventually be a full-scale demonstration of the techniques: Project for the Rails OOP presentation

Let me know what you think in the comments below.

Updated to use HTML5 player at Vimeo.

Someone rip this idea apart:

When you write an if statement in one class that's using data from an object of a different class to make a decision, ask yourself:
  1. Would this code be more appropriate in the other object?
  2. Would it be better to introduce a new object, whose purpose it is to do this?
Thoughts appreciated.

You know when you see code like this:

class CompulsionsController < ApplicationController
    # ... standard actions above here
    def update
      if params[:obsessions].include?(ObsessionsTypes[:murdering_small_animals])
        redirect_to socio_path and return
      elsif params[:obsessions]
        redirect_to standard_obsessions_path and return
      # normal update for compulsions
      @compulsion = Compulsions.find(params[:id])
      # ... remainder of the standard actions below here

and the phrase "WTF were they thinking?" runs through your mind? More...

My vocabulary is failing me right now. What do you call it when a piece of code checks the type of an object before doing something to it?

Type Casing is the act of using case statements in a program to determine what to do with an object based on what type of object it is. It's an OO fail, often hoping to implement Multiple Dispatch. (See also Case Statements Considered Harmful)

Here are three passive-aggressive ways to feel like you're getting back at typecasers.


With a name like each_cons, I thought you were going to iterate through all the permutations of how I could construct a list you operated upon. For example, I thought

[1,2,3,4].each_cons do |x| # I did not notice the required argument
  puts x.inspect

would output: More...

It's a small step, but emcee-3PO can now identify the staves in an image of sheet music for my single test case of "My Darling Clementine." I need to include hundreds more test cases, and I plan to when I implement code to make the tests mark the sheet music with what emcee3po detected so I can visually inspect the accuracy.

Ichiro Fujinaga's "Optical Music Recognition using Projections" (PDF) explains the process in detail, but it turns out to be relatively simple.

To locate the staves:
  1. Do a y-projection on the image.
    A projection just reduces the number of dimensions in an image. In this case, we just take the number of dark-colored pixels in a row of the image. It's similar in theory to 3D projection, but instead of projecting three dimensions onto a plane, we're projecting a plane onto a line.

    I used a threshold of 50% to determine if a pixel was dark enough to include in the projection. So, if R+G+B < (FF+FF+FF) / 2, I count the pixel as dark.

  2. Find the local maxima.
    We want to find the places where the number of dark pixels in a row is highest - those will indicate the horizontal lines on the staff. To do that, we find all the places where the number of pixels stops growing and starts getting smaller -- or where the slope changes from positive to negative. To ignore noise, we set a threshold as Fujinaga suggests at the average of each row, so we don't include anything less than that in our collection of local maxima.

  3. Find the tightest groups of 5.
    We want to find all the places where 5 local maxima are the smallest distance apart, which should indicate the 5 lines in a staff. This part is accomplished by examining each 5-element window in the array of local maxima, and finding the one with the smallest distance between its points. Then you can remove all the windows that include any of those points, and continue until there are no more windows.

  4. Expand those indexes to collect the places where the notes fall outside the staff lines.
    I don't remember Fujinaga mentioning this in the paper I linked to above, but I'm thinking it must be in there. Essentially, since the local maxima get us only what's in between the 5 lines of the staff, we need to expand it a bit so we can get the notes that don't fall directly between the 5 lines. Right now, I've used 1/4 of the average of the rows in the projection, but I think it will need to be an even smaller threshold because I'm still not reliably getting all of the notes.
Up next: reading the notes on the staves. That's going to be cool.

Frequent changes and deprecations to technology you rely upon cause dependencies to break if you want to upgrade. In many cases, you'll find yourself hacking through someone else's code to try to fix it, because a new version has yet to be release (and probably never will). That can be a chore.

I get embarrassed when something I've recommended falls prey to this cycle. Backwards compatibility is one way to deal with the issue, but in the Rails world, few seem worried about it. It doesn't bother me that code and interface quality is a high priority, but it does cause extra work.

There's a trick you can use to help isolate the pain, decrease the work involved in keeping your app up to date, and improve your code in general. You've probably heard of it, but you might not be using it to help you out in this area: encapsulation.


emcee-3PO takes a text file of music, runs it through a Markov Model, and generates new music to be played with Archaeopteryx.

It's an idea I've had for a while, which _why day gave me a good excuse to start.

There's not much there yet -- the first sentence above tells you exactly what it does -- but I hope to add some features to it as time allows:
  • Read from sheet music
  • Use different instruments instead of simply playing the notes as written / make it symphonic
  • Choose probabilities in some smart way to provide to Arkx (right now it's just 100% play this note)
  • Mixing multiple songs
I just thought it'd be a fun exercise, but I bet there's some really cool practical stuff this could turn into.

Feel free to watch the repository if you're interested in seeing this develop.

More importantly, and use your imagination here: Do you have any thoughts about what something like this could do?

When you first introduce someone to source control, things often go smoothly until the first time they have to merge conflicting changes. Then they wonder,
What's the point of this? I thought it was supposed to be easy but it's a PITA.
Two responses on how to combat this immediately come to mind.

The first thing that can help in this situation is to ensure they aren't having to merge useless files. I'm thinking of files here like the CSS generated from SASS: they change frequently but the changes don't affect them one way or the other. (In this case, because the CSS will be regenerated). Another example is a user-specific project settings file.

Two strategies to avoid useless merging are to ignore files (do not have the repository track them) and to automatically use one version of a file or another. Each has it's place.

In the case of a file that needn't be in the repository to begin with -- things like Thumbs.db or .DS_Store -- you should obviously ignore them. In the cases where files should be in the repository, but where you know which version you want all the time, you should consider telling it to always choose one file over another.

If you're using git, .gitignore will help you ignore files, while .gitattributes will help you choose one file over another without specifying it every time. I only wanted to make you aware of this, and Pro Git explains it better than I could, so I'll let you read about how to do it over there.

Thanks to Markus Prinz who helped me find .gitattributes when I didn't know the term I was looking for.

So what's the second thing that helps a newcomer to source control overcome their hatred of merging conflicting changes?

Remind them the alternative is to consistently have their own work overwritten.

I was introduced to the concept of making estimates in story points instead of hours back in the Software Development Practices course when I was in grad school at UH (taught by professors Venkat and Jaspal Subhlok).

I became a fan of the practice, so when I started writing todoxy to manage my todo list, it was an easy decision to not assign units to the estimates part of the app. I didn't really think about it for a while, but recently a todoxy user asked me
What units does the "Estimate" field accept?

Yesterday I got sick of typing rake test and rake db:migrate and being told
You have already activated rake 0.9.2, but your Gemfile requires rake 0.8.7. Consider using bundle exec.
I know you should always run bundle exec, but my unconscious memory has not caught up with my conscious one on that aspect, so I always forget to run rake under bundle exec.

So I wondered aloud on twitter if I could just alias rake to bundle exec rake, but confine that setting to specific directories (with bash being my shell).

Turns out, it is possible with the help of another tool that Calvin Spealman pointed me towards: capn.


I have a job where in any given week I might be working on any one of 30 projects spread across a half dozen product lines. I freelance, sometimes with a single company, but I also work a lot through another company for several different customers. I have my personal projects too, of course, and then there's non-work type things like getting a haircut, building a garden, or changing the air filters around the house.


My list of things to do is too complex for me to keep in my head. It doesn't fit in a typical list because I might want to see what needs to be done, not just at work, or not just for a client, but also for different customers or projects, or both.

Furthermore, it doesn't quite fit into a project management tool either. I need something more flexible that lets me keep my professional and personal lists in the same place, and that gives me just a list when I need it, or some predictions and statistics and data when those things are appropriate.


So when I started on this project, I wanted something more than a todo list, but not as involved as a project management suite. More...

I don't remember what I thought the first time saw the title of this chapter ("Learn to Love Maintenance") from My Job Went To India. I felt sick though. The thought process probably went something like this:
Oh no. You mean I'm going to be stuck here so long I better learn to love it?

I've got it really bad - I have to maintain a bunch of the code I wrote. Mere mortals cannot comprehend how bad that is. When do I get to toss my refuse off to some other sorry excuse for a programmer?
Apparently that's a common response programmers have, given the prospect of enjoying an illustrious career as a an elephant dung-slinger.

But Chad Fowler (in the book) turns it around, saying that not much is expected of maintenance programmers. You just need to fix the occasional bug or add a small feature here or there. It really boils down to this:
[In a greenfield project,] when we don't have the constraints of bad legacy code and lack of funding to deal with, our managers and customers rightfully expect more from us. And, in project work, there is an expected business improvement. If we don't deliver it, we have failed. Since our companies are counting on these business improvements, they will often put tight reigns on what gets created, how, and by when. Suddenly, our creative playground starts to feel more like a military operation - our every move dictated from above.

But in maintenance mode, all we're expected to do is keep the software running smoothly and for as little money as possible. Nobody expects anything from the maintenance crew. Typically, if all is going well, customers will stay pretty hands-off with the daily management of the maintainers and their work. Fix bugs, implement small feature requests, and keep it running. That's all you have to do.
Moreover, after enough code is written, that new project isn't much different than your maintenance work. It's just missing the benefits.

Consequently, you've got a lot more freedom in maintenance to do as you will. Get creative. Spruce up the UI a little bit. Since you get to interact with your "customer" more often, "more people will know who you are, and you'll have the chance to build a larger base of advocates in your business."

On top of that, being responsible for the entire application, it's likely that "even without much effort, you will come to understand what the application actually does." In doing so, you're well on your way to becoming a domain expert.

As I've mentioned before in several of the "Save Your Job" series' posts, as of this writing, I'm working with a small company. So, not only am I a maintenance programmer, I'm a greenfield project programmer too. I've been known to wear more than one hat (As I'm sure many of you can say).

Because of that and the push to drive maintenance costs down - I don't get as many opportunities to get creative in maintenance as Chad suggests. That's a bit of a downer for me.

But he ends it on a motivational high note in the "Act on it!" section: Pick the most important aspect of your maintenance work and find a way to measure it. Make it a point to improve your performance in that area, and measure again. Continuously improve. It's more motivating than having the mindset laid out in the introduction to this post, and you'll likely raise a few eyebrows.

Suppose you want to write an algorithm that, when given a set of data points, will find an appropriate number of clusters within the data. In other words, you want to find the k for input to the k-means algorithm without having any a priori knowledge about the data. (Here is my own failed attempt at finding the k in k-means.)

def test_find_k_for_k_means_given_data_points()
  data_points = [1,2,3,9,10,11,20,21,22]
  k = find_k_for_k_means(data_points)
  assert(k==3, "find_k_for_k_means found the wrong k.")

The test above is a reasonable specification for what the algorithm should do. But take it further: can you actually design the algorithm by writing unit tests and making them pass?

I've previously expressed my doubt that TDD makes an effective approach to algorithm design. More recently, I alluded to a some optimism towards the same idea. Then, in a comment to the same post, Dat Chu asked about using unit tests in algorithm design, also referencing something that Ben Nadel had said about asserting in code-comments how the state of the algorithm should look at certain points.

That all led to this post, and me wanting to lay my thoughts out a little further.

In the general case, I agree with Dat that it would be better to have the executable tests/specs. But, what Ben has described sounds like a stronger version of what Steve McConnell called the pseudocode programming process in Code Complete 2, which can be useful in working your way through an algorithm.

Taking it to the next step, with executable asserts - the "Iterative Approach To Algorithm Design" post came out of a case similar to the one described at the top. Imagine you're coming up with something completely new to you (in fact, in our case, we think it is new to anyone), and you know what you want the results to be, but you're not quite sure how to transform the input to get them.

What good does it do me to have that test if I don't know how to make it pass? The unit test is useful for testing the entire unit (algorithm), but not as helpful for testing the bits in between.

Now, you could potentially break the algorithm into pieces - but if you're working through it for the first time, it's unlikely you'll see those breaking points up front. When you do see them, you can write a test if you like. However, if it's not really a complete unit, then you'll probably end up throwing the test away.

Because of that, and the inability to notice the units until after you've created them, I like the simple assert statements as opposed to the tests, at least in this case.

When we tried solving Sudoku using TDD during a couple of meetings of the UH Code Dojo, we introduced a lot of methods I felt were artificially there, just to be able to test them. We also created an object where one might not have existed had we known a way to solve Sudoku through code to begin with.

Now, we could easily clean up the interface when we're done, but I don't really feel a compulsion to practice TDD when working on algorithms like I've outlined above. I will write several tests for them to make sure they work, but (at least today) I prefer to work through them without the hassle of writing tests for the subatomic particles that make up the unit.

When I try out a product, I like it to work. Sometimes, I like to tinker with things to gain a better understanding of how they work. Occasionally, I can manipulate them with skill. At other times, I'm tinkering in the true sense of the word.

I'm going to point out a few problems I've had with some products I've been using. I won't name names, but some of you who also use these products (or similar ones with the same problems) might understand what I'm talking about. Hopefully, you all can draw a good conclusion from it.

I'm not a mechanic, but sometimes I might want to look into my radiator. Is there a reason I need to disassemble the front of my car to unscrew the radiator cap? Likewise, diagnosing a problem with, and subsequently changing a starter or alternator isn't a task that only automobile doctors should perform. These are relatively high-failure parts that shouldn't require taking apart the exhaust system to replace. That's the tinkerer in me talking.

Picture of an engine
Image courtesy Daimler-Chrysler via How Stuff Works

I am a software developer, but sometimes I don't want to act like one when I work with software written by someone else. Don't make me build your product from source. I'm not asking you to build it for me, and then email it to me. I'm just asking if it's possible for you to set up an automated build on all of the platforms where you intend your software to work.

I enjoy the ability to tinker with open source software, not the requirement to do so.

The situation is even worse for proprietary programs. If you're lucky, there might be a way to mess with the settings that satisfies the tinkerer in you. But if you're not lucky, you could do things like upgrade your operating system and have it break something in 3rd party code. Then you'll be at the mercy of that vendor to make their software run properly with your upgrade. In the mean time, you're stuck manually starting the software, or writing your own scripts to do it.

In the first release of an accounting package, I might expect to edit a text file to set my tax rate. But I better not have to do that in the ninth version. Moreover, if your software makes certain assumptions about its running environment and those assumptions are parts that are likely to fail, you better make sure I can change them in your program without tearing it apart.

In the end, I understand that for early versions of products, you might not have worked out all the kinks or even found its purpose or audience. I expect that as an early adopter, and when I take that plunge, I enjoy it. But once you're several years old and multiple versions in, I don't think it's too much to have certain expectations. At that point, your software should just work and make it easy to make small adjustments.

When I work in Windows, I don't get as much done as when I'm in MacOS X.

It's not because MacOS is inherently better than Windows productivity-wise. It's because my calendar and time-boxing mechanism resides on MacOS. So when I've got an entire day of work to do in Windows, I don't have anything telling me "it's time to switch tasks."

Why is that a problem? That's the focus of this week's chapter in MJWTI. (Last week, I took a mini-vacation / early bachelor party to go fishing at Lake Calcasieu in Southwest Louisiana, so I didn't get around to posting then in the Save Your Job series.)

The "Eight-Hour Burn" is another of the chapters in Chad's book that really stuck with me after I first read it. The premise is that instead of working overtime, you should limit each day's work to an 8 hour period of intense activity. In doing so, you'll get more done than you otherwise would. Our brains don't function at as high a level as possible when we're tired. And when we're working on our fifth 60-hour week, we're probably tired.

We may love to brag about how productive we are with our all-nighters [paraphrasing Chad], but the reality is we can't be effective working too many hours over a long period of time.

A Brain Scan

And it's more than just our brains being tired that should prevent us from working too long. It's the fact that when we know we've got so much extra time to do something, we end up goofing off anyway:
Think about day 4 of the last 70-hour week you worked. No doubt, you were putting in a valiant effort. But, by day 4, you start to get lax with your time. It's 10:30 AM, and I know I'm going to be here for hours after everyone else goes home. I think I'll check out the latest technology news for a while. When you have too much time to work, your work time reduces significantly in value. If you have 70 hours available, each hour is less precious to you than when you have 40 hours available.
That's why I'm in trouble when I work in Windows all day. I do work 12-16 hours most days between job, school, and personal activity (like this blog). I get burnt out every few weeks and have to take a couple of days off, but when I'm in MacOS X, at least my working days are very productive: I've got each task time-boxed and any time I spend reading blogs or news or just getting lost on the web is always scheduled.

When I'm in Windows with nothing to remind me but myself, I drift off quite a bit more easily. After all, it's only 6:30 in the morning. I've still got eight hours to get everything done (I'm leaving early to go check the progress on the house I'm having built).

The good news is that I've got an item on my longer-term to-do list to remedy the situation. Hopefully after reading this chapter again, I'll be more motivated to get it done. The bad news is, since it means working in Windows all day to get it done, I'll probably be off doing my best to read all of Wikipedia instead.

Anyway, how much do you work? Do you have any tricks for staying fresh and motivated?

An interesting discussion going on over at programming reddit got me thinking about what features I'd like my ideal job to have. Aside from "an easy, overpaid job that will get bring me fame and fortune," here's what would make up my list:
  1. Interesting Work: You can only write so many CRUD applications before the tedium drives you insane, and you think to yourself, "there's got to be a better way." Even if you don't find a framework to assist you, you'll end up writing your own and copying ideas from the others. And even that only stays interesting until you've relieved yourself of the burden of repetition.

    Having interesting work is so important to me that I would contemplate a large reduction in income at the chance to have it (assuming my budget stays in tact or can be reworked to continue having a comfortable-enough lifestyle, and I can still put money away for retirement). Of course,the more interesting the work is, the more of an income reduction I could stand.

    Fortunately, interesting work often means harder work, so many times you needn't trade down in salary - or if you do, it may only be temporary.

    The opportunity to present at conferences and publish papers amplifies this attribute.

  2. Competent Coworkers: One thing I can't stand is incompetence. I don't expect everyone to be an expert, and I don't expect them to know about all the frameworks and languages and tricks of the trade.

    I do expect programmers to have basic problem solving skills, and the ability to learn how to fish.

  3. Sane Management: Don't make me delighted to have a broken watch. I shouldn't have to turn back the dial on it to meet deadlines. Or put more strongly, management/customers shouldn't be given the latitude to control all three sides of the iron triangle of software development.

    Scope, Schedule, and Resources. Choose two. We, the development team, get to control the third.

  4. Trust: One comment in the reddit discussion mentioned root access to my workstation and it made me think of trust. If you cannot trust me to administer my own computer, how can you trust me not to subvert the software I'm writing?

    Additionally, I need you to trust my opinion. When I recommend some course of action, I need to know that you have a good technical reason for refusing it, or a good business reason. I want an argument as to why I am wrong, or why my advice is ignored. If you've got me consistently using a hammer to drive marshmallows through a bag of corn chips, I'm not likely to stay content.

    I don't mind if you verify. In fact, I'd like that very much. It'll help keep me honest and vigilant.

  5. Personal Time: I like the idea of Google's 20% time. I have other projects I'd like to work on, and I'd like the chance to do that on the job. One thing I'd like to see though, is that the 20% time is enforced: every Friday you're supposed to work on a project other than your current project. It'd be nice to know I'm not looked down upon by management as selfish because I chose to work on my project instead of theirs.

    I wouldn't mind seeing something in writing about having a share of the profits it generates. I don't want to be working on this at home and have to worry about who owns the IP. And part of that should allow me to work on open source projects where the company won't retain any rights on the code I produce.

  6. Telecommuting: Some days I'll have errands to run, and I dislike doing them. I really dislike doing them on the weekends or in the evenings after work. I'd like to be able to work from home on these days, do my errands, and work late to make up for it. It saves the drive time on those days to help ensure I can get everything done.
There are some other nice-to-have's that didn't quite make the list above for me. For example, it would be nice to be able to travel on occasion. It would also be nice to have conferences, books, and extended learning in general paid for by the company. I'd like to see a personal budget for buying tools I need, along with quality tools to start with.

But if you can meet some agreeable combination of the six qualities above, I'll gladly provide these things myself. If you won't let me use my own equipment, and you provide me with crap, we may not have a deal.

That's my list. What's on yours?

When I wrote about things I'd like to have in a job, I didn't expect that one of the items on my list would draw the kind of reaction it did. A couple of comments seemed to think I'm off my rocker about personal project time:
Why should I pay for the time you spend doing your own projects? You are free to have 20% of the time to yourself if you can take 20% pay cut (from Adedeji O.)
Did I understand you correct? You want to work on something that has no relation to your employer and still get paid by him?
I don't know the exact rates in the US, but, these 20% will easily sum up to more than 10,000 USD per developer per year. Would YOU really pay that for nothing? (from Christoph S.)
To be fair, Christoph did say that he's "not talking about time for personal and technical development and not about work related projects that MAY include benefit for your employer. I'm talking about working on 'my projects' and 'open source projects.'"

Here's what I had said that provoked the reactions:
I like the idea of Google's 20% time. I have other projects I'd like to work on, and I'd like the chance to do that on the job. One thing I'd like to see though, is that the 20% time is enforced: every Friday you're supposed to work on a project other than your current project. It'd be nice to know I'm not looked down upon by management as selfish because I chose to work on my project instead of theirs.

I wouldn't mind seeing something in writing about having a share of the profits it generates. I don't want to be working on this at home and have to worry about who owns the IP. And part of that should allow me to work on open source projects where the company won't retain any rights on the code I produce.
Calling it "my project," talking about who owns the intellectual property, and working on open source appear to be where I crossed the line. At Google, we see things like News and GMail coming from the personal time. What I stated could mean that a programmer for a bank's website that's done in ColdFusion could end up working on a GUI for Linux in C. It's rather hard to make the connection there.

First, I didn't mean to imply that the company would not own anything of any part of what I worked on. If I am willing to take a 20% cut in pay, then certainly I wouldn't expect them to own anything. What I was looking for is some equitable way to share what I create with my time. For example, I might take one day a week at work to build my new widget, but if I'm taking 2 days on the weekend to work on it, I ought to get more than a "thank you" if the project goes on to make hundreds of millions of dollars.

And, I was just saying there needs to be some way to allow me to work on open source software during that time. I don't know how the details would work, but there surely is an equitable way to deal with it. I certainly could have explained it better.

In any case, I don't expect that 20% of my time spent away from my main project translates to the project taking 20% longer, or 20% less profit or revenue or productivity for my company. The degradation may be linear if we are data entry clerks - if all we are doing is typing instructions given to us from on high (if the work is physical). But that's not typically what programming is about.

I'm not running a company though, and I've not done a study about such companies. If I were to do such a study, that would be my hypothesis, and what follows is what I would expect to see.

The great divide between average a good programmers is well-known. Let us suppose that an average programmer is being paid sixty thousand dollars a year. Let us also suppose that you have a team of good programmers at the lower end of the "good" scale - those who are 10 times more productive than their average counterparts.

Are you paying them six hundred thousand dollars per year? I'd be surprised if anyone said yes to that question. You may pay them more than average programmers, but not an order of magnitude more. Even if they are only twice as productive as the average ones, you aren't likely to be paying them enough to worry about 20 percent of their time compared to what you would be paying less-competent people and getting out of them.

But what does that have to do with anything? Suppose I don't have good programmers - what then? If all of your programmers are average or worse, 20% time is not justified in their salaries the same way it is for a good programmer. So what benefits would we expect to see for companies who provide some amount of unrestricted free time?

First, I think the cases where a developer does something unrelated with his free time will be rare. But even if what I'm choosing to work on in my time is the polar opposite of the company's direction, does the company still benefit from it?

I think so. Here are some ideas:
  • The company can list projects it sponsors by paying developers to work on them. We've seen examples of companies hiring developers to work on open source projects. This has been especially prevalent in the Ruby community lately, where Sun hired the JRuby guys to work on JRuby, Microsoft hired John Lam to work on IronRuby, and Engine Yard hired Rubinius developers. Relevance uses a lot of open source software in their solutions, so to give back, they have Open Source Fridays.

    I would expect that allowing your developers that free time to work on anything would bring better developers your way. If I'm looking for a job and one company offers me unrestricted free time plus the same salary as you offer, who do you think will be my first choice?

  • If your work is generally not all that interesting, letting your programmers have a weekly break to work on things that are more interesting will help retain them and keep their minds fresh for the real work. As I've noted before, I almost quit the industry altogether when I got stuck in the boring-development cycle.

  • Working on open source software can make your existing developers better by interacting with other good developers and learning new things.

  • Your developers are learning things and keeping up with the newest technology, allowing you to react to changes in the market place quicker than your competitors. (You might not make iPhone applications at the moment, but if I spent a couple of Fridays digging into the iPhone SDK, you might already have an application ready to go.)

  • You might spin off new companies or departments based on products your employees are writing in their free time.
When I first read the reactions, I thought I did go overboard. But after some more consideration, I'm not so sure. Of course, we need a way to quantify and test these hypotheses to be sure what type of a positive or negative impact such a thing has on companies. But, I think it's very safe to say that going to a four day work week does not automatically mean we've lost 20% in one way or another. In the best of cases, it could mean gains in productivity if it allows you to attract better developers.

What do you think? I assume we can agree that a four day work week for thought workers would not necessarily translate into getting only 80% of the value they would normally output. I'm more interested in hearing what gains or drawbacks you can see in having employees take personal programming Fridays, or something like it.

For those who disagree with the idea entirely, what if we restrict it to open source projects the company utilizes in day-to-day work? What if we restrict it to projects the company can use as infrastructure or turn into products (as I understand Google's to work)? Would you feel better about it then?


I recently decided to cut the cord on cable and downloaded PlayOn to do the job of serving video content from the internet to my Xbox 360 and Wii, so I could watch it on my TV. Naturally, this led me to figure out how it worked, and how I could play around with it.

One thing I immediately wanted to do was improve the organization functionality. I thought I'd start with a plugin that simply takes the media server's location of video output and provides it as input back to the same server, but in a different organization scheme. Unfortunately, that didn't work as PlayOn didn't know what to do with its own URIs as input for a VideoResource in the Lua plugin API.

I didn't want to get into the scraping, transcoding, and serving of files from the internet -- that's already done well and I didn't want to spend all that time creating a media server of my own. But I did want to see better fastforward and rewind capability. To solve that, I thought I'd create a DVR for PlayOn (or TVersity, or any other media server really) and knock out both the organization features I wanted, along with the seek functionality.

Launching a business

This will be my first attempt at launching a money-making venture online (aside from programming for other people's money-making ventures). I don't expect this will turn into a full-time job, nor do I expect I'll make retirement money off of it, but I think it can make a decent return that won't require a ton of ongoing work on my part, and it might make a fun experiment.


I'm writing a Client of FooServer which is reliant upon a rather clunky library whose functionality I'd like to encapsulate in a wrapper library to make the usage less clunky.

My first thought is to choose FooServer as the name of what will likely be the most used class in the library. That way, when I want to get some property of the server or tell it to perform some action I can do something like:

FooServer foo = FooServer.findById(12);
string property =;

That seems innocent enough. But my fear is that by calling the class FooServer, it may violate the principle of least surprise because it does not fully implement a Foo server, nor would it be of use in implementing a Foo server.

I also dislike tagging it with "Wrapper" or something of the sort, because reading that in the client code doesn't seem right either.

I know I'm probably over-thinking it, but names are important to me, because they are the start of clean, readable code that does what it says and says what it does. So that's why I come to you, dear reader.

What would you name the class?

Rails Rumble has nothing on this.

Of course, you could just click the edit button in your database management studio of choice and achieve the same functionality.

SELECT DISTINCT 'script/generate scaffold ' + + ' ' + column_names
FROM sys.tables t
    SELECT + 
           case when max_length > 255 then ':text' else ':string' end + ' '
    FROM sys.columns c
    WHERE c.object_id = t.object_id
    ORDER BY c.column_id
    FOR XML PATH('') ) dummy_identifier ( column_names )

A similar discovery was made in the 1930's. One important difference to note is that, since my program does not simulate the input on it's output program, I am able to achieve speeds that are logarithmically faster than what Turing could accomplish.

It seems easy to add an if statement the first time you need to deal with a difference between legacy data and new data models. It's still easy the second and third times.

Certainly it's easier than transforming the legacy data to fit the new schema.

Induction doesn't hold though. At some point, it becomes a tangled mess of code that has to deal with too many conditions and your mental model of it turns into total disarray.

This is one case where laziness and instant gratification can steer you in the wrong direction. Avoid the spaghetti; just transform the old data like you should have in the first place.

I was first introduced to the concept that choice could be a bad thing back in 2007 at NFJS in Austin, during Scott Davis's keynote.

It might have been obvious to me before then, had I paid attention to my own thoughts and actions. Instead, it took a little prodding and the knowledge that others experienced the same things I did to get me thinking about it.

I was reintroduced to the concept while listening to an episode of my favorite podcast what might be the best radio show on the planet, titled "Choice." Taking it in while traveling nowhere on my elliptical machine, I realized I had to share the podcast and its lessons with the rest of you.

The theme weaved by the lessons is that choice is not all it's cracked up to be. For programmers, I think this is especially so.


When I was studying practice test questions for Exam 70-431 last week, the type of questions and answers I read led me to the thought that certifications attempt to commodify knowledge and use it in place of thought. More...

Through the last forty-plus weeks, we've explored looking at yourself as a product and service, different ways of positioning yourself in the programming market, ways to invest in yourself, how to execute decisions to be a better programmer, and several options you have for marketing yourself, guided by Chad Fowler's excellent book, My Job Went To India.

Today we'll start looking at the "Maintaining Your Edge" section of the book. More...

I put faith in web application development as an income source like I put faith in the United States Social Security system. That is to say, it's there now, but I don't expect to be able to rely on it in its current incarnation very far into the future.

James Maguire quotes Robert Dewar hitting the nail on the head: More...

I didn't expect many people to agree with me on the issue of web application development walking the plank.

Why? More...

Even though the practice of developing a specific piece of software is better enjoyed as a journey than as a goal, the same is not necessarily true when looking at your career as a whole.


Automatic download of libraries and packages
I'd like to see automatic downloads of libraries/packages from IDEs.

The idea came to me a couple of weeks ago on twitter.

What if our programming environment automatically checked for libraries?

Suppose I decide I want to use ActiveRecord, but I don't have it installed on my system. Wouldn't it be nice to just type


The Questions
Daniel Spiewak asks, "how do you apply polyglotism?"

Daniel mentions Google, "one of the most open-minded and developer friendly companies around," and points out that they have a strict limit in languages to use: Python, Java, C++, and JavaScript. He also says,
To my knowledge, this sort of policy is fairly common in the industry. Companies (particularly those employing consultants) seem to prefer to keep the technologies employed to a minimum, focusing on the least-common denominator so as to reduce the requirements for incoming developer skill sets.
We're afraid of being eaten by the poly-headed polyglot monster. More...

Some people call them fat pants. Some people call them stretch pants. Others might call them exercise pants or sweat pants. Whatever you call them, they're comfortable to wear. The problem with sweat pants is the reason they're comfortable is because they're big and expandable. And that expandability means they have a lot of room to accommodate growth as well. More...

If you want to trap a monkey, it's not very hard. Hollow out a hole (in a coconut, the ground, or whatever) just large enough for a monkey's hand to fit in when relaxed, but too small to pull out as a fist. Put some food in the hole, and wait. Soon enough, a monkey will come, fall in love with the food, grab at it and refuse to let go.

You see, monkeys value food higher than life or freedom, and their devotion to it will not allow them to let go. Or so the story of the south Indian monkey trap goes. (I am merely relating the parable, I have not actually tried to capture a monkey in this manner.) More...

Outsourcing is not going away. You can delude yourself with myths of poor quality and miscommunication all you want, but the fact remains that people are solving those problems and making outsourcing work.

As Chad Fowler points out in the intro to the section of MJWTI titled "If You Can't Beat 'Em", when a company decides to outsource - it's often a strategic decision after much deliberation. Few companies (at least responsible ones) would choose to outsource by the seat of their pants, and then change their minds later. (It may be the case that we'll see some reversal, and then more, and then less, ... , until an equilibrium is reached - this is still new territory for most people, I would think.)

Graph showing much more growth in IT outsourcing with statistics from 2006.
Chad explains the situation where he was sent to India to help improve the offshore team there:
If [the American team members] were so good, and the Indian team was so "green," why the hell couldn't they make the Indian team better? Why was it that, even with me in India helping, the U.S.-based software architects weren't making a dent in the collective skill level of the software developers in Bangalore?

The answer was obvious. They didn't want to. As much as they professed to want our software development practices to be sound, our code to be great, and our people to be stars, they didn't lift a finger to make it so.

These people's jobs weren't at risk. They were just resentful. They were holding out, waiting for the day they could say "I told you so," then come in and pick up after management's mess-making offshore excursions.

But that day didn't come. And it won't.
The world is becoming more "interconnected," and information and talent crosses borders easier than it has in the past. And it's not something unique to information technologists - though it may be the easiest to pull-off in that realm.

So while you lament that people are losing their jobs to cheap labor and then demand higher minimum wages, also keep in mind that you should be trying to do something about it. You're not going to reverse the outsourcing trend with any more success than record companies and movie studios are going to have stopping peer-to-peer file sharing.

That's right. In the fight over outsourcing, you, the high-paid programmer, are the big bad RIAA and those participating in the outsourcing are the Napsters. They may have succeeded in shutting down Napster, but in the fight against the idea of Napster, they've had as much strategic success as the War on Drugs (that is to say, very little, if any). Instead of fighting it, you need to find a way to accept it and profit from it - or at least work within the new parameters.

How can you work within the new parameters? One way is to "Manage 'Em." Chad describes several characteristics that you need to have to be successful with an offshore development team, which culminates in a "new kind" of PM:
What I've just described is a project manager. But it's a new kind of project manager with a new set of skills. It's a project manager who must act at a different level of intensity than the project managers of the past. This project manager needs to have strong organizational, functional, and technical skills to be successful. This project manager, unlike those on most onsite projects, is an absolutely essential role for the success of an offshore-developed project.

This project manager possesses a set of skills and innate abilities that are hard to come by and are in increasingly high demand.

It could be you.
Will it be?

Chad suggests learning to write "clear, complete functional and technical specifications," and knowing how to write use cases and use UML. These sorts of things aren't flavor-of-the-month with Agile Development, but in this context, Agile is going to be hard to implement "by the book."

Anyway, I'm interested in your thoughts on outsourcing, any insecurities you feel about it, and what you plan to do about them (if anything). (This is open invitation for outsourcers and outsourcees too!) You're of course welcome to say whatever else is on you mind.

So, what do you think?

Side Note: Posted Thursday instead of Friday because I'm off to Lone Star Ruby Conference in the morning. Hit me up on twitter or contact me if you're going to be there.

If we accept the notion that we need to figure out how to work with outsourcing because it's more likely to increase than decrease or stagnate, then it would be beneficial for us to become "Distributed Software Development Experts" (Fowler, pg 169).

To do that, you need to overcome challenges associated with non-colocated teams that exceed those experienced by teams who work in the same geographic location. Chad lists a few of them in this week's advice from My Job Went To India (I'm not quoting): More...

I was having trouble with dragging and dropping elements using Scriptaculous's dragdrop.js. Apparently, I'm not the only one. The problem was:
I have a div with overflow = auto, so when there is more content than the size of the div, scrollbars appear. My draggables and droppables are all elements inside of that div. Everything works fine when the scrollbar is scrolled all the way to the top, but when you scroll it down any amount, the draggables fail.

You might think that "tech support" is a solved problem. You're probably right. Someone has solved it and written down The General Procedures For Troubleshooting and How To Give Good Tech Support. However, surprisingly enough, not everyone has learned these lessons. And if the manual exists, I can't seem to find it so I can RTFthing.

The titles of the two unheard of holy books I mentioned above might seem at first glance to be different tales. After all, troubleshooting is a broad topic applicable to any kind of problem-solving from chemistry to mechanical engineering to computer and biological science. Tech support is the lowliest of Lowly Worms for top-of-the-food-chain programmers.


I've spent some time recently building a tool that makes my life a bit easier.

Finishing up a simple Rails log filter. Tired of tracking down sessions for support requests and it takes too long even for support kids.

I've browsed plenty of Rails log analyzers that help me find performance bottlenecks and potential improvements. But what I really need is a faster way to filter my logs to trace user sessions for support purposes. Maybe it's just me, but I've got apps where users report problems that make no sense, where their data gets lost, and who can't tell me what they did. Add to that the fact that I've got the same app running on dozens of different sites, and you can see why performance analyzers aren't what I'm looking for to solve my problem.

Because of that, I need a solution that lets me filter down and search parameters to figure out what a particular user was doing on a certain date. Hence, Ties. More...

Shitty variable names, unnecessary usage of pointer arithmetic, and clever tricks are WTFs.

When I saw this WTF code that formats numbers to add commas, I had forgotten how low-level C really is. Even after using C++ fairly regularly for the last several months.

With 500+ upvotes, the people over at the reddit thread regarding the WTF apparently think the real WTF is the post, not the code.
nice_code, stupid_submitter - in which TheDailyWTF jumps the shark by ridiculing perfectly good code.
Let's forgive the misuse of the wornout phrase and get to whether or not looking at the code should result in utterance of WTFs. More...

This post might be better titled, "How (and how not) to help yourself when Google doesn't have the answer: A whirlwind tour through Rails' source" if only I wasn't too lazy to change the max length of the database field for titles to my blog entries.

Google sometimes seems as if it has the sum of all human knowledge within the confines of its search index. It might even be the case that it does. Even if you prefer to think that's true, there may come a time when humanity does not yet have the knowledge you are seeking.

How often is that going to happen? Surely someone has run up against the problems I'm going to have, right? That hasn't been the case for me the last couple of months.

I may be the only developer writing Rails apps on MacOSX to be deployed to the world on Windows where SQL Server 2008 is the backend to a Sharepoint install used by internal staff to drive the data. I'm not so presumptious to think I'm a beautiful and unique snowflake, but I wasn't finding any answers. More...

This is the first in a series of my answers to Jurgen Appelo's list of 100 Interview Questions for Software Developers. Jurgen is a software development manager and CIO at ISM eCompany, according to his About page.

The list is not intended to be a "one-size-fits-all, every developer must know the correct answer to all questions" list. Instead, Jurgen notes:
The key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills.
This list covers most of the knowledge areas as defined by the Software Engineering Body of Knowledge. Of course, if you're just looking for brilliant programmers, you may want to limit the topics to [the] Construction, Algorithms, Data Structures and Testing [sections of the list]. And if you're looking for architects, you can just consider the questions under the headings Requirements, Functional Design and Technical Design.

But whatever you do, keep this in mind:
For most of the questions in this list there are no right and wrong answers!
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. The format will first list the question, my initial response (to start the discussion), followed by a place I might have looked for further information had I seen the questions and prepared before answering them.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy. More...

This is the second in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy.

My lack of understanding the angle from which these questions come may influence other seemingly off-base answers. Help me out by explaining how you might answer these questions on functional design: More...

This is the seventh in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).

This week's answers are about testing. More...

I like to use descriptive variable names, and I try to err on the side of more-descriptive if I think there's any chance of confusion. contract_participants isn't terribly long, but if you're building up all of its members from different sources (i.e., you can't really loop over it), it can get cumbersome to type and worse, to read. Moreover, it's different from just "participants" and they certainly aren't "contracts," so shortening it in this case wasn't going to happen. More...

Being a programmer, when I see something repetitive that I can automate, I normally opt to do so. We know that one way to save your job is by automating, but another is to know when not to automate. It sounds obvious, but when you get into the habit of rooting out all duplication of effort, you can lose sight of the fact that sometimes, it costs more to automate something than to do it by hand. More...

What's with this nonsense about unit testing?

Giving You Context

Joel Spolsky and Jeff Atwood raised some controversy when discussing quality and unit testing on their Stack Overflow podcast (or, a transcript of the relevant part). Joel started off that part of the conversation:
But, I feel like if a team really did have 100% code coverage of their unit tests, there'd be a couple of problems. One, they would have spent an awful lot of time writing unit tests, and they wouldn't necessarily be able to pay for that time in improved quality. I mean, they'd have some improved quality, and they'd have the ability to change things in their code with the confidence that they don't break anything, but that's it.

But the real problem with unit tests as I've discovered is that the type of changes that you tend to make as code evolves tend to break a constant percentage of your unit tests. Sometimes you will make a change to your code that, somehow, breaks 10% of your unit tests. Intentionally. Because you've changed the design of something... you've moved a menu, and now everything that relied on that menu being there... the menu is now elsewhere. And so all those tests now break. And you have to be able to go in and recreate those tests to reflect the new reality of the code.

So the end result is that, as your project gets bigger and bigger, if you really have a lot of unit tests, the amount of investment you'll have to make in maintaining those unit tests, keeping them up-to-date and keeping them passing, starts to become disproportional to the amount of benefit that you get out of them.

A while ago, I was working with a problem in C# where where our code would get deadlocked, and since someone must die or several must starve, I thought it would be nice to just toss a "try again if deadlocked" statement into the exception handler. I muttered this thought on twitter to see if there was any language with such a try-catch-try-again-if construct. More...

This is the fourth in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).


This is the fifth in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!). More...

SOAP can be a huge PITA in Ruby if you're not dealing with a web service that falls under the defaults. In particular, if your web service falls under HTTPS where you need to change the default certificate acceptance, or if you need to authenticate before seeing the WSDL, you're SOL as far as I can tell as of writing this post. (If you know of a way that doesn't resort to this complexity, please speak up!)

I was using Ruby 1.8.7 and soap4r 1.5.8, but this may apply to other versions. Anyway, here are a couple of monkey patches to help get you there if you're having trouble. More...

We observe that organizational structure metrics are significantly better predictors for identifying failure-prone binaries in terms of precision, and recall compared to models built using code churn, code complexity, code coverage, code dependencies, and pre-release defect measures.
(Link is to abstract page, quote is from the PDF linked to from there, chart below is from the paper as well) More...

SOAP can be a huge PITA in Ruby if you're not dealing with a web service that falls under the defaults. In particular, if your web service falls under HTTPS where you need to change the default certificate acceptance, or if you need to authenticate before seeing the WSDL, you're SOL as far as I can tell as of writing this post. (If you know of a way that doesn't resort to this complexity, please speak up!)

I was using Ruby 1.8.7 and soap4r 1.5.8, but this may apply to other versions. Anyway, here are a couple of monkey patches to help get you there if you're having trouble.

If you need to change the SSL verify mode, for example, to accept a certificate unconditionally, you can use this monkeypatch:

def monkeypatch_httpclient_sslsocketwrap(ssl_verify_mode)
    return unless ssl_verify_mode 
    @sslsocket_monkeypatched = true

    require 'soap/nethttpclient'
    block = <<END 
      alias :original_initialize :initialize
      def initialize(socket, context, debug_dev = nil)

        unless SOAP::NetHttpClient::SSLEnabled
          raise'Ruby/OpenSSL module is required')

        @context = context
        @context.verify_mode = #{ssl_verify_mode}
        @socket = socket
        @ssl_socket = create_openssl_socket(@socket)

        @debug_dev = debug_dev
      HTTPClient::SSLSocketWrap.class_eval block

If you need to authenticate before seeing the WSDL, you'll need this patch:

def monkeypatch_authentication(username, password)
    return unless username && password
    @auth_monkeypatched = true

    require 'wsdl/xmlSchema/importer'
    block = <<END
      alias :original_fetch :fetch
      def fetch(location)

        warn("importing: " + location) if $DEBUG
        content = nil

        normalizedlocation = location
        if location.scheme == 'file' or

          (location.relative? and FileTest.exist?(location.path))

        content =
          normalizedlocation = URI.parse('file://' + File.expand_path(location.path))

        elsif location.scheme and location.scheme.size == 1 and

          content =

          client =, "WSDL4R")
          client.proxy = ::SOAP::Env::HTTP_PROXY
          client.no_proxy = ::SOAP::Env::NO_PROXY
          client.set_auth(location, "#{username}", "#{password}") #added for auth

          if opt = ::SOAP::Property.loadproperty(::SOAP::PropertyName)
            http_opt = opt["client.protocol.http"]

            ::SOAP::HTTPConfigLoader.set_options(client, http_opt) if http_opt
          content = client.get_content(location)

        return content, normalizedlocation
      WSDL::XMLSchema::Importer.class_eval block

Hope that helps someone else avoid days' long foray into piecing together blogs posts, message boards, and searching through source code.

And because you might get here via a search for related terms, normal access that only requires basic authentication could be done like this, without opening existing classes:

class SendBasicAuthFilter < SOAP::Filter::StreamHandler
  def initialize(loginname, password)

    @authorization = 'Basic ' + [ loginname + ':' + password ].pack('m').delete("\r\n")


  def on_http_outbound(req)

    req.header['Authorization'] = @authorization

  def on_http_inbound(req, res)


list_service_url = 'https://somesite/path/?WSDL'
list_service_driver =

user = 'username'

pass = 'password'
list_service_driver.streamhandler.filterchain <<,pass)

I'm very welcoming of suggestions regarding how these things might be better accomplished. Resorting to this messy level of monkeypatching just sucks. Let me know in the comments.

This is the sixth in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).


Every day, psychological barriers are erected around us, and depending on what task they are a stumbling block for, they can be helpful or a hindrance.

Ramit Sethi wrote a guest post on personal finance blog Get Rich Slowly about passive barriers that got me thinking about passive barriers in software development, or perhaps just any easily destroyed (or erected barrier) that prevents you from doing something useful (or stupid). One example he uses that comes up a lot in my own work references email: More...

A friend of mine from graduate school recently asked if she could use me as a reference on her resume. I've worked with her on a couple of projects, and she was definitely one of the top few people I'd worked with, so I was more than happy to say yes.

Most of the questions were straightforward and easy to answer. However, one of the potential questions seemed way off-base: I may be asked to "review her multi-tasking ability."

Is that a trick question? Is it relevant? More...

This is the "I'm trying my hardest to be late to that meeting that spans lunch where they don't serve anything to tide you over" edition of Programming Quotables.

If you don't know - I don't like to have too many microposts on this blog (I'm on twitter for that), so save them up as I run across them, and every once in a while I'll post a few of them. The idea is to post quotes about programming that have one or more of the following attributes:
  1. I find funny
  2. I find asinine
  3. I find insightfully true
  4. And stand on their own, with little to no comment needed
It's up to you decide which category they fall in, if you care to. Anyway, here we go:


This is the eight in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).


Code in Views and Code in the Wrong Place are two of the top 20 Rails development No-No's that came up in Chad Fowler's straw poll on Twitter about poor practices in Ruby on Rails.

Domain code in controllers and views isn't a problem that's limited to Rails, of course. It's a problem everywhere, and one you generally need to remain vigilant about. Rails doesn't make it easy by making it easy - it's much too easy to do the wrong thing.

You've got the view open and think, "I need to get a list of Widgets."


slarbi@nibbler> make
make: 'dwight_conrad' is up to date.

slarbi@nibbler> make anyway
make: *** No rule to make target 'anyway'. Stop.

slarbi@nibbler> make rule to make target anyway
make: *** No rule to make target 'rule'. Stop.

slarbi@nibbler> alias makeanyway='make -B' #ohthatmakesalotofsense
slarbi@nibbler> makeanyway
c++ main.o -o dwight_conrad -g

slarbi@nibbler> thank you
-bash: thank: command not found

This is the ninth in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).

Browsing through the questions, I'm not confident here of my ability to answer without asking some preliminary questions (which I have no one to answer), so please chime in if you have something to add. More...

The Pragmatic Bookshelf has recently published The Passionate Programmer: Creating a Remarkable Career in Software Development, (or you can save a few bucks at Amazon). It's the 2nd edition of My Job Went To India, a book I think is a must-read for any programmer.

It was important enough to me that I dedicated an entire category of this blog to discussion about about it, in fact.


Many people see spectacular plays from athletes and think that the great ones are the ones making those plays.

I have another theory: It's the lesser players who make the "great" plays, because thier ability doesn't take them far enough to make it look easy. On top of it all, you could say guys who make fewers mistakes just aren't fast enough to have been in a position to make the play at all.


This is the tenth and final post in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).


For the last few months, I've been having trouble getting out of "next week" mode. That's what I call it when I don't know what I'll be working on outside of the next week at any given time. It's not necessarily a bad thing, but when you're working on projects that take longer than a couple of weeks, it doesn't let you keep the end in sight. Instead, you're tunneling through the dirt and hoping you've been digging up instead of down. More...

Programmers are fond of telling each other that you can be a better programmer by reading other people's code. It's a common bit of advice.

I get the impression most people think you get better by imitating masters. It's a common theme in self improvement. Aspiring writers read great authors. Aspiring musicians listen to great musicians. Artists study artists and coders study coders.

I've certainly espoused that point of view. I'm fond of quoting Ron Jeffries as saying,
My advice is to do it by the book, get good at the practices, then do as you will. Many people want to skip to step three. How do they know?
In fact, I think that's the third time I've done so in almost as many years.

But what if that's not the primary benefit of reading other people's code? I don't mean scanning it - I mean reading it until you understand exactly what it's doing. Is there something else you can get out of it?

I think so. Perhaps it's not the mimicking of a particular style like a monkey that makes us better for reading code. What if it's tracing through an unfamiliar thought process that flexes the brain and makes it think in ways it previously had not?

By reading unfamiliar code and forcing yourself to trace through until you understand it, you end up thinking in ways that were previously foreign to you.

Ergo, reading others' code makes you better regardless of the quality. As long as you don't mimic poor style and practice.

I think that's where the real value in reading code exists. What are your thoughts?

From time to time I like to actually post a bit of code on this programming blog, so here's a stream-of-conscious (as in "not a lot of thought went into design quality") example that shows how to:
  1. Open Excel, making it invisible (or visible) to the user.
  2. Create a workbook and access individual worksheets
  3. Add data to a cell, or retrieve data from a cell
  4. Add a chart to a worksheet, with constants for various chart types
  5. Save as Excel 97-2003 format and close Excel
If you know where I can find the constants for file type numbers, that would be appreciated. Calling SaveAs without the type seems to use whatever version of Excel you are running, but I'd like to find how to save as CSV or other formats.

Needless to say, this requires Excel be on the computer that's running the code.

require 'win32ole'
xl ="Excel.Application")

puts "Excel failed to start" unless xl

xl.Visible = false

workbook = xl.Workbooks.Add
sheet = workbook.Worksheets(1)

#create some fake data
data_a = []
(1..10).each{|i| data_a.push i }

data_b = []
(1..10).each{|i| data_b.push((rand * 100).to_i) }

#fill the worksheet with the fake data
#showing 3 ways to populate cells with values
(1..10).each do |i|
  xl.ActiveCell.Formula = data_a[i-1]

  sheet.Range("B#{i}").Formula = data_b[i-1]

  cell = sheet.Range("C#{i}")
  cell.Formula = "=A#{i} - B#{i}"


#chart type constants (via
xlArea = 1
xlBar = 2

xlColumn = 3
xlLine = 4
xlPie = 5
xlRadar = -4151

xlXYScatter = -4169
xlCombination = -4111
xl3DArea = -4098

xl3DBar = -4099
xl3DColumn = -4100
xl3DLine = -4101 

xl3DPie = -4102
xl3DSurface = -4103
xlDoughnut = -4120

#creating a chart 
chart_object = sheet.ChartObjects.Add(10, 80, 500, 250)

chart = chart_object.Chart
chart_range = sheet.Range("A1", "B10")

chart.SetSourceData(chart_range, nil)
chart.ChartType = xlXYScatter

#get the value from a cell

val = sheet.Range("C1").Value
puts val

#saving as pre-2007 format
excel97_2003_format = -4143 

pwd =  Dir.pwd.gsub('/','\\') << '\\'

#otherwise, it sticks it in default save directory- C:\Users\Sam\Documents on my system
workbook.SaveAs("#{pwd}whatever.xls", excel97_2003_format)


It's also posted in my Miscellany project at GitHub

This is the third in a series of answers to 100 Interview Questions for Software Developers.

The list is not intended to be a "one-size-fits-all" list. Instead, "the key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills." Even still, "for most of the questions in this list there are no right and wrong answers!"

Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not prepared for the interview at all. Here's that attempt.

Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).

Last week's answers on Functional Design had me feeling that way. Luckily, this week we come to technical design - a topic I feel quite a bit stronger on. More...

low cou-pling and high co-he-sion
  1. A standard bit of advice for people who are learning to design their code better, who want to write software with intention as opposed to coincidence, often parroted by the advisor with no attempt to explain the meaning.


It's a great scam, don't you think? Someone asks a question about how to design their code, and we have these two nebulous words to throw back at them: coupling and cohesion. We even memorize a couple of adjectives that go with the words: low and high. More...

Don't encode information into a string like "AAHD09102008BSHC813" and give that gibberish to people. Don't name your project that, don't give that to me as a value or way to identify something, and don't make humans see or interact with that in any form. (If you are generating something similar and parse it with a program in automated fashion, I don't care what you call it.)

Give it a name we can use while communicating with each other and keep the rest of the information in a database. I can look it up if I need to know it.

Do not use file names, folder names, or project names as your as your database. I don't want to be required to scan each item in whatever set you chose and translate it using a lookup table to find what I'm looking for. I don't want to memorize the lookup table either.

LDAP in Ruby is better than LDAP in C#/.NET. Looking at it, I can't say it's much different minus the cruft from .NET.

Lightweight? Seriously?

Experiencing it while actually writing code, it's very different. I can't explain it, except to show it to you and tell you try it. More...

The other day I went to a major pizza chain's website to order online. I had to create an account first, of course. No big deal.

As I was choosing my password, I was pleased to see a password strength indicator to the right. Excellent, it's telling my password is "too short" -- let me add some more characters. "Warning: Too Simple" it said. Great - now I'll add some numbers in there. My password strength was now "good," but since they were going to be storing my personal details, I wanted a "great" password. I like to throw characters in there that aren't letters or numbers, so I did. And it told me my password strength was "great." More...

The other day on twitter I asked about large projects, and I wanted to bring the question to the larger audience here.

'Large project' is often used without saying what it means. What does it mean to you? What do you think it means to others?

We hear vague descriptions about project size tossed about all the time: Ruby on Rails is better for small projects. Java is for large projects. Skip the framework for small projects, but I recommend using one for large projects. More...

JetBrains recently announced that IntelliJ IDEA will be open-sourced and have an Apache 2.0-licensed version with the release of 9.0.

Those who've been reading My Secret Life as a Spaghetti Coder for a long time will know I totally love IDEA.

I haven't written software in Java for quite some time, and I don't normally do "news" posts, but I know enough of you are into Java, Groovy, and Scala to make this worth your while if the $249 was pricey enough to force you into using lesser Java IDEs. Now you don't have to.

Get it and I don't think you'll be disappointed. But don't wait until 10.0. As Cedric Beust pointed out (and I'm inclined to agree), "IDEA going opensource means that it will now slowly die."

I hope not, of course, but one has to wonder.

A little while ago I was trying to think of ways to have a program compare strings and discover patterns among them (as opposed to knowing patterns and looking for particular ones).

Over the years, I've learned about a lot of algorithms, but there's no way I could recall all of them. I knew I'd probably need to look at artificial intelligence, or more specifically, machine learning. But that's all I had to go on.

At the time, I decided it would be helpful to have a list of algorithms and data structures with short descriptions to browse and jog my memory. More...

Last week, hgs asked,
I find it interesting that lots of people write about how to produce clean code, how to do good design, taking care about language choice, interfaces, etc, but few people write about the cases where there isn't time... So, I need to know what are the forces that tell you to use a jolly good bodge?
I suspect we don't hear much about it because these other problems are often caused by that excuse. And, in the long run, taking on that technical debt will likely cause you to go so slow that that's the more interesting problem. In other words, by ignoring the need for good code, you are jumping into a downward spiral where you are giving yourself even less time (or, making it take so long to do anything that you may as well have less time). More...

One step back from greatness lies the very definition of the impossible leadership situation: a president affiliated with a set of established commitments that have in the course of events been called into question as failed or irrelevant responses to the problems of the day... The instinctive political stance of the establishment affiliate -- to affirm and continue the work of the past -- becomes at these moments a threat to the vitality, if not survival, of the nations, and leadership collapses upon a dismal choice. To affirm established commitments is to stigmatize oneself as a symptom of the nation's problems and the premier symbol of systemic political failure; to repudiate them is to become isolated from one's most natural political allies and to be rendered impotent.
Stephen Skowronek, The Politics Presidents Make (pg. 39)

A little while ago Obie asked "What's this crap about a Ruby backlash?" The whole situation has reminded me of Skowronek's work, so I dug a couple of passages up.

We're at a crossroads right now between two regimes - one represented by Java, and the other represented by Ruby (although it is quite a bit more nuanced than that). My belief right now is that Java The Language is in a position where it can't win. People are fed up with the same old crap, and a change is happening (see also: Why Do I Have To Tell The Compiler Twice?, or Adventures in Talking To a Compiler That Doesn't Listen.) More...

When I posted about why it's important to test everything first, Marc Esher from MXUnit asked:
What do you find hard about TDD? When you're developing and you see yourself not writing tests but jamming out code, what causes those moments for you? And have you really, in all honesty, ever reaped significant benefits either in productivity or quality from unit testing? Because there's a pretty large contingent of folks who don't get much mileage out of TDD, and I can see where they're coming from.
My TDD Stumbling Blocks
I'll address the first bit in one word: viscosity. When it's easier to do the wrong thing than the right thing, that's when I "see myself not writing tests but jamming out code."

But what causes the viscosity for me? Several things, really: More...

Last week, Paul Graham and Robert Morris announced the release of Arc to a firestorm of negative commentary.

At one point, it seemed as if those on couldn't wait to find new blog posts to submit that tore into Arc.

There have been many complaints, but much of it has been on the lack of Unicode support.

It's something to be expected, but as I thought about it, I wondered why.

It's not my intent here to draw negative attention by questioning the conventional wisdom of the status quo, but I fear that may happen. I simply want to ask the obvious:

How many projects have you participated in where Unicode was an explicit or implicit requirement? What percentage of the total do those make up? In the remainder of cases, would something like Arc have been useful to you?

For the vast majority of projects I've worked on, having support for 9+ bit character sets or curly quotes was not a requirement, and Arc would have been useful on the ones that didn't have a specific language or platform requirement. (I understand if your work takes you there, but also understand many of ours don't.)

So I'll ask again: should we be slaughtering useful tools, or sacred cows?

Keep it civil and topical please. It's nothing but an observation and a question, not a statement of religious belief spread with the fervor of a crusader.

Because when you don't, how do you know your change to the code had any effect?

When a customer calls with a trouble ticket, do you just fix the problem, or do you reproduce it first (a red test), make the fix, and test again (a green test, if you fixed the problem)?

Likewise, if you write automated tests, but don't run them first to ensure they fail, it defeats the purpose of having the test. Most of the time you won't run into problems, but when you do, it's not fun trying to solve them. Who would think to look at a test that's passing?

The solution, of course, is to forget about testing altogether. Then we won't be lulled into a false sense of security. Right?

I don't like to have too many microposts on this blog, so I've decided to save them up and start a Programming Quotables series. The idea is that I'll post quotes about programming that have one or more of the following attributes:
  1. I find funny
  2. I find asinine
  3. I find insightfully true
  4. And stand on their own, with little to no comment needed
Here's the second in that series. I hope you enjoy them as much as I did: More...

Suppose you want to share some data that one object produces with another object as the consumer. How would you go about doing that?

If you took a straightforward approach, you might have Producer call consumer.method("data") and pass it the data that way. On the other hand, you could have Consumer get the data it needs by requesting it from Producer with something like this: myData = producer.getData().

However, perhaps Producer and Consumer shouldn't know anything about each other. Then you might introduce an Intermediary that gets the data from Producer and passes it to Consumer with something like consumer.myData = producer.getData()

Now if you want to get really creative, you could make Producer write its data to an XML file, and then have Consumer read the data from there.

But why?

Disagreements and horror stories are encouraged below.

This morning, Leila asked how I got my groove back:
I think you have a great concept going. I really would like to find out HOW you became passionate about programming? I just graduated with a BS in CIS and am looking for an entry level IT job, HOWEVER I am not a bit excited about computers anymore. Like you I was just planning on continuing my education -get my MBA. But I know an IT job is what I went to school for. HELP! How do I get excited about an IT job when I can't even figure out what title to put on a job search? just degree in CIS?!
I started to comment, but as it became longer, I decided it might benefit others as a standalone post. More...

You feel, look, and do better when you are accomplishing goals and showing progress. That's one reason you'll find it in this week's advice from MJWTI.

The chapter is called "Daily Hit" and in it Chad recommends "setting a goal (daily, weekly, or whatever you're capable of) and tracking this type of accomplishment." Make sure it's known to your manager as well - don't let the underpants gnomes take credit for your success. Also, the shorter the distance between hits the better, since "if you're supposed to produce a hit per day, you can't spend two weeks tracking the perfect task." More...

This is a story about my journey as a programmer, the major highs and lows I've had along the way, and how this post came to be. It's not about how ecstasy made me a better programmer, so I apologize if that's why you came.

In any case, we'll start at the end, jump to the beginning, and move along back to today. It's long, but I hope the read is as rewarding as the write.

A while back, Reg Braithwaite challenged programing bloggers with three posts he'd love to read (and one that he wouldn't). I loved the idea so much that I've been thinking about all my experiences as a programmer off and on for the last several months, trying to find the links between what I learned from certain languages that made me a better programmer in others, and how they made me better overall. That's how this post came to be. More...

1. In the interest of finding a time sink, what programming related channels do you frequent on IRC?

By coincidence, Peter also posted about little efficiencies this morning. I also asked about automating your workday a little while back. Which brings us to my important question:

2. In the interest of replacing that lost time, what's in your .bash_profile?

A note to myself (a .NET neophyte) and others who may not know how ASP.NET works:

I was writing a user control (we'll call it ContentBoxVariation) in ASP.NET which composes another (ContentBox). Both have a public property Title, with getters and setters.

You might call ContentBoxVariation in an .aspx page like this: More...

At the beginning of this week's chapter from My Job Went To India, Chad Fowler relates a story about Rao, a "mind-reading" programmer who could pick up on the subtleties in conversations and implement code before you realized you had asked for it.

Both when I first read this, and the second time around, alarms went off in my head: "What about YAGNI?" Why would you implement something that wasn't asked for? If you are wrong, you could be relieved of your position for wasting time and money.

Thankfully, Chad addressed my concerns. It turns out, More...

It's the new year, and it's time to get back in the swing of things after the hectic holiday season. I had planned on taking the rest of this week off from posting as well, but I'm starting to feel behind on things, so this will let me set down my goals and focus on them in the coming year.

I had actually planned a different post for today, but Dan Vega inspired me with his list of goals so that's why you're reading this instead. Like Dan, I'm going to try to keep mine positive and specific, with an emphasis on SMART objectives.

With no further ado, here are my top professional goals for 2008: More...

Suppose for the purposes of our example we have string the_string of length n, and we're trying to determine if string the_substring of length m is found within the_string.

The straightforward approach in many languages would be to use a find() or indexOf() function on the string. It might look like this: More...

This seems to be becoming a theme here lately: DIFN. That's the advice in MJWTI for this week, although Chad Fowler doesn't put it so bluntly. In the chapter, Chad describes a race where the first team to complete a project over the weekend wins $100 thousand. Could you do it? More...

A while back I started a Twitter account with the idea of using it as a tumblelog for quotes about software that I wanted to highlight. Unfortunately, the small limit on the number of characters Twitter enforces didn't allow me to post entire quotes, much less attribute them.

Likewise, I don't like to have too many microposts on this blog, so I've decided to save them up and start a Quotables series. The idea is that I'll post quotes about programming that have one or more of the following attributes:
  1. I find funny
  2. I find asinine
  3. I find insightfully true
  4. And stand on their own, with little to no comment needed
Here's the first in that series. I hope you enjoy them as much as I did: More...

This week I return to following the advice in Chad's book. It's something I've been doing now for a while: automation.

I'm really big into automation - one of the things I really like to do is create developer tools, or even just small throwaway scripts that get me through the day.

One paragraph that stuck with me was this one:
So, imagine your company is in the business of creating websites for small businesses. You basically need to create the same site over and over again, with contacts, surveys, shopping carts, the works. You could either hire a small number of really fast programmers to build the sites for you, hire an army of low-cost programmers to do the whole thing manually and repetitively, or create a system for generating the sites.
Sound like anyone you know? (Or any of the other people writing generators, automated testers, and the like?)

It was after reading that paragraph that I decided we needed to change things at work. Forget about code repetition, there was plenty of effort repetition as well. The first part of that process was getting cfrails together, the remaining part is to build a WYSIWYG editor for building sites - if I ever get around to it.

There are other things to automate besides frameworks that generate code. Neal Ford has a pair of talks (both links there are PDFs found via his past conferences page) he gives that illustrate a bunch of tips and "patterns" along these lines. I enjoyed both of them and will eventually get around to reviewing them. He also mentioned that a book covering the topic is coming soon.

Getting back to MJWTI, Chad lists a "simple (minded) formula" to calculate productivity:
productivity = # projects or features or amount of work / (# programmers * average hourly rate)
At the end of the chapter he shows it in action: 5 units of work with 3 fast programmers at $80 per hour would be as productive as 20 programmers at $12 per hour on the same project (obviously ignoring the communication deficiencies and other pitfalls of a group that large). But if you are able to automate enough of your work, you can be the single programmer at $80 per hour on the same 5 units of work.

The exact math isn't what's important - the fact that you are more productive by automating as much as possible is.

In what ways do you automate your workday, no matter how big or small?

I'm not talking about Just-In-Time compilation. I'm talking about JIT manufacturing.

When you order furniture, it likely gets assembled only after the order was received. Toyota is famous for doing it with cars. You can do it yourself with JIT published books. On top of that, Land's End offers custom tailored, JIT manufactured clothing.

It's easy to say, "yes, we can publish software in the same manner." Every time we offer a download, it's done just in time. This post was copied and downloaded (published) at the moment you requested it.

That's not what I'm talking about either. More...

Here's some pseudocode that got added to a production system that might just be the very definition of a simple change:
  1. Add a link from one page to cancel_order.cfm?orderID=12345
  2. In that new page, add the following two queries:
    1. update orders set canceled = 1, canceledOn=getDate() where orderID=#url.orderID#
    2. delete from orderItems
Now, upload those changes to the production server, and run it real quick to be sure it does what you meant it to do.

Then you say to yourself, "Wait, why is the page taking several seconds to load?"

"Holy $%^@," you think aloud, "I just deleted every item from every order in the system!"

It's easy enough for you to recover the data from the backups. It isn't quite as easy to recover from the heart attack.

Steve McConnell (among others) says that the easiest changes are the most important ones to test, as you aren't thinking quite as hard about it when you make them.

Are there any unbelievers left out there?

When someone starts complaining about customers who are making silly requests, I normally say something like, "I know! If it weren't for those damn customers, we'd have a perfect program!"

There'd be no one using it, but hey - the application would be sweeeeet.

This week I'm going to diverge from Chad's book on how to save your job. That's mostly because I don't have the book with me, but this has been on my mind the last couple of days anyway: the fear of success.

I've noticed it in myself and others from time to time - inexplicably sabotaging opportunities to succeed. I try not to listen to that voice now if I can help it.

More recently, I've started to notice it in companies and customers as well - groups as opposed to individuals. I've started wondering if reluctance to "go live" until the product was a symbol of perfection fits in with this phenomenon. More...

I'd like a codometer to count all the lines of code I write during the day. It should keep track of lines that get kept and lines that get removed. I don't know what that information would tell me, but I'm curious about it. It should probably work independent of the IDE, since I often use several during the day.

I'd like it if not only you would stop stealing my focus, but also provide updates in the corner of the screen. When I've put you in the background, you should let me know when you're done processing so I can come and click the "next" button. On top of that, give me an option to have you click next automatically for me.

Like 'considered harmful' being considered harmful as a cliché, I'm starting to have a distinct distaste for website or product names of the class e-removr. Or ending-vowel-removr when the last letter is an 'r'. The first time it seemed refreshing and perhaps a bit cute. By now, I'm starting to wish someone would flush them down the shittr. (Well, the names at least.)

Someone found a set of bicycle pedals that fit under the desk for me. Excellent to be able to get a little exercise while I do my morning blog-reading. I couldn't find one the last time I looked, but I did this time. I'm not sure if mine are the same, or how it will work, but I will let you know when I do.

Today is December 5th. Have you sent me your code for the contest?

It's not a hard thing to come up with, but it's incredibly useful. Suppose you need to iterate over each pair of values or indices in an array. Do you really want to duplicate those nested loops in several places in your code? Of course not. Yet another example of why code as data is such a powerful concept: More...

Although computer science is a young field of study, it is rife with examples of good and bad ways to do things. This week's advice from MJWTI instructs us to focus on the past - learn from the successes and failures of those who came before us.

Chad includes a quote from a Jazz musician that illustrates the concept of how we learn to become masters quite well:
Imitate. Assimilate. Innovate.
We saw a longer version of that in a quote from Ron Jeffries in last week's post about getting good at the process of software development: More...

The other day I was working on a crossover function to be used by a genetic algorithm. The idea is that you select two individuals in your population of possible solutions as parents (or more - bits aren't predisposed to monogamy or bisexual reproduction) with the idea that you'll combine their "DNA" in hopes of producing a more fit individual.

The idea of the crossover for my case was a single point, and it goes somewhat like this (a slightly simplified version, for the sake of discussion): More...

Just a reminder about the contest to win an iPod Nano for learning something new: The window of opportunity is getting smaller - submissions need to be in by next Wednesday, December 5th, 2007.

My goal is to review everything before the end of the weekend, and send the iPod out on Monday (along with an announcement here of the winner, and recognition of the other participants - so if you want to be excluded for some reason, let me know that as well).

If you haven't started, there's still enough time to come up with a solution: it needn't be long or difficult - just demonstrate something new in a language you haven't had much experience in.

If you've got a blog, post the solution there and let me know about it. If not, send it to me directly - first get in touch with me via my contact page and then send it via email.

Finally, I'd like to give special thanks to Chad Fowler for helping spread the word.

Last week I posted about why software developers should care about process, and how they can improve themselves by doing so. In the comments, I promised to give a review of what I'm doing that seems to be working for me. So here they are - the bits and pieces that work for me. Also included are new things we've decided to try, along with some notes about what I'd like to attempt in the future. More...

In this week's advice from MJWTI, "The Way That You Do It," Chad Fowler talks about process and methodology in software development. One quote I liked a lot was:
It's much easier to find someone who can make software work than it is to find someone who can make the making of software work.
Therefore, it would behoove us to learn a bit about the process of software development. More...

Since the gift buying season is officially upon us, I thought I'd pitch in to the rampant consumerism and list some of the toys I've had a chance to play with this year that would mean fun and learning for the programmer in your life. Plus, the thought of it sounded fun. Here they are, in no particular order other than the one in which I thought of them this morning: More...

At the O'Reilly ONLamp blog, chromatic pointed out that we should "program as if [our] maintenance programmer were not a barely-competent monkey," quoting a paragraph from Mark-Jason Dominus.

Mark was describing the idiocy of programming-by-superstition, where you might put parentheses if you are unsure of the order of operations operator precedence (or, to help barely-competent monkeys know the order precedence), rather than re-defining that order as opposed to using them to change the "normal" precedence. More...

String fullName = new StringBuilder().Append(firstName + " " + middleName + " " + lastName).ToString();

The chapter this week from My Job Went to India tells us to "Practice, Practice, Practice." It's pretty obvious advice - if you practice something, you will get better at it. Assuming that skill has value, you'll be able to market yourself easier.

What's not so obvious is that we are "paid to perform in public - not to practice." Unfortunately, most of us practice at our jobs (which is another reason we started the Code Dojo).

Perhaps because the advice seems so obvious is why when I reread this chapter, I realized I hadn't done much about it. Sure, I do a lot of learning and coding on any given day, but it's been rare where I've truly stretched the bounds of my knowledge and skill. And as Chad notes in the chapter, that is precisely the point of practice.

Specifically, we might focus on three areas when practicing, which parallel Chad's experience as a musician: More...

The Turing Test was designed to test the ability of a computer program to demonstrate intelligence. (Here is Alan Turing's proposal of it.) It is often described as so: if a computer can fool a person into believing it too is a person, the computer has passed the test and demonstrated intelligence. That view is a simplified version.

Quoth Wikipedia about the rules:
A human judge engages in a natural language conversation with one human and one machine, each of which try to appear human; if the judge cannot reliably tell which is which, then the machine is said to pass the test.
Specifically, the test should be run multiple times and if the judge cannot decipher which respondent is the human about 50% of the time, you might say he cannot tell the difference, and therefore, the machine has demonstrated intelligence and the ability to hold a conversation with a human.

I bring this up because recently Giles Bowkett said that poker bots pass the test, and pointed to another post where he said the Turing test was beaten in the 1970s by a paranoid repeater named PARRY.

I suppose the idea of poker bots passing the test comes about because (presumably) the human players at the table don't realize they are playing against a computer. But if that is the case, even a losing bot would qualify - human players may think the bot player is just an idiot. More...

The following was generated using a 7th order Markov chain and several of my blog posts as source text:


2nd Generation iPod Nano The Contest
For the next month, I'll be running a contest here for programmers to promote learning something new. I've had this spare iPod Nano that I've yet to use (and likely never will), I've been covering how to save your job with Chad Fowler's My Job Went To India, and I'm passionate about learning new things. It seems the best way to combine all three is a contest to help me spread that passion.

In particular, since I think it's useful to learn languages and different programming paradigms, that's what this contest will focus on.

Here are the rules:
  • Write a program in any language you are unfamiliar with.
  • Choose a language or a program that is in a different paradigm than languages (or programs) you already know (how to write).
  • Use at least one idea from that language that you've never (or rarely) used in another language.
  • Make it a useful program, though it needn't be big.
  • Follow good programming practices appropriate to the paradigm you're programming in (as well as universal ones).
  • If you have a blog and want to participate, post the solution there to be scrutinized in comments.
  • If you don't have a blog and want to participate, email the solution to me via my contact page or my email address if you already know it, and let others scrutinize the solution here in the comments.
  • I'll get the prize out the door in time for you to receive it by Christmas, in case you want to give it away as a gift.
  • When you submit your program, be sure to point out all the ways you've done the above items.
But I need your help
  • If you have any ideas for possible programs, list them below. This would give people something to choose from, and make for more participation.
  • If you have an idea that you want to do, but aren't sure if it would work, it probably will. Feel free to ask if it makes you more comfortable though.
  • If you think this is as important as I do, spread the word - even if you don't want to participate.
  • The winner will likely be chosen in a random drawing, but I need people proficient in different paradigms to help me judge what qualifies. I'm not an expert in everything.
Overall, the point is to learn something new, to have fun doing it, and to get people involved in learning new concepts. If you accomplish any two out of those three goals within the next month, let me know about it and we'll enter you in the contest.

You have until December 5. Start coding.

When looping over collections, you might find yourself needing elements that match only a certain parameter, rather than all of the elements in the collection. How often do you see something like this?

foreach(x in a)
   if(x < 10)

Of course, it can get worse, turning into arrow code. More...

... and two things you can do to remove them

The way I see it, there are three hurdles for programmers to cross to get involved in an open source project:
  1. Motivation to help out in the first place
  2. Fear of rejection in asking to be involved
  3. Lack of knowledge about the project
There's nothing you can do about motivation - if a programmer doesn't want to contribute, he simply won't. But the other two roadblocks are something the project can remove from the programmer's path to contribution. Charles Nutter with JRuby recently demonstrated what I consider to be an excellent way of getting motivated programmers to contribute to a project by removing those other two impedences:
  1. Remove the fear of rejection by asking people to get involved
  2. Improve the knowledge and lower the barrier of entry by posting a list of easy to fix items
In particular, by posting the list of easy items you offer a great place to start where the newbie on the project will feel comfortable getting his feet wet, while at the same time learning the conventions and design of the code base.

Finally, once someone shows interest, don't forget to be helpful and follow-up on answering their questions. It will go a long way to show them that you value their help.


You need to take responsibility for your own improvement. That's a good part of what Chad Fowler's MJWTI focuses on getting you to realize. This week's advice follows along that same line: "Give a man a fish; feed him for a day. Teach a man to fish; feed him for a lifetime" (quoting Lao Tzu).

As Chad notes however, "education requires both a teacher and a student. Many of us are too often reluctant to be a student." He likens fish to the "process of using a tool, or some facet of a technology, or a specific piece of information from a business domain you're working in." Too many of us take the fish today, and "ask ... for another fish tomorrow." More...

There are plenty of people(?) who think we might move on to Javascript as the next big language. If anything, the fact that it runs server-side and browser-side means you might just need to know one language.

Some others think polyglot programming is where its at - the movement to two major platforms and implementations on either of them of most any language you'd want to use gives us tremendous power of integration while making it easy to use the right tool for the right job. If anything, we can stop using hammers to pound screws into rocks.

I'm partial to the second idea. What do you think? (Either these two or something completely different!)

(?) - Just to be clear, I don't think Steve Yegge ever said Javascript will be it. But a lot of the comments seem to think so.

I'm sitting in the conference right now and just found out they are broadcasting live on the web. If you want to attend but can't be here, view the conference here. You've missed Neal Ford, but Venkat is talking about DSLs right now. Bjarne Stroustrup is up after lunch.

Here is the conference schedule.

Update: It's quite late now, but Bjarne didn't make it. They announced this after I had shut down for lunch, and amazingly, I didn't turn my computer on the rest of the weekend.

It's such a bit of obvious advice that I almost skipped over it: "love it or leave it." There's no point staying in a job or career that you don't like - your performance will suffer as you sling code listlessly on a daily basis.

The flip side of that is that you'll excel in something you're passionate about. It's not hard to "take a big step away from mediocrity" just by being passionate (quoting Chad Fowler, in MJWTI).

So, if you're not passionate about programming, should you find another career? Perhaps, but why not just become passionate? It's not exceedingly hard. I know - I was there.

When making the decision to go to graduate school, I had originally planned to go for Political Science. I was bored with work, and I just wanted to get away from computers. A lot of that was self-inflicted with my spaghetti-coding ways, but I just didn't feel right programming any more. Someone with one of the coolest jobs in the world dreading to go to work? That was me.

Luckily for me, the Political Science department didn't accept Spring admissions for grad school, and that was when I wanted to enroll. So, I said "what the hell," and went for Computer Science instead. Of course, I have the benefit of having been passionate about this stuff at one point in my life. If you've never felt that way, what brought you here?

Whatever happened, I made the decision to become passionate about programming and computers again before that first semester - and now I'm hooked. My job is not appreciably different from what I was doing before - I've just added a lot of learning and exploration into my days, and figured out the benefits of dealing with crappy code. I think you can do the same.

Like many of you, I got into programming because I wanted to make games as a kid.

Wolfenstein Title Screen It was around the time games started looking better and being more complicated than Wolfenstein 3-D that I started thinking I'd never be able to make a game. Sure, I could do Tetris or Minesweeper, but how in the world could I ever match the game play and graphics of a Quake or Diablo. Let's not even get started with Halo 3 and Call of Duty. More...

Probably my absolute favorite chapter in Chad Fowler's book gives the advice of being the worst in order to save your job. (Incidentally, it must be a good book because I think I've said that about other chapters, and I'm certain I'll say it again as I rediscover later ones).

Saving your job by being teh suck (that's Latin for mightily unimpressive) coder in the bunch probably sounds odd to you, so let me explain: More...

Here's a WTF I ran into the other day (partially pseudocode):

XmlDataDocument goto = new XmlDataDocument();

XmlNode gotoThisUrl;
if(referer == "")
  gotoThisURL = goto.SelectSingleNode("gotoOnDirectory1");
else if (referer == "")
  gotoThisURL = goto.SelectSingleNode("gotoOnDirectory2");

Response.Redirect(gotoThisURL.InnerText.Trim(), true);

It was actually a bit more complex than that (probably 3x as many lines), but with the same purpose. There were literally two places from which to choose to direct the request. And no, the application wouldn't take incredibly long to compile. I felt like it was quite a bit of overkill.

Those fluent in English know well the phrase "don't put all your eggs in one basket" (kindly linked for those unfamiliar with the idiom). If it is foolish enough to risk dropping the basket and breaking all of your eggs, it is doubly so to put all your eggs in someone else's basket. How could you control your destiny? More...

A couple of weeks ago the UH Code Dojo embarked on the fantastic voyage that is writing a program to solve Sudoku puzzles, in Ruby. This week, we continued that journey.

Though we still haven't completed the problem (we'll be meeting again tenatively on October 15, 2007 to do that), we did construct what we think is a viable plan for getting there, and began to implement some of it.

The idea was based around this algorithm (or something close to it): More...

Is your code littered with checks for types and nulls? Don't get me wrong, now - coding defensively is quite alright: More...

How many of you have experienced the Ballmer peak?

The last bit of advice from Chad Fowler's 52 ways to save your job was to be a generalist, so this week's version is the obvious opposite: to be a specialist.

The intersection point between the two seemingly disparate pieces of advice is that you shouldn't use your lack of experience in multiple technologies to call yourself a specialist in another. Just because you develop in Java to the exclusion of .NET (or anything else) doesn't make you a Java specialist. To call yourself that, you need to be "the authority" on all things Java. More...

... Do you suck?

In a slightly invective tone, Reg makes some great points about assumptions in validation, and the real trouble it can cause. I'm not entirely sure how often I've been guilty, but I know I'll pay a lot closer attention from here on out.

When you build a web app, remember: if it's going public, it has the potential to mess with people's lives.

And now I'm off to re-check those regular expressions I've been using to validate email addresses...

A couple of days ago the UH Code Dojo met once again (we took the summer off). I had come in wanting to figure out five different ways to implement binary search. The first two - iteratively and recursively - are easy to come up with. But what about three other implementations? I felt it would be a good exercise in creative thinking, and pehaps it would teach us new ways to look at problems. I still want to do that at some point, but the group decided it might be more fun to tackle to problem of solving any Sudoku board, and that was fine with me.

Remembering the trouble Ron Jeffries had in trying to TDD a solution to Sudoku, I was a bit weary of following that path, thinking instead we might try Peter Norvig's approach. (Note: I haven't looked at Norvig's solution yet, so don't spoil it for me!) More...

If you don't care about the background behind this, the reasons why you might want to use rules based programming, or a bit of theory, you can skip straight the Drools tutorial.

One of the concepts I love to think about (and do) is raising the level of abstraction in a system. The more often you are telling the computer what, and not how, the better.

Of course, somewhere someone is doing imperative programming (telling it how), but I like to try to hide much of that somewhere and focus more on declarative programming (telling it what). Many times, that's the result of abstraction in general and DSLs and rules-based programming more specifically. More...

Bioinformatics is one area of computing where you'll still want to pay special attention to performance. With the human genome consisting of 3 billion bases, using one byte per base gives you three gigabytes of data to work with. Clearly, something that gives you only a constant reduction in computational complexity can result in huge time savings.

Because of that concern for performance, I expect to be working in C++ regularly this semester. In fact, the first day of class was a nice review of it, and I welcome the change since it's been many years since I've done much of anything in the language. More...

Why don't we see more application code that stays thin on the application side and heavy on the database side?

Just heard from Venkat about the Software Engineering Conference @ UH to be held Oct 19th - 20th, 2007.

Both Venkat from Agile Developer/Relevance and Neal Ford from ThoughtWorks are excellent, entertaining, and informative speakers that I've had a chance to see.

But I'm sure Vali Ali of HP, Sridhar Vajapey from Sun, Ben Galbraith of Ajaxian and more, Jon Schwartz of Phrogram Company, Peter A. Freeman (a former Assistant Director of NSF who is currently at Georgia Tech), and Bjarne Stroustrup, the creator of C++ will also have interesting talks. (Thanks to Venkat for background on the speakers I didn't know about)

If you're going to be in or around Houston (or can make it here) on that weekend, I suggest you register at no cost here.

Blaine Buxton has a nice post on Promises And String Concatenation.

In that post, Blaine notes that string concatenation "make[s] your code slow and consumes memory," and that you are often told to use something like StringBuilder (in Java) when doing a lot of string concatenations.

His position is that the language should abstract that for us and "do the right thing" when you use + (or the concatenation operator). Using + as a message to an object, it would be possible for us mere programmers to implement a string builder on top of + to really speed it up. Of course that's not the case in Java, but he shows an implementation of it in Ruby that, over 1 million iterations of concatenating 'a' to a string, takes only 23.8 seconds versus 4500+ the "normal way."

I'd like to see benchmarks in normal usage to see if that speed increase is typical in a system that uses tons of concatenation, but those numbers are still staggering.

And I agree the language should do it for us, assuming it is possible. I add the clause about possibility there because I can't comprehend why it hasn't been done to begin with (nor have I thought a lot about it). Good catch Blaine.

Update: Chad Perrin notices that String#<< and String#concat do the same thing.

The next few days in Houston are busy for programming technophiles. A couple of quick reminders:

BarCampHouston 2 is this Saturday, August 25, 2007 beginning at 9:00 AM at Houston Technology Center. Update: I had the map wrong since it was wrong on the BarCampHouston wiki page. I hope no one went to the wrong place. Here is the correct one: HTC. I also decided to take the day off and chill out instead of heading up there. My apologies to anyone who had planned to say hello!

HouCFUG is hosting a ColdFusion 8 release party on Tuesday, August 28 from noon to 1:00 PM at Ziggy's Healthy Grill where they'll be giving away a licensed copy of CF 8.

Finally, Agile Houston is hosting a session with Robert Martin, object mentor on Tuesday as well. It's at 6:30 PM in PGH 563 on the University of Houston Campus.

I should be at both BarCamp and at Robert's presentation, but I'll be in class during HouCFUG's meeting.

This week's advice from Chad Fowler's gem of a book really resonated with me when I read it, and it continues to do so. It was one of my favorite chapters in the book: Be a Generalist.

Don't be "just a coder" or "just a tester" or just anything. Doing so leaves you useful only in certain contexts, when reality dictates that it would be better to be generally useful. More...

I'm looking for a couple of pieces of software and was hoping to get some expert opinion (that's why I'm asking you!).

First, I need a standalone diff/merge tool for Windows. I've seen a couple from searching Google, but was hoping for a non-paid version as it is only a temporary solution. If you don't know of a free one, I'll still be glad to know what you use that you were willing to pay for (and what you think of it). More...

This week's advice from My Job Went to India is easy to follow: invest in yourself. In particular, Chad mentions some advice found in The Pragmatic Programmer. Namely, that you should learn a new language (every year). But don't just learn any language - it should be a paradigm shifting language. In other words, "don't go from Java to C#." Go from ColdFusion to Haskell or Java to Ruby or Visual Basic to Lisp.

To illustrate the point, Chad tells the story of the developer who, when asked about having used a particular technology, replied, "I haven't been given the opportunity to work on that." Unless you don't own a computer (which Chad says probably was the case for that developer), you don't need the opportunity to be given to you. You need to take some time and do it yourself. More...

A few days ago I posted a question of preference for implementation of a complicated boolean expression. One was a complete mish-mash of nastiness, while the other used descriptive variable names to which the original conditions were assigned.

Not surprisingly, everyone who responded preferred the second one. On the other hand, I was expecting at least someone to prefer the mish-mash because it has all the logic right there where you can see it rather than hiding it. I'm glad no one said that, but I've heard it said about using classes, functions, or different files for instance.

The point of the post was simple: I just wanted to highlight that you can use variables to store results of complex boolean expressions. Of course, most people already know this. Even I know it.

So why is it when I look through code I see so little use of it?

Before reading Chad's book, I was a one-"stack" kind of programmer. I knew a bit about Java and .NET, and I was fairly competent in C and C++ (though I wouldn't say I knew much about OO). I had done a couple of things in PHP and ASP. But for the most part, I was only using ColdFusion and Microsoft SQL Server on Windows with IIS as the web server. Consequently, I knew HTML and a bit of CSS too. I largely shied away from learning anything new, and tried my hardest to only work with what I knew well. More...

A couple of evenings ago, after I wrote about how I got involved in programming and helped a friend with some C++ (he's a historian), I got inspired to start writing a scripting engine for a text-based adventure game. Maybe it will evolve into something, but I wanted to share it in its infancy right now.

My goal was to easily create different types of objects in the game without needing to know much about programming. In other words, I needed a declarative way to create objects in the game. I could just go the easy route and create new types of weapons like this: More...

From a project I was working on recently I ran into this problematic and fairly complex boolean expression (pseudocode):

if not arguments.pending and arguments.umoneyamount is 0 and aname.agencyname is arguments.agencyname
    do something
elseif ((aname.agencyname is arguments.agencyname and arguments.pending) or ( is 1 or (isdefined("session.realid") and session.realid gt 0)) or (not arguments.pending and arguments.umoneyamount is 0 and local.isMyClient)) and ( is 1 or arguments.pending)
    do something else
    do the other thing

Or alternatively,

local.isMyClient = aname.agencyname is arguments.agencyname
local.isPending = arguments.pending
local.isAdmin = is 1 or (isdefined("session.realid") and session.realid gt 0)
local.canAddUtilities = not arguments.pending and arguments.umoneyamount is 0 and local.isMyClient
local.allow_edit = (local.isMyClient and local.isPending) or local.isAdmin or local.canAddUtilities

if local.canAddUtilities
    do something
elseif local.allow_edit and ( is 1 or arguments.pending)
    do something else
    do the other thing

Which do you prefer, and why?

As I've mentioned before, I used to try my hardest to avoid contact with the customer, and as software developers, I think we don't spend enough time concerned about the business aspects of what we do.

In My Job Went to India, Chad Fowler says he thinks so too.

To combat code monkey syndrome, Chad suggests we have lunch with business people and read trade magazines for our business domains as often as we can. More...

A while ago Ted Neward was reading The 33 Strategies of War and decided to relate those strategies of war to software development.

This weekend he put up the first in the series of 33: Declare War on Your Enemies (The Polarity Strategy).

In any case, I like the idea. Having had some education in politics and computer science, this appealed to me even more than it might have, but I'd think I'd recommend reading it anyway.

My favorite quote?
We will not ship code that fails to pass any unit test... Well, then, we'll not write any unit tests, and we'll have met that goal!
What do you think of viewing software projects as conflict?

Like many programmers, I started doing this because of my interest in video games. I was 6 years old when I first touched a computer. It was an Apple IIe and I would play a game involving Donald Duck, his nephews, and a playground (I forget the name of the game). I was hooked, and took every available chance to play that I could.

Subsequently, I got a Nintendo and played all sorts of games. Super Mario Bros. was my favorite, of course, and it greatly inspired me. After a while, I was spending more time planning and drawing levels in my notebook for two-dimensional side-scrolling video games than I was playing them. It wasn't long before I envisioned my own game console. More...

We could all stand to be better at what we do - especially those of us who write software. Although many of these ideas were not news to me, and may not be for you either, you'd be surprised at how you start to slack off and what a memory refresh will do for you.

Here are (briefly) 10 ways to improve your code from the NFJS session I attended with Neal Ford. Which do you follow? More...

In the past you used to give and receive advice that keeping form state in a session was a valid way to approach the problem of forms that span several pages. It's no longer sound advice, and it hasn't been for a while.

Even before tabs became popular, a few of us were right-clicking links and opening-in-new-windows. It's a nice way to get more things done quicker: loading another page while waiting for the first one to load so that you are doing something all the time, rather than spending a significant amount of time waiting. It even works well in web applications - not just general surfing. More...

The second talk I attended at NFJS was Stuart Halloway's JavaScript for Ajax Programmers. I had planned to attend a different session, but all the seats were full, and Stuart's presentation was in the large room, so I figured I could sneak in a bit late without anyone taking much notice.

After attending the session, I wasn't upset that the other one had been full - Stuart had quite a few tidbits of solid advice to give us. The funny thing is, I read his blog and didn't realize he was at the conference until I entered the room and saw the first slide. If I had known, I would have likely attended his presentation anyway, so the fact that the other one was full was a stroke of luck in that regard (although, I know it would have been good as well because I've been in courses from the speaker). More...

The latest Ruby Quiz asks us to find the maximum contiguous subarray, given an array. This ties in nicely to something I've been wanting to ask for a while: how do you design your algorithms? What heuristics do you use? What different approaches do you try? How can you improve your skill set at designing algorithms?

For the maximum subarray problem, if you didn't know any better, you'd probably implement a solution that analyzes every possible subarray, and returns the one with the maximum sum: More...

Software Developer has an article, Ghosts in the Machine: 12 Coding Languages that Never Took Off that spotlights twelve of many thousands of languages that never made it big. Some had potential, others were doomed to begin with.

ColdFusion makes the list. So do Haskell, Delphi, and PowerBuilder.

I don't know that I disagree with the assessment based on the thought that "the vast majority of us all use the same dozen or so."

What do you think of the list? I was surprised to see those four languages included with some of the others, but at the same time you still have to ask, have they made it? And if they did make it, are they still there?

(via Venkat)

I first read Chad Fowler's gem, My Job Went to India: 52 Ways to Save Your Job around this time last summer. Let me tell you, it was an awesome book (at least, it seriously changed the way I look at my career, and the way I've been managing it since then). It was a greatly inspiring book.

The reason I bring this up today is because despite the fact it was released a couple of years ago, its been making rounds through a couple of blogs lately. On top of that, I've wanted to write about it since reading it, and now seems like the perfect time. More...

I ran into a couple of stumbling blocks today using a particular company's XML request API, and it made me wonder about the restrictions we put in our software that have no apparent reason.

My plan was simple: create a struct/hash/associative array that matches the structure of the XML document, run it through a 10-line or so function that translates that to XML, and send the data to the service provider. Simple enough - my code doesn't even need to know the structure - the programmer using it would provide that, and the service provider doubles as the validator! My code was almost the perfect code. It did almost nothing, and certainly nothing more than it was meant or needed to do.

But alas, it was not meant to be. Instead, the service provider has a couple of restrictions: More...

A phrase that Venkat likes to use sums up my view on software religions: "There are no best practices. Only better practices." That's a powerful thought on its own, but if you want to make it a little more clear, I might say "There are no best practices. Only better practices in particular contexts." This concept tinges my everyday thinking.

As I mentioned before, I'm blind to rules that "always apply" in software development. This post about fundamentalism in software methodology got me thinking about a post I had meant to write up a while ago, which basically turned into this one. More...

There is a seemingly never-ending debate (or perhaps unconnected conversation and misunderstandings) on whether or not the software profession is science or art, or specifically whether "doing software" is in fact an engineering discipline.

Being the pragmatist (not a flip-flopper!) I aspire to be, and avoiding hard-and-fast deterministic rules on what exactly is software development, I have to say: it's both. There is a science element, and there is an artistic element. I don't see it as a zero-sum game, and its certainly not worthy of idolatry. (Is anything?) More...

About a month ago, Robert Martin posted what has to be the funniest critique on using XML I've ever read. More...

Kristian Dupont Knudsen's "Top Ten of Programming Advice to NOT follow" showed up on my radar in a couple of places recently (those links won't always be to the correct page in, I fear).

Most of it is pretty solid, but two of the ten stuck out at me. First, the advice not to "make sure your team shares a common coding standard." Kristian uses a couple of three line functions to illustrate that both are easily read: More...


Picture of me

.NET (19)
AI/Machine Learning (14)
Answers To 100 Interview Questions (10)
Bioinformatics (2)
Business (1)
C and Cplusplus (6)
cfrails (22)
ColdFusion (78)
Customer Relations (15)
Databases (3)
DRY (18)
DSLs (11)
Future Tech (5)
Games (5)
Groovy/Grails (8)
Hardware (1)
IDEs (9)
Java (38)
JavaScript (4)
Linux (2)
Lisp (1)
Mac OS (4)
Management (15)
MediaServerX (1)
Miscellany (76)
OOAD (37)
Productivity (11)
Programming (168)
Programming Quotables (9)
Rails (31)
Ruby (67)
Save Your Job (58)
scriptaGulous (4)
Software Development Process (23)
TDD (41)
TDDing xorblog (6)
Tools (5)
Web Development (8)
Windows (1)
With (1)
YAGNI (10)

Agile Manifesto & Principles
Principles Of OOD
Ruby on Rails

RSS 2.0: Full Post | Short Blurb
Subscribe by email:

Delivered by FeedBurner