Posted by Sam on Jun 25, 2008 at 08:01 AM UTC - 5 hrs
I don't like to have too many microposts on this blog, so I've decided to save them up and start
a Programming Quotables series. The idea is that I'll post quotes about programming that have one or more of the
following attributes:
- I find funny
- I find asinine
- I find insightfully true
- And stand on their own, with little to no comment needed
Here's the seventh in that series. I hope you enjoy them as much as I did:
More...
In the software industry, we've been chasing quality for years. The interesting thing is there are a number of things that work. Design by Contract works. Test Driven Development works. So do Clean Room, code inspections and the use of higher-level languages.
All of these techniques have been shown to increase quality. And, if we look closely we can see why: all of them force us to reflect on our code.
That's the magic, and it's why unit testing works also. When you write unit tests, TDD-style or after your development, you scrutinize, you think, and often you prevent problems without even encountering a test failure.
Perhaps I've made it seem like I'm on the side of the pirates. Just to make it clear that I'm not sailing under the jolly roger: In my own view, piracy is wrong. It's wrong even when the people making and selling the game are senseless, self-destructive fools. It's wrong even if the game sucks. It's wrong if you're broke. It's wrong even if "you weren't going to buy it anyway." It's wrong and I don't do it, ever.
It is not my intention to preach at pirates and get them to change their habits. I'm not anyone's mum, and it's not my place to tell people how to act. I actually think that having lots of people repent of piracy right now would be horrible. The managers would conclude their monstrous policies were working, and we'd get a double helping of the same, forever after, in every game they put out.
Regardless of the approach taken, I definitely no longer believe that sprocs should play any significant role in any application. The current mandate in the software industry is to strive to lower costs by increasing developer productivity and ORM's clearly help to do this by eliminating the need to write and maintain countless simple CRUD sprocs.
It's definitely time for all of us .NET developers to abandon our convention sproc wisdom and start playing catch-up with the rest of the industry when it comes to using ORM's.
I am not at the mercy of some big up-front UML diagrams or "non-agile" models grounded in getting something wrong in its entirety and very thoroughly before you take measures to fix it (or even begin to detect it).
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
Last modified on Jun 25, 2008 at 08:03 AM UTC - 5 hrs
Posted by Sam on Jun 02, 2008 at 07:57 AM UTC - 5 hrs
I don't like to have too many microposts on this blog, so I've decided to save them up and start
a Programming Quotables series. The idea is that I'll post quotes about programming that have one or more of the
following attributes:
- I find funny
- I find asinine
- I find insightfully true
- And stand on their own, with little to no comment needed
Here's the sixth in that series. I hope you enjoy them as much as I did:
More...
Ahhh... configuration. I sometimes think this is a misnomer. At least in the way that the Java and .NET community have approached config in practice. We've had this trend in which we started jamming everything into XML configuration.
So much so, we often get asked to provide XML to configure features I think ought to be set in code along with unit tests. We've turned XML into a programming language, and a crappy one at that. Ayende talks about one issue with sweeping piles of XML configuration under a tool. This is not an intractable problem, but it highlights the fact that XML is code, but it is code with a lot of ceremony compared to the amount of essence. To understand what I mean by ceremony vs essence read Ending Legacy Code In Our Lifetime.
Phil Haack, Dynamic Language DSL vs Xml Configuration
I thought about this for a minute, and realized that implicit in the conversation are several assumptions (or perhaps more accurately, conventionally perceived "truths") with regard to the craft of software development.
1)"there isn't that much to it" = "software is really easy to write"
2)"We have everything important figured out" = "in a business, the actual software is just icing on the cake"
3)"All we need is a programmer" = "software developers are cogs in a machine, or interchangeable components of an assembly line"
Ethan Vizitei, "All I need is a Programmer"
But do I belong to the company I work for? No! Never!
If that means I'm doomed to walk the Earth for eternity writing code and building beautiful ideas, then that's ok.
No matter how much my job makes me happy, my family and my life outside work are just as important and more. Obsessing about anything is not good. Moderation is good. Do everything well but know when to stop. Do your job well but remember to go and hang out with your friends. Put down the mouse and call someone to go out. Liking your life outside work does not mean you suck at work. It means you are good at living.
Damana Madden,
You don't own me, I'm not that kind of girl anymore
The folks who insist on 100% code coverage [for unit tests] are making a useful tool unpalatable to serious programmers.
Andrew Binstock, Is the popularity of unit tests waning?
Posted by Sam on Apr 14, 2008 at 06:11 PM UTC - 5 hrs
If I am healthy, my body may come to rely on being so and forget what to do when I am sick. Therefore, it is better to be sick than to be healthy.
Since I spent my morning reading reddit and typing comments there instead of writing today's blog post, I'll let you in on this discussion that's going on over there: Larry O'Brien's article 30K application lines + 110K testing lines: Evidence of...? was posted to this thread on reddit, and the FUD started to fly. (If you're interested in the subject, there's also a thread about programmers not getting it, or not wanting to.)
It started with Larry quoting himself on praising extreme programming, and mentioning 110 thousand lines of test code to 30 thousand lines of application code, with the application having been developed in Python. Alan Holub took that as an indictment of dynamic languages, with Larry quoting him as saying:
More...
I want to take exception to the notion that Python is adequate for a real programming project. The fact that 30K lines of code took 110K lines of tests is a real indictment of the language. My guess is that a significant portion of those tests are addressing potential errors that the compiler would have found in C# or Java. Moreover, all of those unnecessary tests take a lot of time to write, time that could have been spent working on the application.
In fact, many people were shocked at the amount of tests compared to code, and that's what the discussion (at least the part I was interested in) centered around. Four times as much test code as application logic is too much. It would shackle you, instilling fear in your heart and soul. No changes would ever be made with that kind of viscosity. Furthermore, tests can provide a false sense of security, a blankie, if you will.
You've got to be kidding. Having a test that will tell you when you broke existing functionality is pressure to avoid changes? To me, that's liberating!
Contrast that with not having a test to tell you when something broke. Does it even make sense to say having tests pressures you to avoid changes? Only if you fear having a program that works over having one that you think works.
Let me try a different approach. Take the following simple program:
if (someCondition is true)
do something
else
something else
if (anotherCondition is true)
do another thing
There are four execution paths: One where both someCondition and anotherCondition are true, one where they are both false, and one each where one is true and the other isn't.
In other words, we have six lines of code and at least four tests we should write to cover all the cases. If each test is just a single line, we still need to write the method names and end lines, so that would give us three lines per test - for a total of twelve lines of test code.
The test code size is already double the number of lines in our application code for this simple, six line program with four execution paths.
How many execution paths are in a 30 thousand line program?
Seeing as the number of execution paths in code is more likely to grow exponentially than linearly with each new line that gets written, 110 thousand lines of test code isn't actually all that much.
Further, the solution to the blankie problem is not "have fewer tests," it is to recognize that passing the tests is a necessary - not sufficient - condition of working software.
Following the blankie argument to its logical conclusion - that fewer tests mean you write better code because you are more careful - we should have no tests.
In fact part of the reason we want the tests is that we can make changes to the code with less fear of unknowingly breaking existing functionality or introducing defects in the software.
In the end, if someone is using the tests as a tool to do wrong, there is something wrong with the person, not the test. They will find another way to do wrong, even if we remove the tests from their arsenal.
Last modified on Apr 14, 2008 at 06:16 PM UTC - 5 hrs
Posted by Sam on Mar 17, 2008 at 10:26 AM UTC - 5 hrs
Suppose you want to write an algorithm that, when given a set of data points, will find an appropriate number
of clusters within the data. In other words, you want to find the k for input to the
k-means algorithm without having any
a priori knowledge about the data. (Here is my own
failed attempt at finding the k in k-means.)
def test_find_k_for_k_means_given_data_points()
data_points = [1,2,3,9,10,11,20,21,22]
k = find_k_for_k_means(data_points)
assert(k==3, "find_k_for_k_means found the wrong k.")
end
The test above is a reasonable specification for what the algorithm should do. But take it further: can you actually
design the algorithm by writing unit tests and making them pass?
More...
I've previously expressed my doubt that TDD makes an effective approach to algorithm design.
More recently, I alluded to a some optimism towards the same idea.
Then, in a comment to the same post, Dat Chu asked about using unit tests in algorithm design, also referencing
something that Ben Nadel had said about asserting in code-comments
how the state of the algorithm should look at certain points.
That all led to this post, and me wanting to lay my thoughts out a little further.
In the general case, I agree with Dat that it would be better to have the executable tests/specs.
But, what Ben has described sounds like a stronger version of what Steve McConnell called
the pseudocode programming process
in Code Complete 2, which can be useful in working your way through an algorithm.
Taking it to the next step, with executable asserts - the "Iterative Approach To Algorithm Design" post
came out of a case similar to the one described at the top. Imagine you're coming up with something
completely new to you (in fact, in our case, we think it is new to anyone), and you know what you want
the results to be, but you're not quite sure how to transform the input to get them.
What
good does it do me to have that test if I don't know how to make it pass?
The unit test is useful for testing the entire unit (algorithm), but not as helpful for
testing the bits in between.
Now, you could potentially break the algorithm into pieces - but if you're working through it for the
first time, it's unlikely you'll see those breaking points up front.
When you do see them, you can write a test if you like. However, if it's not really a complete unit,
then you'll probably end up throwing the test away.
Because of that, and the inability to notice the units until
after you've created them, I like the simple assert statements as opposed to
the tests, at least in this case.
When we tried solving Sudoku using
TDD during a couple of meetings of the UH Code Dojo, we introduced a lot of methods I felt were artificially there, just to be able to test them.
We also created an object where one might not have existed had we known a way to solve Sudoku through
code to begin with.
Now, we could easily clean up the interface when we're done, but I don't really feel a compulsion to practice
TDD when working on algorithms like I've outlined above. I will write several tests for them to make sure
they work, but (at least today) I prefer to work through them without the hassle of writing tests for the
subatomic particles that make up the unit.
Posted by Sam on Feb 18, 2008 at 06:43 AM UTC - 5 hrs
Last week, hgs asked,
I find it interesting that lots of people write about how to produce clean code,
how to do good design, taking care about language choice, interfaces, etc, but few people
write about the cases where there isn't time... So, I need to know what are the forces that tell you
to use a jolly good bodge?
I suspect we don't hear much about it because these other problems are often caused by that excuse.
And, in the long run, taking on that technical debt will likely cause you to go so slow that that's the
more interesting problem. In other words, by ignoring the need for good code, you are jumping into
a downward spiral where you are giving yourself even less time (or, making it take so long to do anything
that you may as well have less time).
More...
I think the solution is to start under-promising and over-delivering, as opposed to how most of us do it
now: giving lowball estimates because we think that's what they want to hear. But why lie to them?
If you're using iterative and incremental development, then if you've over-promised one iteration, you
are supposed to dial down your estimates for what you can accomplish in subsequent iterations, until
you finally get good at estimating. And estimates should include what it takes to do it right.
That's the party-line answer to the question. In short: it's never OK to write sloppy code, and
you should take precautions against ever putting yourself in a situation where those
viscous forces pull you in that direction.
The party-line answer is the best answer, but it doesn't fully address the question, and I'm not
always interested in party-line answers anyway. The viscosity (when it's easier
to do the wrong thing that the right thing) is the force behind the bodge. I don't like it, but I
recognize that there are going to be times you run into it and can't resist.
In those cases where you've already painted yourself into a corner, what then? That's the interesting
question here. How do you know the best
places to hack crapcode together and ignore those things that may take a little longer in the short run, but
whose value shows up in the long run?
The easy answer is the obvious one: cut corners in the code that is least likely to need to change or
be touched again. That's because (assuming your hack works) if we don't have to look at the code again,
who really cares that it was a nasty hack? The question whose answer is not so easy or
obvious is "what does such a place in the code look like?"
By the definition above, it would be the lower levels of your code. But if you do that, and inject a bug, then
many other parts of your application would be affected. So maybe that's not the right place to do it.
Instead, it would be better to do it in the higher levels, on which very little (if any) other code
depends. That way, you limit the effects of it. More importantly, if there are no outgoing dependencies
on it, it is easier to change than if other code were highly dependent on it. [ 1]
Maybe the crapcode can be isolated: if a class is already aweful, can you derive a new class from it and
make any new additions with higher quality? If a class is of high quality and you need to hack something together,
can you make a child class and put the hack there? [ 2]
Uncle Bob recently discussed when unit and acceptance
testing may safely be done away with. He put those numbers around 10 and a few thousand lines of code,
respectively.
In the end, there is no easy answer that I can find where I would definitively say, "that's the place for a bodging."
But I suspect there are some patterns we can look for, and I tried to identify a couple of those above.
Do you have any candidates you'd like to share?
Notes:
[1] A passing thought for which I have no answers:
The problem with even identifying those places is that by hacking together solutions, you are more likely
to inject defects into the code, which makes it more likely you'll need to touch it again.
[2] I use inheritance here because the new classes should be
able to be used without violating LSP.
However, you may very well be able to make those changes by favoring composition.
If you can, I'd advocate doing so.
Posted by Sam on Feb 11, 2008 at 05:56 AM UTC - 5 hrs
When I posted about why it's important to test everything first, Marc Esher
from MXUnit asked:
What do you find hard about TDD? When you're developing and you see yourself
not writing tests but jamming out code, what causes those moments for you?
And have you really, in all honesty, ever reaped significant benefits either in
productivity or quality from unit testing? Because there's a pretty large contingent
of folks who don't get much mileage out of TDD, and I can see where they're coming from.
My TDD Stumbling Blocks
I'll address the first bit in one word: viscosity. When it's easier to do the wrong thing
than the right thing, that's when I "see myself not writing tests but jamming out code."
But what causes the viscosity for me? Several things, really:
More...
-
When I'm working with a new framework or technology and I don't know how to test it: I'm trying
to avoid this now by learning languages by unit testing.
However, it's still tough. I started writing tests in C# .NET recently, but moving things to
ASP.NET has made me stumble a bit. That's mostly because I didn't take the time to understand
how it all worked before I started using it, and now I'm in the process of rewriting that code before it becomes too
entrenched.
- UIs: I still don't understand how to test them effectively. I like Selenium for the web,
but most tests I write with it are brittle. Because of that, I write them flippantly. It's a
vicious cycle too: without learning what works, I won't get better at identifying strategies to
remove the viscosity, so I won't write the tests.
- Finally, and most of all: legacy code bases, with no tests and poor design. It's so much easier to hack a fix than it
is to refactor to something testable. I've yet to read Michael Feathers' Working Effectively
With Legacy Code, so that may help when I finally do. (You can find a twelve page PDF article
@ the Object Mentor Resources Website.)
At the minimum, it should help motivate me to follow the elephant more often.
That last one is a killer for me. When I'm working on new projects, it's incredibly easy to write
tests as I develop. So much so that I don't bother thinking about not doing it. Unfortunately, most
of my work is not in new code bases.
I should also note that I often don't bother unit testing one-off throwaway scripts, but there
are times when I do.
On top of that, my unit tests rarely stay unit-sized. I generally just
let them turn into integration tests (stubbing objects as I need them when they are still
unit-sized). The only time I bother with mocks are if the integration piece is taking too long
to run tests.
For example, I might let the tests hit a testing database for a while, but as the tests get unbearable
to run, I'll write a different class to use that just returns some pre-done queries, or runs
all the logic except for save().
What about rewards?
In Code Complete 2, Steve McConnell talks about why it's important to measure experiments when
tuning your code:
Experience doesn't help much with optimization either. A person's experience might have
come from an old machine, language, or compiler - when any of those things changes, all
bets are off. You can never be sure about the effect of an optimization until you
measure the effect. (McConnell, 603)
I bring that up because I think of TDD (and any other practice we might do while
developing) as an optimization, and to be sure about it's effects, I'd have to measure it.
I haven't measured myself with TDD and without, so you can take what follows as anecdotal
evidence only. (Just because I say that, don't think you can try TDD for a couple of days
and decide it's slowing you down so it doesn't bring any benefit - it takes a while to
realize many of the benefits.)
So what rewards have I noticed? Like the problems I've had, there are a few:
-
Better design: My design without TDD has been a train wreck (much of that due to my
past ignorance of design principles), but has (still) improved as a result of TDD.
After all, TDD is a design activity. When writing a test, or determining what test to write next, you
are actively involved in thinking about how you want your code to behave, and how you want to
be able to reuse it.
As a byproduct of writing the tests, you get a very modular design - it becomes harder to do
the wrong thing (bad design), and easier to keep methods short and cohesive.
-
Less fear: Do you have any code that you just hate to touch because of the horror it sends
down your spine? I do. I've had code that is so complex and wrapped up within itself that I've
literally counseled not changing it for fear of it breaking and not being able to fix it. My
bet is that you've probably seen similar code.
The improved design TDD leads to helps that to some extent obviously. But there may be times
when even though you've got a test for something, it's still ugly code that could break easily.
The upside though, is you don't need to fear it breaking. In fact, if you think about it,
the fear isn't so much that you'll break the code - you fear you won't know you've broken it.
With good tests, you know when you've broken something and you can fix it before you deploy.
-
Time savings: It does take some time to write tests, but not as much as you might think.
As far as thinking about what you want your code to do, and how you want to reuse it, my
belief is that you are doing those things anyway. If not, you probably should be, and your
code likely looks much the same as some of that which I have to deal with
(for a description, see the title of this weblog).
It saves time as an executable specification - I don't have to trace through a big code base
to find out what a method does or how it's supposed to do it. I just look up the unit tests
and see it within a few clean lines.
Most of your tests will be 5-7 lines long, and you might have five tests per method. Even
if you just test the expected path through the code, ignoring exceptions and negative tests,
you'll be a lot better off and you'll only be writing one or two tests per method.
How long does that take? Maybe five minutes per test? (Which would put you at one minute per line!)
Maybe you won't achieve that velocity as you're learning the style of development, but certainly you could
be there (or better) after a month or two.
And you're testing anyway, right? I mean, you don't write code and check it in to development
without at least running it, do you? So, if you're programming from the bottom up, you've
already written a test runner of some sort to verify the results. What would it cost to
put that code into a test? Perhaps a minute or three, I would guess.
And now when you need to change that code, how long does it take you to login to the application,
find the page you need to run, fill out the form, and wait for a response to see if you were right?
If you're storing the result in the session, do you need to log out and go through the same process,
just to verify a simple calculation?
How much time would it save if you had written automated tests? Let's say it takes you two
minutes on average to verify a change each time you make one. If it took you half-an-hour
of thinking and writing five tests, then within 15 changes you've hit even and the rest is gravy.
How many times do you change the same piece of code? Once a year? Oh, but we didn't include all the
changes that occur during initial development. What if you got it wrong the first
time you made the fix? Certainly a piece of code changes 15 times even before you've got it
working in many cases.
Overall, I believe it does save time, but again, I haven't measured it. It's just all those little
things you do that take a few seconds at a time - you don't notice them. Instead, you think
of them as little tasks to get you from one place to another. That's what TDD is like: but
you don't see it that way if you haven't been using it for a while. You see it as an
extra task - one thing added to do. Instead, it replaces a lot of tasks.
And wouldn't it be better if you could push a button and verify results?
That's been my experience with troubles and benefits. What's yours been like? If you haven't
tried it, or are new, I'm happy to entertain questions below (or privately if you prefer) as
well.
Posted by Sam on Jan 30, 2008 at 07:34 AM UTC - 5 hrs
Because when you don't, how do you know your change to the code had any effect?
When a customer calls with a trouble ticket, do you just fix the problem, or do you
reproduce it first (a red test), make the fix, and test again (a green test, if you fixed the
problem)?
Likewise, if you write automated tests, but don't run them first to ensure
they fail, it defeats the purpose of having the test. Most of the time you won't
run into problems, but when you do, it's not fun trying to solve them. Who would
think to look at a test that's passing?
The solution, of course, is to forget about testing altogether. Then we won't be lulled into
a false sense of security. Right?
Last modified on Jan 30, 2008 at 07:35 AM UTC - 5 hrs
Posted by Sam on Jan 14, 2008 at 06:42 AM UTC - 5 hrs
This is a story about my journey as a programmer, the major highs and lows I've had along the way, and
how this post came to be. It's not about how ecstasy made me a better programmer, so I apologize if that's why you came.
In any case, we'll start at the end, jump to
the beginning, and move along back to today. It's long, but I hope the read is as rewarding as the write.
A while back,
Reg Braithwaite
challenged programing bloggers with three posts he'd love to read (and one that he wouldn't). I loved
the idea so much that I've been thinking about all my experiences as a programmer off and on for the
last several months, trying to find the links between what I learned from certain languages that made
me a better programmer in others, and how they made me better overall. That's how this post came to be.
More...
The experiences discussed herein were valuable in their own right, but the challenge itself is rewarding
as well. How often do we pause to reflect on what we've learned, and more importantly, how it has changed
us? Because of that, I recommend you perform the exercise as well.
I freely admit that some of this isn't necessarily caused by my experiences with the language alone - but
instead shaped by the languages and my experiences surrounding the times.
One last bit of administrata: Some of these memories are over a decade old, and therefore may bleed together
and/or be unfactual. Please forgive the minor errors due to memory loss.
How QBASIC Made Me A Programmer
As I've said before, from the time I was very young, I had an interest in making games.
I was enamored with my Atari 2600, and then later the NES.
I also enjoyed a playground game with Donald Duck
and Spelunker.
Before I was 10, I had a notepad with designs for my as-yet-unreleased blockbuster of a side-scrolling game that would run on
my very own Super Sanola game console (I had the shell designed, not the electronics).
It was that intense interest in how to make a game that led me to inspect some of the source code Microsoft
provided with QBASIC. After learning PRINT, INPUT,
IF..THEN, and GOTO (and of course SomeLabel: to go to)
I was ready to take a shot at my first text-based adventure game.
The game wasn't all that big - consisting of a few rooms, the NEWS
directions, swinging of a sword against a few monsters, and keeping track of treasure and stats for everything -
but it was a complete mess.
The experience with QBASIC taught me that, for any given program of sufficient complexity, you really only
need three to four language constructs:
- Input
- Output
- Conditional statements
- Control structures
Even the control structures may not be necessary there. Why? Suppose you know a set of operations will
be performed an unknown but arbitrary amount of times. Suppose also that it will
be performed less than X number of times, where X is a known quantity smaller than infinity. Then you
can simply write out X number of conditionals to cover all the cases. Not efficient, but not a requirement
either.
Unfortunately, that experience and its lesson stuck with me for a while. (Hence, the title of this weblog.)
Side Note: The number of language constructs I mentioned that are necessary is not from a scientific
source - just from the top of my head at the time I wrote it. If I'm wrong on the amount (be it too high or too low), I always appreciate corrections in the comments.
What ANSI Art taught me about programming
When I started making ANSI art, I was unaware
of TheDraw. Instead, I opened up those .ans files I
enjoyed looking at so much in MS-DOS Editor to
see how it was done. A bunch of escape codes and blocks
came together to produce a thing of visual beauty.
Since all I knew about were the escape codes and the blocks (alt-177, 178, 219-223 mostly), naturally
I used the MS-DOS Editor to create my own art. The limitations of the medium were
strangling, but that was what made it fun.
And I'm sure you can imagine the pain - worse than programming in an assembly language (at least for relatively
small programs).
Nevertheless, the experience taught me some valuable lessons:
- Even though we value people over tools, don't underestimate
the value of a good tool. In fact, when attempting anything new to you, see if there's a tool that can
help you. Back then, I was on local BBSs, and not
the 1337 ones when I first started out. Now, the Internet is ubiquitous. We don't have an excuse anymore.
-
I can now navigate through really bad code (and code that is limited by the language)
a bit easier than I might otherwise have been able to do. I might have to do some experimenting to see what the symbols mean,
but I imagine everyone would.
And to be fair, I'm sure years of personally producing such crapcode also has
something to do with my navigation abilities.
-
Perhaps most importantly, it taught me the value of working in small chunks and
taking baby steps.
When you can't see the result of what you're doing, you've got to constantly check the results
of the latest change, and most software systems are like that. Moreover, when you encounter
something unexpected, an effective approach is to isolate the problem by isolating the
code. In doing so, you can reproduce the problem and problem area, making the fix much
easier.
The Middle Years (included for completeness' sake)
The middle years included exposure to Turbo Pascal,
MASM, C, and C++, and some small experiences in other places as well. Although I learned many lessons,
there are far too many to list here, and most are so small as to not be significant on their own.
Therefore, they are uninteresting for the purposes of this post.
However, there were two lessons I learned from this time (but not during) that are significant:
-
Learn to compile your own $&*@%# programs
(or, learn to fish instead of asking for them).
-
Stop being an arrogant know-it-all prick and admit you know nothing.
As you can tell, I was quite the cowboy coding young buck. I've tried to change that in recent years.
How ColdFusion made me a better programmer when I use Java
Although I've written a ton of bad code in ColdFusion, I've also written a couple of good lines
here and there. I came into ColdFusion with the experiences I've related above this, and my early times
with it definitely illustrate that fact. I cared nothing for small files, knew nothing of abstraction,
and horrendous god-files were created as a result.
If you're a fan of Italian food, looking through my code would make your mouth water.
DRY principle?
Forget about it. I still thought code reuse meant copy and paste.
Still, ColdFusion taught me one important aspect that got me started on the path to
Object Oriented Enlightenment:
Database access shouldn't require several lines of boilerplate code to execute one line of SQL.
Because of my experience with ColdFusion, I wrote my first reusable class in Java that took the boilerplating away, let me instantiate a single object,
and use it for queries.
How Java taught me to write better programs in Ruby, C#, CF and others
It was around the time I started using Java quite a bit that I discovered Uncle Bob's Principles of OOD,
so much of the improvement here is only indirectly related to Java.
Sure, I had heard about object oriented programming, but either I shrugged it off ("who needs that?") or
didn't "get" it (or more likely, a combination of both).
Whatever it was, it took a couple of years of revisiting my own crapcode in ColdFusion and Java as a "professional"
to tip me over the edge. I had to find a better way: Grad school here I come!
The better way was to find a new career. I was going to enter as a Political Scientist
and drop programming altogether. I had seemingly lost all passion for the subject.
Fortunately for me now, the political science department wasn't accepting Spring entrance, so I decide to
at least get started in computer science. Even more luckily, that first semester
Venkat introduced me to the solution to many my problems,
and got me excited about programming again.
I was using Java fairly heavily during all this time, so learning the principles behind OO in depth and
in Java allowed me to extrapolate that for use in other languages.
I focused on principles, not recipes.
On top of it all, Java taught me about unit testing with
JUnit. Now, the first thing I look for when evaluating a language
is a unit testing framework.
What Ruby taught me that the others didn't
My experience with Ruby over the last year or so has been invaluable. In particular, there are four
lessons I've taken (or am in the process of taking):
-
The importance of code as data, or higher-order functions, or first-order functions, or blocks or
closures: After learning how to appropriately use
yield, I really miss it when I'm
using a language where it's lacking.
-
There is value in viewing programming as the construction of lanugages, and DSLs are useful
tools to have in your toolbox.
-
Metaprogramming is OK. Before Ruby, I used metaprogramming very sparingly. Part of that is because
I didn't understand it, and the other part is I didn't take the time to understand it because I
had heard how slow it can make your programs.
Needless to say, after seeing it in action in Ruby, I started using those features more extensively
everywhere else. After seeing Rails, I very rarely write queries in ColdFusion - instead, I've
got a component that takes care of it for me.
-
Because of my interests in Java and Ruby, I've recently started browsing JRuby's source code
and issue tracker.
I'm not yet able to put into words what I'm learning, but that time will come with
some more experience. In any case, I can't imagine that I'll learn nothing from the likes of
Charlie Nutter, Ola Bini,
Thomas Enebo, and others. Can you?
What's next?
Missing from my experience has been a functional language. Sure, I had a tiny bit of Lisp in college, but
not enough to say I got anything out of it. So this year, I'm going to do something useful and not useful
in Erlang. Perhaps next I'll go for Lisp. We'll see where time takes me after that.
That's been my journey. What's yours been like?
Now that I've written that post, I have a request for a post I'd like to see:
What have you learned from a non-programming-related discipline that's made you a better programmer?
Last modified on Jan 16, 2008 at 07:09 AM UTC - 5 hrs
Posted by Sam on Dec 10, 2007 at 11:46 AM UTC - 5 hrs
Here's some pseudocode that got added to a production system that might just be the very definition of a simple change:
-
Add a link from one page to cancel_order.cfm?orderID=12345
-
In that new page, add the following two queries:
- update orders set canceled = 1, canceledOn=getDate() where orderID=#url.orderID#
- delete from orderItems
Now, upload those changes to the production server, and run it real quick to be sure it does what you meant it to do.
Then you say to yourself, "Wait, why is the page taking several seconds to load?"
"Holy $%^@," you think aloud, "I just deleted every item from every order in the system!"
It's easy enough for you to recover the data from the backups. It isn't quite as easy to recover from the heart attack.
Steve McConnell (among others) says that the easiest changes are the most important ones to test, as you aren't thinking quite as hard about it when you make them.
Are there any unbelievers left out there?
Last modified on Dec 10, 2007 at 11:46 AM UTC - 5 hrs
Posted by Sam on Oct 04, 2007 at 04:00 PM UTC - 5 hrs
A couple of weeks ago the UH Code Dojo embarked on the
fantastic voyage that is writing a program to solve Sudoku puzzles, in Ruby. This week, we
continued that journey.
Though we still haven't completed the problem (we'll be meeting again tenatively on October 15, 2007 to
do that), we did construct what we think is a viable plan for getting there, and began to implement some
of it.
The idea was based around this algorithm (or something close to it):
More...
while (!puzzle.solved)
{
find the most constrained row, column, or submatrix
for each open square in the most constrained space,
find intersection of valid numbers to fill the square
starting with the most constrained,
begin filling in open squares with available numbers
}
With that in mind, we started again with TDD.
I'm not going to explain the rationale behind each peice of code, since the idea was presented above.
However, please feel free to ask any questions if you are confused, or even if you'd just like
to challenge our ideas.
Here are the tests we added:
def test_find_most_constrained_row_column_or_submatrix
assert_equal "4 c", @solver.get_most_constrained
end
def test_get_available_numbers
most_constrained = @solver.get_most_constrained
assert_equal [3,4,5], @solver.get_available_numbers(most_constrained).sort
end
def test_get_first_empty_cell_from_most_constrained
most_constrained = @solver.get_most_constrained
indices = @solver.get_first_empty_cell(most_constrained)
assert_equal [2,4], indices
end
def test_get_first_empty_cell_in_submatrix
indices = @solver.get_first_empty_cell("0 m")
assert_equal [0,2], indices
end
And here is the code to make the tests pass:
def get_most_constrained
min_open = 10
min_index = 0
@board.each_index do |i|
this_open = @board[i].length - @board[i].reject{|x| x==0}.length
min_open, min_index = this_open, i.to_s + " r" if this_open < min_open
end
#min_row = @board[min_index.split(" ")]
@board.transpose.each_index do |i|
this_open = @board.transpose[i].length - @board.transpose[i].reject{|x| x==0}.length
min_open, min_index = this_open, i.to_s + " c" if this_open < min_open
end
(0..8).each do |index|
flat_subm = @board.get_submatrix_by_index(index).flatten
this_open = flat_subm.length - flat_subm.reject{|x| x==0}.length
min_open, min_index = this_open, index.to_s + " m" if this_open < min_open
end
return min_index
end
def get_available_numbers(most_constrained)
index, flag = most_constrained.split(" ")[0].to_i,most_constrained.split(" ")[1]
avail = [1,2,3,4,5,6,7,8,9] - @board[index] if flag == "r"
avail = [1,2,3,4,5,6,7,8,9] - @board.transpose[index] if flag == "c"
avail = [1,2,3,4,5,6,7,8,9] - @board.get_submatrix_by_index(index).flatten if flag == "m"
return avail
end
def get_first_empty_cell(most_constrained)
index, flag = most_constrained.split(" ")[0].to_i,most_constrained.split(" ")[1]
result = index, @board[index].index(0) if flag == "r"
result = @board.transpose[index].index(0), index if flag == "c"
result = index%3*3,index*3+@board.get_submatrix_by_index(index).flatten.index(0) if flag == "m"
return result
end
Obviously we need to clean out that commented-out line, and
I feel kind of uncomfortable with the small amount of tests we have compared to code. That unease was
compounded when we noticed a call to get_submatrix instead of get_submatrix_by_index.
Everything passed because we were only testing the first most constrained column. Of course, it will
get tested eventually when we have test_solve, but it was good that the pair-room-programming
caught the defect.
I'm also not entirely convinced I like passing around the index of the most constrained whatever along
with a flag denoting what type it is. I really think we can come up with a better way to do that, so
I'm hoping that will change before we rely too much on it, and it becomes impossible to change without
cascading.
Finally, we also set up a repository for this and all of our future code. It's not yet open to the public
as of this writing (though I expect that
to change soon). In any case, if you'd like to get the full source code to this, you can find our Google Code
site
at http://code.google.com/p/uhcodedojo/.
If you'd like to aid me in my quest to be an idea vacuum (cleaner!), or just have a question, please feel
free to contribute with a comment.
Last modified on Oct 04, 2007 at 04:12 PM UTC - 5 hrs
Posted by Sam on Sep 19, 2007 at 01:37 PM UTC - 5 hrs
A couple of days ago the UH Code Dojo met once again (we took the summer off). I had come in wanting to
figure out five different ways to implement binary search.
The first two - iteratively and recursively - are easy to come up with. But what about three
other implementations? I felt it would be a good exercise in creative thinking, and pehaps it
would teach us new ways to look at problems. I still want to do that at some point, but
the group decided it might be more fun to tackle to problem of solving any Sudoku board,
and that was fine with me.
Remembering the trouble
Ron Jeffries had in
trying to
TDD a solution
to Sudoku, I was a bit
weary of following that path, thinking instead we might try
Peter Norvig's approach. (Note: I haven't looked
at Norvig's solution yet, so don't spoil it for me!)
More...
Instead, we agreed that certainly there are some aspects that are testable, although
the actual search algorithm that finds solutions is likely to be a unit in itself, and therefore
isn't likely to be testable outside of presenting it a board and testing that its outputted solution
is a known solution to the test board.
On that note, our first test was:
require 'test/unit'
require 'sudokusolver'
class TestSudoku < Test::Unit::TestCase
def setup
@board = [[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4 ,1 ,9 ,0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
@solution =[[5, 3, 4, 6, 7, 8, 9, 1, 2],
[6, 7, 2, 1, 9, 5, 3, 4, 8],
[1, 9, 8, 3, 4, 2, 5, 6, 7],
[8, 5, 9, 7, 6, 1, 4, 2, 3],
[4, 2, 6, 8, 5, 3, 7, 9, 1],
[7, 1, 3, 9, 2, 4, 8, 5, 6],
[9, 6, 1, 5, 3, 7, 2, 8, 4],
[2, 8, 7, 4 ,1 ,9 ,6, 3, 5],
[3, 4, 5, 2, 8, 6, 1, 7, 9]]
@solver = SudokuSolver.new
end
def test_solve
our_solution = @solver.solve(@board)
assert_equal(@solution, our_solution)
end
end
We promptly commented out the test though, since we'd never get it to pass until we were done. That
doesn't sound very helpful at this point. Instead, we started writing tests for testing the validity
of rows, columns, and blocks (blocks are what we called the 3x3 submatrices in a Sudoku board).
Our idea was that a row, column, or block is in a valid state
if it contains no duplicates of the digits 1 through 9. Zeroes (open cells) are acceptable in
a valid board. Obviously, they are not acceptable in a solved board.
To get there, we realized we needed to make initialize take the initial game board
as an argument (so you'll need to change that in the TestSudokuSolver#setup method and
SudokuSolver#solve, if you created it), and then we added the following tests (iteratively!):
def test_valid_row
assert @solver.valid_row?(0)
end
def test_valid_allrows
(0..8).each do |row|
assert @solver.valid_row?(row)
end
end
The implementation wasn't difficult, of course. We just need to reject all zeroes from the row, then run
uniq! on the resulting array. Since uniq! returns nil if each
element in the array is unique, and nil evaluates to false, we have:
class SudokuSolver
def initialize(board)
@board = board
end
def valid_row?(row_num)
@board[row_num].reject{|x| x==0}.uniq! == nil
end
end
At this point, we moved on to the columns. The test is essentially the same as for rows:
def test_valid_column
assert @solver.valid_column?(0)
end
def test_valid_allcolumns
(0..8).each do |column|
assert @solver.valid_column?(column)
end
end
The test failed, so we had to make it pass. Basically, this method is also the same for columns as it was
for rows. We just need to call Array#transpose# on the board, and follow along. The
valid_column? one-liner was @board.transpose[col_num].reject{|x| x==0}.uniq! == nil.
We added that first, made sure the test passed, and then refactored SudokuSolver
to remove the duplication:
class SudokuSolver
def initialize(board)
@board = board
end
def valid_row?(row_num)
valid?(@board[row_num])
end
def valid_column?(col_num)
valid?(@board.transpose[col_num])
end
def valid?(array)
array.reject{|x| x==0}.uniq! == nil
end
end
All the tests were green, so we moved on to testing blocks. We first tried slicing in two dimensions, but
that didn't work: @board[0..2][0..2]. We were also surprised Ruby didn't have something
like an Array#extract_submatrix method, which assumes it it passed an array of arrays (hey,
it had a transpose method). Instead, we created our own. I came up with some nasty, convoluted code,
which I thought was rather neat until I rewrote it today.
Ordinarily, I'd love to show it as a fine example of how much prettier it could have become. However, due
to a temporary lapse into idiocy on my part: we were editing in 2 editors, and I accidentally saved the
earlier version after the working version instead of telling it not to save. Because of that,
I'm having to re-implement this.
Now that that sorry excuse is out of the way, here is our test for, and implementation of extract_submatrix:
def test_extract_submatrix
assert_equal [[1,2,3],[4,5,6],[7,8,9]].extract_submatrix(1..2,1..2) , [[5,6],[8,9]]
end
class Array
def extract_submatrix(row_range, col_range)
self[row_range].transpose[col_range].transpose
end
end
Unfortunately, that's where we ran out of time, so we didn't even get to the interesting problems Sudoku
could present. On the other hand, we did agree to meet again two weeks from that meeting (instead of a month),
so we'll continue to explore again on that day.
Thoughts? (Especially about idiomatic Ruby for Array#extract_submatrix)
Update: Changed the misspelled title from SudokoSolver to SudokuSolver.
Update 2: SudokuSolver Part 2 is now available.
Last modified on Oct 04, 2007 at 04:05 PM UTC - 5 hrs
Posted by Sam on Sep 10, 2007 at 12:48 PM UTC - 5 hrs
If you don't care about the background behind this, the reasons why you might want to use
rules based programming, or a bit of theory, you can skip straight the Drools tutorial.
Background
One of the concepts I love to think about (and do) is raising the level of abstraction in a system.
The more often you are telling the computer what, and not how, the better.
Of course, somewhere someone is doing imperative programming (telling it how), but I like to
try to hide much of that somewhere and focus more on declarative programming (telling it what). Many
times, that's the result of abstraction in general and DSLs
and rules-based programming more specifically.
More...
As a side note, let me say that this is not necessarily a zero-sum game. In one aspect you may be declaratively
programming while in another you are doing so imperatively. Example:
function constructARecord()
{
this.name=readFileLineNumber(fileName, 1);
this.phone=readFileLineNumber(fileName, 2);
this.address=readFileLineNumber(fileName, 3);
}
You are telling it how to construct a record, but you are not telling it how to read the file. Instead,
you are just telling it to read the file.
Anyway, enough of that diversion. I hope I've convinced you.
When I finished writing the (half-working) partial order planner in Ruby,
I mentioned I might like "to take actions as functions, which receive preconditions as their parameters,
and whose output are effects" (let me give a special thanks to Hugh Sasse for his help and ideas in trying to use TSort for it while I'm
on the subject).
Doing so may have worked well when generalizing the solution to a rules engine instead of just a planner (they are conceptually
quite similar). That's
often intrigued me from both a business application and game programming standpoint.
The good news is (as you probably already know), this has already been done for us. That was the
subject of Venkat's talk that I attended at No Fluff Just Stuff at the end of June 2007.
Why use rules-based programming?
After a quick introduction, Venkat jumped right into why you might want to use a rules engine.
The most prominent reasons all revolve around the benefits provided by separating concerns:
When business rules change almost daily, changes to source code can be costly. Separation of knowledge
from implementation reduces this cost by having no requirement to change the source code.
Additionally, instead of providing long chains of if...else statements, using a rule engine
allows you the benefits of declarative programming.
A bit of theory
The three most important aspects for specifying the system are facts, patterns, and the rules themselves.
It's hard to describe in a couple of sentences, but your intuition should serve you well - I don't think
you need to know all of the theory to understand a rule-based system.
Rules Engine based on Venkat's notes
Facts are just bits of information that can be used to make decisions. Patterns are similar, but can
contain variables that allow them to be expanded into other patterns or facts. Finally, rules have
predicates/premises that if met by the facts will fire the rule which allows the action to be
performed (or conclusion to be made).
(Another side note: See JSR 94 for a Java spec for Rules Engines
or this google query for some theory.
Norvig's and Russell's Artificial Intelligence: A Modern Approach
also has good explanations, and is a good introduction to AI in general (though being a textbook, it's pricey at > $90 US)).
(Yet another side note: the computational complexity of this pattern matching can be enormous, but the
Rete Algorithm will help, so don't
prematurely optimize your rules.)
Drools
Now that we know a bit of the theory behind rules-based systems, let's get into the practical to show
how easy it can be (and aid in removing fear of new technology to become a generalist).
First, you can get Drools 2.5 at Codehaus or
version 3, 4 or better from JBoss.
At the time of writing, the original Drools lets you use XML with Java or Groovy, Python, or Drools' own
DSL to implement rules, while the JBoss version (I believe) is (still) limited to Java or the DSL.
Since I avoid XML like the plague (well, not that bad),
I'll go the DSL route.
First, you'll need to download Drools and unzip it to a place you keep external Java libraries.
I'm working with Drools 4.0.1. After that, I created a new Java project in Eclipse and added a new user
library for Drools, then added that to my build path (I used all of the Drools JARs in the library).
(And don't forget JUnit if you don't have it come up automatically!)
Errors you might need to fix
For reference for anyone who might run across the problems I did, I'm going to include a few of the
errors I came into contact with and how I resolved them. I was starting with a pre-done example, but
I will show the process used to create it after trudging through the errors. Feel free to
skip this section if you're not having problems.
After trying the minimal additions to the build path I mentioned above, I was seeing an error
that javax.rules was not being recognized. I added
jsr94-1.1.jar to my Drools library (this is included under /lib in the Drools download) and it was
finally able to compile.
When running the unit tests, however, I still got this error:
org.drools.RuntimeDroolsException: Unable to load dialect 'org.drools.rule.builder.dialect.java.JavaDialectConfiguration:java'
At that point I just decided to add all the dependencies in /lib to my Drools library and the error
went away. Obviously you don't need Ant, but I wasn't quite in the mood to go hunting for the minimum
of what I needed. You might feel differently, however.
Now that the dialect was able to be loaded, I got another error:
org.drools.rule.InvalidRulePackage: Rule Compilation error : [Rule name=Some Rule Name, agendaGroup=MAIN, salience=-1, no-loop=false]
com/codeodor/Rule_Some_Rule_Name_0.java (8:349) : Cannot invoke intValue() on the primitive type int
As you might expect, this was happening simply because the rule was receiving an int and was
trying to call a method from Integer on it.
After all that, my pre-made example ran correctly, and being comfortable that I had Drools working,
I was ready to try my own.
An Example: Getting a Home Loan
From where I sit, the process of determining how and when to give a home loan is complex and can change
quite often. You need to consider an applicant's credit score, income, and down payment, among
other things. Therefore, I think it is a good candidate for use with Drools.
To keep the tutorial simple
(and short), our loan determinizer will only consider credit score and down payment in regards to the
cost of the house.
First we'll define the HomeBuyer class. I don't feel the need for tests, because as you'll
see, it does next to nothing.
package com.codeodor;
public class HomeBuyer {
int _creditScore;
int _downPayment;
String _name;
public HomeBuyer(String buyerName, int creditScore, int downPayment) {
_name = buyerName;
_creditScore = creditScore;
_downPayment = downPayment;
}
public String getName() {
return _name;
}
public int getCreditScore(){
return _creditScore;
}
public int getDownPayment(){
return _downPayment;
}
}
Next, we'll need a class that sets up and runs the rules. I'm not feeling
the need to test this directly either, because it is 99% boilerplate and all of the code
gets tested when we test the rules anyway. Here is our LoanDeterminizer:
package com.codeodor;
import org.drools.jsr94.rules.RuleServiceProviderImpl;
import javax.rules.RuleServiceProviderManager;
import javax.rules.RuleServiceProvider;
import javax.rules.StatelessRuleSession;
import javax.rules.RuleRuntime;
import javax.rules.admin.RuleAdministrator;
import javax.rules.admin.LocalRuleExecutionSetProvider;
import javax.rules.admin.RuleExecutionSet;
import java.io.InputStream;
import java.util.ArrayList;
public class LoanDeterminizer {
// the meat of our class
private boolean _okToGiveLoan;
private HomeBuyer _homeBuyer;
private int _costOfHome;
public boolean giveLoan(HomeBuyer h, int costOfHome) {
_okToGiveLoan = true;
_homeBuyer = h;
_costOfHome = costOfHome;
ArrayList<Object> objectList = new ArrayList<Object>();
objectList.add(h);
objectList.add(costOfHome);
objectList.add(this);
return _okToGiveLoan;
}
public HomeBuyer getHomeBuyer() { return _homeBuyer; }
public int getCostOfHome() { return _costOfHome; }
public boolean getOkToGiveLoan() { return _okToGiveLoan; }
public double getPercentDown() { return(double)(_homeBuyer.getDownPayment()/_costOfHome); }
// semi boiler plate (values or names change based on name)
private final String RULE_URI = "LoanRules.drl"; // this is the file name our Rules are contained in
public LoanDeterminizer() throws Exception // the constructor name obviously changes based on class name
{
prepare();
}
// complete boiler plate code from Venkat's presentation examples follows
// I imagine some of this changes based on how you want to use Drools
private final String RULE_SERVICE_PROVIDER = "http://drools.org/";
private StatelessRuleSession _statelessRuleSession;
private RuleAdministrator _ruleAdministrator;
private boolean _clean = false;
protected void finalize() throws Throwable
{
if (!_clean) { cleanUp(); }
}
private void prepare() throws Exception
{
RuleServiceProviderManager.registerRuleServiceProvider(
RULE_SERVICE_PROVIDER, RuleServiceProviderImpl.class );
RuleServiceProvider ruleServiceProvider =
RuleServiceProviderManager.getRuleServiceProvider(RULE_SERVICE_PROVIDER);
_ruleAdministrator = ruleServiceProvider.getRuleAdministrator( );
LocalRuleExecutionSetProvider ruleSetProvider =
_ruleAdministrator.getLocalRuleExecutionSetProvider(null);
InputStream rules = Exchange.class.getResourceAsStream(RULE_URI);
RuleExecutionSet ruleExecutionSet =
ruleSetProvider.createRuleExecutionSet(rules, null);
_ruleAdministrator.registerRuleExecutionSet(RULE_URI, ruleExecutionSet, null);
RuleRuntime ruleRuntime = ruleServiceProvider.getRuleRuntime();
_statelessRuleSession =
(StatelessRuleSession) ruleRuntime.createRuleSession(
RULE_URI, null, RuleRuntime.STATELESS_SESSION_TYPE );
}
public void cleanUp() throws Exception
{
_clean = true;
_statelessRuleSession.release( );
_ruleAdministrator.deregisterRuleExecutionSet(RULE_URI |