Posted by Sam on Oct 16, 2009 at 12:00 AM UTC - 5 hrs
JetBrains recently announced that IntelliJ IDEA will be open-sourced and have an Apache 2.0-licensed version with the release of 9.0.
Those who've been reading My Secret Life as a Spaghetti Coder for a long time will know I totally love IDEA.
I haven't written software in Java for quite some time, and I don't normally do "news" posts, but I know enough of you are into Java, Groovy, and Scala to make this worth your while if the $249 was pricey enough to force you into using lesser Java IDEs. Now you don't have to.
Get it and I don't think you'll be disappointed. But don't wait until 10.0. As Cedric Beust pointed out (and I'm inclined to agree),
" IDEA going opensource means that it will now slowly die."
I hope not, of course, but one has to wonder.
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
Posted by Sam on Feb 13, 2008 at 08:44 AM UTC - 5 hrs
One step back from greatness lies the very definition of the impossible leadership situation:
a president affiliated with a set of established commitments that have in the course of
events been called into question as failed or irrelevant responses to the problems of the day...
The instinctive political stance of the establishment affiliate -- to affirm and continue the
work of the past -- becomes at these moments a threat to the vitality, if not survival,
of the nations, and leadership collapses upon a dismal choice. To affirm established
commitments is to stigmatize oneself as a symptom of the nation's problems and the premier
symbol of systemic political failure; to repudiate them is to become isolated from one's most
natural political allies and to be rendered impotent.
A little while ago Obie asked " What's this crap about a Ruby backlash?" The whole situation has reminded me of Skowronek's work, so I dug a couple of passages up.
We're at a crossroads right now between two regimes - one represented by Java, and the other represented by Ruby (although it is quite a bit more nuanced than that). My belief right now is that Java The Language is in a position where it can't win. People are fed up with the same old crap, and a change is happening (see also: Why Do I Have To Tell The Compiler Twice?, or Adventures in Talking To a Compiler That Doesn't Listen.)
More...
What these [reconstructive] presidents did, and what their predecessors could not do, was to
reformulate the nation's political agenda altogether, ... and to move the nation past the old
problems, eyeing a different set of possibilities... (Skowronek, pg. 38)
When the new regime starts gaining momentum, in the old regime there will be wailing and gnashing of teeth. We can see some of this in the dogma repeated by Ruby's detractors alluded to (but not sourced) by Daniel Spiewak. We hear it in the fear in people's comments when they fail to criticize the ideas, relying instead on ad hominem attacks that have little to nothing to do with the issues at hand.
(Unlike Obie, I don't have any reason to call attention to anyone by name. If you honestly haven't seen this, let's try i don't like ruby, ruby sucks, and ruby is slow and see if we can weed through the sarcasm, apologists who parrot the line so as not to offend people, or just those exact words with no other substance. )
There will also fail to be unity.
Java is considering closures, and some want multiline string literals. There's talk that Java is done and that there should be a return to the good ol' days.
Neal Gafter quotes himself and Joshua Bloch in Is Java Dying? (where he concludes that it isn't):
Neal Gafter: "If you don't want to change the meaning of anything ever, you have no choice but to not do anything. The trick is to minimize the effect of the changes while enabling as much as possible. I think there's still a lot of room for adding functionality without breaking existing stuff..."
Josh Bloch: "My view of what really happens is a little bit morbid. I think that languages and platforms age by getting larger and clunkier until they fall over of their own weight and die very very slowly, like over ... well, they're all still alive (though not many are programming Cobol anymore). I think it's a great thing, I really love it. I think it's marvelous. It's the cycle of birth, and growth, and death. I remember James saying to me [...] eight years ago 'It's really great when you get to hit the reset button every once and a while.'"
To me, the debate is starting to look a lot like the regime change Skowronek's work predicts when going from a vulnerable establishment regime where an outsider reconstructs a new one.
I'm not saying Ruby itself will supplant Java. But it certainly could be a piece of the polyglot programming puzzle that will do it. It's more of an overall paradigm shift than a language one, so although I say one part is represented by Java and another by Ruby, I hope you won't take me literally.
Franklin Roosevelt was the candidate with "clean hands" at a moment when failed policies,
broken promises, and embarrassed clients were indicting a long-established political order.
Agitating for a rout in 1932, he inveighed against the entire "Republican leadership." He
denounced them as false prophets of prosperity, charged them with incompetence in dealing with
economic calamity, and convicted them of intransigence in the face of social desperation.
Declaring their regime morally bankrupt, he campaigned to cut the knot, to raise a new standard,
to restore to American government the ancient truths that had first inspired it.
(Skowronek, pg 288)
Hoover's inability to take the final step in innovation and
repudiate the system he was transforming served his critic's well... Hoover would later
lament the people's failure to appreciate the significance of his policies, and yet he was
the first to deny it. The crosscurrents of change in the politics of leadership left him with
an impressive string of policy successes, all of which added up to one colossal political
failure... Hoover sought to defend a system that he had already dispensed with...
(Skowronek, pg. 284-285)
Which one sounds like which paradigm?
Last modified on Feb 13, 2008 at 08:45 AM UTC - 5 hrs
Posted by Sam on Jan 14, 2008 at 06:42 AM UTC - 5 hrs
This is a story about my journey as a programmer, the major highs and lows I've had along the way, and
how this post came to be. It's not about how ecstasy made me a better programmer, so I apologize if that's why you came.
In any case, we'll start at the end, jump to
the beginning, and move along back to today. It's long, but I hope the read is as rewarding as the write.
A while back,
Reg Braithwaite
challenged programing bloggers with three posts he'd love to read (and one that he wouldn't). I loved
the idea so much that I've been thinking about all my experiences as a programmer off and on for the
last several months, trying to find the links between what I learned from certain languages that made
me a better programmer in others, and how they made me better overall. That's how this post came to be.
More...
The experiences discussed herein were valuable in their own right, but the challenge itself is rewarding
as well. How often do we pause to reflect on what we've learned, and more importantly, how it has changed
us? Because of that, I recommend you perform the exercise as well.
I freely admit that some of this isn't necessarily caused by my experiences with the language alone - but
instead shaped by the languages and my experiences surrounding the times.
One last bit of administrata: Some of these memories are over a decade old, and therefore may bleed together
and/or be unfactual. Please forgive the minor errors due to memory loss.
How QBASIC Made Me A Programmer
As I've said before, from the time I was very young, I had an interest in making games.
I was enamored with my Atari 2600, and then later the NES.
I also enjoyed a playground game with Donald Duck
and Spelunker.
Before I was 10, I had a notepad with designs for my as-yet-unreleased blockbuster of a side-scrolling game that would run on
my very own Super Sanola game console (I had the shell designed, not the electronics).
It was that intense interest in how to make a game that led me to inspect some of the source code Microsoft
provided with QBASIC. After learning PRINT , INPUT ,
IF..THEN , and GOTO (and of course SomeLabel: to go to)
I was ready to take a shot at my first text-based adventure game.
The game wasn't all that big - consisting of a few rooms, the NEWS
directions, swinging of a sword against a few monsters, and keeping track of treasure and stats for everything -
but it was a complete mess.
The experience with QBASIC taught me that, for any given program of sufficient complexity, you really only
need three to four language constructs:
- Input
- Output
- Conditional statements
- Control structures
Even the control structures may not be necessary there. Why? Suppose you know a set of operations will
be performed an unknown but arbitrary amount of times. Suppose also that it will
be performed less than X number of times, where X is a known quantity smaller than infinity. Then you
can simply write out X number of conditionals to cover all the cases. Not efficient, but not a requirement
either.
Unfortunately, that experience and its lesson stuck with me for a while. (Hence, the title of this weblog.)
Side Note: The number of language constructs I mentioned that are necessary is not from a scientific
source - just from the top of my head at the time I wrote it. If I'm wrong on the amount (be it too high or too low), I always appreciate corrections in the comments.
What ANSI Art taught me about programming
When I started making ANSI art, I was unaware
of TheDraw. Instead, I opened up those .ans files I
enjoyed looking at so much in MS-DOS Editor to
see how it was done. A bunch of escape codes and blocks
came together to produce a thing of visual beauty.
Since all I knew about were the escape codes and the blocks (alt-177, 178, 219-223 mostly), naturally
I used the MS-DOS Editor to create my own art. The limitations of the medium were
strangling, but that was what made it fun.
And I'm sure you can imagine the pain - worse than programming in an assembly language (at least for relatively
small programs).
Nevertheless, the experience taught me some valuable lessons:
- Even though we value people over tools, don't underestimate
the value of a good tool. In fact, when attempting anything new to you, see if there's a tool that can
help you. Back then, I was on local BBSs, and not
the 1337 ones when I first started out. Now, the Internet is ubiquitous. We don't have an excuse anymore.
-
I can now navigate through really bad code (and code that is limited by the language)
a bit easier than I might otherwise have been able to do. I might have to do some experimenting to see what the symbols mean,
but I imagine everyone would.
And to be fair, I'm sure years of personally producing such crapcode also has
something to do with my navigation abilities.
-
Perhaps most importantly, it taught me the value of working in small chunks and
taking baby steps.
When you can't see the result of what you're doing, you've got to constantly check the results
of the latest change, and most software systems are like that. Moreover, when you encounter
something unexpected, an effective approach is to isolate the problem by isolating the
code. In doing so, you can reproduce the problem and problem area, making the fix much
easier.
The Middle Years (included for completeness' sake)
The middle years included exposure to Turbo Pascal,
MASM, C, and C++, and some small experiences in other places as well. Although I learned many lessons,
there are far too many to list here, and most are so small as to not be significant on their own.
Therefore, they are uninteresting for the purposes of this post.
However, there were two lessons I learned from this time (but not during) that are significant:
-
Learn to compile your own $&*@%# programs
(or, learn to fish instead of asking for them).
-
Stop being an arrogant know-it-all prick and admit you know nothing.
As you can tell, I was quite the cowboy coding young buck. I've tried to change that in recent years.
How ColdFusion made me a better programmer when I use Java
Although I've written a ton of bad code in ColdFusion, I've also written a couple of good lines
here and there. I came into ColdFusion with the experiences I've related above this, and my early times
with it definitely illustrate that fact. I cared nothing for small files, knew nothing of abstraction,
and horrendous god-files were created as a result.
If you're a fan of Italian food, looking through my code would make your mouth water.
DRY principle?
Forget about it. I still thought code reuse meant copy and paste.
Still, ColdFusion taught me one important aspect that got me started on the path to
Object Oriented Enlightenment:
Database access shouldn't require several lines of boilerplate code to execute one line of SQL.
Because of my experience with ColdFusion, I wrote my first reusable class in Java that took the boilerplating away, let me instantiate a single object,
and use it for queries.
How Java taught me to write better programs in Ruby, C#, CF and others
It was around the time I started using Java quite a bit that I discovered Uncle Bob's Principles of OOD,
so much of the improvement here is only indirectly related to Java.
Sure, I had heard about object oriented programming, but either I shrugged it off ("who needs that?") or
didn't "get" it (or more likely, a combination of both).
Whatever it was, it took a couple of years of revisiting my own crapcode in ColdFusion and Java as a "professional"
to tip me over the edge. I had to find a better way: Grad school here I come!
The better way was to find a new career. I was going to enter as a Political Scientist
and drop programming altogether. I had seemingly lost all passion for the subject.
Fortunately for me now, the political science department wasn't accepting Spring entrance, so I decide to
at least get started in computer science. Even more luckily, that first semester
Venkat introduced me to the solution to many my problems,
and got me excited about programming again.
I was using Java fairly heavily during all this time, so learning the principles behind OO in depth and
in Java allowed me to extrapolate that for use in other languages.
I focused on principles, not recipes.
On top of it all, Java taught me about unit testing with
JUnit. Now, the first thing I look for when evaluating a language
is a unit testing framework.
What Ruby taught me that the others didn't
My experience with Ruby over the last year or so has been invaluable. In particular, there are four
lessons I've taken (or am in the process of taking):
-
The importance of code as data, or higher-order functions, or first-order functions, or blocks or
closures: After learning how to appropriately use
yield , I really miss it when I'm
using a language where it's lacking.
-
There is value in viewing programming as the construction of lanugages, and DSLs are useful
tools to have in your toolbox.
-
Metaprogramming is OK. Before Ruby, I used metaprogramming very sparingly. Part of that is because
I didn't understand it, and the other part is I didn't take the time to understand it because I
had heard how slow it can make your programs.
Needless to say, after seeing it in action in Ruby, I started using those features more extensively
everywhere else. After seeing Rails, I very rarely write queries in ColdFusion - instead, I've
got a component that takes care of it for me.
-
Because of my interests in Java and Ruby, I've recently started browsing JRuby's source code
and issue tracker.
I'm not yet able to put into words what I'm learning, but that time will come with
some more experience. In any case, I can't imagine that I'll learn nothing from the likes of
Charlie Nutter, Ola Bini,
Thomas Enebo, and others. Can you?
What's next?
Missing from my experience has been a functional language. Sure, I had a tiny bit of Lisp in college, but
not enough to say I got anything out of it. So this year, I'm going to do something useful and not useful
in Erlang. Perhaps next I'll go for Lisp. We'll see where time takes me after that.
That's been my journey. What's yours been like?
Now that I've written that post, I have a request for a post I'd like to see:
What have you learned from a non-programming-related discipline that's made you a better programmer?
Last modified on Jan 16, 2008 at 07:09 AM UTC - 5 hrs
Posted by Sam on Nov 22, 2007 at 12:04 PM UTC - 5 hrs
Since the gift buying season is officially upon us, I thought I'd pitch in to the rampant consumerism and list some of the toys I've had a chance to play with this year that would mean fun and learning for the programmer in your life. Plus, the thought of it sounded fun.
Here they are, in no particular order other than the one in which I thought of them this morning:
More...
- JetBrains' IntelliJ IDEA: An awesome IDE for Java. So great, I don't mind spending the $249 (US) and using it over the free Eclipse. The Ruby plugin is not too shabby either, the license for your copy is good for your OSX and Windows installations, and you can try it free for 30 days. Martin Fowler thinks IntelliJ changed the IDE landscape. If you work in .NET, they also have ReSharper, which I plan to purchase very soon. Now if only we could get a ColdFusion plugin for IntelliJ, I'd love it even more.
- Programming Ruby, Second Edition: What many in the Ruby community consider to be Ruby's Bible. You can lower the barrier of entry for your favorite programmer to using Ruby, certainly one of the funner languages a lot of people are loving to program in lately. Sometimes, I sit and think about things to program just so I can do it in Ruby.
If they've already got that, I always like books as gifts. Some of my
favorites from this year have been: Code Complete 2, Agile Software Development: Principles, Patterns, and Practices which has a great section on object oriented design principles, and of course,
My Job Went to India.
I have a slew of books I've yet to read this year that I got from last Christmas (and birthday), so I'll have to
list those next year.
-
Xbox 360 and a subscription to
XNA Creator's Club (through Xbox Live Marketplace - $99 anually) so they can deploy their games to their new Xbox. This is without a
doubt the thing I'd want most, since I got into this whole programming thing because I was interested
in making games. You can point them to the
getting started page, and they could
make games for the PC for free, using XNA (they'll need that page to get started anyway, even if you
get them the 360 and Creator's Club membership to deploy to the Xbox).
-
MacBook Pro and pay for the extra pixels. I love mine - so much so,
that I intend to marry it. (Ok, not that much, but I have
been enjoying it.)
The extra pixels make the screen almost as wide as two, and if you can get them an extra monitor I'd do
that too. I've moved over to using this as my sole computer for development, and don't bother with
the desktops at work or home anymore, except on rare occasions. You can run Windows on it, and the
virtual machines are getting really good so that you ought not have to even reboot to use either
operating system.
Even if you don't want to get them the MacBook, a second or third monitor should be met with enthusiasm.
-
A Vacation: Programmers are notoriously working long hours
and suffering burnout, so we often need to take a little break from the computer screen. I like
SkyAuction because all the vacations are package deals, there's often a good variety to choose from (many
different countries), most of the time you can find a very good price, and usually the dates are flexible
within a certain time frame, so you don't have to commit right away to a certain date.
Happy Thanksgiving to those celebrating it, and thanks to all you who read and comment and set me straight when I'm wrong - not just here but in the community at large. I do appreciate it.
Do you have any ideas you'd like to share (or ones you'd like to strike from this list)?
Last modified on Nov 22, 2007 at 12:04 PM UTC - 5 hrs
Posted by Sam on Oct 31, 2007 at 04:26 PM UTC - 5 hrs
When looping over collections, you might find yourself needing elements that match only a certain
parameter, rather than all of the elements in the collection. How often do you see something like this?
foreach(x in a)
if(x < 10)
doSomething;
Of course, it can get worse, turning
into arrow code.
More...
What we really need here is a way to filter the collection while looping over it. Move that extra
complexity and indentation out of our code, and have the collection handle it.
In Ruby we have each
as a way to loop over collections. In C# we have foreach , Java's got
for(ElemType elem : theCollection) , and Coldfusion has <cfloop collection="#theCollection#">
and the equivalent over arrays. But wouldn't it be nice to have an each_where(condition) { block; } or
foreach(ElemType elem in Collection.where(condition)) ?
I thought for sure someone would have implemented it in Ruby, so I was surprised at first to see this in my
search results:
However, after a little thought, I realized it's not all that surprising: it is already incredibily easy to filter
a collection in Ruby using the select method.
But what about the other languages? I must confess - I didn't think long and hard about it for Java or C#.
We could implement our own collections such that they have a where method that returns the subset we are
looking for, but to be truly useful we'd need the languages' collections to implement where
as well.
Of these four languages, ColdFusion provides both the need and opportunity, so I gave it a shot.
First, I set up a collection we can use to exercise the code:
<cfset beer = arrayNew(1 )>
<cfset beer[1] = structNew()>
<cfset beer[1].name = "Guiness" >
<cfset beer[1].flavor = "full" >
<cfset beer[2] = structNew()>
<cfset beer[2].name = "Bud Light" >
<cfset beer[2].flavor = "water" >
<cfset beer[3] = structNew()>
<cfset beer[3].name = "Bass" >
<cfset beer[3].flavor = "medium" >
<cfset beer[4] = structNew()>
<cfset beer[4].name = "Newcastle" >
<cfset beer[4].flavor = "full" >
<cfset beer[5] = structNew()>
<cfset beer[5].name = "Natural Light" >
<cfset beer[5].flavor = "water" >
<cfset beer[6] = structNew()>
<cfset beer[6].name = "Boddington's" >
<cfset beer[6].flavor = "medium" >
Then, I exercised it:
<cfset daytypes = ["hot" , "cold" , "mild" ]>
<cfset daytype = daytypes[randrange(1 ,3 )]>
<cfif daytype is "hot" >
<cfset weWantFlavor = "water" >
<cfelseif daytype is "cold" >
<cfset weWantFlavor = "full" >
<cfelse>
<cfset weWantFlavor = "medium" >
</cfif>
<cfoutput>
Flavor we want: #weWantFlavor#<br/> <br/>
Beers with that flavor: <br/>
<cf_loop collection="#beer#" item="aBeer" where="flavor=#weWantFlavor#" >
#aBeer.name#<br/>
</cf_loop>
</cfoutput>
Obviously, that breaks because don't have a cf_loop tag. So, let's create one:
<!--- loop.cfm --->
<cfparam name="attributes.collection" >
<cfparam name="attributes.where" default = "" >
<cfparam name="attributes.item" >
<cfif thistag.ExecutionMode is "start" >
<cfparam name="isDone" default="false" >
<cfparam name="index" default="1" >
<cffunction name="_getNextMatch" >
<cfargument name="arr" >
<cfloop from="#index#" to="#arrayLen(arr)#" index="i" >
<cfset keyValue = attributes.where.split("=" )>
<cfset index=i>
<cfif arr[i][keyValue[1]] is keyValue[2]>
<cfreturn arr[i]>
</cfif>
</cfloop>
<cfset index = arrayLen(arr) + 1>
<cfexit method="exittag" >
</cffunction>
<cfset "caller.#attributes.item#" = _getNextMatch(attributes.collection,index)>
</cfif>
<cfif thistag.ExecutionMode is "end" >
<cfset index=index+1>
<cfset "caller.#attributes.item#" = _getNextMatch(attributes.collection,index)>
<cfif index gt arrayLen(attributes.collection)>
<cfset isDone=true>
</cfif>
<cfif not isDone>
<cfexit method="loop" >
<cfelse>
<cfexit method="exittag" >
</cfif>
</cfif>
It works fine for me, but you might want to implement it differently. The particular area of improvement I
see right away would be to utilize the item name in the where attribute. That way,
you can use this on simple arrays and not just assume arrays of structs.
Thoughts anybody?
Last modified on Oct 31, 2007 at 04:29 PM UTC - 5 hrs
Posted by Sam on Oct 26, 2007 at 08:53 AM UTC - 5 hrs
I just wanted to highlight a couple of interesting ideas I saw on InfoQ this morning:
- Somewhat-open classes in Java: I hadn't known about JavaRebel until now, but it looks interesting. It sounds as if you can do open-class type things, but they don't seem to market it on that aspect - instead, they focus on the time it saves you by not having to "redeploy or restart."
-
That sounds quite a lot like another language where you're working in the abstract syntax tree itself. So I found it good timing that InfoQ also mentioned several implementations-in-the-works of Lisp on the .NET platform. Hey Grant, why don't you join one of those teams?
Do I have any votes for polyglot programming?
Posted by Sam on Sep 25, 2007 at 06:39 AM UTC - 5 hrs
The last bit of advice from Chad Fowler's 52 ways to save your job was to be a generalist, so this week's version is the obvious opposite: to be a specialist.
The intersection point between the two seemingly disparate pieces of advice is that you shouldn't use your lack of experience in multiple technologies to call yourself a specialist in another. Just because you
develop in Java to the exclusion of .NET (or anything else) doesn't make you a Java specialist. To call yourself that,
you need to be "the authority" on all things Java.
More...
Chad mentions a measure he used to assess a job candidate's depth of knowledge in Java: a question of how to make the JVM crash.
I'm definitely lacking in this regard. I've got a pretty good handle on Java, Ruby, and ColdFusion. I've done a small amount of work in .NET and have been adding to that recently. I can certainly write a program that will crash - but can I write one to crash the virtual
machine (or CLR)?
I can relunctantly write small programs in C/C++, but I'm unlikely to have the patience to trace through a large program for fun. I might even still be able to figure out some assembly language if you gave me enough time. Certainly in these lower level items it's not hard to find a way to crash. It's
probably harder to avoid it, in fact.
In ColdFusion, I've crashed the CF Server by simply writing recursive templates (those that cfinclude themselves). (However, I don't know if that still works.) In Java and .NET, I wouldn't know where to start. What about crashing a browser with JavaScript?
So Chad mentions that you should know the internals of JVM and CLR. I should know how JavaScript works in the browser and not just how to getElementById() . With that in mind, these things are going on the to-learn list - the goal being to find a way to crash each of them.
Ideas?
Last modified on Sep 25, 2007 at 06:41 AM UTC - 5 hrs
Posted by Sam on Sep 10, 2007 at 12:48 PM UTC - 5 hrs
If you don't care about the background behind this, the reasons why you might want to use
rules based programming, or a bit of theory, you can skip straight the Drools tutorial.
Background
One of the concepts I love to think about (and do) is raising the level of abstraction in a system.
The more often you are telling the computer what, and not how, the better.
Of course, somewhere someone is doing imperative programming (telling it how), but I like to
try to hide much of that somewhere and focus more on declarative programming (telling it what). Many
times, that's the result of abstraction in general and DSLs
and rules-based programming more specifically.
More...
As a side note, let me say that this is not necessarily a zero-sum game. In one aspect you may be declaratively
programming while in another you are doing so imperatively. Example:
function constructARecord()
{
this.name=readFileLineNumber(fileName, 1);
this.phone=readFileLineNumber(fileName, 2);
this.address=readFileLineNumber(fileName, 3);
}
You are telling it how to construct a record, but you are not telling it how to read the file. Instead,
you are just telling it to read the file.
Anyway, enough of that diversion. I hope I've convinced you.
When I finished writing the (half-working) partial order planner in Ruby,
I mentioned I might like "to take actions as functions, which receive preconditions as their parameters,
and whose output are effects" (let me give a special thanks to Hugh Sasse for his help and ideas in trying to use TSort for it while I'm
on the subject).
Doing so may have worked well when generalizing the solution to a rules engine instead of just a planner (they are conceptually
quite similar). That's
often intrigued me from both a business application and game programming standpoint.
The good news is (as you probably already know), this has already been done for us. That was the
subject of Venkat's talk that I attended at No Fluff Just Stuff at the end of June 2007.
Why use rules-based programming?
After a quick introduction, Venkat jumped right into why you might want to use a rules engine.
The most prominent reasons all revolve around the benefits provided by separating concerns:
When business rules change almost daily, changes to source code can be costly. Separation of knowledge
from implementation reduces this cost by having no requirement to change the source code.
Additionally, instead of providing long chains of if...else statements, using a rule engine
allows you the benefits of declarative programming.
A bit of theory
The three most important aspects for specifying the system are facts, patterns, and the rules themselves.
It's hard to describe in a couple of sentences, but your intuition should serve you well - I don't think
you need to know all of the theory to understand a rule-based system.
Rules Engine based on Venkat's notes
Facts are just bits of information that can be used to make decisions. Patterns are similar, but can
contain variables that allow them to be expanded into other patterns or facts. Finally, rules have
predicates/premises that if met by the facts will fire the rule which allows the action to be
performed (or conclusion to be made).
(Another side note: See JSR 94 for a Java spec for Rules Engines
or this google query for some theory.
Norvig's and Russell's Artificial Intelligence: A Modern Approach
also has good explanations, and is a good introduction to AI in general (though being a textbook, it's pricey at > $90 US)).
(Yet another side note: the computational complexity of this pattern matching can be enormous, but the
Rete Algorithm will help, so don't
prematurely optimize your rules.)
Drools
Now that we know a bit of the theory behind rules-based systems, let's get into the practical to show
how easy it can be (and aid in removing fear of new technology to become a generalist).
First, you can get Drools 2.5 at Codehaus or
version 3, 4 or better from JBoss.
At the time of writing, the original Drools lets you use XML with Java or Groovy, Python, or Drools' own
DSL to implement rules, while the JBoss version (I believe) is (still) limited to Java or the DSL.
Since I avoid XML like the plague (well, not that bad),
I'll go the DSL route.
First, you'll need to download Drools and unzip it to a place you keep external Java libraries.
I'm working with Drools 4.0.1. After that, I created a new Java project in Eclipse and added a new user
library for Drools, then added that to my build path (I used all of the Drools JARs in the library).
(And don't forget JUnit if you don't have it come up automatically!)
Errors you might need to fix
For reference for anyone who might run across the problems I did, I'm going to include a few of the
errors I came into contact with and how I resolved them. I was starting with a pre-done example, but
I will show the process used to create it after trudging through the errors. Feel free to
skip this section if you're not having problems.
After trying the minimal additions to the build path I mentioned above, I was seeing an error
that javax.rules was not being recognized. I added
jsr94-1.1.jar to my Drools library (this is included under /lib in the Drools download) and it was
finally able to compile.
When running the unit tests, however, I still got this error:
org.drools.RuntimeDroolsException: Unable to load dialect 'org.drools.rule.builder.dialect.java.JavaDialectConfiguration:java'
At that point I just decided to add all the dependencies in /lib to my Drools library and the error
went away. Obviously you don't need Ant, but I wasn't quite in the mood to go hunting for the minimum
of what I needed. You might feel differently, however.
Now that the dialect was able to be loaded, I got another error:
org.drools.rule.InvalidRulePackage: Rule Compilation error : [Rule name=Some Rule Name, agendaGroup=MAIN, salience=-1, no-loop=false]
com/codeodor/Rule_Some_Rule_Name_0.java (8:349) : Cannot invoke intValue() on the primitive type int
As you might expect, this was happening simply because the rule was receiving an int and was
trying to call a method from Integer on it.
After all that, my pre-made example ran correctly, and being comfortable that I had Drools working,
I was ready to try my own.
An Example: Getting a Home Loan
From where I sit, the process of determining how and when to give a home loan is complex and can change
quite often. You need to consider an applicant's credit score, income, and down payment, among
other things. Therefore, I think it is a good candidate for use with Drools.
To keep the tutorial simple
(and short), our loan determinizer will only consider credit score and down payment in regards to the
cost of the house.
First we'll define the HomeBuyer class. I don't feel the need for tests, because as you'll
see, it does next to nothing.
package com.codeodor;
public class HomeBuyer {
int _creditScore;
int _downPayment;
String _name;
public HomeBuyer(String buyerName, int creditScore, int downPayment) {
_name = buyerName;
_creditScore = creditScore;
_downPayment = downPayment;
}
public String getName() {
return _name;
}
public int getCreditScore(){
return _creditScore;
}
public int getDownPayment(){
return _downPayment;
}
}
Next, we'll need a class that sets up and runs the rules. I'm not feeling
the need to test this directly either, because it is 99% boilerplate and all of the code
gets tested when we test the rules anyway. Here is our LoanDeterminizer :
package com.codeodor;
import org.drools.jsr94.rules.RuleServiceProviderImpl;
import javax.rules.RuleServiceProviderManager;
import javax.rules.RuleServiceProvider;
import javax.rules.StatelessRuleSession;
import javax.rules.RuleRuntime;
import javax.rules.admin.RuleAdministrator;
import javax.rules.admin.LocalRuleExecutionSetProvider;
import javax.rules.admin.RuleExecutionSet;
import java.io.InputStream;
import java.util.ArrayList;
public class LoanDeterminizer {
// the meat of our class
private boolean _okToGiveLoan;
private HomeBuyer _homeBuyer;
private int _costOfHome;
public boolean giveLoan(HomeBuyer h, int costOfHome) {
_okToGiveLoan = true;
_homeBuyer = h;
_costOfHome = costOfHome;
ArrayList<Object> objectList = new ArrayList<Object>();
objectList.add(h);
objectList.add(costOfHome);
objectList.add(this);
return _okToGiveLoan;
}
public HomeBuyer getHomeBuyer() { return _homeBuyer; }
public int getCostOfHome() { return _costOfHome; }
public boolean getOkToGiveLoan() { return _okToGiveLoan; }
public double getPercentDown() { return(double)(_homeBuyer.getDownPayment()/_costOfHome); }
// semi boiler plate (values or names change based on name)
private final String RULE_URI = "LoanRules.drl"; // this is the file name our Rules are contained in
public LoanDeterminizer() throws Exception // the constructor name obviously changes based on class name
{
prepare();
}
// complete boiler plate code from Venkat's presentation examples follows
// I imagine some of this changes based on how you want to use Drools
private final String RULE_SERVICE_PROVIDER = "http://drools.org/";
private StatelessRuleSession _statelessRuleSession;
private RuleAdministrator _ruleAdministrator;
private boolean _clean = false;
protected void finalize() throws Throwable
{
if (!_clean) { cleanUp(); }
}
private void prepare() throws Exception
{
RuleServiceProviderManager.registerRuleServiceProvider(
RULE_SERVICE_PROVIDER, RuleServiceProviderImpl.class );
RuleServiceProvider ruleServiceProvider =
RuleServiceProviderManager.getRuleServiceProvider(RULE_SERVICE_PROVIDER);
_ruleAdministrator = ruleServiceProvider.getRuleAdministrator( );
LocalRuleExecutionSetProvider ruleSetProvider =
_ruleAdministrator.getLocalRuleExecutionSetProvider(null);
InputStream rules = Exchange.class.getResourceAsStream(RULE_URI);
RuleExecutionSet ruleExecutionSet =
ruleSetProvider.createRuleExecutionSet(rules, null);
_ruleAdministrator.registerRuleExecutionSet(RULE_URI, ruleExecutionSet, null);
RuleRuntime ruleRuntime = ruleServiceProvider.getRuleRuntime();
_statelessRuleSession =
(StatelessRuleSession) ruleRuntime.createRuleSession(
RULE_URI, null, RuleRuntime.STATELESS_SESSION_TYPE );
}
public void cleanUp() throws Exception
{
_clean = true;
_statelessRuleSession.release( );
_ruleAdministrator.deregisterRuleExecutionSet(RULE_URI, null);
}
}
I told you there was a lot of boilerplate code there. I won't explain it all because:
- It doesn't interest me
- I don't yet know as much as I should about it
I fully grant that reason number one may be partially or completely dependent on the second one.
In any case, now we can finally write our rules. I'll be starting with a test:
package com.codeodor;
import junit.framework.TestCase;
public class TestRules extends TestCase {
public void test_poor_credit_rating_gets_no_loan() throws Exception
{
LoanDeterminizer ld = new LoanDeterminizer();
HomeBuyer h = new HomeBuyer("Ima Inalotadebt and Idun Payet", 100, 20000);
boolean result = ld.giveLoan(h, 150000);
assertFalse(result);
ld.cleanUp();
}
}
And our test fails, which is what we wanted. We didn't yet create our rule file, LoanRules.drl, so
let's do that now.
package com.codeodor
rule "Poor credit score never gets a loan"
salience 2
when
buyer : HomeBuyer(creditScore < 400)
loan_determinizer : LoanDeterminizer(homeBuyer == buyer)
then
System.out.println(buyer.getName() + " has too low a credit rating to get the loan.");
loan_determinizer.setOkToGiveLoan(false);
end
The string following "rule" is the rule's name. Salience is one of the ways Drools performs conflict resolution.
Finally, the first two lines tell it that buyer is a variable of type HomeBuyer with a credit score of less than 400
and loan_determinizer is the LoanDeterminizer passed in with the object list
where the homeBuyer is what we've called buyer in our rule. If either of
those conditions fails to match, this rule is skipped.
Hopefully that
makes some sense to you. If not, let me know in the comments and I'll try again.
And now back to
our regularly scheduled test:
running it again still results in a red bar. This time, the problem is:
org.drools.rule.InvalidRulePackage: Rule Compilation error : [Rule name=Poor credit score never gets a loan, agendaGroup=MAIN, salience=2, no-loop=false]
com/codeodor/Rule_Poor_credit_score_never_gets_a_loan_0.java (7:538) : The method setOkToGiveLoan(boolean) is undefined for the type LoanDeterminizer
The key part here is "the method setOkToGiveLoan(boolean) is undefined for the type LoanDeterminizer."
Oops, we forgot that one. So let's add it to LoanDeterminizer :
public void setOkToGiveLoan(boolean value) { _okToGiveLoan = value; }
Now running the test results in yet another red bar! It turns out I forgot something pretty big (and basic) in
LoanDeterminizer.giveLoan() : I didn't tell it to execute the rules. Therefore, the default
case of "true" was the result since the rules were not executed.
Asking it to execute the rules is as
easy as this one-liner, which passes it some working data:
_statelessRuleSession.executeRules(objectList);
For reference, the entire working giveLoan method is below:
public boolean giveLoan(HomeBuyer h, int costOfHome) throws Exception {
_okToGiveLoan = true;
_homeBuyer = h;
_costOfHome = costOfHome;
ArrayList<Object> objectList = new ArrayList<Object>();
objectList.add(h);
objectList.add(costOfHome);
objectList.add(this);
_statelessRuleSession.executeRules(objectList);
// here you might have some code to process
// the loan if _okToGiveLoan is true
return _okToGiveLoan;
}
Now our test bar is green and we can add more tests and rules. Thankfully, we're done with programming
in Java (except for our unit tests, which I don't mind all that much).
To wrap-up the tutorial I want to focus on two more cases: Good credit score always gets the loan and
medium credit score with small down payment does not get the loan. I wrote the tests and rules
iteratively, but I'm going to combine them here for organization's sake seeing as I already demonstrated
the iterative approach.
public void test_good_credit_rating_gets_the_loan() throws Exception {
LoanDeterminizer ld = new LoanDeterminizer();
HomeBuyer h = new HomeBuyer("Warren Buffet", 800, 0);
boolean result = ld.giveLoan(h, 100000000);
assertTrue(result);
// maybe some more asserts if you performed processing in LoanDeterminizer.giveLoan()
ld.cleanUp();
}
public void test_mid_credit_rating_gets_no_loan_with_small_downpayment() throws Exception {
LoanDeterminizer ld = new LoanDeterminizer();
HomeBuyer h = new HomeBuyer("Joe Middleclass", 500, 5000);
boolean result = ld.giveLoan(h, 150000);
assertFalse(result);
ld.cleanUp();
}
And the associated rules added to LoanRules.drl:
rule "High credit score always gets a loan"
salience 1
when
buyer : HomeBuyer(creditScore >= 700)
loan_determinizer : LoanDeterminizer(homeBuyer == buyer)
then
System.out.println(buyer.getName() + " has a credit rating to get the loan no matter the down payment.");
loan_determinizer.setOkToGiveLoan(true);
end
rule "Middle credit score fails to get a loan with small down payment"
salience 0
when
buyer : HomeBuyer(creditScore >= 400 && creditScore < 700)
loan_determinizer : LoanDeterminizer(homeBuyer == buyer && percentDown < 0.20)
then
System.out.println(buyer.getName() + " has a credit rating to get the loan but not enough down payment.");
loan_determinizer.setOkToGiveLoan(false);
end
As you can see, there is a little bit of magic going on behind the scenes (as you'll also find in Groovy) where
here in the DSL, you can call loan_determinizer.percentDown and it will call getPercentDown
for you.
All three of our tests are running green and the console outputs what we expected:
Ima Inalotadebt and Idun Payet has too low a credit rating to get the loan.
Warren Buffet has a credit rating to get the loan no matter the down payment.
Joe Middleclass has a credit rating to get the loan but not enough down payment.
As always, questions, comments, and criticism are welcome. Leave your thoughts below. (I know it was
the length of a book, so I don't expect many.)
For more on using Drools, see the Drools 4 docs.
Finally, as with my write-ups on
Scott Davis's Groovy presentation,
his keynote,
Stuart Halloway's JavaScript for Ajax Programmers,
and Neal Ford's 10 ways to improve your code,
I have to give Venkat most of the credit for the good stuff here.
I'm just reporting from my notes, his slides,
and my memory, so any mistakes you find are probably mine. If they aren't, they likely should have been.
Last modified on Sep 10, 2007 at 05:46 PM UTC - 5 hrs
Posted by Sam on Jul 07, 2007 at 06:08 PM UTC - 5 hrs
The first talk I attended at NFJS was Scott Davis's presentation about Groovy, and how it lives up to its name (you know, its grrrooovy baby!) The presentation went through the death of Java, how Groovy fills a need that Java misses, some sample code for utility programs, and finally a short introduction to Grails.
As a disclaimer, I give Scott Davis all the credit for the good stuff here. I'm just reporting from my notes, his slides, and my memory, so any mistakes you find are probably mine. If they aren't, they probably should have been.
More...
Scott first introduces us to the death of Java, or more precisely, that rumors "of [its] death have been greatly exaggerated." Instead, Java is the COBOL of the 21st century. Although it is over 10 years old now, it was designed with C and C++ in mind so you might consider it even older than that.
Scott notes that after ten years many technologies are dead (or rotten and decaying!). Does anyone still use DOS, Lotus 1-2-3, or WordPerfect? (Do you still want to?) I might be partial to DOS because that's where I cut my teeth (and the thought of using it still has me licking my chops), but other than that, I think the answer is a loud "hell, no!"
On the other hand, some technologies "age like a fine wine." Scott includes HTTP and TCP/ IP in this category.
Finally, the third class of applications adapt and evolve to stay alive. FTP moved to SFTP and now we're using BitTorrent. Telnet moved to SSH and POP to IMAP. It is in this vein that Java finds itself, struggling to adapt.
It's not that Java (the language) isn't capable - it's that it gets in your way. Scott points out that in essence, the main complaint about Java is "that it is one level of abstraction too low." The soup du jour is simplicity - we want things to be "better, faster, [and] lighter."
In fact, anyone who has written a line of code in Java knows he's got around 29 more to write before he's done anything useful. Scott then shows
a 35 line example of reading a file and printing it out line by line to illustrate his point.
In Java, we often "develop 'util' classes to overcome this abstraction layer mismatch," Scott says, noting that being forced into having com.mycompany.StringUtil due to Sun's declaration of String as final is a code smell. Surely there is a better way!
Enter Groovy, the "first scripting language to be officially sanctioned by Sun to run on the JVM." It can run in compiled or interpreted mode and it is easy to install (though it took me some time because of less than helpful error messages about me pointing to the bin directory under Java, rather than the base Java directory). Most importantly, it fills the void of that "abstraction layer mismatch." Reading and writing lines in a file with Groovy is reduced to:
import java.io.File
new File("simpleFile.txt").each { line -> println(line) }
Look familiar? It's definitely Ruby-esque, but based on my cursory inspection, also keeps some Java idioms where appropriate (for example, import over require ). I can definitely see how it would be easy for a Ruby developer to pick up, but on the other hand,
I think it would be easier for a Java developer to transition to this than to Ruby. I'm looking forward to playing with it some more.
Finally, Scott goes through some "Groovy hacks" related to File I/O, XML and HTML, SQL, Groovlets (surprisingly, Groovy's servlets), and Grails. In particular, you should have a look at the Groovy Markup way of writing markup, because its quite a bit more elegant than ><ML (that my 13375p34|< for an X).
I won't go into detail about the code samples, as you can find those elsewhere (.PDF) if you wanted (perhaps in GINA, the canonical Groovy book), but I do plan to cover an introduction to Grails in a future post related to another presentation from NFJS. When I start playing more with Groovy I'll put up some real code samples then.
Thoughts about Groovy?
Posted by Sam on Jul 01, 2007 at 09:25 PM UTC - 5 hrs
Let me start out by saying the NFJS conference was incredible!
I went in with the intention of blogging as the sessions and days went on, but I was incredibly busy, and felt like my notes didn't do justice to the presentations. So I'm going to review the slides and flesh out my comments, and hopefully do a good job at letting you know what went down.
More...
For this post, I just wanted to give an overview of the symposium in general. As soon as I walked in and found my name badge, I knew these guys were a seriously professional conference, and they would pay attention to the small details to make our experience great.
Why? The badge was inside a nice, hard plastic (think something you might store a baseball card in, but bigger) and worn around a neck clip - the kind you used to see people wearing to hold their keys around their necks. But that wasn't the impressive part - the most impressive was that they put the schedule inside too, upside down on the back side, so all you had to do was flip over the badge and you could read the schedule and room assignments easily.
We also got a nice laptop backpack, a leather notebook where we could store pre-printed slides and code samples from the sessions we attended, as well as a CD with all the presentation material from every session.
Now, that's all useful for going back and having something to reference, but it's also great to have so you can flip through and see a good overview of sessions before you attend, so you know if it will be valuable for you.
The rooms were outstanding. We stayed in the Marriott Austin Airport South. I got the feeling the hotel staff greatly appreciated us being there - I got even better service from them than I normally expect from Marriott. Now, I'm not sure of the weekend rates there, but Thursday they wanted 200 dollars (US), and Sunday they wanted $189 (if I recall correctly). But NFJS was able to negotiate $99 rooms for us for Friday and Saturday. It was certainly nice to get such great service at such a bargain. Of course, it would have been even nicer if the hotel provided free wireless Internet access. NFJS provided it in the conference rooms, but that didn't reach all the way up to our rooms. I'm getting free wireless next door in another Marriott-owned hotel for over 100 dollars less than I would be paying at Marriott proper tonight!
The food was amazing as well. Just good all-around, and even included options for vegetarians that didn't include side-dishes only. I was fairly impressed by that. Also, I think the hotel bar is the cheapest bar in Austin. =)
Everything went extremely smooth from what I could tell. Jay Zimmerman and any other staff (including the hotel) did an excellent job of ensuring that. Oh yeah (read: "most importantly"), the speakers were awesome. They really knew how to entertain and be informative. But, for being a Java-based conference, it was interesting (exciting?) to see how many bashes Java got - and not just from the speakers. I'll probably blog some funny quotes later.
The first night most of the speakers made themselves available (aside from in-between sessions) at the hotel bar. I had dinner with a friend of mine (who I didn't know was going to be there) and Scott Davis, who is an amazingly charismatic guy. The second night a lot of people went to Bruce Tate's house. Someone later told me the group I was hanging out with in the dynamic languages BOF had been invited, but by that time we didn't feel like going to crash the party (not to mention a second-hand invite of "I heard" isn't that strong a thing to go on in the first place).
Also, one of the presenters had Sean Corfield's blog on his bookmarks toolbar, so that was exciting to see (I'll have to go through my notes to remember who).
In any case, I've got tons of sessions to blog about, some of which may contain useful information for you. I'm staying another night in Austin so I won't start tonight (other than this one), but keep a look out for more detailed posts about the actual content of the conference in the coming days.
PS: I'm also looking forward to catching up with all the other blogs and seeing how CFUnited went. But if I see one more post about the iPhone I'm going to have to duct-tape my head like a mummy gets wrapped to avoid exploding.
Last modified on Jul 01, 2007 at 09:30 PM UTC - 5 hrs
Posted by Sam on Jun 25, 2007 at 05:58 PM UTC - 5 hrs
Last night John Lam posted about Steve Yegge's port of Rails to JavaScript (Running on Rhino, "JavaScript for Java" on the JVM," not in the browser).
Sounds interesting.
Update: Steve responds and explains how it's not much to get hyped up about.
Last modified on Jun 27, 2007 at 06:21 AM UTC - 5 hrs
Posted by Sam on Jun 18, 2007 at 07:03 PM UTC - 5 hrs
Last Saturday, I had the fortune of attending the JUnit Workshop put on by Agile Houston. It was great getting to meet and work with some of the developers in the Houston Area. We started a couple of hours late because of a mixup at the hotel, but it was a good chance to chat with the other developers.
I signed up for a story to implement forums for JUnit.org, which would be used to post hard to test code and receive tests for them. The twist was that we wanted to compile the code and run the unit tests that others posted in response against the code, providing pass/fail and code coverage statistics (that sounds a lot harder than it really is). The other set of stories I signed up for was related to articles, news, (something else related to those that I can't recall), and RSS feeds for each of them.
More...
There were other stories too, but I don't know how they all fared, so I'll just report on the ones I was involved in. I was on a team that started with four developers (and ended with 4, with one person leaving and another coming in), and we decided to start with the articles, news, RSS, etc, as that seemed simple.
We started talking about perhaps implementing something in Python (or perhaps Jython, either with Django), Ruby/ JRuby on Rails, or Groovy and Grails. Even though it was for JUnit, many of us weren't too keen on using the weekend to code in Java like we would during the week (plus, how would we ever get anything done?).
Then someone mentioned something about using Zope (which I had never heard of) and Python. At that point, I installed Python and Zope - and so began a long day of installing programs on my laptop.
The problem was compounded by the fact that a few months back my memory went bad in the laptop and I had to reinstall everything (I originally thought the problem was with Windows, and attempts to repair the installation didn't work). Well, I waited until last week to install anything, so I only showed up prepared for Java and Ruby development.
In any case, it was decided somewhere that a PHP solution would be preferable, since there is a lot of good open source content management systems to choose from, as well as just about any other application we might need (such as the forums). Somewhere around this time we realized the articles could be implemented by the team doing the CMS, and we could move on to
the forums. Ideally, we'd want to use a pre-made solution and modify it to our needs, but in the beginning we just wanted a proof of concept. The four of us all started getting development environment set up - just testing that our Java compiled and unit tests could work. I got to use annotations and JUnit 4 for the first time right here, so that was mildly exciting. =)
But, because there were only two of us who had development environments ready to go, and I was one of them, I got stuck working on the front end. That's not too terribly exciting, but it was fine since I was getting to work with others, and someone had to do it (it isn't like I disliked it, but I would have preferred to work on the back end).
Before we dipped all the way into installing a PHP forum and modifying it, we thought it would be a good idea to implement a simple forum in a JSP to post code to the back end, which would compile it and run tests against it.
That turned out to be a bad move. I had to install Apache, Apache Tomcat, Eclipse Web Platform Tools, and a seemingly endless list of dependencies. To make everything better, the WPT package (or one of the dependencies) was missing a file, and I could only find questions about it searching the internet - no answers. Luckily, I made my problems known and finally one of the other developers found the file on his machine and we transferred it over. Now, four hours into the day, I had a form with a textarea in it. See why web development with Java rocks? =) (Just kidding of course - it's not that bad.)
After that I moved on to try to modify the phpBB for our needs. I installed PHP, MySQL, and phpBB with phpBB telling me the version of MySQL was incompatible with my installation of PHP. But it wasn't! Then I installed PostgreSQL and tried it - to no avail. Finally, it was time for me to leave, and having accomplished nothing, I felt pretty useless.
I haven't heard an update on what the rest of the team accomplished yet, but when I turned on my computer today and tried to install phpBB - it just worked - no questions asked. So what happened? It didn't tell me I'd need to restart the computer, but that's the only thing that changed. In my frustration, I didn't think to do it "just in case." How funny!
Anyway, if you've stayed with me through what must be a boring story for most people, I may as well give you a bit of juicy news: Uncle Bob is supposed to be visiting us at Agile Houston in a short while. I don't know if anything is official yet, but it sure will/would be cool!
Posted by Sam on Jun 18, 2007 at 02:50 PM UTC - 5 hrs
This one refers to the 40+ minute presentation by Obie Fernandez on Agile DSL Development in Ruby. (Is InfoQ not one of the greatest resources ever?)
You should really view the video for better details, but I'll give a little rundown of the talk here.
Obie starts out talking about how you should design the language first with the domain expert, constantly refining it until it is good - and only then should you worry about implementing it (this is about the same procedure you'd follow if you were building an expert system as well). That's where most of the Agility comes into play.
More...
Later, he moves on to describe four different types of design for your DSL. This was something I hadn't really thought about before, therefore it was the most interesting for me. Here are the four types:
- Instantiation: The DSL consists simply of methods on an object. This is not much different from normal programming. He says it is "DSLish," perhaps the way Blaine Buxton looks at it.
- Class Macros: DSL as methods on some ancestor class, and subclasses can then use those methods to tweak the behavior of themselves and their subclasses. This follows a declarative style of programming, and is the type of DSL followed by Ruby on Rails (and cfrails) if I understand him correctly.
- Top-level methods: Your application defines the DSL as a set of top-level methods, and then invokes
load with the path to your DSL script. When those methods are called in the configuration file, they modify some central (typically global) data, which your application uses to determine how it should execute. This is like scripting, and (again) if I understood correctly, this is the style my (almost working correctly) Partial Order Planner uses.
- Sandboxing (aka Contexts): Similar to the Instantiation style, but with more magic. Your DSL is defined as methods of some object, but that object is really just a "sandbox." Interaction with the object's methods modifies some state in the sandbox, which is then queried by the application. This is useful for processing user-maintained scripts kept in a database, and you can vary the behavior by changing the execution context.
Note: These are "almost" quotes from the slides/talk (with a couple of comments by myself), but I didn't want to keep rewinding and such, so they aren't exact. Therefore, I'm giving him credit for the good stuff, but I stake a claim on any mistakes made.
The talk uses Ruby as the implementation language for the example internal DSLs, but I think the four types mentioned above can be generalized. At the least, they are possible in ColdFusion - but with Java, I'm not so sure. In particular, I think you'd need mixin ability for the Top-level methods style of DSL.
Next, Obie catalogs some of the features of Ruby you see used quite often in crafting DSLs: Symbols (because they have less noise than strings), Blocks (for delayed evaluation of code), Modules (for cleaner separation of code), Splats (for parameter arrays), the different types of eval (for dynamic evaluation), and define_method and alias_method .
Finally, he winds up the presentation with a bit about BNL. I liked this part because I found Jay Fields' blog and his series of articles about BNL.
As always, comments, questions, and thoughts are appreciated. Flames - not so much.
Posted by Sam on Jun 15, 2007 at 09:18 PM UTC - 5 hrs
Sean Corfield responded in some depth to " Is Rails easy?", and explained what I wish I could have when I said (awkwardly, rereading it now) "I think my cat probably couldn't [code a Rails app]."
Sean makes it quite clear (as did Venkat's original post) that it isn't that using a framework, technology, or tool in general is easy or hard (although, you can certainly do things to make it easier or harder to use). In many cases, what it does for you is easy to begin with - in the case of Rails, it is stuff you do all the time that amounts to time-wasting, repetitive, boring busy-work. Rather, the right way to look at them is that they are tools that make you more productive, and it takes a while to learn to use them.
If you go into them thinking they are easy, you're likely to be disappointed and drop a tool that can really save you time before you learn to use it. And that could be tragic, if you value your time.
Posted by Sam on Jun 14, 2007 at 08:13 PM UTC - 5 hrs
Just a friendly reminder that Agile Houston is hosting a JUnit website improvement workshop on Saturday, June 16, 2007. The workshop starts at 9 AM and continues all day.
It's at the Courtyard Marriott. We'll be TDDing improvements to the JUnit.org website and I should be there from about 9 AM to 3 PM. See you there!
Posted by Sam on Jun 11, 2007 at 09:52 AM UTC - 5 hrs
Want to get a report of a certain session? I'll be attending No Fluff Just Stuff in Austin, Texas at the end of June. So, look at all that's available, and let me know what you'd like to see.
I haven't yet decided my schedule, and it's going to be tough to do so. I'm sure to attend some Groovy and JRuby sessions, but I don't know which ones. In any case, I want to try to leave at least a couple of sessions open for requests, so feel free to let me know in the comments what you'd like to see. (No guarantees though!). Here's what I'm toying with so far (apologies in advance for not linking to the talks or the speakers' blogs, there are far too many of them =)):
More...
First set of sessions: No clue. Leaning toward session 1, Groovy: Greasing the Wheels of Java by Scott Davis.
Second set: 7 (Groovy and Java: The Integration Story by Scott Davis), 8 (Java 6 Features, what's in it for you? by Venkat Subramaniam), 9 (Power Regular Expressions in Java by Neal Ford), or 11 (Agile Release Estimating, Planning and Tracking by Pete Behrens). But, no idea really.
Third set: 13 (Real World Grails by Scott Davis), 15 (10 Ways to Improve Your Code by Neal Ford), or 17 (Agile Pattern: The Product Increment by Pete Behrens). I'm also considering doing the JavaScript Exposed: There's a Real Programming Language in There! talks in the 2nd and 3rd sets by Glenn
Vanderburg.
Fourth set: Almost certainly I'll be attending session 20, Drooling with Groovy and Rules by Venkat Subramaniam, which will focus on declarative rules-based programming in Groovy, although Debugging and Testing the Web Tier by Neal Ford (session 22) and Java Performance Myths by Brian Goetz (session 24) are also of interest to me.
Fifth set: Again, almost certainly I'll go to Session 27, Building DSLs in Static and Dynamic Languages by Neal Ford.
Sixth set: No clue - I'm interested in all of them.
Seventh set: Session 37, Advanced View Techniques With Grails by Jeff Brown and Session 39, Advanced Hibernate by Scott Leberknight, and Session 42, Mocking Web Services by Scott Davis all stick out at me.
Eighth set: Session 47, Pragmatic Extreme Programming by Neal Ford, Session 46, What's New in Java 6 by Jason Hunter, Session 45, RAD JSF with Seam, Facelets, and Ajax4jsf, Part One by David Geary, or Session 44, Enterprise Applications with Spring: Part 1 by Ramnivas Laddad all seem nice.
Ninth set: Session 50, Enterprise Applications with Spring: Part 2 by Ramnivas Laddad or Session 51, RAD JSF with Seam, Facelets, and Ajax4jsf, Part Two by David Geary are appealing, which probably means the eight set will be the part 1 of either of these talks.
Tenth set: Session 59, Productive Programmer: Acceleration, Focus, and Indirection by Neal Ford is looking to be my choice, though the sessions on Reflection, Spring/Hibernate Integration Patterns, Idioms, and Pitfalls, and NetKernel (which "turns the wisdom" of "XML is like lye. It is very useful, but humans shouldn't touch it," "on its head" all interest me.
Final set: Most probably I'll go to Session #65: Productive Programmer: Automation and Canonicality by Neal Ford.
As you can see, there's tons to choose from - and that's just my narrowed down version. So much so, I wonder how many people will leave the symposium more disappointed about what they missed than satisfied with what they saw =).
Anyway, let me know what you'd like to see. Even if its not something from the list I made, I'll consider it, especially if there seems to be enough interest in it.
Last modified on Jun 11, 2007 at 09:58 AM UTC - 5 hrs
Posted by Sam on Jun 11, 2007 at 07:50 AM UTC - 5 hrs
Don't forget to (learn how to) unit test it using XMLUnit.
Posted by Sam on Jun 11, 2007 at 07:36 AM UTC - 5 hrs
I take my second weekend in a row (mostly) off of a computer, and look at all the cool things happening!
Adobe releases AIR (previously known as Apollo) and Flex 3 public beta, both products have been on my list of things to do for quite some time, still with no action taken.
Ruby ( MRI) released bug fixes in version 1.8.6.
JRuby officially went 1.0 (though it has yet to be posted to the website as I write this). And Ruby.NET released version 0.8 (IronRuby uses its scanner and parser, according to the article -- and this happened a couple of days before the weekend).
More...
For web developers, InfoQ let us know about Browser Shots, which is a service that will test your web sites in tons of different browsers and present screen shots to you. It lets you choose whether to enable Flash, Javascript, or certain media players, and allows you to set the color depth and screen resolution. It's quite slow at the moment (set it and come back in four hours or so), but it shows us what can be possible! ( Screenshots of codeodor.com, as they are coming in more will pop up)
Also, InfoQ completed its first official year. It was "pre-released" (I forget the exact term they used) I think in May 2006, but it went officially live a year ago Friday. Congrats to Floyd Marinescu and the entire InfoQrue! (And they have a lot of relevant news (to me) I'll be posting links to today)
Last modified on Jun 11, 2007 at 08:29 AM UTC - 5 hrs
Posted by Sam on Jun 03, 2007 at 12:12 PM UTC - 5 hrs
Last night Thomas Enebo announced on Ruby Talk that JRuby 1.0 RC 3 has been released, and that it "will likely be our final release candidate before our 1.0 release."
I'm interested to deploy a web app trying Ruby on Rails with JRuby (or JRuby on Rails, perhaps), and also in experimenting with Sean Corfield's Scripting for CF 8 to run Ruby within ColdFusion.
Anyone else planning to do good things with JRuby?
Posted by Sam on May 30, 2007 at 08:14 AM UTC - 5 hrs
Kenji HIRANABE (whose signature reads "Japanese translator of 'XP Installed', 'Lean Software Development' and 'Agile Project Management'") just posted a video (with English subtitles) of himself, Matz (creator of Ruby), and Kakutani-san (who translated Bruce Tate's "From Java To Ruby" into Japanese) to his blog. According to Matz (if I heard correctly in the video), Kenji's a "language otaku," so his blog may be worth checking out for all you other language otakus (though, from my brief look at it, it seems focused between his company's product development and more general programming issues).
In any case it's supposed to be a six part series where the "goal is to show that 'Business-Process-Framework-Language' can be
instanciated as 'Web2.0-Agile-Rails-Ruby' in a very synergetic manner, but each of the three may have other goals."
The first part discusses the history of Java, Ruby, and Agile. I found it interesting and am looking forward to viewing the upcoming parts, so I thought I'd link it here.
Posted by Sam on May 29, 2007 at 07:14 AM UTC - 5 hrs
Often when we make a change to some code, if it is not properly designed, the result is that cascading changes need to be done throughout the application because bugs will ripple through the system. That's one of the ideas behind why we want to have low coupling between classes. We also have unit testing to combat the effects of this, but let's just suppose we haven't written any, and haven't used JUnit Factory to create regression tests.
Given that there is a lot of code out there that isn't quite perfect, wouldn't it be nice to have a tool that could analyze changes and where they would affect other code? I can imagine how such a tool might work, but I haven't heard of one before now (that I recall, anyway).
So the point of it all: I heard something about BMC and IBM teaming up on such a tool (my understanding is that BMC started it, and IBM later joined the project). I'm assuming it'd be in Java, but does anyone have information on this? Can anyone confirm or deny the story I heard?
Last modified on May 29, 2007 at 07:15 AM UTC - 5 hrs
Posted by Sam on May 27, 2007 at 11:15 AM UTC - 5 hrs
Hot off the presses: On Saturday, June 16, 2007 from 9:00 AM (and lasting all day) Agile Houston is hosting a JUnit workshop.
Attendees will be pairing and TDDing improvements to JUnit.org, including: (quoting Agile Houston's announcement)
More...
* A forum that lets developers post hard-to-test code, and "test case" responses that show coverage and test results.
* A JUnit roadmap that shows changes to upcoming versions, like the Hamcrest style API changes in JUnit 4.4
* Updated graphics, layout, and content.
I'm guessing the updated graphics won't be test-driven, but who knows what's possible these days?!
This sounds like a lot of fun so I'm tentatively planning on making it out, and I hope to see you people there! You just need to "bring your laptop" and show up "ready to sling some code." And Ben Rady said you'll want to get there early if you want some place to sit!
Last modified on Jun 03, 2007 at 12:14 PM UTC - 5 hrs
Posted by Sam on May 20, 2007 at 08:04 AM UTC - 5 hrs
What if we had functions compile(String sourceCode) and runCompiled(BinaryData compiledCode) ? What could you do with it? Could it act as a closure, and bring really dynamic coding to traditionally static languages, perhaps allowing dynamically named variables and functions to languages without them (and much more!)? Could we store our code in a database and mix and match chunks, building programs based on a SQL query?
What do you think? Useful or useless?
Posted by Sam on May 19, 2007 at 01:25 PM UTC - 5 hrs
Peter Bell's presentation on LightWire
generated some comments I found very interesting and thought provoking.
(Perhaps Peter is not simply into application generation, but comment generation as well.)
The one I find most interesting is brought up by several people whose opinions I value -
Joe Rinehart, Sean Corfield,
Jared Rypka-Hauer, and others during and after the presentation.
That is: what is the distinction between code and data, and specifically, is XML code or data
(assuming there is a difference)?
More...
The first item where I see a distinction that needs to be made is on, "what do we mean when we are talking about
XML?" I see two types - XML the paradigm where you use tags to describe data, and the XML you write - as in,
the concrete tags you put into a file (like, "see that XML?"). We're talking about XML you've written, not
the abstract notion of XML.
The second idea: what is code? What is data? Sean Corfield offers what I would consider to be a concice,
and mostly correct definition: "Code executes, non-code (data) does not execute." To make it correct (rather
than partially so), he adds that (especially in Lisp) code can be data, but data is not code. You see this
code-as-data any time you are using closures or passing code around as data. But taking it a bit further -
your source code is always just data to be passed to a compiler or interpreter, which figures out what the
code means, and does what it has been told to do.
So is XML code? Certainly we see it can be: ColdFusion, MXML, and others are languages where your
source code is written (largely) in XML. But what about the broader issue of having a programmatic
configuration file versus a "data-only" config file?
Is the data in the config file executable? It depends on the purpose behind the data. In the case of data
like
<person>
<name>
Bill
</name>
<height>
4'2"
</height>
</person>
I think (assuming there is nothing strange going on) that it is clearly data. Without knowing anything about the
back end, it seems like we're just creating a data structure. But In the case of
DI (and many others uses for config files),
I see it as giving a command to the DI framework to configure a new bean. In essence, as Peter notes,
we've just substituted one concrete syntax for another.
In the case of XML, we're writing (or using)
a parser to send data to an intepreter we've written that figures out what "real" commands to run based on
what the programmer wrote in the configuration file. We've just created a higher level language than we had before
- it is doing the same thing any other non-machine code language does (and you might even argue
about the machine code comment). In the configuration case,
often it is a DSL (in the DI case specifically, used to describe which objects depend on which other
objects and load them for us).
While we don't often have control structures, there is nothing stopping us from implementing them,
and as Peter also notes, just because a language is not
Turing complete), doesn't mean it is not
a programming language. In the end, I see it as code.
Both approaches are known to have their benefits and drawbacks, and choosing one over the other is largely a matter
of personal taste, size and scope of problem, and problem/solution domain. For me, in the worlds of
JIT compiling and interpreted langages, the programmatic way
of doing things tends to win out - especially with large configurations because I prefer to have
the power of control structures to help me do my job (without having to implement them myself).
On the other hand, going the hard-coded XML route is especially popular in the compiled world, if not
for any other reason than you can change configurations without recompiling your application.
I don't make a distinction between the two on terms of XML is data, while programming (or using an in-language DSL)
in your general-purpose language is code. To me, they are both code, and if either is done incorrectly it will
blow-up your program.
Finally, I'm not sure what value we gain from seeing that code is data (and in many cases config data is code),
other than perhaps a new way of looking at problems which might lead us to find better solutions.
But that isn't provided by the distinction itself, just the fact that we saw it.
Comments, thoughts, questions, and requests for clarifications are welcome and appreciated.
Posted by Sam on May 15, 2007 at 01:07 PM UTC - 5 hrs
It might be petty, but those two three-letter words really get to me. They clutter my code almost as much as semicolons, Object obj = new Object() // assign a new object to the object called "obj" which is is of type Object , and angle brackets. I'm want to do something like newCodebase=rereplace(entireCodebase,"[gs]et","","all") .
What if we had a new idiom where calling the method without arguments would cause it to return a value (the get), while calling it with an argument would run the setter? Its easy enough to do. What do you think, am I off my head?
Posted by Sam on May 14, 2007 at 08:01 AM UTC - 5 hrs
Due to a couple of requests in the comments about Sorting really BIG files,
I'm posting the Java source code that does the job.
First, we have the main externalSort algorithm. It expects parameters relation
(the name of a table, which in this case is the .csv file to sort without the ".csv" in the file name)
and the column to sort on (it checks the first row of the .csv file for the value passed). Now, of course
that is easy enough to change to fit your own needs - it worked fine for me and my CSVs.
There is a magic number in there that you might want to pull out: 10000. That is the number of rows to
read at a time.
Clearly, ten thousand rows is way smaller than we would want to sort on - depending on how many characters
are in each row, we could probably quite easily fit 10 million rows into memory. But, I didn't want to
spend time generating 100 million row files to test this on - so I just tried it with 100 thousand or so.
For a real use scenario, you'd probably want to go with something bigger - if not even calculate the size
and average row length.
More...
private static void externalSort(String relation, String attribute)
{
try
{
FileReader intialRelationInput = new FileReader(relation + ".csv");
BufferedReader initRelationReader = new BufferedReader(intialRelationInput);
String [] header = initRelationReader.readLine().split(",");
String [] row = header;
int indexToCompare = getIndexForColumn(header,attribute);
ArrayList<Integer[]> tenKRows = new ArrayList<Integer[]>();
int numFiles = 0;
while (row!=null)
{
// get 10k rows
for(int i=0; i<10000; i++)
{
String line = initRelationReader.readLine();
if (line==null)
{
row = null;
break;
}
row = line.split(",");
tenKRows.add(getIntsFromStringArray(row));
}
// sort the rows
tenKRows = mergeSort(tenKRows, indexToCompare);
// write to disk
FileWriter fw = new FileWriter(relation + "_chunk" + numFiles + ".csv");
BufferedWriter bw = new BufferedWriter(fw);
bw.write(flattenArray(header,",")+"\n");
for(int i=0; i<tenKRows.size(); i++)
{
bw.append(flattenArray(tenKRows.get(i),",")+"\n");
}
bw.close();
numFiles++;
tenKRows.clear();
}
mergeFiles(relation, numFiles, indexToCompare);
initRelationReader.close();
intialRelationInput.close();
}
catch (Exception ex)
{
ex.printStackTrace();
System.exit(-1);
}
}
Now, there were a couple of functions that the externalSort() method depended on.
First we see the getIndexForColumn . This method just wraps code that finds the index
where the value of attribute occurs in the array header . That's
easy enough to write, so I won't add to the clutter and post it here.
Next, we have getIntsFromStringArray . This is also simple and just uses parseInt
on each index in the array and returns a new one for int s. I'm assuming these are all integers,
so you may need to change that line to deal with String s or other data types.
Then we have the mergeSort algorithm, which is shown below. flattenArray is
also omitted here because it simply takes an array, puts a comma between each element, and returns
a String . Finally, mergeFiles is also shown below.
// sort an arrayList of arrays based on the ith column
private static ArrayList<Integer[]> mergeSort(ArrayList<Integer[]> arr, int index)
{
ArrayList<Integer[]> left = new ArrayList<Integer[]>();
ArrayList<Integer[]> right = new ArrayList<Integer[]>();
if(arr.size()<=1)
return arr;
else
{
int middle = arr.size()/2;
for (int i = 0; i<middle; i++)
left.add(arr.get(i));
for (int j = middle; j<arr.size(); j++)
right.add(arr.get(j));
left = mergeSort(left, index);
right = mergeSort(right, index);
return merge(left, right, index);
}
}
// merge the the results for mergeSort back together
private static ArrayList<Integer[]> merge(ArrayList<Integer[]> left, ArrayList<Integer[]> right, int index)
{
ArrayList<Integer[]> result = new ArrayList<Integer[]>();
while (left.size() > 0 && right.size() > 0)
{
if(left.get(0)[index] <= right.get(0)[index])
{
result.add(left.get(0));
left.remove(0);
}
else
{
result.add(right.get(0));
right.remove(0);
}
}
if (left.size()>0)
{
for(int i=0; i<left.size(); i++)
result.add(left.get(i));
}
if (right.size()>0)
{
for(int i=0; i<right.size(); i++)
result.add(right.get(i));
}
return result;
}
There is nothing special about the mergeSort and merge methods above. They simply
follow the well known merge sort algorithm, accounting
for the fact that we are sorting on one element in an array of arrays, rather that an array with only one
dimension.
Finally, we have the mergeFiles function, which simply reads all the files, one row at a time,
and puts the smallest row at the top of the sorted file.
private static void mergeFiles(String relation, int numFiles, int compareIndex)
{
try
{
ArrayList<FileReader> mergefr = new ArrayList<FileReader>();
ArrayList<BufferedReader> mergefbr = new ArrayList<BufferedReader>();
ArrayList<Integer[]> filerows = new ArrayList<Integer[]>();
FileWriter fw = new FileWriter(relation + "_sorted.csv");
BufferedWriter bw = new BufferedWriter(fw);
String [] header;
boolean someFileStillHasRows = false;
for (int i=0; i<numFiles; i++)
{
mergefr.add(new FileReader(relation+"_chunk"+i+".csv"));
mergefbr.add(new BufferedReader(mergefr.get(i)));
// get each one past the header
header = mergefbr.get(i).readLine().split(",");
if (i==0) bw.write(flattenArray(header,",")+"\n");
// get the first row
String line = mergefbr.get(i).readLine();
if (line != null)
{
filerows.add(getIntsFromStringArray(line.split(",")));
someFileStillHasRows = true;
}
else
{
filerows.add(null);
}
}
Integer[] row;
int cnt = 0;
while (someFileStillHasRows)
{
Integer min;
int minIndex = 0;
row = filerows.get(0);
if (row!=null) {
min = row[compareIndex];
minIndex = 0;
}
else {
min = null;
minIndex = -1;
}
// check which one is min
for(int i=1; i<filerows.size(); i++)
{
row = filerows.get(i);
if (min!=null) {
if(row!=null && row[compareIndex] < min)
{
minIndex = i;
min = filerows.get(i)[compareIndex];
}
}
else
{
if(row!=null)
{
min = row[compareIndex];
minIndex = i;
}
}
}
if (minIndex < 0) {
someFileStillHasRows=false;
}
else
{
// write to the sorted file
bw.append(flattenArray(filerows.get(minIndex),",")+"\n");
// get another row from the file that had the min
String line = mergefbr.get(minIndex).readLine();
if (line != null)
{
filerows.set(minIndex,getIntsFromStringArray(line.split(",")));
}
else
{
filerows.set(minIndex,null);
}
}
// check if one still has rows
for(int i=0; i<filerows.size(); i++)
{
someFileStillHasRows = false;
if(filerows.get(i)!=null)
{
if (minIndex < 0)
{
puts ("mindex lt 0 and found row not null" + flattenArray(filerows.get(i)," "));
System.exit(-1);
}
someFileStillHasRows = true;
break;
}
}
// check the actual files one more time
if (!someFileStillHasRows)
{
//write the last one not covered above
for(int i=0; i<filerows.size(); i++)
{
if (filerows.get(i) == null)
{
String line = mergefbr.get(i).readLine();
if (line!=null)
{
someFileStillHasRows=true;
filerows.set(i,getIntsFromStringArray(line.split(",")));
}
}
}
}
}
// close all the files
bw.close();
fw.close();
for(int i=0; i<mergefbr.size(); i++)
mergefbr.get(i).close();
for(int i=0; i<mergefr.size(); i++)
mergefr.get(i).close();
}
catch (Exception ex)
{
ex.printStackTrace();
System.exit(-1);
}
}
Ideally, you'd want to do better abstraction on a lot of this code and put it into different methods.
Like Ben Nadel did in his ColdFusion/Java version of this,
I'm keeping it inline so you don't need to track over to a function to see exactly what it does (though
a good name would certainly tell you without the details - which is what abstraction is about!). In
particular, I'd move all the file-specific operations into abstractions which better describe what
they're doing in this particular domain. It would really help to better understand what's going on
in the mergeFiles method at a glance, rather than reading all the details.
Questions? Comments?
Last modified on May 14, 2007 at 08:05 AM UTC - 5 hrs
Posted by Sam on Apr 07, 2007 at 04:29 PM UTC - 5 hrs
Have any of you Java guys or gals seen or tried JUnit Factory from Agitar? It generates functional unit tests for each method in a given class. A good description of how this can be used (since it can't detect how you expect the code should work, only how it does work) is provided in the FAQ:
More...
What do these so-called tests actually test? Don't they just check that the code is doing what it's doing?
You are absolutely right. The tests that JUnit Factory generates are best described as characterization tests. These are tests that characterize the actual behavior of the code. They record and test not what the code is supposed to do, but what it actually does. The term characterization tests was introduced by Michael Feathers in his book Working Effectively With Legacy Code – a great book by the way.
How are these characterization tests useful?
Characterization tests are change detectors and are particularly useful when working with legacy code. In Working With Legacy Code (you should really get this book), Michael Feathers defines legacy code as code without tests. When modifying legacy code you are working without a safety net. You make specific changes but you don't know if those changes have some unwanted or unexpected side effects. A good set of characterization tests can serve as a safety net because these tests will show you when you have changed some existing behavior. Of course, only you – the developer - can decide if the change is what you had intended or not. In many cases, the tests will show that your changes have some intended effects and some unintended side-effects.
Can I use the generated tests in some other ways?
Sure. Since the test generator does not know how you meant your code to be used, it tries all sorts of things and uses a wide range of valid and invalid inputs – including things that you did not plan for or anticipate. By reading the generated tests you will often discover behavior that you did not want or expect. Seeing a generated test method called testFooThrowsArrayIndexOfBoundException for example, lets you know that method foo() throws an AIOOBException. Is this exception something that you expected? If the answer is yes, then fine, you now have a test for that. If the answer is no, you can fix your code accordingly.
I tried a simple MathFunctions class in the online demo:
public class MathFunctions {
public int add(int x, int y){
return x+y;
}
public int subtract(int x, int y){
return x-y;
}
public int multiply(int x, int y){
return x*y;
}
public int divide(int x, int y){
return x/y;
}
}
and it responded with the following tests (pardon the lack of formatting - I'm just lazy at the moment):
/**
* Generated by Agitar build: Agitator Version 1.0.4.000276 (Build date: Mar 27, 2007) [1.0.4.000276]
* JDK Version: 1.5.0_09
*
* Generated on Apr 7, 2007 2:19:39 PM
* Time to generate: 00:03.788 seconds
*
*/
import com.agitar.lib.junit.AgitarTestCase;
public class MathFunctionsAgitarTest extends AgitarTestCase {
static Class TARGET_CLASS = MathFunctions.class;
public void testConstructor() throws Throwable {
new MathFunctions();
assertTrue("Test completed without Exception", true);
}
public void testAdd() throws Throwable {
int result = new MathFunctions().add(100, 1000);
assertEquals("result", 1100, result);
}
public void testAdd1() throws Throwable {
int result = new MathFunctions().add(0, 0);
assertEquals("result", 0, result);
}
public void testDivide() throws Throwable {
int result = new MathFunctions().divide(0, 100);
assertEquals("result", 0, result);
}
public void testDivide1() throws Throwable {
int result = new MathFunctions().divide(100, -1);
assertEquals("result", -100, result);
}
public void testMultiply() throws Throwable {
int result = new MathFunctions().multiply(100, 0);
assertEquals("result", 0, result);
}
public void testMultiply1() throws Throwable {
int result = new MathFunctions().multiply(100, 1000);
assertEquals("result", 100000, result);
}
public void testSubtract() throws Throwable {
int result = new MathFunctions().subtract(2, 2);
assertEquals("result", 0, result);
}
public void testSubtract1() throws Throwable {
int result = new MathFunctions().subtract(100, 1000);
assertEquals("result", -900, result);
}
public void testDivideThrowsArithmeticException() throws Throwable {
try {
new MathFunctions().divide(100, 0);
fail("Expected ArithmeticException to be thrown");
} catch (ArithmeticException ex) {
assertEquals("ex.getClass()", ArithmeticException.class, ex.getClass());
assertThrownBy(MathFunctions.class, ex);
}
}
}
It looks pretty promising.
Last modified on Apr 07, 2007 at 04:30 PM UTC - 5 hrs
Posted by Sam on Mar 30, 2007 at 03:12 PM UTC - 5 hrs
One of the great benefits of using a framework (in the general sense) is the freedom in portability it often gives you. Rather than writing your own Ajax routines which would need to handle browser incompatibilities, you can rely on Prototype/ Scriptaculous, AjaxCFC, Spry, or a seemingly infinite number of Ajax frameworks available. We see the same phenomenon in our use of ColdFusion, which uses Java to be cross-platform. And, you can get the benefit for databases by using a framework like Transfer or Reactor or ActiveRecord in Ruby (all also have a slew of other benefits). In general, we're simply seeing an abstracting away of the differences between low-level things so we can think at a much higher level than "oh crap, what if they're using Lynx!" (Ok, I'm doubting any of those Ajax frameworks fully support Lynx =), but you get the picture)
More...
I can honestly say I've never had to build an application that needed to be portable between databases, aside from between different versions by the same vendor (although there was once where it might have been nice, since I used the apparently impossible to find in hosting combination of Microsoft SQL Server and Java). My guess is that unless you're building software which is meant to be deployed on different databases (such as a framework, library, or software you plan to sell over and over again which you won't be hosting), you probably haven't needed to do so either... Or at least, not often enough to let it impact your coding.
So today, I started my quest to turn cfrails into a DB-independent framework (as of today, March 30, 2007 it works only with Microsoft SQL Server). Since one of my goals is to limit the amount of configuration the programmer has to provide to get up and running, simply telling cfrails your database metadata (instead of having it look in the database itself) was out of the question (XML or otherwise). I've looked at using CF's getMetaData(query_result) function in the past, but that doesn't have near the amount of data that would be useful for me.
I figured my best bet would be to drop down into Java and use its DatabaseMetaData interface. This way, I could let Java take care of the differences between DB implementations for me. Sure enough, looking over the docs, it had everything I needed. I went ahead and whipped up some code to use it and test it before trying to put it into Coldfusion. If you're interested, here's most of what I needed (only using the jdbc/odbc bridge). If not, feel free to skip this code:
// at the top of the file, you'll need to import java.sql.*
// and of course, define your class as you normally would.
Connection connection=null;
String driverPrefixURL = "jdbc:odbc:";
String dataSource = "ODBC Datasource Name";
String username = "user";
String password = "password";
String catalog = "which database";
String schema = "dbo";
String tableYouWantColumnsFor = "sometable";
try
{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
connection = DriverManager.getConnection(driverPrefixURL+dataSource, username, password);
DatabaseMetaData dbmeta = connection.getMetaData();
ResultSet cols = dbmeta.getColumns(catalog,schema,tableYouWantColumnsFor,"%");
while (cols.next())
{
// this is all going to be unformatted as-is
System.out.print(cols.getString("COLUMN_NAME") + " ");
System.out.print(cols.getString("DATA_TYPE") + " ");
System.out.print(cols.getString("TYPE_NAME") + " ");
System.out.print(cols.getString("COLUMN_SIZE") + " ");
System.out.print(cols.getString("IS_NULLABLE") + " ");
// no clue why these two lines are throwing errors
//System.out.print(cols.getString("COLUMN_DEF") + " ");
//System.out.print(cols.getString("ORDINAL_POSITION") + " ");
System.out.println(); // skip to the next line
}
connection.close();
}
catch (Exception e)
{
e.printStackTrace();
// probably should be in finally block, but didn't worry about it as this was just a test
try
{
connection.close();
}
catch (Exception ex)
{
System.out.println("Could not close connection - probably wasn't open.");
}
}
(if you need more info, see the docs I linked above -- this is just a small sampling of the metadata available for columns in a table)
Sweet! I didn't really need ORDINAL_POSITION since it already ordered the results by it, and COLUMN_DEF (the column's default value) wasn't all that important (well, I could figure out a way to get that later, anyway). But, when I went to put it into ColdFusion, I realized - I have no clue how I might get the information I need to connect- forget about the fact that I'm not sure the average developer (CF or otherwise) will know what a catalog or schema is, or which one they need to access if they do. Who in the world isn't going to have to go through some work to figure out what driver they need and what the connection URL is?
So, I needed a way to let Java know these things through CF. I thought I could get at least some of that information from the CFAdminAPI (or whatever the proper case and spelling is), but I didn't really like that option. I couldn't think of any other option that would allow me to use Java to get the metadata, so I decided I'd have a look at the codebases for Transfer ORM, Reactor, and DataMgr and see how those smart guys did it. I only browsed, but I didn't find anything useful for me trying to use Java to do the trick. So, it was back to writing a DB-specific implementation for each database I wanted to support.
Luckily for me, even though my design doesn't approach the modularity and complexity of Reactor or Transfer (yet... I'm a big fan of YAGNI and haven't had any reason to break out of the simple design I currently have and Transfer is downright confusing for me to quickly browse it and figure out what's going on), it is quite flexible enough for me to do the different implementations and integrate them with ease. To do so, I only need to refactor the current routine that gets the MSSQL Server metadata, and provide a way to figure out which database type it is looking in (each of these only has to happen once). After that, all that remains is to write a query to get the metadata and convert its output (for each implementation) to the interface the rest of my code expects. Not hard by any means, but it would have been cool for maintenance purposes and greater overall simplicity if I could have let Java handle it.
So now that you've gotten this far in this boring post with no conclusion, have you used Java to get the DB metadata in CF (or even outside of CF)? Have you found a way to do it without making the programmer specify the connection URL and driver?
Posted by Sam on Mar 29, 2007 at 02:24 PM UTC - 5 hrs
It had been a while since I visited InfoQ, but the other day I got one of their mailings in my inbox, and tons of it was relevant to my interests (even more so than normal). Rather than having a separate post for all of it, I decided to combine much of it into this post.
First, this post let me know that Thoughtworks has released CruiseControl.rb, which is good news for Ruby users who also want continuous integration. I've yet to use it, but those guys are a great company and it seems like everything they touch turns to gold.
More...
Next, of interest to Java programmers and ORM fans, Google has contributed code to Hibernate that allows people to "keep their data in more than one relational database for whatever reason-too much data or to isolate certain datasets, for instance-without added complexity when building and managing applications."
Then, Bruce Tate, who is probably best known for his books Beyond Java and From Java to Ruby (by me anyway) has a case study on ChangingThePresent.org, his new project that uses Rails. It looks like a good starting point if you are developing a high-traffic Rails site. And, the site does some good charity work - taking only the credit card processing fee from your donation to the causes of your choice.
Finally, Gilad Bracha speaks about dynamic languages on the JVM. Until recently he was the Computational Theologist (is that someone who studies the religion of computation?) at Sun. The talk is just over 30 minutes long, and is at a surprisingly low level - so geeks only! He discusses a few of the challenges with implementation details on getting dynamic languages to run natively in byte-code (as opposed to writing an interpreter in Java which would then interpret the dynamic language). This includes a new instruction for invoking methods - invokedynamic, as well as a discussion on possible implementations of hotswapping - the technique that allows you to modify code at runtime. Very interesting stuff.
For today, I should be done referencing InfoQ, except to give them credit in my next post for turning me on to a couple of good resources for implementing DSLs in Ruby (which is the subject of my next post). I'll post the link in the comments when I'm done with it - I try not to edit these posts anymore since MXNA likes to visit every 30 seconds and thinks the date is the unique identifier of the post... you know, instead of using the XML field called "guid." =) (sorry guys, just poking a little fun - thanks for aggregating me).
Posted by Sam on Mar 12, 2007 at 08:43 AM UTC - 5 hrs
Yesterday I was working on a little Java program that, when given a table, a "possible" candidate key (which could be composite), and some non-key columns would check to see if the table was in 1st, 2nd, or 3rd normal form(s). One constraint is that this program needs to be procedural in style (or rather, all of my code must reside in the file that contains the main() function).
I started out with the pseudocode programming process in psvm. My listing started to look like:
More...
// validate the input args[]
// extract the table, "possible" candidate key, and non-keys
// validate that there are no duplicate columns in the "possible" candidate key and the non-keys
// validate the table exists
// validate the columns exist for the "possible" candidate key and the non-keys
...
// and so on
At that point, I decided to go ahead and start coding. Now, I wasn't using TDD, since I have no clue how I might go about doing so other than to first code in different classes, and then migrate everything over to main() . In that case, I figured there may very well be more bugs after migrating, rather than less. So, I just started coding.
Once I got down to having to hit the database table to verify it and the columns existed, I got stuck. I didn't know if I needed to test against an actual DBMS using JDBC or if I needed to check against a .CSV file, for example. I stayed stuck and thinking for a while, until I started thinking in objects (or more generally, and importantly, abstract data types). At that point I realized that whatever I did, I'd have to store it some how, and in doing so, I could "fake it" and implement that part later. All I needed was the interface, essentially.
Now, this is a technique you'll often use in OOP and especially if you're following test driven development, but given the nature of the constraints, I didn't think about it immediately.
The point is that even if you aren't programming with objects or using TDD, you can still think about abstraction, and use it to continue coding even when the detailed implementation-level requirements aren't fully specified. I'm not sure this is something I would have thought about when programming purely procedurally, so hopefully it will help someone out, as it certainly did for me.
Thoughts anybody?
Posted by Sam on Feb 27, 2007 at 06:31 AM UTC - 5 hrs
One of the things I've been dreading is getting Rails to work with IIS, although it appears to be getting easier (last time I checked, there was no such "seamless integration"). But eWeek has some good news. They note that core developer on the JRuby project, Charles Nutter
said the JRuby project will announce, possibly as soon as this week, that JRuby supports the popular Ruby on Rails Web development framework. Ruby on Rails is sometimes referred to as RoR or simply Rails.
"We're trying to finish off the last few test cases so we can claim that more than 95 percent of core Rails tests passed across the board," Nutter said.
Moreover, the JRuby team is inviting Rails developers to try out real-world usage of JRuby and help them find any Rails issues the unit tests do not cover, or any remaining failures that are crucial for real applications, Nutter said.
Support for Ruby on Rails is important because "once Rails can run on JVM alongside other Java apps, existing Java shops will have a vector to bringing Rails apps into their organizations. They won't have to toss out the existing investment in servers and software to go with the new kid on the block," Nutter said.
The article also mentions "Microsoft is looking at support for Ruby developers and broader uses of Ruby, including having it run on the CLR."
Both bits of good news for fans of Ruby who want it to play well with Windows and IIS. Not to mention that using JRuby (or NRuby, if it will be called that for .NET) should open up all of those libraries available to Java/.NET developers, which has been considered by some to be one of Ruby's weaker spots.
Posted by Sam on Feb 01, 2007 at 03:02 PM UTC - 5 hrs
On Monday (Jan. 29, 2007) we had our first meeting of the UH Code Dojo,
and it was a lot of fun. Before we started, I was scared that I'd be standing
there programming all by myself, with no input from the other members. But,
my fear was immediately laid to rest. Not only did everyone participate a lot,
one of the members, Matt (I didn't catch his last name), even took over the
typing duties after about 45 minutes. That turned out to be a good thing -
since I still can't type worth a crap on my laptop, and he was much faster than me.
Now, it had only been a couple of months since I last used Java - but it still amazes me
how little time off you need to forget simple things, like "import" for "require." I found
myself having several silly syntax errors for things as small as forgetting the
semicolon at the end of the lines.
Overall, we started with 7 people, and had 5 for most of the meeting. We finished with
four because one person had tons of homework to do. It seemed like those five of us
who stayed were genuinely enjoying ourselves.
In any case, we decided to stick with the original plan of developing a tic-tac-toe game.
You would think that a bunch of computer scientists could develop the whole
game in the two hours we were there. But, you'd have been wrong.
I came away from the meeting learning two main points, which I think illustrate the main
reasons we didn't complete the game:
- YAGNI is your friend
- Writing your tests first, and really checking them is very worthwhile
More...
Both points are things most of us already know, but things you sometimes lose sight
of when working like that all the time, or without paying attention. But, I was
paying attention on Monday, so let me explain how I came to identify those points.
We started by trying to identify a class we could test. This was harder than it looked. We
initially decided on Game . There was a lot of discussion on making it an
abstract class, or an interface, or should we just make it support only tic-tac-toe? After
everyone was convinced that we should first make it concrete, because we would not be
able to test it otherwise, we started trying to figure out what we could test.
So, we wrote a test_hookup() method to check that JUnit was in
place and working.
In the end, we also decided that we would probably want to try to
make it abstract after we tested the functionality, so we could support multiple games.
I think that decision proved to be a bad one - because none of us could figure
out a single peice of functionality that a Game should have.
For me, I think this was due mostly to the fact that I was trying to think of
general functionality, so I only came up
with run() , basically the equivalent of play() .
After thinking for a couple of minutes about things we could test, we decided we should
go with something more tangible for us - the Board class. Once we did that,
we never had a problem with what to test next. But, we still ran into problems with
YAGNI. We were arguing about what type of object we should use, so I just said "let's use
Object , then anything will fit and we can be more general when we get
back to Game ." This led to trouble later, when we had to have checks for
null all the time. So, we changed it to a string whose value could be "x", "o", or "".
That made life, and the code, a lot simpler.
Those are the two main places where ignoring the YAGNI prinicple hurt us. There are also
a couple of places where not writing our tests first hurt us.
The most obvious one is that we were just looking for a green bar. Because of that, we
actually were running the wrong tests, and some of ours hadn't worked in the first
place. We were running the test from the Game class, not the Board .
It wasn't too hard to fix the ones that were broken though.
Other than that, we spent
a lot of time trying to figure out if an algorithm to determine a winner would work (in
our heads). If we had just written the test first, we could have run it and seen
for sure if the algorithm was going to work correctly.
Overall, the only other thing I would have changed about how we developed would have been to
decide on several tests in advance, so we would have plenty to choose from,
rather than thinking of a test, then immediately implementing it.
For reference, I've posted our code below.
// Board.java
public class Board {
private String[][] _board;
private boolean current_player;
public Board(int row, int col)
{
_board = new String[row][col];
current_player = true;
for (int i = 0 ; i < row ; i++)
{
for (int j = 0 ; j < col ; j++)
{
_board[i][j] = "";
}
}
}
public boolean isEmpty()
{
for (int i=0; i<_board.length; i++)
{
for(int j=0; j<_board[i].length; j++)
{
if (!_board[i][j].equals("")) return false;
}
}
return true;
}
public void setMarkerAtPosition(int row, int col)
{
if (current_player)
_board[row][col] = "x";
else
_board[row][col] = "o";
current_player = !current_player;
}
protected Object getMarkerAtPosition(int row, int col)
{
return _board[row][col];
}
protected boolean isGameOver()
{
return false;
}
public boolean isRowWinner(int row)
{
for (int j = 0 ; j < _board[0].length - 1 ; j++)
{
if (!_board[row][j].equals(_board[row][j + 1]))
{
return false;
}
}
return true;
}
public boolean isColWinner(int col)
{
for (int i = 0 ; i < _board.length -1 ; i++)
{
if (!_board[i][col].equals(_board[i][col]))
{
return false;
}
}
return true;
}
public boolean isDiagWinner()
{
boolean diag_winner1 = true, diag_winner2 = true;
for (int i = 0 ; i < _board.length -1 ; i++)
{
if (!_board[i][i].equals(_board[i+1][i+1]))
{
diag_winner1=false;
}
if (!_board[i][_board[0].length-1-i].equals(
_board[i+1][_board[0].length-1-(i+1)]))
{
diag_winner2=false;
}
}
return diag_winner1||diag_winner2;
}
}
// TestBoard.java
import junit.framework.TestCase;
public class TestBoard extends TestCase {
private Board _b;
public void setUp()
{
_b= new Board(3,3);
}
public void testBoard()
{
assertTrue(_b != null);
assertTrue(_b.isEmpty());
}
public void testSetMarkerAtPosition()
{
_b.setMarkerAtPosition(0, 0);
//assertEquals(_b.getMarkerAtPosition(0,0),"x");
}
public void testIsGameOver()
{
assertFalse(_b.isGameOver());
for (int i = 0 ; i < 3 ; i++)
{
for (int j = 0 ; j < 3 ; j++)
{
_b.setMarkerAtPosition(i,j);
}
}
//assertTrue(_b.isGameOver());
}
public void testIsRowWinner()
{
_b.setMarkerAtPosition(0,0);
_b.setMarkerAtPosition(1,0);
_b.setMarkerAtPosition(0,1);
_b.setMarkerAtPosition(1,1);
_b.setMarkerAtPosition(0,2);
assertTrue(_b.isRowWinner(0));
}
public void testIsColWinner()
{
_b.setMarkerAtPosition(0,0);
_b.setMarkerAtPosition(0,1);
_b.setMarkerAtPosition(1,0);
_b.setMarkerAtPosition(1,1);
_b.setMarkerAtPosition(2,0);
assertTrue(_b.isColWinner(0));
}
public void testIsDiagWinner1()
{
_b.setMarkerAtPosition(0,0);
_b.setMarkerAtPosition(0,1);
_b.setMarkerAtPosition(1,1);
_b.setMarkerAtPosition(0,2);
_b.setMarkerAtPosition(2,2);
assertTrue(_b.isDiagWinner());
}
public void testIsDiagWinner2()
{
_b.setMarkerAtPosition(0,2);
_b.setMarkerAtPosition(0,1);
_b.setMarkerAtPosition(1,1);
_b.setMarkerAtPosition(0,0);
_b.setMarkerAtPosition(2,0);
assertTrue(_b.isDiagWinner());
}
}
As you can see, the board object really didn't do what its name implies. That should have
been game, I think. Also, the three check for winner functions are only temporarily
public - we planned on combining them to check for a winner in just one public function.
Other than that, we also need some negative tests, and obviously to complete the game.
Finally, thanks to everyone who showed up for making
it an enjoyable experience! I can't wait for the next one!
Last modified on Feb 12, 2007 at 10:12 AM UTC - 5 hrs
Posted by Sam on Jan 23, 2007 at 12:11 PM UTC - 5 hrs
I've seen a couple of posts around the ColdFusion world lately about the programming language, Groovy (which runs on the JVM). If you're interested in learning Groovy and using it for web application development, the framework Grails is for you. And even better, InfoQ has just released Getting Started with Grails. It's a free 131-page PDF mini-book available for download now. You can also buy the print version for just under 20 bucks.
Last modified on Jan 23, 2007 at 12:11 PM UTC - 5 hrs
Posted by Sam on Jan 22, 2007 at 05:16 PM UTC - 5 hrs
One of my guilty pleasures is reading The Daily WTF?!. I just saw this Java snippet, and wanted to share:
int x = new Integer( 0 ).intValue();
someMethod( new Integer( x ) );
I laughed my ass off when I read it. Then, of course, I had to have a laugh at myself, because I've written code like that. Within the last six months.
I did have an excuse though. For some reason, the compiler was complaining that it couldn't cast an int to an Integer (or vice versa). Then it was saying the opposite within the function. So I had to wrap and unwrap it. The weird thing was it was only giving this error on one machine.
As I recall, we just had to switch a compiler version in Eclipse to fix it. But, I still wrote it.
Last modified on Jan 22, 2007 at 05:17 PM UTC - 5 hrs
Posted by Sam on Jan 15, 2007 at 10:23 AM UTC - 5 hrs
Do you find yourself writing more classes in other languages than in Coldfusion? I do.
For instance, I certainly develop more classes in Java than Coldfusion.
Is it due to the fact that I've developed bad habits in CF and haven't yet broken out of them
(i.e., I'm just unable to spot needs for new classes) or is it because CF is so much more high-level
than Java? A third option may be that the performance issues with instantiating
CFCs contribute to me not wanting
to break out?
More...
I think part of it may be due to bad habits and not having the ability to notice a spot where I need
another class. But, I have no evidence of this, since I haven't been able to notice it =).
I do have evidence of the other two options however: CF being so much more high-level than Java is fairly
self-evident: Where I'll often use a façade to interact with a database in Java, in Coldfusion it would
be pretty useless - since there is no need to instantiate a connection, prepare a statement, prepare a
record set, and then close/destroy them all when you don't need them any more. The other,
about the performance hits, I know sometimes stops me from creating more classes. For instance,
I recently used a mixin to achieve some code reuse with a utility method that didn't belong to any of
the classes that used it. Clearly it broke the "is-a" relationship rule, but I didn't
want to incur the cost of instantiating an object just to use
a single, simple method (the application was already seeming to perform slow).
Of course, this is not a zero-sum game. It is certainly due in part to all three. I'd like to think
it is mostly due to the expressiveness of Coldfusion, but I'll always have to wonder about the bad habits.
In Ruby, I've developed a couple of "significant" applications, but those didn't approach
the size of the one's I've done in Coldfusion or Java. It's hard to say for sure, but I'd guess that I'm
writing much fewer classes in Ruby than I would in Java. Again, this is mostly due to Ruby's expressiveness.
In any case, have you noticed yourself writing fewer classes in CF than in other languages? Have you
thought about it? Come to any conclusions?
Last modified on Jan 15, 2007 at 10:23 AM UTC - 5 hrs
Posted by Sam on Dec 04, 2006 at 08:31 AM UTC - 5 hrs
Just finished writing a survey on some of relatively current literature on k-means, focusing on introducing it, some practical applications of it, some difficulties in it, and how to find k, the number of clusters. I'm still new to the area, so don't expect much groundbreaking to be done.
The second half focuses on my own experiment, trying to find k using two similar, but slightly different techniques. I failed, but if you'd like to go over it and either laugh at me, or perhaps figure out what I've done wrong, you are free to. =)
Obviously, this isn't going to interest many people, so I didn't take time to mark it up - it's just available as a DOC (I had planned on having a PDF version, but my PDF writer has taken a crap on me). If you don't have Word or Open Office, and would like to read it, contact me and I'll try to get the PDF for you in some way or another.
Anyway, the DOC is here if you want to read it. It's over 3600 words, so beware!
I'm interested to know if anyone has built any machine learning libraries or done anything with machine learning in Coldfusion? My immediate thought is "no way!" because I don't think Coldfusion has the performance for it. But, I wouldn't know, since I haven't tried it. Have you? What's been your experience? Drop me a line if you care to.
Posted by Sam on Nov 30, 2006 at 12:55 PM UTC - 5 hrs
There's been an ongoing discussion in the Coldfusion community for quite some time
now regarding the use of XML for configuration files to your applications.
More recently, Peter Bell has written quite a few posts about it in
The Benefits of XML Config Files (and why I don't use them),
Configuration Files vs. Scripts,
and most recently, Should you use XML for you configuration files?
He references Hal Helms,
Joe Rinehart
(and thanks Jim Collins who wrote (or co-wrote?)
config.cfc for these),
and Martin Fowler.
This post was sort of "sparked" by Jim's release of config.cfc, then my ensuing question about "when and why" I would want to use it (and no, I didn't mean it to sound that provocative), and then Peter's response to which this post title is addressed as "regarding."
I'm a little late to the party, but I thought I'd chime in nevertheless.
More...
First, I want to start off with my position that "there are no best practices, only better practices" (to quote my professor Venkat). And to paraphrase Peter, a lot of people who know a lot more than I do use XML configurations. On the other hand, a lot of people who know a lot more than me don't use them.
Second, I'd like to point out that I found it mildly funny (not in a bad way) in reading some of the reference documents that some of us consider XML to be human readable. I don't think you'd find that statement anywhere except the Coldfusion community.
Finally, now that I've got the mile-long introduction out of the way, let's continue.
In Hal Helms' article, he is touting the benefits of configuration files in general. He makes a convincing argument, and has motivated me to be a better job of extracting things I never thought would should be configurable, unless I was building something that needed to be internationalized. He compared the INI file to those of the XML variety, and mentions that "some configuration information is more complex and is a poor fit for the simple nature of the INI file. In such cases, XML can be the ideal solution." I think it would be hard to have a problem with that statement, especially given the context of the article.
Joe Rinehart also only touches on the topic - he says that XML "provides the easiest way for us to partition off our configuration into logical 'chunks,'" and doesn't go into detail beyond that (for explaining why XML is a Good Thing.
But, Peter goes into a lot more detail about the benefits of XML. In particular, he mentions (Peter, if you feel I've reproduced too much of your article here, let me know and I'll trim it down)
Validation
As long as you have a DTD/Schema for your XML, you can validate XML against the DTD/Schema to confirm that it meets the specificed requirements. This can be extremely useful in pre-checking your data before getting a ColdFusion error because your configuration data was not well formed. Better yet, you don’t need to write any of the validation code – just the DTD/schema.
Manipulation
With XSLT you can transform XML files from one structure to another and you can even query the data using XPath. There are lots of great tools for working with XML files and again, they’re just there. You don’t need to write them.
Editing
There are plenty of good free plug ins for Eclipse and stand alone editors (XMLBuddy seems
to be a popular Eclipse plug in). People are mostly used to editing in XML and there are
good tools with highlighting and that will validate against a DTD/Schema in editor.
Consistency
As in Steve Krug's "Don't Make me think", framework design choices should bear in mind that
many people spend most of their time using something other than your framework. I may not use
MG often, but apart from learning the vocabulary of the DSLs, there is no real learning curve
in terms of how to create an XML config file for Reactor, CS or MG. Same for Mach-II and
anything else that uses XML configuration files. It is usually better to do things the
way people expect so they have less to learn.
Like Peter, I find XML to have a low signal to noise ratio (in "Benefits of XML Config Files...",
though he said "high," if he'll allow me to put words in his mouth, I think he meant low).
I should also put a bit of a disclaimer here: I've spent the last three months working on a Java project, so
I'm sick of XML configuration. And in general, I've only rarely used XML configuration files with Colfusion,
and even then in fairly distant past. In that regard, let me profess some ignorance, especially regarding
DTDs. Lately, I've been preferring programmatic configs (written directly in
CFML/script). However, I've also not been needing much in the way of configuration, so that may help blind me
to the benefits of XML.
In any case, let me get into some reasons as to why, comparing that with the perceived benefits of XML.
Validation:
As opposed to validating my XML file with a DTD/Schema, I validate my code with unit tests. These can similarly
be provided to other programmers to validate their configurations. Now, I don't know how in depth validation
can go, but it seems to me I'll be writing some validation in CF even if I use XML, and if it wasn't well
formed, or did not meet my expectations, the unit tests would catch that. Another thing frequently mentioned
is that it would be better to catch any errors in the spot the error occurred. I agree with this, but
if you are validating your programmatic config file (with unit tests or otherwise), it is easy enough to
throw an exception in that one spot, rather than letting the error propagate through your code.
Manipulation: I don't suppose I can use a converter to change my programmatic config file, but I'm
not sure I see a reason to either. If I chose to, I could query the data with SQL using a query of query,
as opposed to XPath.
Editing: I feel it is as simple to edit CFML as it is XML, so I don't see a benefit here, except the
highlighting he mentions. But, if we're writing CFML, it is easy enough to check for syntax errors, and if
you're using a decent editor, it will also highlight syntax errors for you. I realize it won't highlight
"syntactic" errors by missing out on something you need in the config, but I don't value that highly, since
I cannot remember a time I wrote a config file from scratch, unless I was creating the application (and in
that case, the XML highlighting is unlikely to help you).
Consistency:
As he mentions, there is little to learn in how to configure different apps when they use similar XML. But,
assuming the person configuring the XML file is also the programmer, the part you have to learn is precisely
the part you need to learn in a programmatic configuration file! You don't need to relearn how to do
<cfset name = value> , you need to learn what valid names and values are. In the case
that the person configuring the application is not a programmer, I'd almost certainly have provided a GUI
frontend to the config file, or likely stored it in a database in the first place.
Finally, Peter mentioned that although just a few variables don't need XML, you might need it when
configuring 200 beans for ColdSpring or LightWire (both dependecy injection frameworks). I have a hard time
seeing that something which is weighty for a small project could be anything less than unwieldy for a larger
one. Especially in this example, if I was using something that required an XML configuration file, I would
undoubtedly store the information for all those beans in a database, and write a script that would write
my XML file for me.
Overall, I like what Martin Fowler said the best:
"I often think that people are over-eager to define configuration files. Often a programming language
makes a straightforward and powerful configuration mechanism... My advice here is to always provide
a way to do all configuration easily with a programmatic interface, and then treat a separate
configuration file as an optional feature."
So having said all that, if and when I find a good case for using an XML configuration file, I'll gladly do so,
and so should you. Nor should we discount XML - as Peter said and I agree, a lot of really smart people
use it. But at the moment, I don't see a need for the excess clutter in my life - CFML provides
enough of that for me =).
Last modified on Nov 30, 2006 at 04:58 PM UTC - 5 hrs
Posted by Sam on Sep 13, 2006 at 09:34 AM UTC - 5 hrs
It may be that I'm still so new to Java that I don't know a good way around this, or it may be that I'm so used to the convenience of Coldfusion that I build tools in Java to help it act more like Coldfusion, but I was having some trouble with MVC in Java. I didn't see why something in my View layer should be importing anything from the package java.sql, such as ResultSet .
Moreover, I wanted to open and close my Connection to the database (and the Statement / PreparedStatement in the Model layer, and doing so would have prevented the ResultSet from being available to the View.
Enter QueryResult . QueryResult simply takes as a parameter a ResultSet and creates a HashMap where each key is a column name, and accessing it gives you an array of the rows' values for that column.
Thus you can get the fifth row of the column email by doing this:
QueryResult qr = new QueryResult(someResultSet);
String emailInRow4 = (String) qr.get("email")[4];
It's not the greatest thing in the world, but it works better than the alternative. I plan to put the code up here for it later, so maybe it could help someone else as well.
Last modified on Sep 13, 2006 at 09:37 AM UTC - 5 hrs
|
Me
|