Posted by Sam on Jun 16, 2010 at 11:06 AM UTC - 5 hrs
It seems easy to add an if statement the first time you need to deal with a difference between legacy data and new data models. It's still easy the second and third times.
Certainly it's easier than transforming the legacy data to fit the new schema.
Induction doesn't hold though. At some point, it becomes a tangled mess of code that has to deal with too many conditions and your mental model of it turns into total disarray.
This is one case where laziness and instant gratification can steer you in the wrong direction. Avoid the spaghetti; just transform the old data like you should have in the first place.
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
It might have been obvious to me before then, had I paid attention to my own thoughts and actions. Instead,
it took a little prodding and the knowledge that others experienced the same things I did to get me thinking about it.
The more you have to remember, the worse decisions you make
In the episode, hosts Jad Abumrad and Robert Krulwich relate a story from
Baba Shiv at the Stanford School of Business,
whose research has to do with the brain and tricking people.
In one experiment, Shiv has subjects memorize a number (taking as much time as they want), then
go to another room and recite the number.
Some people only have to memorize a two-digit number, while the others are supposed to remember a seven-digit one.
The trickery comes in when the researchers interrupt the subjects on the way to the second room:
"Excuse me? Sorry to interrupt you, but would you like a snack?"
The research subjects are then asked to choose between chocolate cake or fruit salad for their snack.
The results show those who only need to remember two digits almost always choose the fruit, while the
seven-digit club nearly always choose the cake.
The pair also discuss Barry Schwartz's book, The Paradox of Choice,
that Scott Davis spoke about in this aforementioned keynote, and about which I wrote in 2007:
He [Scott Davis] told a story (I believe to be from the book) that succinctly showed what happens when consumers are presented with too many choices: Given 3-5 options for tasting Jellies/Jams, customers ended up buying more of it. However, increasing that to two-dozen left sales at levels lower than having no sampling at all.
We simply become overwhelmed by the number of options available, turn off, and refuse to make a choice at all. That leads us to the Maximizer/Satisficer concept, where the Maximizers are especially prone to analysis because they try to be more prepared when making decisions and never stop analyzing, even after making the choice. On the other hand, satisficers try to narrow their decision-making points down to a reasonable few.
When my to-do list is unordered and contains too many items, I often revert to the infamous analysis paralysis;
the too-many-things-to-remember causes the poor decision of making no decision. (I am working on some software to relieve me
of this feeling, but more on that some other time).
More powerful programming languages make programs shorter. And programmers seem to think of programs at least partially in the language they're using to write them. The more succinct the language, the shorter the program, and the easier it is to load and keep in your head.
Choosing quickly versus Deliberating
In the same Radiolab episode, the hosts visit Berkeley Bowl
and encounter the "too many choices" dilemma. In choosing an apple each, Jad deliberated as Robert
just chose on his first gut instinct.
The problem behind the kind of choice we make when we have too many things to choose from is that it
short-circuits our prefrontal cortex - the rational, deliberative system in our brains.
We can only hold so much data at a given moment, and when we encounter encephaloverlow, the emotional, unconscious, automatic
part of our brain takes over.
Trying to make that choice brings us back to the maximizers vs. satisficers bisection mentioned above:
in deliberating all of the options behind a series of choices, the maximizer is often less satisfied with his choice.
In general, I'm that guy; and I can say without a doubt that the decisions I'm most happy with are the ones I make in an instant.
Complete rationality leaves you nowhere. Feelings are what makes decisions work.
In the Radiolab episode that inspired this post,
Antoine Bechara, a psychology professor at USC
tells the story of Elliot, a man who became completely rational after having a tumor removed from his brain. He had no mental impairment
that anyone could detect, but he had an impossible time deciding between the simple things in life: whether to use a blue pen or a black one; or which
cereal to eat in the mornings.
Neither feelings nor emotion seemed to play any part in Eliott's decision-making, and he could never decide on anything.
He ended up divorced, losing his job, and losing his life savings. After he became entirely rational, his life fell apart.
This segment shows us that gut feelings are shorthand averages of past wisdom, a theme Andy Hunt discusses in
Pragmatic Thinking and Learning:
L-mode is the analytical, rational process in our brain, but
We want to use R-mode more than we have because the R-mode
provides intuition, and that's something we desperately need in
order to become experts. We cannot be expert without it.
The Dreyfus model emphasizes the expert's reliance on tacit knowledge;
that's over here in the R-mode as well. Experts rely on seeing and
discriminating patterns; pattern matching is here too.
Justifying choices makes you choose pop culture.
The implication here is that pop culture is bad in some respect. That's not necessarily the case, but
when, as Malcolm Gladwell explains about the "perils of introspection," you have one group overwhelmingly
choosing this poster
over this one,
you might have a problem of the high-cultural taste variety.
It turns out, those who had to justify their choice were the ones choosing the precursor to the LOLcat. (And as Gladwell points out,
the scary part is that we ask our focus groups to explain their choices, and what we have left to purchase is what they favored.)
In that vein, I try not to think too hard about my programming. I don't mean to minimalize it - after all, programmers are
wizards who
But at the same time, I believe
there are no best practices - only better practices in particular contexts.
And those "best practices" are often the stuff of pop culture programming legend.
You know - those message board posts that tell you "don't do that" instead
of answering your question? Yeah, I know you know them. I love those almost as much as the ones that tell me "that can't happen" after I've
been trying to find out why the hell it just did.
Even though having too many choices can cause you to make bad decisions - and the wide-open-world of programming has infinitely many choices -
you need to find and focus-in on the decisions worth making. It's quite alright to have shortcuts that help you narrow down the
options to choose from. Given the evidence, it seems like a good idea.
But let the R-mode part of your brain and your expertise be the guiding factor in your choice in eliminating extraneous options, not a
reliance on pop culture. If you have to stop to justify your decision, it's likely you're
reverting to the pop-culture decisions that have you appreciating charlatans over masters of their craft.
How have you noticed choice impairing you as a programmer? How has it enabled you?
Posted by Sam on Jul 16, 2008 at 12:00 AM UTC - 5 hrs
When I was studying practice test questions for
Exam 70-431 last week, the
type of questions and answers I read led me to the thought that certifications attempt to
commodify knowledge and use it in place of thought.
More...
Not that this is a novel concept. I've always held a bit of disdain in my heart for certifications. Just as well,
it seems like most of the IT industry agrees with me (or, at least those elite enough to be writing
on the Internet).
After that, I wanted to find out what people thought about certifications in general. I didn't expect that almost
everyone would be
talking about how
worthlessthey are.
Nearly everyone questions their legitimacy.
But now, "We're settling into a pattern where a relatively small percentage of IT workers will need a certification to work in the area that they're in."
One of those areas is high-end IT architecture, and there are several certs that are profitable, in his view. They include the architecture-related certs offered by Cisco, Oracle, Microsoft, IBM, and EMC.
A certification like one of these can make or break a job interview.
Any hint that companies are de-emphasizing the technical certification process is troubling, because it could so easily have a negative effect on professional standards. I'm as big a believer as anyone in the importance of on-the-job training and real-world experience, but that doesn't begin to obviate the need for a formal certification program.
In a world where risk management and business continuity planning are essential pursuits for any healthy IT organization, every asset needs to be defined and monitored, and that includes technical skills. There's no better way to accomplish that than through a consistent, well-conceived means of documenting who has what skills. And that means certification.
Certainly if a certification helps you
get a job, it is monetarily worth it, whereas if it doesn't, then it's not. Just as well, some employers
want to see certifications while others do not. Some potential clients desire business partners to
be certified in the technologies that interest them. Others don't care.
So let's forget about "worth" as a measure of dollars for a moment.
Is it possible that conventional wisdom is wrong? Is it possible certifications can be worthwhile, personally or professionally?
It's simple to see that certifications are only worth the legitimacy conferred upon them by the community
as a whole. If the larger community sucks as much as we often say it does, that doesn't bode well for
certifications among elites. But the fact that so many people are making so much money in the certification
industry says that the community at large gives respect to the process.
So why do we see so many people questioning that process? (We see the same phenomenon in questioning
the value of a university degree, to be fair.)
The weak might bitch because they're lazy. The elite think they're above it, and that the tests focus on
too low a common denominator, if not the lowest. What about the rest of us?
I wasn't terribly excited after having passed the exam I took on Saturday. I certainly didn't feel like I
was a changed man. I know it helps out our company in becoming a Business Intelligence
partner of Microsoft, but I was less than enthusiastic.
Since, according to Microsoft, candidates for the exam should be experts in "wizard usage," I thought
I was an expert in Wizardry.
My reaction was a bit tongue-in-cheek. In fact, I do feel I learned something from the process - at
the minimum I had to know much more about scaling and back-up schemes than I would have known before-hand.
So I feel it was personally valuable in those respects.
Still, part of me feels the certification process is an attempt to commoditize knowledge and use it in place of thought.
So now I'll turn it on you:
Do you have any certifications? (Why, or why not?)
How do you feel about certifications? Why do you feel that way?
Today we'll start looking at the "Maintaining Your Edge" section of the book.
More...
Chad starts off by explaining what can happen if you get too comfortable: Tiffany.
Not that Tiffany, this one:
Do you remember a pop star named Tiffany (no last name) from the 1980s? She was in the top of the top forty, and a constant sound on the radio back then. She enjoyed immense success, becoming for a short time a household name.
...
Apparently, if she tried, she didn't move fast enough to hold the affection -- or even the attention -- of her fans. When the tastes of the nation turned from bubble gum to grunge, Tiffany suddenly became obsolete.
The point remains: you need to stay sharp. You cannot sit back and become complacent. Doing so in this industry can cause you to become extinct. And you'll probably be less famous than Tiffany or the Dodo. However, Didus ineptus may end up describing you well.
One thing you can do to stay sharp is recognize that, relative to information growth, your knowledge and skill levels are deteriorating rapidly. The consequences of what Gordon Moore observed in 1965 is that new possibilities for computation arise at an astounding rate.
That graph looks linear, so what's the big deal? Look at the left - it's logarithmic scale. That graph really looks like this:
That slope is so high it's almost negative.
You can't keep up with everything - but you can't afford to be late to the party when it comes to new trends in development either. If you were a desktop application programmer in 1992 and didn't look up until 2002, you'd probably say a few WTFs, and then start drowing in all the information you'd need to get started programming web applications. There's a lot to learn in new paradigms.
So you need to anticipate changes. You might not be able to jump the gun on the next big thing, but at worst you'll have augmented your arsenal, and you can stay close to other trends as well. Reading blogs and staying current in news and even journals can help you find new, up and coming developments. Thinking about how things will change and backing your hypotheses up with evidence from the literature can be a worthwhile activity in that regard.
Looking ahead and being explicit about your skill development can mean the difference between being blind or being visionary.
Know that you'll be obsolete. Don't accept obsolescence.
How do you deal with the pressure to stay current? What new things are you learning?
Posted by Sam on Jul 30, 2008 at 12:00 AM UTC - 5 hrs
I put faith in web application development as an income source like I put faith in the United States Social Security system. That is to say, it's there now, but I don't expect to be able to rely on it in its current incarnation very far into the future.
Java is mainly used in Web applications that are mostly fairly trivial. If all we do is train students to be able to do simple Web programming in Java, they won't get jobs, since those are the jobs that can be easily outsourced. What we need are software engineers who understand how to build complex systems.
Although Dewar was speaking in terms of Java, the statement applies to the broader world of web apps (and many desktop apps) in general.
That property is precisely what allowed frameworks like Rails and Django to come into existence and get popular.
Soon enough, the money will dry up for implementation because it's too easy to generate solutions for most problems you'll encounter - either using a framework, or a content management system like Sharepoint or Joomla, or even by hiring someone to generate it for you. Yesterday I recommended a potential client just go the CMS route.
Nowadays, most of the skill involved in writing web applications amounts to gluing the disparate pieces together. How long until someone figures out how to commoditize that? Instead of knowing only how to implement solutions to problems, you need to be skilled at problem solving itself.
Right now, you might be in a position where you can kick back and count your money while you smoke a cigar.
But if you're in the business of building web applications and you're not innovating new kinds of them, you're doomed. You can chase vertical after vertical and keep building the same apps for quite a while, but if you don't get into generating them, you're on the way out as people come in who can do it cheaper than you and with higher quality.
Generation is to web apps as prefabricated steel buildings are to construction. Except almost no one cares if their web application was generated or not - they just want the lower price.
I suspect that even if you are generating applications, at some point in the future, the number applications needing to be generated will not have grown as quickly as the number of people who can generate them.
People are building complex data warehouses and doing analysis and reporting on them with GUIs and Wizards right now. You still need the knowledge pertaining to data warehouses, but that knowledge is becoming easier to obtain for more people with less effort than ever before. That trend, which fits in with the general trend of information democratization, is unlikely to reverse itself.
If you don't plan for change now, you'll end up shocked.
And then how long until you're pulling out the cloth-eared elephant?
What Dewar said is true: Web applications are mostly fairly trivial. To survive,
you need to learn the fundamentals so you are applicable in various kinds of programming and for different platforms. If you really want to be safe, you need to be innovating, not building copy-cat applications with a twist (and especially not from scratch?!?!).
Every programmer should also read Chad Fowler's "My Job Went To India" book, where he explains that as larger and larger numbers of programmers adopt a particular skill, that skill becomes more and more a commodity. Rails development becoming a commodity is really not in the economic interest of any Rails developer. This is especially the case because programming skill is very difficult to measure, which - according to the same economics which govern lemons and used-car markets - means that the average price of programmers in any given market is more a reflection of the worst programmers in that market than the best. An influx of programmers drives your rates down, and an influx of incompetent programmers drives your rates way the fuck down. (Bold emphasis mine)
The problem, in my view, is that the influx of incompetent programmers is inevitable.
So building well-known applications with twists becomes much like the would-be artist who looks at Pablo Picasso's work and says, "I could do that."
The obvious exception is that applications are not (usually) like art. Well-made knockoffs of the original aren't likely to be differentiable by customers from the cheap knockoffs, so the masses of incompetents and maybe the original end up defining the market in the long term.
After you've seen it, you could do this:
To which we all respond, "But you didn't, did you?"
As always, I welcome your thoughts in the comments below.
For one, if I'm right, it's something which no one wants to hear. Further, if what I said is correct, and it's a novel concept, most people will not yet be of that opinion. Backlash would occur naturally. On top of that, there's always the possibility that what I said is completely asinine.
Despite my expectations of imminent flaming, however, the people who responded raised some excellent points, which I'd like to address here, taking the opportunity they presented to clarify my initial thoughts.
First, it would be helpful to answer the question, "what do I mean when I say 'mostly fairly trivial?'"
Mostly: Most web applications. Fairly: For each web application included in "mostly," it is trivial to a reasonable degree.
Put together, I would say "most parts of of most web applications are trivial."
I must have spent too much effort putting the focus on application generation, because by far, the biggest objections were there:
Barry Whitley noted,
The be-all end-all self building framework/generator has been the holy grail of software development since its inception, and it isn't really much closer to achieving that now than it was 20 years ago.
Along those same lines, Mike Rankin brought up CASE tools:
Anybody remember CASE (Computer Aided Software Engineering)? It was sold as the end of software development. To build an application, a layperson would just drag and drop computer generated components onto a "whiteboard" and connect them up by drawing lines. CASE was THE buzzword in the late eighties. 20 years later, it's nowhere to be found.
Software development will become a complete commodity the moment business decides to stop using their systems as a way to gain a competitive advantage.
To be clear, although I believe some applications can be entirely generated, I don't pretend that anywhere near most of them can. However, I do think that most parts of most web applications fall into that category.
Getting back to programming-by-wizard, at one point (very early) in (what you might call) my career, I programmed in G. This consisted mostly of connecting icons in what amounted to creating ridiculously complex flowcharts.
I think that's close to what many people envision as the "programmer killer" when they hear someone saying there will be one. But having used that style for a couple of months, I can issue my assurances that won't be it.
In fact,
as Mike said, since competitive advantage dictates businesses will continue innovating processes that will need to be codified in software, it's guaranteed there will always be software to write.
It's not a question of whether there will be software to write - it's a question of how much of it is there to go around for how many programmers, what skill-level those programmers need to be at, and what those applications will look like and run on.
Whereas a decade ago a team of programmers might have built an application over several months, we're at a point now where a single programmer can build applications of similar scope in days to weeks. We've even got time to add all the bells and whistles nowadays.
Within an application, we need fewer programmers than we did in the past. To stay employed, you need to learn how to use the tools that abstract the accidental complexity away, in addition to learning new types of things to do.
Barry puts it well:
As for the skills required, I'd actually argue that the workplace is demanding people with MORE skills than ever before. There is a lot of crap work for sure, and that market is dying out. For companies that want to be serious players, though, the demands are higher than they've ever been.
Indeed. That's what I'm talking about. The repetitive tedious stuff is going to be generated and outsourced. But there are shitloads of people still doing the tedious stuff. And there are shitloads of capable programmers who can glue the rest of it together.
We don't even need to get into the discussion of what will supplant the web and the number of jobs that will need to move around. The marketplace won't support us all at our high salaries. To be around in the future, you're going to need to do a better job of coping with change than the mainframe and green-screen programmers who won't find a job now. You're going to need to be capable of picking up new technologies, and the knowing the principles behind them will help you do it. Knowing how to build and design complex systems to solve complex problems is where you'll need to be. This is in contrast to being given specs and translating them into the newest fad-language. That's what professor Dewar was getting at, and that's what I'm getting at.
I don't expect most of the readers here will need to worry. Not because of anything I've done, necessarily, but because it seems like most of you embrace change.
To be fair, it may be more enjoyable, but it might not be as profitable - at least that's what Chad Fowler talks about in this week's chapter from My Job Went to India, "Make Yourself a Map."
Your personal product road map is what you use to tell whether you've
moved. When you're going to the same office day in and day out, working
on a lot of the same things, the scenery around you doesn't change. You
need to throw out some markers that you can see in the distance, so you'll
know that you've actually moved when you get to them. Your product
"features" are these markers.
Unless you really lay it out and make a plan, you won't be able to see
beyond the next blip on the horizon. In Chapters 2 and 3, you discovered
how to be intentional about your choice of career path and how to invest
in our professional selves. Though I focused on what seemed like a onetime
choice of what to invest in, each choice should be part of a greater
whole. Thinking of each new set of knowledge or capability as equivalent
to a single feature in an application puts it in context really well. An
application with one feature isn't much of an application.
What's more, an application with a bunch of features that aren't cohesive
is going to confuse its users... A personal product road map can not only
help you stay on track, constantly evolving, but it can also show you the
bigger picture of what you have to offer...
While it's definitely OK to learn diverse skills -- it expands your thinking --
it's also a good idea to think about the story your skill set tells. (Bold emphasis applied by me.)
For a couple of vacations I've taken in the past, I spun a pen on a map and drove to where it pointed the same night (up to 15 hours away). So far, my career map looks the same: as if a monkey tossed darts at a bunch of options and I decided to follow whatever the darts landed on.
I'm mostly a web developer - in the sense that I derive most of my income, write most of my code, and spend most of my time writing code that will in some way show up on the web or affect something that will show up on the web.
But I am also interested in, and spend significant time programming and studying artificial intelligence and machine learning, bioinformatics, and game development. I'm also interested in business for the sake of business (though I only occasionally write about it here). I enjoy writing desktop software as well (though I rarely have done so).
AI and game development dovetail nicely with each other. There are a lot of similarities between and overlap in algorithms for bioinformatics and AI. But short of creating a bioinformatics game on the web, it's hard to imagine where all these skills and interests intersect.
Perhaps it would be better for me to try and create a coherent picture out of the skills I choose to learn. But I rather enjoy having my hands and mind roam freely.
How's your skill set? Is it too focused, where you might need some breadth, or do you have a bit of a programmer dissociative identity, where some cohesion could take you a long way?
I need an MP3 library. Instead of seeing that there's no file to load, wouldn't it be great if the editor
tried to find it?
All that needs to happen is that we have an index that checks for includes and references. If it doesn't find
it in the standard library or any installed libraries, it goes to the index to find possible matches. If there's
only one, it downloads it and continues. If more than one exists, it might ask you which one you want
to download and include.
This isn't limited to Ruby. In fact, I'd love it more in Java and .NET. I can't count the number of times
on those platforms where I've looked up how to do something, only to be denied by the fact they didn't mention
the appropriate package or namespace to use.
Automatic Parallel Programming
Around the same time, I also thought it would be nice to have compilers and interpreters decide when concurrency
would be appropriate:
This can get really tricky. In fact, we don't really want it to be automagical detection. There
are some cases where it could happen, as far as I can tell. However, it's not worth the apprehension we'd feel if
we didn't know when the compiler or interpreter was going to do so.
But there are plenty of cases where it is possible. I've been in several of them lately. Even forgetting
about those -- Instead of
typing the boilerplate to make it happen, I really want something almost automagic:
Daniel mentions Google, "one of the most open-minded and developer friendly companies around," and points out that they have a strict limit in languages to use: Python, Java, C++, and JavaScript. He also says,
To my knowledge, this sort of policy is fairly common in the industry. Companies (particularly those employing consultants) seem to prefer to keep the technologies employed to a minimum, focusing on the least-common denominator so as to reduce the requirements for incoming developer skill sets.
We're afraid of being eaten by the poly-headed polyglot monster.
More...
Google avoids it by sticking to a limited set of languages. I don't work at Google, so I can't claim to know what their architecture looks like in terms of blending languages within applications. But the act of limiting languages does not in itself preclude polyglot programming. Choosing to limit yourself to one language, or one language per application, would do so. Likewise, choosing C# and Java as your languages would probably nullify most of the benefits. But the point is, just because you allow or encourage polyglot programming does not mean you let everyone choose whatever language suits them and then throw it all together in a melting pot, praying that everything works well together. You can have a method to sort out the madness.
In any case, what's going on with polyglotism, and how do we implement it?
Let's start with the impolite question: Should one really care about multilanguage support? When this feature was announced at .NET's July 2000 debut, Microsoft's competitors sneered that it wasn't anything anyone needed. I've heard multilanguage development dismissed, or at least questioned, on the argument that most projects simply choose one language and stay with it. But that argument doesn't really address the issue. For one thing, it sounds too much like asserting, from personal observation, that people in Singapore don't like skiing. Lack of opportunity doesn't imply lack of desire or need. Before .NET, the effort required to interface modules from multiple languages was enough to make many people stick to just one; but, with an easy way to combine languages seamlessly and effortlessly, they may -- as early experience with .NET suggests -- start to appreciate their newfound freedom to mix and match languages.
Even then, the web was all about polygot programming: we see ActionScript, JavaScript, XML, SQL, and #{favorite_application_server_language} already.
Applications of the future will take advantage of the polyglot nature of the language world. We have 2 primary platforms for "enterprise" development: .NET and Java. There are now lots of languages that target those platforms. We should embrace this idea. While it will make some chores more difficult (like debugging), it makes others trivially easy (or at least easier). It's all about choosing the right tool for the job and leveraging it correctly. Pervasive testing helps the debugging problem (adamant test-driven development folks spend much less time in the debugger). SQL, Ajax, and XML are just the beginning. Increasingly, as I've written before, we're going to start adding domain specific languages. The times of writing an application in a single general purpose language is over. Polyglot programming is a subject I'm going to speak about a lot next year.
Since Meyer's article, we have indeed seen an explosion of languages running on both "enterprise" platforms. Interest seems to have really grown recently. Because of that, it's getting easier to integrate languages. But as Neal Ford noted, we're still in the beginning (if you believe that paradigm is going to take hold).
The first layer is what I called the stable layer. It's not a very large part of the application in terms of functionality. But it's the part that everything else builds on top off, and is as such a very important part of it. This layer is the layer where static type safety will really help. ...
The second layer is the dynamic layer. This is where maybe half the application code resides. The language types here are predominantly dynamic, strongly typed languages running on the JVM, like JRuby, Rhino and Jython.
...
The third layer is the domain layer. It should be implemented in DSL's, one or many depending on the needs of the system. In most cases it's probably enough to implement it as an internal DSL within the dynamic layer, and in those cases the second and third layer are not as easily distinguishable. But in some cases it's warranted to have an external DSL that can be interacted with. A typical example might be something like a rules engine (like Drools).
Let's talk briefly about a web application, as I envision it using Ola's layers as a guide:
In the stable layer, you might be talking about a banking application where transactions are performed in Java. You might have some tool set that performs important calculations, and which other applications depend on. It might be the parts of the application that really need to perform well.
In the dynamic layer, you might have code that integrates the static and domain layers. You might have code which glues together other stuff, or things which don't really turn into bottlenecks for the application, which aren't necessarily part of the application domain, but which don't require the ceremony of the stable layer. You're choosing to use a dynamic scripting language to increase productivity. You might define the DSL abstractions here, and use them in the DSL/domain layer.
In the domain layer is where you'll see the meat of your application code- it's where all your business object abstractions will reside, and they will in turn depend on the other two layers. It should consist entirely of code that uses business language.
Ola may have meant something else entirely, or just slightly different, but that's the way I've come to see it. The concept is still new, so there's nothing saying that vision of a "best practice" design will win out over ad hoc solutions that throw modules written in different languages together to form an application.
How I'm Using Polyglot Programming
How am I using polyglot programming in my day-to-day programming? Aside from the JavaScript/AppLanguage/SQL web paradigm, there are a couple of ways I'm doing it.
In one of my jobs, we do a lot of work with Sharepoint. In fact, the bulk of the applications are built in Sharepoint: to manage documents, content, data and get all the integration with other Office products. However, Sharepoint is a beast. Often, doing something that would be trivial on another platform is an arduous task in Sharepoint. So we have to drop out of it.
In some cases, when we've needed the feature to reside within Sharepoint itself, we've opted to use C# within ASP.NET to create Sharepoint Web Parts. For the next one of these I come up against, I intend to look at using IronRuby and deriving from the appropriate parent classes.
In other cases, we're building stand-alone "sub-applications" that integrate with Sharepoint and Active Directory. For these, we have some utility classes written in C#, while the "sub-application" is built in Ruby on Rails. Soon, we'll be looking at running them on IronRuby, and taking further advantage of the .NET platform.
In my job with the bioinformatics lab at the University of Houston, things are a bit different. I'm building command line applications. After a machine takes photos of your DNA sample, there is an genome analysis pipeline from Illumina that analyzes the images, produces nucleotide sequences, and tries to align them to a reference sequence. That utilizes Perl, Python, and C/C++, though I don't know to what extent a in which "layers" each language does its job.
On my end, I'm using C++ to do the heavy lifting in analysis in proprietary libraries we've been producing to do our own analysis and post-analysis. When you're dealing with genome-size data sets, from 5 mega- to 3 giga-bases, even constant reductions in run-time can mean the difference of days of computation in algorithms with above-linear time complexity. But I'm also building scripts in Ruby that help piece it all together, and doing some analysis with it for linear-time algorithms. It's very ad hoc, in this instance.
Finally, my game development group plans to use C/C++ to forge our engine, while using a scripting language(s) to fill out the game details. As I understand it, much of the game industry already employs polyglot programming in this manner.
So that's how I'm doing it at the moment.
In 2008, the polyglot programming paradigm is still (at least in my opinion), a relatively novel concept. There aren't a lot of resources to show us how to do it the right way. As time moves forward and people publish their experiences, that should be expected to change.
On that note, do you have any stories about how you've done it, or any ideas on how you would if you were going to do it? Please feel free to share them - you'd be doing us all a favor.
Posted by Sam on Aug 22, 2008 at 12:00 AM UTC - 5 hrs
Some people call them fat pants. Some people call them stretch pants. Others might call them exercise pants or sweat pants. Whatever you call them, they're comfortable to wear. The problem with sweat pants is the reason they're comfortable is because they're big and expandable. And that expandability means they have a lot of room to accommodate growth as well.
More...
In fact, some people wear them when they know they'll be eating a lot, just for that purpose.
It's hard enough to know when you're getting fat - after all, you're you and it's a slow process, so you wouldn't notice unless your pants got tighter or you were reviewing your weight on the scale regularly.
It's even harder to notice when you're wearing sweatpants. You can go for years - growing and growing ...
And before you know it, you've not just beat anorexia, you've pwned it.
As with the other chapters in this section of My Job Went To India, "That Fat Man in the Mirror" boils down to refusing to let yourself become comfortable in where you're at personally (not geographically) as a programmer. In this case, you need to put away your programming skill fatpants, and periodically review your skills. Take an inventory. Better yet, have someone else evaluate you. Sweats may allow for growth, but it's not the good kind.
An easy way to measure your progress is to use a trusted third party. A
mentor or a close colleague doesn't live in your head with you and can
help give you a more objective look at where you stand. You might discuss
your abilities as a software developer, project leader, communicator, team
member, or any other facet of the total package that makes you who you
are. (Chad Fowler, pgs. 155-156 of My Job Went to India)
If you're a bit uncomfortable asking someone to help in that way, you should make
use of review-time at your company (if there's such a thing where you're at).
If your company has such processes in place already, don't write them off
as HR nonsense. Take them seriously and make good come out of them.
Keep it written down and revise and review often, Chad says.
That sounds like solid advice to me. I got started with some goals earlier in the year, and had planned to periodically review them here on the weblog. But that hasn't happened, so it's something which I need to put more effort into.
Don't get be lethargic about your skills. Instead, take off the fatpants and actively evaluate where you're at and where you need to be. Get some feedback. Otherwise, some day in the future you may end up wondering to yourself, "how did I lose my edge?"
Do you review yourself periodically? Have you used the reviews to become better?
Posted by Sam on Aug 29, 2008 at 12:00 AM UTC - 5 hrs
If you want to trap a monkey, it's not very hard. Hollow out a hole (in a coconut, the ground, or whatever)
just large enough for a monkey's hand to fit in when relaxed, but too small to pull out as a fist.
Put some food in the hole, and wait. Soon enough, a monkey will come, fall in love with the food, grab at it
and refuse to let go.
You see, monkeys value food higher than life or freedom, and their devotion to it will not allow them to let
go. Or so the story of the south Indian monkey trap goes.
(I am merely relating the parable, I have not actually tried to capture a monkey in this manner.)
More...
In My Job Went to India, Chad Fowler's final bit of advice for keeping sharp and
up to date urges us to not allow ourselves the mental security blanket of value rigidity - or the mental crutch,
as it often turns out to be. You might not even be aware you're using one yet.
Chad tells the story of Novell's decline:
Many of us in the mid-1990s swore by Novell's NetWare platform when it
came to providing ?le and print services in the enterprise. Novell was way
ahead of its time with its directory services product, and those of us "in
the know" were almost cocky in our criticism of competing technologies.
Novell's product was enjoying a healthy majority in market share, and it
was hard to imagine the tide turning.
No single event made it obvious that Novell was losing to Microsoft.
Microsoft never made that magic Active Directory release that made us all
say, "Wow! Drop NetWare!" But, Netware has slowly gone from bleeding-
edge innovator to legacy technology. For many NetWare administrators,
the water was boiling before they ever even realized the pot was warm.
By allowing yourself the comfort and ease of such a mental crutch, you're doomed to keep repeating what worked in the past, even if it's not the best solution today. Before you know it, your technology of choice is no longer the soup du jour, and you're stuck knowing nothing else.
Instead of blindly advocating your technology of choice -- no matter the absurdity of that solution in the
situation -- have "stong opinions which are weakly held."
Realize "it depends" is a valid answer to programming questions.
Posted by Sam on Sep 04, 2008 at 12:00 AM UTC - 5 hrs
Outsourcing is not going away. You can delude yourself with myths of poor quality
and miscommunication all you want, but the fact remains that people are solving
those problems and making outsourcing work.
As Chad Fowler points out in
the intro to the section of MJWTI titled "If You Can't Beat 'Em", when a
company decides to outsource - it's often a strategic decision after much deliberation.
Few companies (at least responsible ones) would choose to outsource by the seat of their pants, and then change their
minds later. (It may be the case that we'll see some reversal, and then more, and then less, ... , until an equilibrium is reached - this is still new territory for most people, I would think.)
Chad explains the situation where he was sent to India to help improve the offshore team there:
If [the American team members] were so good, and the Indian team was so "green," why the hell
couldn't they make the Indian team better? Why was it that, even with me
in India helping, the U.S.-based software architects weren't making a dent
in the collective skill level of the software developers in Bangalore?
The answer was obvious. They didn't want to. As much as they professed
to want our software development practices to be sound, our code to be
great, and our people to be stars, they didn't lift a finger to make it so.
These people's jobs weren't at risk. They were just resentful. They were
holding out, waiting for the day they could say "I told you so," then come
in and pick up after management's mess-making offshore excursions.
But that day didn't come. And it won't.
The world is becoming more "interconnected," and information and talent crosses borders easier than it has in the past.
And it's not something unique to information technologists - though it may be the easiest to pull-off in that realm.
So while you lament that people are losing their jobs to cheap labor and then demand higher minimum wages, also keep in mind that you should be trying to do something about it. You're not going to reverse the outsourcing trend with
any more success than record companies and movie studios are going to have stopping peer-to-peer file sharing.
That's right. In the fight over outsourcing, you, the high-paid programmer, are the big bad RIAA and those participating in the outsourcing are the Napsters. They may have succeeded in shutting down Napster, but in the fight against the idea of Napster, they've had as much strategic success as the War on Drugs (that is to say, very little, if any). Instead of fighting it, you need to find a way to accept it and profit from it - or at least work within the new parameters.
How can you work within the new parameters? One way is to "Manage 'Em." Chad describes several characteristics that you need to have to be successful with an offshore development team, which culminates in a "new kind" of
PM:
What I've just described is a project manager. But it's a new kind of project
manager with a new set of skills. It's a project manager who must act at
a different level of intensity than the project managers of the past. This
project manager needs to have strong organizational, functional, and technical
skills to be successful. This project manager, unlike those on most
onsite projects, is an absolutely essential role for the success of an offshore-developed
project.
This project manager possesses a set of skills and innate abilities that are
hard to come by and are in increasingly high demand.
It could be you.
Will it be?
Chad suggests learning to write "clear, complete functional and technical specifications," and knowing how to write use cases and use UML. These sorts of things aren't flavor-of-the-month with Agile Development, but in this context, Agile is going to be hard to implement "by the book."
Anyway, I'm interested in your thoughts on outsourcing, any insecurities you feel about it, and what you plan to do about them (if anything). (This is open invitation for outsourcers and outsourcees too!) You're of course welcome to say whatever else is on you mind.
Posted by Sam on Sep 12, 2008 at 12:00 AM UTC - 5 hrs
If we accept the notion that we need to figure out how to work with outsourcing
because it's more likely to increase than decrease or stagnate, then it would be beneficial for us to become
"Distributed Software Development Experts" (Fowler, pg 169).
To do that, you need to overcome challenges associated
with non-colocated teams that exceed those experienced by teams who work in the same geographic location.
Chad lists a few of them in this week's advice from
My Job Went To India (I'm not quoting):
More...
Communication bandwidth is lower when it's not face to face. Most will be done through email,
so most of it will suck comparatively.
Being in (often widely) different time zones means synchronous communication is limited to few overlapping
hours of work. If you get stuck and need an answer, you stay stuck until you're in one of those overlaps.
That sucks.
Language and cultural barriers contribute to dysfunctional communication. You might need an accent to accent
translator to desuckify things.
Because of poor communication, we could find ourselves in situations where we don't know what each other
is doing. That leads to duplicative work in some cases, and undone work in others. Which leads to
more sucking for your team.
The bad news is that there's a lot of potential to suck. The good news is there's already a model
for successful and unsuccessful geographically distributed projects: those of open source.
You can learn in the trenches by participating. You can find others' viewpoints on successes and
failures by asking them directly, or by reviewing
open source project case studies.
Try to think about the differences and be creative with ways to address them.
Doing that means you'll be better equipped to cope with challenges inherent
with outsourced development. And it puts you miles ahead of your bitchenmoaning colleagues who end
up trying to subvert the outsourcing model.
There are plenty of potential solutions out there, but none of them worked for me. I would get the scrolling working, and then the
draggable would move away from the cursor. I'd get it in sync with the mouse cursor and the scrolling would get crazy again.
I'd fix that and then no matter where I dropped it, if the div had been scrolled, dropping would fail.
Here's how I fixed the problem:
In the Draggable#updateDrag function (~ line 356), on the first line, I changed the value of the pointer argument to
take into account how much the container had scrolled:
pointer = new Array(pointer[0] + this.options.scroll.scrollLeft, pointer[1] + this.options.scroll.scrollTop);
At least one of the solutions I recall seeing mentioned this.
In the same function, I changed the first two elements in the p array before the last two elements get
pushed onto it:
p = new Array(p[0] + this.options.scroll.scrollLeft, p[1] + this.options.scroll.scrollTop);
This also just takes into account how far the container has been scrolled.
To ensure my droppables were able to receive the draggables given the adjusted coordinates, we need to adjust the
scroll position just as we did above. First, I adjusted the code in my webpage that produces the droppables and
added a scroll parameter that should be the name of the element that scrolls (the same parameter the draggable accepts):
Since Droppable elements don't generally take a scroll option, we'll need to modify that code in Scriptaculous's dragdrop.js
file as well. In the Droppable#fire function (~ line 109) add the folling lines under Position.prepare();:
var point = [Event.pointerX(event), Event.pointerY(event)];
if(this.last_active.scroll){
point[0] += $(this.last_active.scroll).scrollLeft;
point[1] += $(this.last_active.scroll).scrollTop;
}
Finally, just underneath that where it calls this.isAffected, change the first parameter from
[Event.pointerX(event), Event.pointerY(event)] to use the variable we created above, pointer.
That should be it. If you've tried the above and still get problems, feel free to leave a comment below, or
contact me and I'll do my best to help out.
I haven't submitted a patch because I didn't check to see that this was a general solution. It seems like it should be, but
without testing it outside my intended usage, I don't think it'd be accepted anyway. Quite frankly, I'm not thrilled about
adding a new option to droppables, but it seemed like the simplest route to fix my problem at the time.
Yes, I tried setting includeScrollOffsets to true and using Position#withinIncludingScrolloffsets
in Prototype, and that failed for me too.
Posted by Sam on Mar 05, 2010 at 10:14 PM UTC - 5 hrs
You might think that "tech support" is a solved problem. You're probably right. Someone has solved it
and written down The General Procedures For Troubleshooting and How To Give Good Tech Support.
However, surprisingly enough, not everyone has learned these lessons.
And if the manual exists, I can't seem to find it so I can RTFthing.
The titles of the two unheard of holy books I mentioned above might seem at first glance to be
different tales. After all, troubleshooting is a broad topic applicable to any kind of
problem-solving from chemistry to mechanical engineering to computer and biological science.
Tech support is the lowliest of Lowly Worms for top-of-the-food-chain programmers.
(And don't ask me how sad it makes me feel that my favorite book as a kid has only a 240px image online. I need to find my copy and scan it.)
But just like its more enlightened brethren, tech support consists of troubleshooting. In fact, it should be
the first line of defense to keep your coders coding and off the phone. Who wants them to man the phones?
Certainly not the programmers. Certainly not management. Tech support is a cost center, not a customer
service opportunity.
Perhaps when you have a virtual monopoly over a market like most cable companies or utilities in a given locale,
you can afford to have poor customer service. The cable sphere seems to be opening up, what with satellite TV and internet
and now AT&T and Verizon offering television and decent-to-good internet packages.
Even still, AT&T's UVerse has its own problems, I've heard,
and (at least personally) I've not witnessed the kind of customer service that competition promises with regards to cable TV and internet access.
The fact is we tend to treat support like a second class citizen. It's a position we want to fill with a minimum-wage worker (or less, if we
can outsource it) who has no expertise, no clue, and doesn't care to learn the
product since he can get a job in the fast food industry at about the same rate. And with no stress!
It makes it worse that we don't even want to take the time to train him, since it would take away from the productive code-writing time to do so.
The person we want to treat as an ape or worse always seems expendable. We treat them so. Should they be?
I say no. Not only am I a big fan of dogfooding,
I feel like Fog Creek's
giving customer service people a career path nowadays
matches a lot of my ANSI artist peers' experiences
from back in the day. Smart people start in support, and they can move themselves up in the organization to play more "key roles."
I don't think it needs to be a full-time thing, but it certainly helps if programmers are their own support team.
Like Bruce Johnson who posted that linked message, I work on a small team and can vouch: it's downright embarrassing to have to support
our customers. I'm glad to do it, but when it happens, more than likely I've got to take blame for the problem I'm dealing with.
You know how hard I try to make sure my code works as expected before I deploy it?
"OMG I'm sorry, that's my fault, I'll fix it for you right away." Can you get better support than that?
I'm not so sure I'd have tried that hard without the customer experience pushing me.
I think I've made my first point: that customer support is customer service is important to the health of your business.
While I agree that tech support in the common use of the term is useful to shield your programmers from
inane requests, I also recognize the value in having programmers take those calls from time-to-time.
Given that, I do in fact have some do's and do-not's with regards to support. The list here deals mostly with
how to be a good support technician for your team, as opposed to the customer. Still, the customer is
central to the theory.
Although it does not make an exhaustive list, here are four contributions to The General Procedures For Troubleshooting:
After listening to the problem description, the first thing to do is recognize whether or not you can
solve the problem while the customer is on the phone, or if anyone can. If you can, then do it. If you think
only someone else can do it, and work for an organization that has multiple levels of live-support, then escalate it.
If you don't think solving the problem is possible without escalating it to a level of support that won't get
to it immediately, thank the customer for reporting the issue, let them know the problem is being worked on,
and boogie on to step 2.
As support, the first thing you need to do before escalating the issue is confirm there is an issue, and do it with a test account, not the user's.
It's ridiculous to ask for the user's
credentials. Don't do it. If someone were to ask you, "What is your username and password?" what would you think?
The average user isn't going to know your query is tantamount stupidity, but if you get someone who is slightly
security-conscious, you're going to lose a customer. Hopefully, he's not a representative of your
whale.
If anyone found out that you're in the habit of asking users for their passwords, they can easily call anyone
who uses your software and get in by just asking. Further, since many people use the same password for everything
or many things, that person would also have access to your customers' other sensitive information, wherever it resides.
You can point the blame at your stupid customer for using the same password everywhere they go all you want. You're being
just as stupid by opening the door for that type of attack. Further, you should always try to recreate and fix the problem
with as little inconvenience to the user as possible. That means doing it with test accounts as opposed to asking the
user for theirs, or changing their information.
Keep things simple for the user. Don't jump immediately to using their time to make things easier on the support team.
Doing that is lazy at best, sloppy most of the time, and could result in disaster at worst.
After confirming the existence of the problem, provide the steps of how to reproduce it. Give some screen shots.
If it's a web app, provide links. Don't constantly send and email and ask the higher levels about it. Doing so once or twice is
one thing, but doing it for every request is a time-waster. Just send the email and the next level will get to it
when they can. If they don't get to it within the acceptable time-frame for your organization, send a reminder.
Include the boss if you need to. But don't do that prematurely (and that's another subject altogether).
Don't jump to conclusions about the source of the problem.
Although Abby Fichtner wasn't speaking
directly to support ...
... This is the opposite of my general approach. The parallel here is code : customer :: you : dumb2.
I've learned (even if through a bit of self-torture) that I should always look at the code first, if for no other reason than I don't
want to be foolishly blaming others when I'm to blame. In the case of support, I've always hated the term "User Error,"
and that's what the tweet reminds me of.
By framing it as an external problem, we miss an opportunity to teach the user how to use the product, or a chance to
improve the product to make sure they can't use it "incorrectly."
What are your thoughts about tech support? What can you contribute to The General Procedures For Troubleshooting?
2) "word1 : word2 :: word3 : word4" is SAT (and elsewhere) notation for
the analogy "word1 is to word2 as word3 is to word4." See freesat1prep.com
for a few examples.
Posted by Sam on Jan 03, 2010 at 03:09 AM UTC - 5 hrs
I've spent some time recently building a tool that makes my life a bit easier.
I've browsed plenty of Rails log analyzers that
help me find performance bottlenecks and potential improvements. But what I really need is a faster way to filter my logs to
trace user sessions for support purposes. Maybe it's just me, but I've got apps where users report problems
that make no sense, where their data gets lost, and who can't tell me what they did. Add to that
the fact that I've got the same app running on dozens of different sites, and you can see
why performance analyzers aren't what I'm looking for to solve my problem.
Because of that, I need a solution that lets me filter down and search parameters to figure out
what a particular user was doing on a certain date. Hence, Ties.
More...
What can Ties do?
Enter the path to a Ruby on Rails production log file, click the "Load Log" button and it reads in the file.
Then, choose from the years, months, and days of requests in that file. Tell Ties which controller, action, and URL you are interested in.
Finally, decide if you only want to see the log entries which contain an exception, enter a regular expression
to search the params, plus the output filename and click a button to filter the log entries you care to see.
Ties takes a many-megabyte Rails production log file and outputs only the entries you're interested in.
Keyboard Shortcuts: Shoes leaves it to the programmer to
implement keyboard shortcuts, so while familiar
actions like Copy (ctrl-C) and Paste (ctrl-V) are available
via the mouse, I have yet to implement them on the keyboard.
Error Handling: It's minimal. If you enter a non-existent file,
or non-Rails-production file, who knows what will happen?
Crazy web-of-a-graph: My intent is to build the data model
such that you can search most items in approximately O(1) time. Right now,
you drill down to the day in constantish time, and after that
it becomes linear search.
Testing on all platforms and Rails versions: I proudly certify
this version (0.1) WOMM.
That means I've only tested it on Mac OS 10.5.8 (Leopard), using straight log files from Rails 2.2 on Ruby 1.8.6 and 1.8.7.
That being said, Shoes is supposed to work on Windows and Linux as well, and I've not noticed any major differences
in the log files between Rails versions, so you might find it works great for you too. If not, I encourage you to
let me know and I'll fix it up quick for you. (Please have
a sample log file available if possible.)
nice_code, stupid_submitter - in which TheDailyWTF jumps the shark by ridiculing perfectly good code.
Let's forgive the misuse of the wornout phrase and get to whether or not looking at the code
should result in utterance of WTFs.
More...
It goes something like this:
Setup, in which we use globally shared memory
Negate the number if it is negative, for no reason I can think of, and set a reminder flag.
Set buffer to point to the final spot. Move it back one, dereference it, and insert the null string terminator there.
Move the buffer backwards. Dereference the pointer and set the value there to the character '0'. Add the remainder of the number divided by ten to that, since '1' is 1 away from '0', and so forth.
Divide the number by 10 so we can get to the next digit.
If we've done the loop 3 times, move the buffer pointer back one location and insert a comma.
Repeat starting at step 4 until the number is 0.
Cleanup - if the front character is a comma, remove it by moving the buffer pointer forward.
Cleanup - Move the buffer pointer backwards and insert our '-' if the negative flag has been set.
I felt like that required too much brain power to follow for doing something so simple. So I decided to make my
own version. I thought I'd try the following:
Copy the number to a string.
Starting at the end, copy each character to another string, inserting a comma every 3rd time.
A thousand times simpler than the convoluted mess of nice_num. Here's that attempt, in C:
I think it's clearer - but not by much. Certainly the variable names are better because you don't first
have to first understand what's going on to know what they are for. I think moving the pointer
arithmetic back into the more-familiar array notation helps understandability. And removing the trick of
knowing that the ASCII codes for '1' is 1 more than for '0' ... for '9' is 9 more than 0 means less thinking.
On the negative side, the commadify code is slower than nice_num, most of which is caused by using malloc
instead of having preallocated memory. Removing those two instances
and using preallocated memory shaves a couple of tenths of a second off of the one-million-runs loop. But
you have to compensate with more code that keeps track of the start position.
So what's the verdict?
I don't think we're reacting to the WTFs I mentioned above when we see the nice_num code.
I think we're reacting to C itself. We're so used to very high level languages
that common tricks and things you'd know as a C programmer are the WTFs to us.
This kind of stuff isn't outside the realm of what a strong programmer should know. It isn't even
close to the border. The truth is our low-level skills are out of practice and we should probably get some.
What do you think?
Code with main program and comparison with nice_num is available at my github repository, miscellany/commadify.
Update: Thanks to Dave Kirby's comment, I've fixed memory leaks and updated the code in this post and at the repository. Link to repo originally was to the specific commit - so I've changed that to link to master instead.
Posted by Sam on Dec 22, 2008 at 12:00 AM UTC - 5 hrs
This post might be better titled, "How (and how not) to help yourself when
Google doesn't have the answer: A whirlwind tour through Rails' source" if only I wasn't
too lazy to change the max length of the database field for titles to my blog entries.
Google sometimes seems as if it has the sum of all human knowledge within the confines of its search index.
It might even be the case that it does. Even if you prefer to think that's true,
there may come a time when humanity does not yet have the knowledge you are seeking.
How often is that going to happen? Surely someone has run up against the problems I'm
going to have, right? That hasn't been the case for me the last couple of months.
I may be the only developer writing Rails apps on MacOSX to be deployed to the world on Windows
where SQL Server 2008 is the backend to a Sharepoint install used by internal staff to drive the data. I'm
not so presumptious to think I'm a beautiful and unique snowflake, but I wasn't finding any answers.
More...
Before I started this trek, I made a commitment to leave after an hour if I found my attention
drifting toward something else. I never started checking email, reading blogs, or obsessively reloading
twitter to see my tweeps latest tweets, so I thought I was in the clear.
However, even though I felt like I was focused, the fact that I had been sitting at the computer for so
long contributed to poor decision making. The first of these was to keep searching Google even though
every search was coming up useless. I always followed the path of least resistance - even
if it wasn't going to get me to the goal quicker than an alternative path. If it was less challenging,
it was for me.
After a while, I ran out of mentally casual paths and resigned myself to tracing through
the source code (it is open source, after all, and this is one of the benefits everyone claims but
so few practice). It was what I knew I should have been doing as I started out, and I had wasted
several hours trying to tip-toe around it for the sake of my poor, tired brain.
Now that I was sure I had the right data type being returned, I needed to narrow down where the problem was
occuring. I knew SQLServerAdapter was using DBI to connect to the database, so I figured I'd use a
quick irb session to test DBI itself. The test came back negative - DBI was getting the correct
data. I also ran a quick test in the Rails app, reaching through ActiveRecord::Base to use the connection
manually. That worked, as expected.
I had thought, and now confirmed, that the best place to look would be SQLServerAdapter.
If it were a Rails problem, certainly someone would have run into it by now. So it made sense the problem would be in the interface
between Rails and Microsoft.
Why? Because if Rails is a Ghetto,
Rails with Microsoft products is a fucking concentration camp.
Excuse the profanity. I don't often use it here, so you know I must mean it when I do.
class << selfdef string_to_binary(value)"0x#{value.unpack("H*")[0]}"enddef binary_to_string(value)
value =~ /[^[:xdigit:]]/ ? value : [value].pack('H*')endend
But it wasn't obvious how it was being used elsewhere. I even tried using the reverse operations in my objects - to no avail.
And after searching in the source file, it certainly wasn't being called inside of SQLServerAdapter.
So I went on a quest for the answer inside /opt/local/lib/ruby/gems/1.8/gems/activerecord-2.1.1/
.
For quite some time I went back and forth inserting and removing debugging code between Active Record and
SQLServerAdapter. select(sql, name=nil) is a protected method defined in the abstract connection
adapter in Active Record. SQLServerAdapter implements it privately, and it was both getting and returning
the correct data.
After ActiveRecord calls object.instance_variable_set("@attributes", record) when instantiating our object,
object.attributes[binary].size becomes less than record["binary"].size. That was
the problem. I thought for sure instance_variable_set was a monkeypatched method on Object,
and that all I needed to do was issue a monkeyfix and all would be well.
Only I was wrong. It's there by default in Ruby, and Rails wasn't monkeypatching it (that I can tell).
All the sudden things started looking bleak. By this time I knew how I could fix it as a hack. I even
had a nice little monkeypatch for my objects that I could issue and
have it feel less hacky to be used. I had given up.
But for some reason I picked it back up after an hour and found that ActiveRecord was actually calling
that string to binary method in SQL Server Adapter. It allows them to register calls that should happen before
defining the read and write methods on the object. Excellent!
I opened up SQLServerAdapater, there it was: a differentbinary_to_string method
that totally explained it. The pair in this version were encoding and decoding the data to/from base 64.
That would work fine, if my data was going through the encoding part. But it wasn't - it was coming straight
from Sharepoint.
There's a comment in the code about some arbitrary 7k limit on data size from SQL Server
being the reason for encoding as base 64 before inserting the data. I don't know about
inserting, but retrieving works fine without it. If I could think of a way to distinguish,
I'd submit a patch for the patchwork. Ideally, I'd rather find a way around the restriction, if it
actually exists.
The original code I was looking at was on github. It (not surprisingly) differed from the code in use on
my machine. Another side effect of the 16 hour monitor stare.
It's called the 8 hour burn for a reason.
The only things burning after 16 hours are your brain from all the stimulants and your
wife, wondering WTF you're really doing because there's no way you're working from before
she wakes up until after she goes to bed.
What's the point?
There's two, if you ask me:
You have the source code. Look through it. You have no other choice when no one has had your problem,
and you might
benefit by doing so even if someone already has.
Even when you think you're focused working late, and resolve to leave when you lose the focus, you're
still going to make stupid decisions that you won't notice until the morning. I turned
a 5 hour journey into a 12 hour marathon. Sleep, FTW.
The list is not intended to be a "one-size-fits-all, every developer must know the correct answer to all questions" list.
Instead, Jurgen notes:
The key is to ask challenging questions that enable you to distinguish the smart software developers from the moronic mandrills.
...
This list covers most of the knowledge areas as defined by the Software Engineering Body of Knowledge.
Of course, if you're just looking for brilliant programmers, you may want to limit the topics to [the]
Construction, Algorithms, Data Structures and Testing [sections of the list].
And if you're looking for architects, you can just consider the questions under the headings
Requirements, Functional Design and Technical Design.
But whatever you do, keep this in mind: For most of the questions in this list there are no right and wrong answers!
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers, as if I had not
prepared for the interview at all. The format will first list the question, my initial response (to start
the discussion), followed by a place I might have looked for further information had I seen the questions
and prepared before answering them.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy.
More...
Requirements
Can you name a number of non-functional (or quality) requirements?
I'd first mention performance and security, from the user's perspective. I'd then mention meeting minimum
requirements for metrics like code coverage in testing and dependencies in our design. I don't consider
code quality out-of-order when it comes to requirements.
The non-functional requirements
page at Wikipedia lists several examples. Notable exceptions from my quickie-response: accessibility,
documentation, portability. There are several that are listed that I consider covered by what I've listed,
but I missed some that caused me to say
What is your advice when a customer wants high performance, high usability and high security?
My "advice" starts with the questions: "What do you consider X to be?" where X belongs to the set {"high performance",
"high usability", "high security"}. I might offer that I consider "high performance"
to be a misnomer, as it's either acceptable or not, and that unless the customer defines it, I don't
know how we'd even attempt to measure something as vague as "usability."
I'm not sure where I'd prepare for this question. Any suggestions are appreciated.
Can you name a number of different techniques for specifying requirements? What works best in which case?
I can name several: tell me in person, tell me over email, tell me over IM or
over the phone.
I know that's not what you're looking for. You're looking for answers like "use cases." It all boils down to the same.
I might even mention "unit tests" here. That's part of the spec, as far as I'm concerned, and for almost
any software I write for myself, it's the only way I specify requirements (except for maybe a very informal to-do
list).
Face-to-face works best in most cases, I'd gather.
The answer after doing some research: ¡Ay, ay, ay, no me gusta! I didn't see this coming. There are a number of things that could qualify as answers
(Prototyping, Storyboards, Modeling, ..., State transitions) that I knew about beforehand. I thought to include
none of them.
What is requirements tracing? What is backward tracing vs. forward tracing?
My response? "I don't know anything about requirements tracing. I'm willing to learn."
Which tools do you like to use for keeping track of requirements?
I generally use a combination of a text file and emails, as far as the client is concerned. If it's a larger
system, I'll use something like Sharepoint, Basecamp,
or another system that performs a similar function. I have no preferences, because nothing I've ever used
compares to a simple list. If it does, it's equally useful.
I don't know that I'd say I like any of them. In reality I prefer a simple to-do list that I encode in
tests (insofar as I'm capable of writing the tests) and knock them out one-by-one.
How do you treat changing requirements? Are they good or bad? Why?
I don't give a value judgement on changing requirements: they are inevitable. They can be good or bad depending on
how the client handles them.
I always try to let the client know: I can do X amount in Y timeframe. You asked for Z total.
Here's an estimate for each item in Z, pick X from it for our current Y timeframe. I'll get back to you every Y timeframe to show a demo,
and you can choose X from the remaining Z again (with changes based on circumstances if required). Feel free
to fire me when you have enough out of Z that's functional. (Ok, I probably wouldn't say the last sentence in those
terms, but I'd find a way to say it, if for no other reason than to sell them the rest of the process.)
As far as looking it up before the interview: I've review Agile literature. Searching any of the agile yahoo groups
for the question at hand ought to be good enough.
How do you search and find requirements? What are possible sources?
"What do you mean?" would be my response. I really don't know. What does searching and finding requirements mean?
Does it mean figuring out how to do requirements that I don't know how to accomplish?
How do you prioritize requirements? Do you know different techniques?
I rarely prioritize requirements. I let the customer decide. I give them a relative cost of implementing
X requirement vs. implementing Y requirement, and let them decide. If Y requires X, then I tell them
so.
I know of different techniques - take "random" for example. I don't know what they might be called. But I
cannot think of anything better, even if it were decreed as a Top 10 Commandment for Prioritizing Requirements.
No web search for this comes to mind. I'd review a couple of process management books if I had no clue.
This seems to be a decent discussion,
if you must have one from me from a cursory browse.
Can you name the responsibilities of the user, the customer and the developer in the
requirements process?
The user will be the person using the software, versus the customer being the one who pays for its development.
I hate that distinction. The developer programs it. Responsibilities? In my ideal organization, I'd have:
Developers working with the Customer to manage requirements.
Developers working with the User to make the application work for them regardless of the Customer (I've
seen too many projects where the User had to use whatever the Customer purchased, even
if the purchase was ... little yellow bus-ish.)
Customer/User having daily meetings with the developer
Developer making the best software he/she can given the constraints.
Again, I don't know where I'd look this up before being asked. Suggestions (again, as always) are most welcome.
What do you do with requirements that are incomplete or incomprehensible?
I send an email saying "I don't understand what you mean. Please read the very small attached book
and get back to me."
Just kidding, of course (at least in most cases. I've been recently tempted to send that exact email, as it happens.)
I just ask them to clarify. If I don't have contact with the customer, I ask the intermediary to clarify or
get clarification.
Outside reading: Agile, and hopefully other processes for some compare and contrast.
I think these are decent answers to start a discussion with. If you're a hiring manager, what do you say? Would you show me the door, or keep me around for a while longer?
It's not quite as hard as Steve Yegge's list
of things to know (I'll get to that eventually), but it's a good (and more well-rounded!) list nevertheless.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not
prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy.
My lack of understanding the angle from which these questions come may influence other seemingly off-base
answers. Help me out by explaining how you might answer these questions on functional design:
More...
What are metaphors used for in functional design? Can you name some successful examples?
I hate to start the first one off with an IDK,
but what else can you do? I feel like I can talk intelligently about
some of the issues involved in functional design, but until now, I never realized it was a process unto itself,
nor did I know what all it entailed until after I browsed the remaining questions.
How can you reduce the user's perception of waiting when some functions take a lot of time?
Show a progress bar. Distract them with something they need to read or stage data-collection/interaction with the user.
Of course,
the obvious yet overlooked answer here would be to make the offending functions faster. At
worst, if the function doesn't require user interaction at all, you could spawn a new thread to do the work
in the background and let the user continue working. Then they'd notice no waiting at all.
Reading: I picked this up from observation and deduction. I don't know where I'd go to read such information.
Which controls would you use when a user must select multiple items from a big list, in a minimal amount of space?
What controls do I have to pick from? Can the list be grouped by some recurring values? For instance, consider
how the iPod uses the scroll wheel to let you navigate a long list of music. Now apply finding one song to
finding several, and you have an idea of what I'm talking about.
Another option might be to have auto-complete, with a special key for "add to list" when you're satisfied, which
puts you back at the beginning of the text box. Another key would end the sequence.
Can you visualize the list in some way? The user could then click pictures to drill down and roll up on items
in the list, viewing it from different levels. For example, if the list was of colors,
you could take the typical color-picker from every image program out there, and use it to choose your list.
Reading: I came up with these on the spot, and can't say I've seen them completely implemented yet,
so again, when it comes time for advice on where to read up on this topic, I've got nothing.
Can you name different measures to guarantee correctness of data entry?
Aside from the pains of torture? You'll definitely want a fast-acting validation to let the user
know as soon as possible that you think their input is incorrect. When talking about the web, you'll
want to validate it server side as well. You'll want to show some examples if you need a certain format
of input (or just if the customer expects to be forced into a format, you'll want to relieve their anxiety
by providing an example). Finally, make the user type as little as possible, except if you're doing a progressive
lookup / autocomplete on their typing.
Reading list is vacant. Please advise.
Can you name different techniques for prototyping an application?
At first, I didn't realize there were different techniques. Unless we were talking about storyboards vs.
mark-up-only or a shell of an application. I don't imagine that was the point.
I did find something that I think was the point: for reading, Scott Ambler has a take from a different point of view: tips and techniques
in user interface prototyping that includes another list of resources. A treasure trove!
Now that I know what we're talking about, I'd have given similar answers. I feel I could have been effective
in a conversation about this, but the other party would have had to lead me that direction. I couldn't get
there from the question alone.
Can you name examples of how an application can anticipate user behavior?
Analyze it and look for patterns. Use a Markov Model to predict next action based on previous N actions.
That's the only way (well, there are variation on the same theme) I can think of that an
application would do it. We could of course observe the users and make changes ourselves. I don't think
that would be a bad idea.
Reading is nonexistent. Those ideas came from elsewhere, but never as part of functional design.
Can you name different ways of designing access to a large and complex list of features?
There's the outmoded File/Edit/View tree menu controls (find link to discussion), ribbon interfaces,
and
progressive search, or autocomplete.
I'd also consider re-purposing my answers to #3.
How would you design editing twenty fields for a list of 10 items? And editing 3 fields for a list of 1000 items?
For the latter, I'd almost certainly put them all in one page. Or 100+ at a time. The fewer clicks the better.
It's worth importing from a spreadsheet every time, if that's an option.
For the former- Can the 20 fields be broken down into more cohesive units? Twenty fields in one form is often too many.
10 forms is not absurd, on the other hand.
What is the problem of using different colors when highlighting pieces of a text?
The highlight color can make the text hard(er) or downright impossible to read. Anything else?
I don't see room for discussion, unless you're flashing the colors -- then you might get into causing
epileptic seizures in those susceptible to them.
Can you name some limitations of a web environment vs. a Windows environment?
Access to the file system. Lag mainly - less responsive experience for most of web than on Windows.
As you can tell - while some of these questions are ones I've thought about in the past, or touch themes on others
that I do - I haven't thought about functional design as a process by itself.
So in that regard I ask,
Do you have any pointers?
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
Do you know what a regression test is? How do you verify that new changes have not broken existing features?
You answered the second part of the question with the first: you run regression tests to ensure that
new changes have not broken existing features. For me, regression tests come in the form of already written tests,
especially unit tests that I've let turn into integration tests. However, you could write a regression test
before making a new change, and it would work as well.
The point is that you want to have some tests in place so that when you inevitably make changes, you can ensure
they didn't cascade throughout the system introducing bugs.
How can you implement unit testing when there are dependencies between a business layer and a data layer?
Generally I'd let that unit test become an integration test. But if the time to run the tests was becoming
too long, I'd build a mock object that represented the data layer without hitting the database or file
system, and that would be expected to decrease the running time significantly.
Which tools are essential to you for testing the quality of your code?
I don't know if anything is essential. If you've got asserts or throws, you
can always implement testing yourself, and a good eye for bad code helps as well. That said, to reduce
psychological barriers
to testing, it would be nice to have tools already made for this purpose.
Luckily, we have such tool available: unit testing frameworks and static code analysis tools in your language of choice.
What types of problems have you encountered most often in your products after deployment?
Most recently I've encountered very specific integration errors, and written about some ideas on
fixing the polysystemic testing nightmare.
Do you know what code coverage is? What types of code coverage are there?
Generally I'd thought it refers to the percentage of code covered by tests. I don't know what
the second question here refers to, as I thought it referred exclusively to testing.
Do you know the difference between functional testing and exploratory testing? How would you test a web site?
I have to admit that before being asked this question, I wouldn't have thought about it. My guess is that
functional testing refers to testing the expected functionality of an application, whereas exploratory
testing involves testing without knowing any specific expectations.
As far as testing a web site, I'll have plenty of unit tests, some acceptance tests, perhaps some in
Selenium or a similar offering, as well as load testing. These aren't specific to web apps, however, except
for load testing in most cases.
I'm very interested in feedback here, given my misunderstanding of the question. If you can offer it, let me
thank you in advance.
What is the difference between a test suite, a test case and a test plan? How would you organize testing?
A test suite is made up of test cases. I'm not sure what a test plan is, aside from the obvious which the
name would suggest. As far as organizing testing: I tend to organize my unit tests by class, with the method
they test in the same order they exist within that class.
What kind of tests would you include for a smoke test of an ecommerce web site?
Again, here's another where I didn't know the terminology, so having to ask would result in demerits, but
knowing the answer of "what is a smoke test?" allows us to properly answer the question:
In software testing, a smoke test is a collection of written tests that are performed on a system prior to being accepted for further testing.
In that case, I'd click around (or more likely, write an application that could be run many times that does the same thing,
or use that application to write Selenium tests) looking for problems. I'd fill out some forms, and leave others blank.
Ideally, it would all be random, so as to find problems with the specs as often as possible without actually
testing all the specs, since the point seems to be to give us a quick way to reject the release without
doing full testing.
What can you do reduce the chance that a customer finds things that he doesn't like during acceptance testing?
The best thing to do is to use incremental and iterative development that keeps the customer in the
loop providing feedback before you get down to acceptance testing. Have effective tests in place that
cover his requirements and ensure you hit those tests. When you come across something you know
won't pass muster, address it even though it might not be a formal requirement.
There are undoubtedly underhanded ways to achieve that goal as well, but I'm not in the habit of going
that direction, so I won't address them here.
Can you tell me something that you have learned about testing and quality assurance in the last year?
Again I'm going to reference my polysystemic testing nightmare,
because it taught me that testing is extremely hard when you don't have the right tools at your disposal, and that
sometimes, you've got to create them on your own.
As far as reading goes, I'd start with literature on TDD, as it's
the most important yet underused as far as I'm concerned.
Posted by Sam on Jan 12, 2009 at 12:00 AM UTC - 5 hrs
I like to use descriptive variable names, and I try to err on the side of more-descriptive if I think there's
any chance of confusion. contract_participants isn't terribly long, but if you're building
up all of its members from different sources (i.e., you can't really loop over it), it can get cumbersome
to type and worse, to read. Moreover, it's different from just "participants" and they certainly
aren't "contracts," so shortening it in this case wasn't going to happen.
More...
Unfortunately, unlike some languages, Ruby doesn't have such a construct. Fortunately, it does have
the facilities to create one. I thought there was already an implementation, but I couldn't find it at
first. So I started to write my own.
For some reason I was looking through Utility Belt and noticed
it had an implementation of with():
class Object
def with(object, &block)
object.instance_eval &block
endend
Unbelievable! My implementation was running into a WTF in its own right, and here was this one-liner.
With was created as a result of those efforts. It
works how I want it to in the simple cases I've demonstrated. It still needs some work on left-hand-side
variables that are not members of the aforementioned @foo. It needs some tests for
more complex uses like nested blocks (and
code to make them pass). But it works for what I imagine the majority usage would be.
I opted for the syntax With.object(foo) do ... end so as to not force you to use a
change to Object. However, you can require 'with_on_object' if you prefer
to just use with(@foo). There's also a conditional patch on NilClass if
empty? does not exist. It is added and removed in the same method.
It requires Ruby2Ruby and ParseTree
to do its magic, though that will change soon due to compatibility problems with Ruby 1.9 and other implementations
(for which another project exists, if I've read correctly).
Posted by Sam on Jan 27, 2009 at 12:00 AM UTC - 5 hrs
Being a programmer, when I see something repetitive that I can automate, I normally opt to do so.
We know that one way to save your job is by
automating, but another is to know when not to automate. It sounds obvious, but when you get
into the habit of rooting out all duplication of effort, you can lose sight of the fact that sometimes, it costs more
to automate something than to do it by hand.
More...
I came across such a situation the other day.
.
In this case I was working with static content on a website that wanted to go dynamic. It wasn't just a case of
writing a spider to follow all the links and dump all the HTML into a database - there was some structure to the
data, and the database would need reflect it.
In this case, there was a hierarchy of data. For simplicity's sake, let's say there were three levels to the tree:
departments, sections, and products. At the top we have very few departments. In the middle, there are several
sections per department. And there are many products in each section.
Each level of the hierarchy is different - so you'll need at least three spider/parser/scrapers. Within each level,
most of the content is fairly uniform, but there are some special cases to consider. We can also assume each
level requires roughly the same amount of effort in writing an automaton to process it's data.
It's natural to start at the top (for me, anyway -- you are free to differ), since you can use that spider to
collect not only the content for each department, but the links to the section pages as well. Then you'll
write the version for the sections which collect the content there and the links to the products. Finally, you get
to the bulk of the data which is contained in the products. (And don't forget the special cases in each level!)
But that's the wrong way to proceed.
You ought to start at the bottom, where you get the most return on your investment first. (Or at least skip the top
level.) Spidering each level to collect links to the lower levels is exceedingly easy. It's the parsing and special
cases in the rest of the content that makes each level a challenge.
Since there are so few cases at the top level, you can input that data by hand quicker than you can write the automation
device. It may not be fun, but it saves a few hours of you (and your customer's) time.
Posted by Sam on Feb 11, 2009 at 12:00 AM UTC - 5 hrs
What's with this nonsense about unit testing?
Giving You Context
Joel Spolsky and Jeff Atwood raised some controversy when discussing quality and unit testing on their Stack Overflow podcast (or, a transcript of the relevant part).
Joel started off that part of the conversation:
But, I feel like if a team really did have 100% code coverage of their unit tests, there'd be a couple of problems. One, they would have spent an awful lot of time writing unit tests, and they wouldn't necessarily be able to pay for that time in improved quality. I mean, they'd have some improved quality, and they'd have the ability to change things in their code with the confidence that they don't break anything, but that's it.
But the real problem with unit tests as I've discovered is that the type of changes that you tend to make as code evolves tend to break a constant percentage of your unit tests. Sometimes you will make a change to your code that, somehow, breaks 10% of your unit tests. Intentionally. Because you've changed the design of something... you've moved a menu, and now everything that relied on that menu being there... the menu is now elsewhere. And so all those tests now break. And you have to be able to go in and recreate those tests to reflect the new reality of the code.
So the end result is that, as your project gets bigger and bigger, if you really have a lot of unit tests, the amount of investment you'll have to make in maintaining those unit tests, keeping them up-to-date and keeping them passing, starts to become disproportional to the amount of benefit that you get out of them.
Joel was talking about people who suggest having 100% code coverage, but he said a couple of things about unit testing in general, namely the second and third paragraphs I quoted above: that changes to code may cause a ripple effect where you need to update up to 10% of your tests, and that "as your project gets bigger ... [effort maintaining your tests] starts to become disproportional to the amount of benefit that you get out of them."
One poster at Hacker News mentioned that it's possible for your tests to have 100% code coverage without really testing anything, so they can be a false sense of security (don't trust them!).
Bill Moorier said,
The metric I, and others I know, have used to judge unit testing is: does it find bugs? The answer has been no. For the majority of the code I write, it certainly doesn't find enough bugs to make it worth the investment of time.
He followed up by saying that user reports, automated monitoring systems, and logging do a much better job at finding bugs than unit tests do.
I don't really care if you write unit tests for your software (unless I also have to work on it or (sometimes) use it in some capacity). I don't write unit tests for everything. I don't practice TDD all the time. If you're new to it I'd recommend that you do it though -- until you have enough experience to determine which tests will bring you the value you want. (If you're not good at it, and haven't tried it on certain types of tests, how else would you know?)
The Points
All of that was there to provide you context for this simple, short blog post:
If changing your code means broken tests cascading through the system to the tune of 10%, you haven't written unit tests, have you?
(Further, the sorts of changes that would
needlessly break so many unit-cum-integration tests would be rare, unless you've somehow happened or tried very hard to design a tightly coupled spaghetti monster while writing unit tests too.)
I've not yet met a project where the unit tests are the maintenance nightmare. More often, it's the project itself, and it probably doesn't have unit tests to maintain. The larger the code base, with large numbers of dependencies and high coupling, the more resistant it is to change - with or without unit tests. The unit tests are there in part to give you confidence that your changes haven't broken the system when you do make a change.
If you're making changes where you expect the interface and/or behavior to change, I just don't see where the maintenance nightmare comes from regarding tests. In fact, you can run them and find out what else in your code base needs to change as a result of your modifications.
In short, these scenarios don't happen enough such that they would make testing worthless.
You may indeed write a bunch of tests that don't do anything to test your code, but why would you? You'd
have to try pretty hard to get 100% code coverage with your tests while succesfully testing nothing.
Perhaps some percentage of your tests under normal development will provide a false sense of security. But without any tests whatsoever, what sense of security will you have?
If you measure the value of unit testing by the number of bugs it finds (with more being better), you're looking at it completely wrong. That's like measuring the value of a prophylactic by the number of diseases you
get after using it. The value is in the number of bugs that never made it into production. As a 2008 study from
Microsoft finds [PDF], at least with TDD, that number can be astonishingly high.
As for user reports, automated monitoring systems, and logging doing a better job at finding bugs than unit testing: I agree. It's just that I'd prefer my shipped software to have fewer bugs for them to find, and I certainly don't look at my users as tests for my software quality once it's in production.
Posted by Sam on Mar 03, 2009 at 12:00 AM UTC - 5 hrs
A while ago, I was working with a problem in C# where where our code would get deadlocked, and since someone must die or several must starve, I thought it
would be nice to just toss a "try again if deadlocked"
statement into the exception handler. I muttered this thought on twitter to
see if there was any language with such a try-catch-try-again-if construct.
More...
A couple of my tweeps responded with what we we're used to using: loops.
Another two mentioned redo in Ruby.
redo is certainly a cool construct (and underused), but it doesn't do what I want
it to:
beginraise from_the_dead
rescueredoend
Ruby responds, "tryredo.rb:35: unexpected redo." As you might know, you need to use a loop:
Suppose I have a zombie who only really likes eating braaaiiinzzz. Most of the time, he gets exactly
what he wants. But every so often, we try to feed him some other body part. Now, if he's really hungry, he
eats it. We might only have fingers available at the time, so I want
to try to feed him fingers again until brains are available or until he's hungry enough to eat the fingers.
I know that conceptually this is just a loop even if we don't explicitly code it. But does your language have a
try again? What do you think the merits or demerits of such an approach would be? Is it just a harmful
goto?
My zombie is hungry and waiting. Rather impatiently I might add.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
How do you make sure that your code can handle different kinds of error situations?
I write tests that describe the expected error situations and check to see that they are handled appropriately.
If the software is legacy and prone to generating exceptions, I'll wrap it up to report the exceptions
to get an idea of what needs attention first, and start attacking it there. Of course where required,
we'll use try/catch.
Mostly I try to follow the rigorous WOMM development process.
Can you explain what Test-Driven Development is? Can you name some principles of Extreme Programming?
TDD: Red, Green, Refactor. You write a test that specifies some behavior the code should produce. Run the test
to make sure it fails. Write code to make the test pass, then run it to make sure it passes. Change the
code as needed to improve its design.
XP: Aside from testing, we want to continuously review code through pair programming, maintain sustainable
work conditions with sane work-weeks, and continually integrate our code to reduce the time we have to
spend working out kinks in that process. There's more, but those are a few.
Reading: Xprogramming.com is a good starting point.
The associated lists (TDD and XP) on Yahoo Groups are great for discussion. It's probably worth reading
a book or two on the subjects. I'd recommend Kent Beck or Ron Jeffries.
What do you care about most when reviewing somebody else's code?
Does it say what it's doing, and do it correctly? Is it readable?
Reading: As with most of these questions, Steve McConnell's Code Complete 2 is an outstanding
resource on the subject. If you don't read anything else, read that.
When do you use an abstract class and when do you use an interface?
I'd use an abstract class when I want to provide some implementation for reuse, but where some
also remains to be specified by the inheriting class. An interface is useful for multiple inheritance
in languages that don't allow it, as well as a decoupling device - allowing you to depend on
interfaces that don't change as often as implementations might.
Reading: Books on OO design are useful, especially those targeting static languages like Java.
Apart from the IDE, which other favorite tools do you use that you think are essential to you?
Continuous integration tools, testing frameworks, (some people might include dependency injection
frameworks), scripting languages, the command line, source control, ... What else would you include?
Reading: McConnell's aforementioned book, The Pragmatic Programmer,
Practices of an Agile Developer,
tons of blogs that talk about the tools they use to make themselves more productive, and (although
I've only seen the presentation and not read the book), Neal Ford's The Productive Programmer
probably contains some useful items.
How do you make sure that your code is both safe and fast?
The question seems to imply these goals are normally at odds. I haven't felt that way. I'd program for
security first, and then if it's slow, I'd try to identify the bottleneck and then find a way to improve
its time complexity. If the algorithm is already at its lower bound for time complexity, I'd move on to
micro improvements, like moving variable creation and function calls outside of loops.
When do you use polymorphism and when do you use delegates?
I don't have any hard and fast rules. I rarely need to use polymorphism since I
primarily program in dynamic languages that make it unnecessary. (I guess it's still polymorphism, but you're not
doing anything special to achieve it.) When I have been in static languages, I'll implement
the methods that accept different types as needed to make client code more friendly to work in. If we're
actually building an API for public consumption, then obviously we have to move from "as needed" to
a more aggressive schedule.
I'm at a loss for a better answer to this question, because (surprisingly to me) I've not thought about it before now.
When would you use a class with static members and when would you use a Singleton class?
I don't know how to answer this except for "when the situation calls for it." I'd normally opt for
the class with static members when it makes sense to do it. As far as a proper Singleton, I don't know
that I've ever written one that enforces that property. More often, if I need just one, I only create one.
Can you name examples of anticipating changing requirements in your code?
I write unit tests, so that helps with changing requirements. How? It helps keep the design very modular to
allow for extension and easy changes, and the tests themselves provide assurance I haven't broken anything
when I do need to make a change.
I don't generally go leaving hooks and pre-implementing code that I think will be needed. YAGNI
helps guide me in that regard.
Can you describe the process you use for writing a piece of code, from requirements to delivery?
Requirement -> unit test -> code -> run tests -> commit -> run tests -> deploy.
I think that explains it all, and explaining each step could be a blog post or more of its own.
My basic advice on how to prepare for this section of questions is: Read and internalize
Code Complete 2.
I should probably read it again, in fact.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
More...
This week's answers are on a topic I've been wanting to explore more in depth here lately: algorithms (though it doesn't go into as much detail). I'll
wait until the end to give reference information because all of this post relies on experience, but there
are two sources where I'd start learning the information for every question. Trying to keep the post
DRY and all.
How do you find out if a number is a power of 2? And how do you know if it is an odd number?
To find out if a number is a power of two, you can divide by two until the number is 1 or odd. If the number is odd
at any point before you reach one, the number is not a power of two. To find out if a number is odd, I'd normally
take number mod 2 and see if the result is 1 or not (1 would mean the number is odd). If performance
is a concern and the compiler or runtime doesn't optimize for mod 2, you could use a bit mask that
checks if the smallest bit is set or not. If so, the number is odd. If not, the number is even.
Here's an example with the bit mask:
def even? n
n == n & 0b11111111_11111111_11111111_11111110
end
n =512while even?(n) && n > 1do
n = n >> 1endputs n==1
Note you'll need a large enough bit mask to cover the size of the number in bits.
How do you find the middle item in a linked list?
If it's important to be able to find the middle element, I'd keep a pointer to it. If it's implemented using an
array and it's important, we can store the array length and divide by two. If it's implemented using pointers to
elements, we can iterate over the list while counting its length, then iterate from the beginning until we get
halfway there. We could also take the size of the structure in memory and divide by two, going straight to that
element by adding 1/2 the size to the pointer, but that'd be a mighty WTF to most programmers when trying
to understand it.
How would you change the format of all the phone numbers in 10,000 static html web pages?
Not by hand if I could avoid it. I'd write a regex that matches any of the known formats in the set of pages
and use a language with a function that replaces based on a regular expression find.
Can you name an example of a recursive solution that you created?
I was creating a pattern enumeration algorithm in an effort to statistically identify potentially important
subsequences in a given genome. The more immediate goal was to identify Rho sites
in a set of bacterial genomes. Since we wanted to identify any potential pattern, the form needed to be general,
so a variable depth was required and we used recursion to achieve this. (This is a job interview, so I tried to
think of the most impressive sounding example from the last year I could think of.)
Which is faster: finding an item in a hashtable or in a sorted list?
Item retrieval is basically O(1) in a hash table, while O(log n) in a sorted list, so the hash table is faster on
average.
What is the last thing you learned about algorithms from a book, magazine or web site?
I guess it depends on what you'd consider learning. For instance, I recently looked up
merge sort to use as reference in writing a sorting
algorithm for a custom data structure, but I wouldn't say I "learned" it there. If you take "learning" as
being introduced to, it was in a course at school or via a book.
How would you write a function to reverse a string? And can you do that without a temporary string?
In most instances I'd be working with a language that already implements a reverse method for strings.
If not working in such a language, and I'm using a temporary string, the problem boils down to iterating backwards
over the given string, and assigning tempstring[realstring_length - i] = realstring[i]. If we restrict
the usage of a temporary string, then we can just use a variable to store the current character for swapping:
for(i=0; i<len; i++) {
lowerchar = realstring[i];
realstring[i] = realstring[len-i-1]; // -1 for 0 based arrays
realstring[len-1] = lowerchar;
}
What type of language do you prefer for writing complex algorithms?
I prefer very super extremely high level languages (to distinguish from VHLL)
that tend to be dynamic. The reason is that, in using them, I don't have to worry about low level details
that might otherwise get in the way of understanding the algorithm. After that, I'll normally have to implement
the same algorithm in a lower level language and take care of the details I could otherwise ignore, because
normally performance is going to matter when developing that complex algorithm.
In an array with integers between 1 and 1,000,000 one value is in the array twice. How do you determine which one?
I'd insert each value as a key in a hash and when the key already exists, I know we've hit the duplicate. This
gives us O(n) time complexity, which I'm sure could be proven to be the lower bound.
Do you know about the Traveling Salesman Problem?
Yes, it's famous. The problem asks us to find the shortest path that visits every node in a graph.
Posted by Sam on Mar 12, 2009 at 12:00 AM UTC - 5 hrs
SOAP can be a huge PITA in Ruby if you're not dealing with a web service that falls
under the defaults. In particular, if your web service falls under
HTTPS where you need to change the default
certificate acceptance, or if you need to authenticate before seeing the
WSDL, you're SOL as far as I
can tell as of writing this post. (If you know of a way that doesn't resort to this complexity, please speak up!)
I was using Ruby 1.8.7 and soap4r 1.5.8, but this may apply to other versions.
Anyway, here are a couple of monkey patches to help get you there if you're having trouble.
More...
If you need to change the SSL verify mode, for example, to accept a certificate unconditionally, you can use this
monkeypatch:
Hope that helps someone else avoid days' long foray into piecing together blogs posts, message boards, and
searching through source code.
And because you might get here via a search for related terms, normal access that only requires basic authentication
could be done like this, without opening existing classes:
I'm very welcoming of suggestions regarding how these things might be better accomplished. Resorting to this
messy level of monkeypatching just sucks. Let me know in the comments.
(Link is to abstract page, quote is from the PDF linked to from there, chart below is from the paper as well)
More...
Further research needs to be carried out to see if this generalizes beyond the Windows Vista team.
Needless to say, it's not a license to write crapcode, as those metrics are still good predictors of software defects, but it's interesting to note just how important organization is to software quality.
Posted by Sam on Mar 12, 2009 at 12:00 AM UTC - 5 hrs
SOAP can be a huge PITA in Ruby if you're not dealing with a web service that falls
under the defaults. In particular, if your web service falls under
HTTPS where you need to change the default
certificate acceptance, or if you need to authenticate before seeing the
WSDL, you're SOL as far as I
can tell as of writing this post. (If you know of a way that doesn't resort to this complexity, please speak up!)
I was using Ruby 1.8.7 and soap4r 1.5.8, but this may apply to other versions.
Anyway, here are a couple of monkey patches to help get you there if you're having trouble.
If you need to change the SSL verify mode, for example, to accept a certificate unconditionally, you can use this
monkeypatch:
Hope that helps someone else avoid days' long foray into piecing together blogs posts, message boards, and
searching through source code.
And because you might get here via a search for related terms, normal access that only requires basic authentication
could be done like this, without opening existing classes:
I'm very welcoming of suggestions regarding how these things might be better accomplished. Resorting to this
messy level of monkeypatching just sucks. Let me know in the comments.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
This week's answers about data structures are complementary (indeed very intertwined with) last week's questions about algorithms.
Also like last week, I'll
wait until the end to give reference information because all of this post relies on experience, but there
are two sources where I'd start learning the information for every question.
How would you implement the structure of the London underground in a computer's memory?
Without having travelled to London or on its subway system, I'd guess a graph would be the right
data structure. The set of vertices would represent the stations, and the edges connecting them would
be the tracks.
Not safe for work (language):
I don't know the proper in-memory representation of tramps.
How would you store the value of a color in a database, as efficiently as possible?
Efficiently for retrieval speed, storage speed, size? I'm guessing size. After asking why such efficiency is
needed, and assuming we're talking about a range of up to 166 colors (FFFFFF), I'd just store it as the smallest
integer type where it would fit.
What is the difference between a queue and a stack?
A queue is typically FIFO (priority queues don't quite follow that)
while a stack is LIFO. Elements get inserted at one end of a
queue and retrieved from the other, while the insertion and removal operations for a stack are done
at the same end.
What is the difference between storing data on the heap vs. on the stack?
The stack is smaller, but quicker for creating variables, while the heap is limited in size only by how much
memory can be allocated. Stack would include most compile time variables, while heap would include anything
created with malloc or new. (This is for C/C++, and not strictly the case.)
How would you store a vector in N dimensions in a datatable?
I need a little direction for this question, as I know not what it means. I encourage my readers, who have
on most occasions proven themselves more adept than me, to come through again.
What type of language do you prefer for writing complex data structures?
I can't imagine using anything higher level than C or C++. Anything more advanced has most anything already
built and not very easily molded. Or perhaps I just wouldn't think of it as complex.
What is the number 21 in binary format? And in hex?
10101 in binary and 15 in hex, and no I didn't cheat and use a calculator. It works just like decimal.
Take the following digits of an arbitrary number in base B:
UVWXYZ
The number in decimal is U*B5 + V*B4 + W*B3 + X*B2 + Y*B1 + Z*B0
As more digits are added, you just increase the power by which it is raised. Also note that any number raised
to the zeroth power is 1, so the Z element is just itself, and the ones digit.
What is the last thing you learned about data structures from a book, magazine or web site?
As with my answer to this question with regard to algorithms, I'm certain I've used to web for reference here,
but I'd guess my introduction and original knowledge acquisition came from a book.
However, I would add journal article to the list of answers, because in both cases that would have been my
answer, even though I read them via the web.
How would you store the results of a soccer/football competition (with teams and scores) in an XML document?
<fixtures><fixture><teamname="Chelsea FC"><score>0</score></team><teamname="Fulham FC"><score>1</score><!-- any other stats? --></team></fixture></fixtures>
That might be reasonable.
Can you name some different text file formats for storing unicode characters?
I have to be honest here and say I don't know what you're talking about. I can't think of a file format
that wouldn't take it.
Again the reading material is similar to last week:
Posted by Sam on Mar 24, 2009 at 12:00 AM UTC - 5 hrs
Every day, psychological barriers are erected around us, and depending on what task they are a stumbling block for,
they can be helpful or a hindrance.
Ramit Sethi wrote a guest post on personal finance blog Get Rich Slowly about passive barriers that got me
thinking about passive barriers in software development, or perhaps just any easily destroyed (or erected barrier) that
prevents you from doing something useful (or stupid). One example he uses that comes up a lot in my own work
references email:
More...
I get emails like this all the time:
"Hey Ramit, what do you think of that article I sent last week? Any suggested changes?"
My reaction? "Ugh, what is he talking about? Oh yeah, that article on savings accounts ... I have to dig that up
and reply to him. Where is that? I'll search for it later. Marks email as unread"
Note: You can yell at me for not just taking the 30 seconds to find his email right then, but that's exactly
the point: By not including the article in this followup email, he triggered a passive barrier of me
needing to think about what he was talking about, search for it, and then decide what to reply to.
The lack of the attached article is the passive barrier, and our most common response to barriers is to
do nothing.
(Bold emphasis is mine).
If I can't immediately answer an email, it gets put on hold until I have time to go through and do the research
that I need to do to give a proper reply. Sometimes, that means your email never gets answered because eventually
I look down at the receipt date and say to myself "I guess it'd be stupid to respond now." But I digress.
In everyday software development, there are a number of barriers that can help us:
Minimizing or closing the browser. When a compilation is expected to take up to a minute, or a test suite
will run for too long, or a query takes forever, there's not much work that can be done, so I might
fire up the feed reader, email, or twitter to pass the time away. The problem here is that you'll often spend
far longer on your excursion than it takes for your process to complete. If you waste just 5 minutes each time,
you've accomplished nothing - you're just skimming and certainly not getting anything out of it, and you could
have been back working on what you were trying to accomplish. In these situations, I have my email, feed reader,
and twitter minimized, and that significantly reduces the urge to open them up and start a side quest.
If you wanted to get more to the active side of barriers, you might just add the line
to your hosts file. That turns a passive barrier to time waste into a downright pain.
Having a test suite with continuous integration and code analysis tools running. At various points in a day you might be tempted to
check in code that breaks the build or introduces a bug. This is especially true at the end of the day.
However, if you have a test suite that runs on every commit, you're much more likely to run it to avoid the embarrassment of checking
in bad code. If you've got static analysis tools that also report on potentially poor code, you're less
likely to write it.
Annoyance Driven Development. This isn't one that I know how to turn on or off, but I think it would be
a great feature to have in IDEs or text editors: it gets slow when your methods or classes or files get too big.
This would be a great preventative tool, if it exists. I guess it falls back to using test suites and
code analysis to provide instant feedback that annoys you into doing the right thing.
Working with others, or having others review your code. Most of us pay more attention to quality
when we know others will be looking at the code we write. Imagine how much more of your code you'd be
proud to show off if you just knew that someone would be looking at it later.
Just as well, there are also barriers that hinder us:
Interruptions. This one is obvious, but so pervasive it should be mentioned. IM,
telephone calls, email, coworkers stopping by to chat or ask questions - they all prevent us from working
from time to time. The easy answer is to close these things, and that's what I do. They all represent
passive barriers to getting work done, and you can easily turn that around to be a passive barrier
against wasting time (see above). Pair programming is an effective technique that erects its own
barrier to these time wasters.
Rotting Design: Rigidity, Fragility, Immobility, and Viscosity. Bob Martin discusses these
in his (PDF) article on Design Principles and Design Patterns. Quoting him for the
descriptions, I'll leave it to you to read for the full story:
Rigidity is the tendency for software to be difficult to change, even in
simple ways. Every change causes a cascade of subsequent changes in dependent
modules. What begins as a simple two day change to one module grows into a multi-
week marathon of change in module after module as the engineers chase the thread of
the change through the application.
...
Closely related to rigidity is fragility. Fragility is the tendency of the
software to break in many places every time it is changed. Often the breakage occurs
in areas that have no conceptual relationship with the area that was changed. Such
errors fill the hearts of managers with foreboding. Every time they authorize a fix,
they fear that the software will break in some unexpected way.
...
Immobility is the inability to reuse software from other projects or
from parts of the same project. It often happens that one engineer will discover that he
needs a module that is similar to one that another engineer wrote. However, it also
often happens that the module in question has too much baggage that it depends upon.
After much work, the engineers discover that the work and risk required to separate
the desirable parts of the software from the undesirable parts are too great to tolerate.
And so the software is simply rewritten instead of reused.
...
Viscosity comes in two forms: viscosity of the design, and viscosity of
the environment. When faced with a change, engineers usually find more than one
way to make the change. Some of the ways preserve the design, others do not (i.e.
they are hacks.) When the design preserving methods are harder to employ than the
hacks, then the viscosity of the design is high. It is easy to do the wrong thing, but
hard to do the right thing.
The point is that poor software design makes an effective barrier to progress. There are only two ways
I know to tear down this wall: avoid the rot, and make a conscious decision to fix it when you
know there's a problem. There are plenty of ways to avoid the rot, but books are devoted to them, so
I'll leave it alone except to say a lot of the agile literature will point you in the right direction.
Unit Tests. I struggled with the idea of putting this on here or not. If you're an expert, you already know this.
If you're a novice or lazy, you'll use it as an excuse to avoid unit testing. The point remains: unit testing
can be a barrier to producing software, if you are exploring new spaces and having trouble determining
test cases for it. I'll let the Godfather of TDD, Kent Beck, explain:
... I still didn't have any software. As with any speculative idea, the chances that this
one will actually work out are slim, but without having anything running, the chances are zero. In six
or eight hours of solid programming time, I can still make significant progress. If I'd just written
some stuff and verified it by hand, I would probably have the final answer to whether my idea is
actually worth money by now. Instead, all I have is a complicated test that doesn't
work, a pile of frustration, eight fewer hours in my life, and the motivation to write another essay.
These are just a few examples, so I'm interested in hearing from you.
What barriers have you noticed that positively affect your programming? Negatively?
Posted by Sam on Mar 30, 2009 at 12:00 AM UTC - 5 hrs
A friend of mine from graduate school recently asked if she could use me as a reference on her resume.
I've worked with her on a couple of projects, and she was definitely one of the top few people I'd
worked with, so I was more than happy to say yes.
Most of the questions were straightforward and easy to answer. However, one of the potential questions
seemed way off-base: I may be asked to "review her multi-tasking ability."
Of course I want to paint her in the best possible light, and in that regard, I'm unsure how to answer such
a question. Why? To understand that, we need to ask
What's the question they're really asking?
There are two disparate pieces of knowledge they can hope to glean from my answer to that question:
Does she concentrate on a single item well enough to finish it?
In this case, they are asking the opposite of what they want to find out. The trick relies on the reviewer to give
an honest opinion, whereas most people would assume they should answer each question in the affirmative. Because
the rest of the questions seem straightforward, I'd give this potential "real question" a low
probability of being what they really want to know.
Is the candidate able to juggle multiple different projects and work effectively?
I give this one the higher probability of being the question the employer really wants the
answer to. But it's a ridiculous question. On the one hand, you already know the job candidate has
successfully completed two levels of college, so it should be clear that they can handle multiple different
projects given the appropriate resources. On the other hand, I don't think they care about the
"appropriate resources" part. I think they're setting their employees up to fail because they
don't understand that
Is "multitasking ability" just code for unable to accomplish anything because you require employees
to work on so many different projects in parallel that progress cannot be made on any of them?
What's your opinion?
Update: John G. Miller (or someone claiming to be him) is author of a book and has asserted trademark rights to a phrase originally used in this article, so I've removed it.
Posted by Sam on Apr 09, 2009 at 12:00 AM UTC - 5 hrs
This is the "I'm trying my hardest to be late to that meeting that spans lunch where they don't serve anything to tide you over"
edition of Programming Quotables.
If you don't know - I don't like to have too many microposts on this blog (I'm on twitter for that), so save them up as I run across them, and every once in a while I'll post a few of them. The idea is to post quotes about programming that have one or more of the
following attributes:
I find funny
I find asinine
I find insightfully true
And stand on their own, with little to no comment needed
It's up to you decide which category they fall in, if you care to. Anyway, here we go:
This is my Quality is Dead hypothesis: a pleasing level of quality for end users has become too hard to achieve while demand for it has simultaneously evaporated and penalties for not achieving it are weak.
The entropy caused by mindboggling change and innovation in computing has reached a point where it is extremely
expensive to use traditional development and testing methods to create reasonably good products and get a
reasonable return on investment. Meanwhile, user expectations of quality have been beaten out of them.
When I say quality is dead, I don't mean that it's dying, or that it's under threat.
What I mean is that we have collectively- and rationally- ceased to expect that software normally works
well, even under normal conditions. Furthermore, there is very little any one user can do about it.
I haven't figured out yet exactly how I'm going to use this, but it'll probably look like this: one new beat per day, one new track per weekend. One new app per month, one milestone on the app per week. The goal is not to establish a far-off goal and find a way to hit it, but to establish a series of tiny, immediate goals that keep you forever moving forward. Aristotle argued that virtue was mostly a matter of having good habits; Lao-Tzu tells us that the voyage of a million miles starts with a single step. So the key is to get moving and keep moving.
But in computer games, it's impossible to have an equal match. It's humans versus machines. One side has an advantage of being able to perform a billion calculations per second, and the other has the massively parallel human brain.
Any parity here is an illusion, and it's that illusion that we seek to improve and maintain via the introduction of intelligent mistakes and artificial stupidity.
The computer has to throw the game in order to make it fun. When you beat the computer, it's an illusion. The computer let you win. We just want it to let you win in a way that feels good.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
What kind of tools are important to you for monitoring a product during maintenance?
I rely on logs and profiling tools on occasion. I'm
really interested to hear from the rest of you about this though.
What is important when updating a product that is in production and is being used?
I'd say it's important not to interrupt service. Surely there must be something else you're
getting at?
How do you find an error in a large file with code that you cannot step through? cout, puts, printf, System.Out.print, Console.Out.WriteLine, and ## have all
been useful for me at one time or another.
A good strategy here is to isolate the code that's causing the error by removing code and faking results.
By doing that, you can slowly add code back in until the error reappears. Rewrite that part.
How can you make sure that changes in code will not affect any other parts of the product?
Regression tests!
How do you create technical documentation for your products?
Ideally I'd have comments that can be harvested by tools like RDoc or JavaDoc, but times are often
less than ideal.
How can you debug a system in a production environment, while it is being used?
You can read logs if important events are being logged.
Profiling tools exist for this purpose, but I don't have experience with any outside of those for
use with databases.
Do you know what load balancing is? Can you name different types of load balancing?
One computer acts as the gatekeeper for an array of computers and directs requests to the others
to "balance the load" of the entire system.
I'm not familiar with different types, but just guessing I'd assume they have round-robin and need-based
load balancing. I'd also presume any other scheduling algorithmic scheme could be applied in load
balancing.
I'm more interested to know why this is on the maintenance list of questions. Is it because you've deployed
your application and now you need to scale it with hardware?
Can you name reasons why maintenance of software is the biggest/most expensive part of an application's life cycle?
One view is that after you write the first line of code, you begin maintenance. But more in-line with the
popular view: it lasts the longest. You may take a month to build a system that will be in production over
several years. During that time, defects are found that need to be fixed, business rules may change, or
new features may be added.
Also, we suck at writing software.
What is the difference between re-engineering and reverse engineering?
I didn't know this one. I thought and would have responded that re-engineering would be
rebuilding an application with a white box, while reverse engineering would be done through a black box.
The reengineering of software was described by Chikofsky and Cross in their 1990 paper, as "The examination and alteration of a system to reconstitute it in a new form". Less formally, reengineering is the modification of a software system that takes place after it has been reverse engineered, generally to add new functionality, or to correct errors.
This entire process is often erroneously referred to as reverse engineering; however, it is more accurate to say that reverse engineering is the initial examination of the system, and reengineering is the subsequent modification.
How would you answer these questions about software maintenance?
Domain code in controllers and views isn't a problem that's limited to Rails, of course. It's a problem everywhere, and one you generally need to remain vigilant about. Rails doesn't make it easy by making it easy - it's much too easy to do the wrong thing.
You've got the view open and think, "I need to get a list of Widgets."
That was easy, and it's even easier in controllers where you don't have the hassle of angled brackets and percent signs. Worse yet, since you've got what you need right there, it's easy to add more logic around it. Before you know it, your views and controllers are cluttered with a bunch of crap that shouldn't be there.
I fall into the trap more often than I'd like to admit. And I know it's wrong. What of those who haven't a clue?
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
Browsing through the questions, I'm not confident here of my ability to answer without asking
some preliminary questions (which I have no one to answer), so please chime in if you have something to add.
More...
Do you know what a baseline is in configuration management? How do you freeze an important moment in a project?
A baseline in configuration management is the same as a baseline anywhere else - some place which serves as
a starting point or point we can return to to compare one point in time to another.
Freezing an important moment in a project to me sounds like creating a new major or minor version - so I may make
a new tag in the version control system.
Which items do you normally place under version control?
I can't think of anything I'd not put under version control aside from user-specific project settings
and files created while executing the software (like logs, for instance) that are always likely to cause
change conflicts between different developers.
How can you make sure that team members know who changed what in a software project?
Give everyone separate accounts to access the VCS.
Some people like to keep a change log at the top of each file, but I think that gets ignored and becomes
useless.
Do you know the differences between tags and branches? When do you use which?
A tag is for a static snapshot, a branch is intended for development outside the trunk.
How would you manage changes to technical documentation, like the architecture of a product?
I'd put them in VCS along with the rest of the project. I have to say though - I'm not sure I understand what
this question is really asking.
Which tools do you need to manage the state of all digital information in a project? Which tools do you like best?
This is another question which I'd need prodding to give a more useful answer. My favorite would be a VCS, but
what else exists? Are we talking about using Wikis and bug/issue trackers? Are we talking about using Excel to
create spreadsheets to manage burndown charts and make predictions as to project completion dates? Are we
talking about using Word to manage user documentation?
I'm unsure.
How do you deal with changes that a customer wants in a released product?
I mainly deal in web development, so in that case we just make the change and re-deploy the web application.
In desktop applications, I've only ever released projects to customers who will be using it on so few machines
that coming up with an automatic update strategy would have been a waste of resources.
If it was the right
project, I'd have it do a call to a central server that looks for updates, and perform the updates automatically.
In that case, we just make the updates and set a flag when a customer wants a change to a released product.
Are there differences in managing versions and releases?
As with so many things in software development, the answer depends on how you view versions and releases.
If you view a release as the deliverable to customers and a version as a concept to get you there, then
the answer is undeniably yes.
If you view versions and releases as the same things, then perhaps not so much.
Do you have multiple versions between releases? There are many questions to ask here, and I certainly don't
have all the answers. Personally, I see them as a combination of the two views I mentioned above:
the version is the version of the code that we tag at certain points in time. The release happens in
tandem, but it is the deliverable we give to clients - so they differ in the way we manage them, because
it's not simply an issue of "commit->version->release." I'd love to get there, but I'm not close yet.
What is the difference between managing changes in text files vs. managing changes in binary files? diff is much less useful in highlighting differences in binary files than it is in text files (at least
as far as the person running the diff is concerned). It is much harder to manage conflicting changes in binary
files as a result.
How would you treat simultaneous development of multiple RfC's or increments and maintenance issues?
I fear I don't understand what this question is asking. RfC == Request for Comments? What does that
have to do with increments and maintenance issues? Any advice as to what you think this question means
is truly appreciated.
Reading through and answering these questions has made one thing very clear to me: I'm stuck in the middle
of the forest and I'm only seeing the trees. I don't have a strategy when it comes to configuration management
and version control. I use it, because I know I should, and it has some benefits. However, focusing
only on the tactical side means I'm not getting as much out of it as I could.
How would you answer these questions about configuration management?
(A disclaimer for you: I read this book as a reviewer and haven't yet made the time to go through the finished
product, so some of what I'm about to say may change. That said, I can only imagine that it got better before
going to publication, so I don't expect anyone would be disappointed.)
The Passionate Programmer retains that status of being a must-read. It adds a few new chapters and
removes a couple of others, but more importantly it changes the framing from the negative view of "save your job"
to what My Job Went to India was always really about anyway: "creating a remarkable career in software
development."
Here's what I had to say about it for the blurb:
Six short months before I read Chad's book, I was on the verge of
changing careers. Through a series of accidents from November to
May, I decided not only to stick with software development but to be
passionate about it while striving to be great. With a healthy dose of
inspiration, the book you're now holding served as a road map for
achieving those goals.
It truly is an excellent map that helped me find my way from Quit Town to making the decision to be
passionate about hacking and life in general, starting this blog, and striving to leave the realm
of the unclean masses in our profession whose exploits we read about so often.
If you read MJWTI and understood the positive aspects of it, this book isn't that important since
you know most of it already.
I'd have purchased it anyway, but you may feel differently. That's Okay.
However, if you felt you'd be embarrassed if someone saw you holding the first version - or just
haven't read it before - I strongly recommend picking up a copy of this version and
going through it. Don't just read it though - apply it. At the end of every chunk of advice there is a list
of activities that you can perform. Don't just gloss over them; make it a point to actually do some of them.
It's short enough to read through in one or two sittings. But there's enough content in there to keep you busy for
a couple of years.
If you've read this book or the 1st edition, what did you think about it? Am I overenthusiastic?
I look forward to covering the new chapters as time allows over the next few weeks. I hope you'll join me in
the discussion.
Posted by Sam on May 14, 2009 at 12:00 AM UTC - 5 hrs
Many people see spectacular plays from athletes and think that the great ones are the ones making those plays.
I have another theory: It's the lesser players who make the "great" plays, because thier ability doesn't take them
far enough to make it look easy. On top of it all, you could say guys who make fewers mistakes just
aren't fast enough to have been in a position to make the play at all.
In the case of sport, one might also make that argument against the lesser players in favor of the ones who
regularly make the highlight reel: their greatness lets them get just a tad closer, which allows them to make the play.
In the case of software development, that case is not so easily made.
When developers have to stay up all night and code like zombies on a project that may very well be on
a death march, you've got a problem, and it's not solely that your project might fail. Even when that super heroic
effort saves the project, you've still got at least three issues to consider:
Was the business side too eager to get the project out the door?
Are the developers so poor at estimating that it led to near-failure?
Is there a failure of communication between the two sides?
In saving the project, the spectacular effort and performance of your team or individuals on your team is
not something to be marveled at - it's a failure whose cause needs to be identified and corrected.
Handing out bonuses is a nice way to show appreciation for their heroic efforts, but it encourages poor
practices by providing disincentives for doing the right thing:
No incentive to make good estimates.
Incentive to give in to distrations since they "can always just stay late"
No reason not to have a foggy head half the day
A motive for waiting until the last minute, just to show off their prowess
Handing out bonuses to the individuals who displayed the most heroism brings friction and
resentment from
those who opted to sleep (especially among those who realize half the work was created by the
heroes!).
Yet, having only part of the team on board with the near-death march causes the same resentment from the
sleepless hackers.
Rewards encourage repetition of the behavior that led to the prize. When you do that, you're putting
future projects in peril.
There are plenty of ways to reduce the risk and uncertainty of project delivery - and subtantially fewer
tend to work when you wait until the last week of a project - but those methods are the subjects of other stories.
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
How many of the three variables scope, time and cost can be fixed by the customer?
Two. (See The 'Broken Iron Triangle' for a good
discussion.)
Who should make estimates for the effort of a project? Who is allowed to set the deadline?
The team tasked with implementing the project should make the estimates. The deadline can be set by
the customer if they forego choosing the cost or scope. There are cases where the team should set the deadline.
One of these is if they are working concurrently on many projects, the team can give the deadline to management,
with the knowledge that priorities on other projects can be rearranged if the deadline for the new project
needs to be more aggressive than the team has time to work on it.
Otherwise, I imagine management is
free to set it according to organizational priorities.
Do you prefer minimization of the number of releases or minimization of the amount of work-in-progress?
I generally prefer to minimize the amount of work on the table, as it can be distracting to
Which kind of diagrams do you use to track progress in a project?
I've tended to return to the burndown chart time after time. Big visible charts
has some discussion of different charts that can be used to measure various metrics of your project.
What is the difference between an iteration and an increment?
Basically, an iteration is a unit of work and and increment is a unit of product delivered.
Can you explain the practice of risk management? How should risks be managed?
I don't know anything about risk management formally, but I prefer to to deal with higher risk items first
when possible.
Do you prefer a work breakdown structure or a rolling wave planning?
I have to be honest and say I don't know what you're talking about. Based on the names, my guess would be
that "work breakdown structure" analyzes what needs to be done and breaks it into chunks to be delivered
in a specific order, whereas rolling wave may be more like do one thing and then another, going with the flow.
In any case, I would think like most everyone else that I'd prefer the work breakdown structure, but it's
unrealistic in most projects (repetitive projects could use it very successfully, for instance). Therefore,
I'll take the rolling wave over WBS please.
What do you need to be able to determine if a project is on time and within budget?
Just the burndown chart, if it's been created out of truthful data.
Can you name some differences between DSDM,
Prince2
and Scrum?
I'm not at all familiar with Prince2, so I can't talk intelligently about it. DSDM is similar to Scrum in that
both stress active communication with and involvement of the customer, as well as iterative and incremental
development. I'm not well versed in DSDM, but from what little I've heard, it sounds a bit more prescriptive than
Scrum.
I'd suggest reading the Wikipedia articles to get a broad overview of these subjects - they are decent starters.
It would be nice if there were a book that compared and contrasted different software development
methodologies, but in the absence of such a book, I guess you have to read one for each.
How do you agree on scope and time with the customer, when the customer wants too much?
Are they willing to pay for it? If they get too ridiculous, I'd just have to tell them that I can't do what they're asking
for and be able to pay my developers. Hopefully, there would be some convincing that worked before it came to
that point, since we don't want to risk losing customers. However, I must admit that I don't have any strategies
for this. I'd love to hear them, if you have some.
There are a couple of stories you can tell:
about 9 women having one baby in just one month. (Fred Brooks)
about your friend with an interesting first date philosophy (Venkat Subramaniam)
How would you answer these questions about project management?
Posted by Sam on May 21, 2009 at 12:00 AM UTC - 5 hrs
For the last few months, I've been having trouble getting out of "next week" mode.
That's what I call it when I don't know what I'll be working on outside of the next week at any given time.
It's not necessarily a bad thing, but when you're working on projects that take longer than a couple of weeks,
it doesn't let you keep the end in sight. Instead, you're tunneling through the dirt and hoping you've been digging
up instead of down.
More...
I've delivered most projects during this period on schedule, but I did cave into pressure to
over-promise and under-deliver on one occasion. And it sucked.
When I wrote that
rewarding heroic development
promotes bad behavior, I said reducing the risk and uncertainty of project delivery
is the subject of a different story, and the discussion in the comments
got me thinking about this. There are many stories worth telling regarding this issue.
The rest of this story is about how I'm intending to get out of my funk using
techniques that have worked for me in the past.
(Aside: As I write the words below, it occurs to me we have a chicken/egg problem of which comes first.
Just start somewhere.)
To make decent estimates there are 3 must-haves:
Historical data as to how much you can complete in a given time frame
Backlog of items you need to complete in the time frame you're wanting to estimate for
The ability to break requests into sweet, chunky, chewy, bite-sized morsels of estimable goodness.
Since you haven't been doing this ["ever", "in a while"][rand*2], you don't have historical data. Your backlog
is anything on your table that hasn't been completed yet - so you've got that. Now, you need to break your
backlog apart into small enough bits to estimate accurately. This way, you practice the third item and
in a couple of weeks, you'll have historical data.
About estimating tasks:
Don't worry about estimating in units of time. You're probably not good at it. Most of us aren't, and you haven't
even given it a fair shot with some data to back up your claims. Measure in points or tomatoes. Provide your estimate
in chocolate chips. The unit of measurement doesn't matter at this point, so pick something that
makes you happy. However, you should stay away from units of time at this point in the exercise. You're not
good at that, remember?
So I have some number of tasks that need to be completed. I write each of them down, and decide how many chocolate
chips it's going to take me to finish each one. I count in Fibonacci numbers instead of counting numbers, because
as tasks grow in time and complexity, estimates grow in uncertainty. I try to keep all of my tasks as 1, 2, or 3
chocolate chips. Sometimes I'll get up to a 5.
But if you start venturing into 8 and 13 or more, you're basically saying
IDFK anyway, so you might as well be honest and bring that out into the open. Such tasks are more like
Chewbaccas than chocolate chips, so take some time to think about
how you might break them down as far as possible.
Now that you know how to estimate tasks:
Before you start on a task -- with a preference to earlier rather than later (hopefully as soon as you know it needs to be done) --
estimate how many points it should take you, then write it down on your list of items to complete. Take note
of how many chocolate chips you finish daily. Write down the number completed and the date.
Make a graph over time comparing the number of chocolate chips you have remaining (or how many you've completed)
on the Y-axis and the date that applied to. If you use points remaining, it's a Burn Down chart. If you
go the otherway, it's not surprisingly called a Burn Up chart.
Keep a log of the number of chips you complete per week. The last two or three weeks' averages are a good indication
of how many you'll be able to do for the next few weeks, and helps planning for
individuals spanning several projects, or teams on a single project.
You can now reference your chips per week to extrapolate how long it's likely to take you to finish a particular
task or small project.
Further, you'll always want to know how many points you've got in your backlog and how many you need to
complete by a given date. If you keep a log of due dates you can reference it and your points per weeks
when someone asks you when you can have something done. Now, you can say "I can start on the 26th or you
can rearrange the priorities on my current work and I can be done by the end of the day."
Any questions? As always, I'm happy to answer them.
The majority of these ideas are scrum-thought and I've used terms from that methodology, so if you want to
go deeper, that would be a good place to look.
I get the impression most people think
you get better by imitating masters.
It's a common theme in self improvement.
Aspiring writers read great authors. Aspiring musicians listen to great musicians. Artists study artists and coders
study coders.
I've certainly espoused that point of view. I'm fond of quoting Ron Jeffries as saying,
My advice is to do it by the book, get good at the practices, then do as you will. Many people want to skip to step three. How do they know?
In fact, I think that's the third time I've done so in almost as many years.
But what if that's not the primary benefit of reading other people's code? I don't mean scanning it - I mean reading it
until you understand exactly what it's doing. Is there something else you can get out of it?
I think so. Perhaps it's not the mimicking of a particular style like a monkey that makes us better
for reading code. What if it's tracing through an unfamiliar thought process that flexes the brain
and makes it think in ways it previously had not?
By reading unfamiliar code and forcing yourself to trace through until you understand it, you end up
thinking in ways that were previously foreign to you.
I think that's where the real value in reading code exists.
What are your thoughts?
Posted by Sam on Jun 09, 2009 at 12:00 AM UTC - 5 hrs
From time to time I like to actually post a bit of code on this programming blog, so here's
a stream-of-conscious (as in "not a lot of thought went into design quality") example that shows how to:
Open Excel, making it invisible (or visible) to the user.
Create a workbook and access individual worksheets
Add data to a cell, or retrieve data from a cell
Add a chart to a worksheet, with constants for various chart types
Save as Excel 97-2003 format and close Excel
If you know where I can find the constants for file type numbers, that would be appreciated. Calling SaveAs
without the type seems to use whatever version of Excel you are running, but I'd like to find how to save as
CSV or other formats.
Needless to say, this requires Excel be on the computer that's running the code.
require'win32ole'
xl = WIN32OLE.new("Excel.Application")puts"Excel failed to start"unless xl
xl.Visible =false
workbook = xl.Workbooks.Add
sheet = workbook.Worksheets(1)#create some fake data
data_a =[](1..10).each{|i| data_a.push i }
data_b =[](1..10).each{|i| data_b.push((rand*100).to_i)}#fill the worksheet with the fake data#showing 3 ways to populate cells with values(1..10).eachdo |i|
sheet.Range("A#{i}").Select
xl.ActiveCell.Formula = data_a[i-1]
sheet.Range("B#{i}").Formula = data_b[i-1]
cell = sheet.Range("C#{i}")
cell.Formula ="=A#{i} - B#{i}"end#chart type constants (via http://support.microsoft.com/kb/147803)
xlArea =1
xlBar =2
xlColumn =3
xlLine =4
xlPie =5
xlRadar =-4151
xlXYScatter =-4169
xlCombination =-4111
xl3DArea =-4098
xl3DBar =-4099
xl3DColumn =-4100
xl3DLine =-4101
xl3DPie =-4102
xl3DSurface =-4103
xlDoughnut =-4120#creating a chart
chart_object = sheet.ChartObjects.Add(10, 80, 500, 250)
chart = chart_object.Chart
chart_range = sheet.Range("A1", "B10")
chart.SetSourceData(chart_range, nil)
chart.ChartType = xlXYScatter
#get the value from a cell
val = sheet.Range("C1").Value
puts val
#saving as pre-2007 format
excel97_2003_format =-4143
pwd = Dir.pwd.gsub('/','\\') << '\\'#otherwise, it sticks it in default save directory- C:\Users\Sam\Documents on my system
workbook.SaveAs("#{pwd}whatever.xls", excel97_2003_format)
xl.Quit
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
Last week's answers on Functional Design
had me feeling that way. Luckily, this week we come to technical design - a topic I feel quite a bit stronger on.
More...
What do low coupling and high cohesion mean? What does the principle of encapsulation mean?
Coupling refers to how strongly or loosely components in a system are tied together. You want that to be
low. Cohesion refers to how well the individual parts of a unit of code fit together for a single purpose.
Encapsulation is about containing implementation of code so that outsiders don't need to know how it's works
on the inside. By doing so you can reduce negative effects of coupling.
Reading: Robert C. Martin's SOLID
principles of OOD, which have been linked on this blog since day 1. His book, Agile Software Development: Principles, Patterns, and Practices
is another great resource for this topic. It's short and to the point, and comes highly recommended from myself.
How do you manage conflicts in a web application when different people are editing the same data?
Set a flag when someone starts editing data unit A. If someone else loads it, let them know it's being edited
and that it's currently in read only mode. If the race was too fast, you can also have a check on the
commit side to let them know their changes conflict with another user, present them the data, and then let
them figure out how to merge the changes. This has rarely been a problem in my experience, but it could be,
and that's how I'd deal with it if the requirement came up. (If the changes don't conflict, you could
simply keep the user unaware as well.)
My answer comes from the things you see in normal usage of shared files or just about any shared resource,
for that matter. Originally,
it comes down to race conditions, so you might
be able to extrapolate some useful information from that low-level explanation.
Do you know about design patterns? Which design patterns have you used, and in what situations?
I know about design patterns. Most of the ones I'm familiar with, at least in the canonical book,
aren't of much daily use to me, as I tend to work in dynamic languages, where the sorts of flaws that precipitate
the patterns (as implemented in the book) just aren't factors as often as in other languages. (Yes, some of the
book is implemented in Smalltalk. I can implement them with as much superfluous junk as you desire in
any language - that doesn't make it a necessity.)
I suppose most frequently I've used the Strategy pattern.
(Perhaps the fact that I've focused so much on one in particular is a weakness in my coding style?) The situations
are when an interface should remain the same while the implementation should differ somewhat. I don't have a
concrete example on the top of my head.
If I were to start working in Java again, or building larger applications in .NET (I currently build very small
apps in that space as part of my job), I'd re-read the book. I might even scan the inner cover daily just as a
refresher.
I wouldn't say I'm strong on design patterns, but I've got reference information and know where to look
should I need to, along with the facilities to become strong should my situation call for it.
Do you know what a stateless business layer is? Where do long-running transactions fit into that picture?
I hadn't heard it as a single term until now, but knowing the individual terms lets me say that objects in
the business layer (or domain model) are transient - or that their state is not preserved in memory between
subsequent requests for the same object.
This may note bode well for long running transactions, as state presumably must be set up each time an
object is loaded, along with any process that might be required for tear-down.
For reading, this is just information as I've come across it throughout my various readings, so I don't
know what to recommend.
What kinds of diagrams have you used in designing parts of an architecture, or a technical design?
UML, or some bastardization of it has always been enough.
Most likely the bastardized part where we just do a little design on paper or a whiteboard to gain a
better understanding of the intended design through some sketch-work.
I've never been a part of a team that practices BDUF, nor
have I felt the need for it in any personal projects, so I'm light on recommendations for reading.
The Wikipedia article on UML is
sufficient for my tastes, but I've know people who dove into Martin Fowler's books
and came away more knowledgeable, so that may help you.
Can you name the different tiers and responsibilities in an N-tier architecture?
For what value of N? (I mean, we could have N=1000000 and I wouldn't know -- or if I did know, we might
be here all day.) Normally N=3, so we might be talking about presentation, logic, and data tiers. Sometimes
we might talk about Entities and others, or we might be considering (mistakenly?) MVC.
I think the responsibilities are clear by their names, but if you'd like to discuss further, I'm
certainly okay with doing so.
Can you name different measures to guarantee correctness and robustness of data in an architecture?
I need a little direction here. It seems to me this is a product of many things, and I don't know where to start.
For instance, we could say that unit tests and integration tests can go part of the way there. We could
talk about validating user input, and that it matches some definition of "looking correct." We could
have checks coded and in place between the various systems that make up the architecture. Constraints on the
database. I could go on if I were giving myself more time to think about it.
Because of the open-endedness in this question, there are any number of references. I'd dive into
automated testing in its various forms, which when applied to the situation, should get you most of the
way there.
Can you name any differences between object-oriented design and component-based design?
To be honest, this is the first I've heard of component-based design, so no, I can't name the differences.
My thoughts would go towards having objects to design around (as in C++) vs. not having objects to
design around (as in C).
As it happens, there may be a reason the term "component-based design" seems new to me: IEEE held the
"1st ... workshop" on it not 6 months ago. They could very well be behind the times.
Searching with Google also indicates this may be designing from the view of the outside,
as in SOA.
I think the SOLID principles I mentioned above go beyond the availability of objects-proper, so I don't expect
to be surprised here. However, I can't offer you any reading advice and without a definitive source
from the Google results, I cannot even tell if I'm in the right ballpark.
Your thoughts are especially encouraged on this topic.
How would you model user authorization, user profiles and permissions in a database?
I wouldn't typically model the authorization piece in the DB. If I read you correctly, I'm
guessing you mean the storage of authorization information in the database, as opposed to the
act of authorizing. Under that assumption, I've modeled this situation in just about every
way I can imagine. A couple of scenarios:
a. Under a denormalized scenario, I'd keep a table of permissions and a table of users (which includes authorization
information, profile information, and a list of permissions from the users table). This isn't ideal if permissions
ever change, and especially not if you're returning a ton of users for the purpose of authorization while the
profile information is especially large. In that case you're transferring way more data than you need, and it could
result in performance problems. (The extra data transfer may only be a problem with ORM tools, as you
could always hand-write the queries to return only what you need.
On the other hand, storing of redundant data is a problem if storage space itself is an issue.)
b. Under a completely normalized scenario, we'd have a table of permissions, a table relating users
to permissions, and a table for users. For the sake of cohesion (and potentially optimizing data transfer)
we might separate the users table into one for authentication and another for profile, while keeping the
relationship with permissions based on user_auth.
c. Some variation in between the two extremes.
For reading? For me it's based on experience, and perhaps a couple of database courses in college. I guess
just about any book on database design would do. I wouldn't bother trying to understand the formal
academic descriptions of database normalization,
but if you want to, you can only be better for it (as long as you can recognize the tradeoffs due to extra
joins!) Reader suggestions are highly welcome, as always.
How would you model the animal kingdom (with species and their behavior) as a class system?
This one might deserve a blog post all on its own. It depends: If I'm working in a language with
multiple inheritance, I'd use a combination of class hierarchy that follows the animal kingdom along
with mixins (which are also inheritance, but with less of a hierarchical attitude) for behavior shared
between and among the hierarchy levels. Without multiple inheritance, I'd have to resort to
interfaces where available, and composition for actual code reuse where it made sense.
The short answer though, is that I probably wouldn't implement it as a class system. If I really was working
with taxonomy and biological classification,
I don't think I'd model the real world with objects. I'd need to look into the subject quite a bit further
to tell you how I would do it, but suffice to say I don't think it'd be using objects to match
it one-for-one, or even something resembling one-to-one.
Reading: I wouldn't know where to begin. The SOLID principles will guide you, but I wouldn't think that's all
there is to it.
What do you think? Where would your answers differ?
Posted by Sam on Jun 17, 2009 at 12:00 AM UTC - 5 hrs
low cou-pling and high co-he-sion n.
A standard bit of advice for people who are learning to design their code better, who want to
write software with intention as opposed to coincidence, often parroted by the advisor
with no attempt to explain the meaning.
Motivation
It's a great scam, don't you think? Someone asks a question about how to design their code,
and we have these two nebulous words to throw back at them: coupling and cohesion.
We even memorize a couple of adjectives that go with the words: low and high.
More...
Cohesion Good. Coupling, Baaaaad!
It's great because it shuts up the newbie who asks the question -- he doesn't want to appear dumb, after all --
and it gets all of those in-the-know to nod their heads in approval. "Yep, that's right. He's got it. +1."
But no one benefits from the exchange. The newbie is still frustrated, while the professional doesn't
give a second thought to the fact that he probably doesn't know what he means. He's just parroting
back the advice that someone gave to him. It's not malicious or even conscious, but nobody is getting smarter
as a result of the practice.
Maybe we think the words are intuitive enough. Coupling means that something is depending on something else, multiple
things are tied together. Cohesion means ... well, maybe the person asking the question heard something about
it in high school chemistry and can recall it has something to do with sticking together.
Maybe they don't know at all.
Maybe, if they're motivated enough (and not that we've done anything to help in that department), they'll look it
up:
Coincidental cohesion (worst)
is when parts of a module are grouped arbitrarily (at random); the parts have no significant relationship (e.g. a module of frequently used functions).
Logical cohesion
is when parts of a module are grouped because they logically are categorised to do the same thing, even if they are different by nature (e.g. grouping all I/O handling routines).
Temporal cohesion
is when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (e.g. a function which is called after catching an exception which closes open files, creates an error log, and notifies the user).
Procedural cohesion
is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
Communicational cohesion
is when parts of a module are grouped because they operate on the same data (e.g. a module which operates on the same record of information).
Sequential cohesion
is when parts of a module are grouped because the output from one part is the input to another part like an assembly line (e.g. a function which reads data from a file and processes the data).
Functional cohesion (best)
is when parts of a module are grouped because they all contribute to a single well-defined task of the module
Content coupling (high)
is when one module modifies or relies on the internal workings of another module (e.g. accessing local data of another module).
Therefore changing the way the second module produces data (location, type, timing) will lead to changing the dependent module.
Common coupling
is when two modules share the same global data (e.g. a global variable).
Changing the shared resource implies changing all the modules using it.
External coupling
occurs when two modules share an externally imposed data format, communication protocol, or device interface.
Control coupling
is one module controlling the logic of another, by passing it information on what to do (e.g. passing a what-to-do flag).
Stamp coupling (Data-structured coupling)
is when modules share a composite data structure and use only a part of it, possibly a different part (e.g. passing a whole record to a function which only needs one field of it).
This may lead to changing the way a module reads a record because a field, which the module doesn't need, has been modified.
Data coupling
is when modules share data through, for example, parameters. Each datum is an elementary piece, and these are the only data which are shared (e.g. passing an integer to a function which computes a square root).
Message coupling (low)
is the loosest type of coupling. Modules are not dependent on each other, instead they use a public interface to exchange parameter-less messages (or events, see Message passing).
No coupling
[is when] modules do not communicate at all with one another.
What does it all mean?
The Wikipedia entries mention that "low coupling often correlates with high cohesion" and
"high cohesion often correlates with loose coupling, and vice versa."
However, that's not the intuitive result of simple evaluation, especially on the part of someone who doesn't
know in the first place.
In the context of the prototypical question
about how to improve the structure of code, one does not lead to the other. By reducing coupling, on the face of
it the programmer is going to merge unrelated units of code, which would also reduce cohesion. Likewise, removing
unrelated functions from a class will introduce another class on which the original will need to depend, increasing
coupling.
To understand how the relationships become inversely correlated requires a larger step in logic, where examples
of the different types of coupling and cohesion would prove helpful.
Examples from each category of cohesion
Coincidental cohesion often looks like this:
class Helpers;
class Util;
int main(void) {
where almost all of your code goes here;
return 0;
}
In other words, the code is organized with no special thought as to how it should be organized.
General helper and utility classes,
God Objects,
Big Balls of Mud, and other anti-patterns
are epitomes of coincidental cohesion.
You might think of it as the lack of cohesion: we normally talk about cohesion being a good thing, whereas
we'd like to avoid this type as much as possible.
(However, one interesting property of coincidental cohesion is that even though the code in question should not be stuck together,
it tends to remain in that state because programmers are too afraid to touch it.)
With logical cohesion, you start to have a bit of organization. The Wikipedia example mentions "grouping
all I/O handling routines." You might think, "what's wrong with that? It makes perfect sense." Then consider that
you may have one file:
IO.somelang
function diskIO();
function screenIO();
function gameControllerIO();
While logical cohesion is much better than coincidental cohesion, it doesn't necessarily go far enough in terms
of organizing your code. For one, we've got all IO in the same folder in the same file, no matter what type of
device is doing the inputting and outputting. On another level, we've got functions that handle both input and
output, when separating them out would make for better design.
Temporal cohesion
is one where you might be thinking "duh, of course code that's executed based on some other
event is tied to that event." Especially considering the Wikipedia example:
a function which is called after catching an exception which closes open files,
creates an error log, and notifies the user.
But consider we're not talking about simple the relationship in time. We're really interested in the code's structure.
So to be temporally cohesive, your code in that error handling situation should keep the closeFile,
logError, and notifyUser functions close to where they are used. That doesn't mean
you'll always do the lowest-level implementation in the same file -- you can create small functions that take
care of setting up the boilerplate needed to call the real ones.
It's also important to note that you'll almost never want to implement all of that directly in the catch
block. That's sloppy, and the antithesis of good design. (I say "almost" because I am wary of absolutes, yet I cannot think
of a situation where I would do so.) Doing so violates functional cohesion, which is what we're really
striving for.
Procedural cohesion
is similar to temporal cohesion, but instead of time-based it's sequence-based. These are similar because
many things we do close together in time are also done in sequence, but that's not always the case.
There's not much to say here. You want to keep the definitions of functions that do things together structurally
close together in your code, assuming they have a reason to be close to begin with. For instance,
you wouldn't put two modules of code together if they're not at least logically cohesive to begin with. Ideally,
as in every other type of cohesion, you'll strive for functional cohesion first.
Communicational cohesion
typically looks like this:
some lines of code;
data = new Data();
function1(Data d) {...};
function2(Data d) {...};
some more lines of code;
In other words, you're keeping functions together that work on the same data.
Sequential cohesion
is much like procedural and temporal cohesion, except the reasoning behind it is that functions would
chain together where the output of one feeds the input of another.
Functional cohesion is the ultimate goal.
It's The Single Responsibility Principle [PDF] in
action. Your methods are short and to the point. Ones that are related are grouped together locally in a file.
Even files or classes contribute to one purpose and do it well. Using the IO example from above, you might have
a directory structure for each device, and within it, a class for Input and one for Output. Those would be children
of abstract I/O classes that implemented all but the device-specific pieces of code.
(I use inheritance terminology here only
as a subject that I believe communicates the idea to others. Of course, you don't have to even have inheritance
available to you to achieve the goal of keeping device agnostic code in one locale while keeping the device
specific code apart from it).
Examples from each category of coupling
Content coupling is horrific. You see it all over the place. It's probably in a lot of your code, and
you don't realize it. It's often referred to a violation of encapsulation in OO-speak, and it looks like one
piece of code reaching into another, without regard to any specified interfaces or respecting privacy. The problem
with it is that when you rely on an internal implementation as opposed to an explicit interface, any time that
module you rely on changes, you have to change too:
module A
data_member = 10
end
module B
10 * A->data_member
end
What if data_member was really called num_times_accessed? Well, now you're screwed since you're
not calculating it.
Common coupling
occurs all the time too. The Wikipedia article mentions global variables, but this could be just a member in a class
where two or more functions rely on it if you consider it. It's not as bad when its encapsulated behind an interface,
where instead of accessing the resource directly, you do so indirectly, which allows you to change internal
behavior behind the wall, and keeps your other units of code from having to change every time the shared resource
changes.
An example of external coupling is a program where one part of the code reads a specific file format that
another part of the code wrote. Both pieces need to know the format so when one changes, the other must as well.
unit A
write_csv_format();
end
unit B // in another file, probably
read_csv_format();
end
Control coupling
might look like:
// unit A
function do(what){
if(what == 1) do_wop;
else if (what == 2) ba_ba_da_da_da_do_wop;
}
// unit B
A.do(1);
Stamp coupling (Data-structured coupling)
involved disparate pieces of code touching the same data structure in different ways. For example:
employee = { :age => 24, :compensation=> 2000 }
def age_range(employee)
range = 1 if employee[:age] < 10
range = 2 if employee[:age] > 10 && < 20;
...
return range
end
def compensation_range(employee)
... only relies on employee[:compensation] ...
end
The two functions don't need the employee structure, but they rely on it and if it changes, those two functions
have to change. It's much better to just pass the values and let them operate on that.
Data coupling
is starting to get to where we need to be. One module depends on another for data. It's a typical function call with parameters:
// in module A
B.add(2, 4)
Message coupling
looks like data coupling, but it's even looser because two modules communicate indirectly without ever passing
each other data. Method calls have no parameters, in other words.
No coupling, like Wikipedia says, is when "modules do not communicate at all with one another."
There is no dependency from code A to code B.
Concluding Remarks
So how do we reconcile the thought that "if I separate code to increase functional cohesion, I introduce dependencies
which is going to increase coupling" with the assertion that low coupling and high cohesion go hand in hand? To do that,
you must recognize that the dependencies already exist. Perhaps not at the class level, but they do at the lines of
code level. By pulling them out into related units, you remove the spaghetti structure (if you can call it that)
and turn it into something much more manageable.
A system of code can never be completely de-coupled unless it does nothing. Cohesion is a different story.
I can't claim that your code cannot be perfectly cohesive, but I can't claim that it can. My belief is it
can be very close, but at some point you'll encounter diminishing returns on your quest to make it so.
The key takeaway is to start looking at your code and think about what you can do to improve it as you notice
the relationships between each line you write start take shape.
Comments and corrections(!) are encouraged. What are your thoughts?
Posted by Sam on Jun 18, 2009 at 12:00 AM UTC - 5 hrs
Don't encode information into a string like "AAHD09102008BSHC813" and give that gibberish to people. Don't name your project that, don't give that to me as a value or way to identify something, and don't make humans see or interact with that in any form. (If you are generating something similar and parse it with a program in automated fashion, I don't care what you call it.)
Give it a name we can use while communicating with each other and keep the rest of the information in a database. I can look it up if I need to know it.
Do not use file names, folder names, or project names as your as your database. I don't want to be required to scan each item in whatever set you chose and translate it using a lookup table to find what I'm looking for. I don't want to memorize the lookup table either.
Posted by Sam on Jul 21, 2009 at 12:00 AM UTC - 5 hrs
The other day I went to a major pizza chain's website to order online. I had to create an account first, of course.
No big deal.
As I was choosing my password, I was pleased to see a password strength indicator to the right. Excellent,
it's telling my password is "too short" -- let me add some more characters. "Warning: Too Simple" it said.
Great - now I'll add some numbers in there. My password strength was now "good," but since they were going
to be storing my personal details, I wanted a "great" password. I like to throw characters in there that
aren't letters or numbers, so I did. And it told me my password strength was "great."
More...
Even better that they gave a color indication as well - going from red to green as my password got
sufficiently strong.
You can imagine my disappointment when I hit the "Go" button, only to be presented with this message:
Please enter a valid password. Valid passwords must be at least 8 characters in length and contain letters and numbers only.
Look, I understand the allure of arbitrarily limiting these things. You secretly want someone to put in those special characters just so they can
see how good you were to have the foresight that someone might try to use one.
When I first started programming for money I did the same thing. Even though we were on a modern OS with
very few limitations on filenames, I wrote the application such that it would enforce MS DOS filename restrictions,
and only allowed letters and numbers. I spent extra effort to make sure the application would fail
on completely valid input.
As a consequence, strange bugs pop up because that app is not the only interface, and the system is less usable for the customer.
I know most of you in this audience already know you shouldn't design "features" like this. But for the
newbies that haven't yet had enough experience to know: If you don't have a valid reason for
constraining the data, don't do it just to show off what you can do.
It's an annoyance at best. It adds complexity where none is needed,
making your application harder to maintain over time. At worst, it results in defects that your customer
paid you to insert into the application. And that's a tragedy of ethics.
Posted by Sam on Aug 11, 2009 at 12:00 AM UTC - 5 hrs
The other day on twitter I asked about large projects, and I wanted to
bring the question to the larger audience here.
We hear vague descriptions about project size tossed about all the time:
Ruby on Rails is better for small projects. Java is for large projects. Skip the framework for
small projects, but I recommend using one for large projects.
More...
What factors go into determining the "size" of a project? Certainly lines of code and the size of the team are
considerations. Perhaps we should include the number of users as well. What would you add?
I suspect that to developers who tend to work alone on projects, a large one might be dozens of thousands of lines
of code. For those who work in moderate size teams, say with half a dozen members, we might hear a few hundred
thousand lines of code. For large teams in the teen-range, I'd expect millions of lines. What about teams with
50-100 developers?
I think makes a difference when you're giving advice on various aspects as to what constitutes a large project (or, if
you believe the advice is relative in those aspects, say why), so I'm interested to hear your thoughts.
So I ask, what is a "large project" to you? What do you think it means to others?
Those who've been reading My Secret Life as a Spaghetti Coder for a long time will know I totallylove IDEA.
I haven't written software in Java for quite some time, and I don't normally do "news" posts, but I know enough of you are into Java, Groovy, and Scala to make this worth your while if the $249 was pricey enough to force you into using lesser Java IDEs. Now you don't have to.
Posted by Sam on Feb 20, 2008 at 07:11 AM UTC - 5 hrs
A little while ago I was trying to think of ways to have a program compare strings
and discover patterns among them (as opposed to knowing patterns and looking for particular ones).
Over the years, I've learned about a lot of algorithms, but there's no way I could recall
all of them. I knew I'd probably need to look at artificial intelligence, or more specifically,
machine learning. But that's all I had to go on.
At the time, I decided it would be helpful to have a list of algorithms and
data structures with short descriptions to browse and jog my memory.
More...
Most of the problems you'll solve in your programming career don't require a lot
of thought to arrive at a correct solution. But algorithms, data structures, and
approaches to problems aren't just limited to the realm of programming. Reg Braithwaite
reminds us of another reason to have these things at your disposal -
even the problem of determining
who to hire can be reduced to Naïve Bayes Classification.
And when you have those problems where
there is no human solution (how can I discover patterns in several strings
which may have hundreds of characters?), or the computer solution takes too long to
find the optimal one where good enough will do, or there just isn't necessarily a
right answer -- those are the hard ones where you aren't likely to stumble upon an
answer -- where do you turn?
It turns out, a lot of problems can be reduced to others we already know how to solve.
In fact, the basis of proving complexity class for an algorithm utilizes that: reduction of one
problem to another will prove that if you solve the first one, you can solve the second one,
and it will be just as complex. A famous example is
SAT.
I haven't yet compiled the list I spoke of above, but luckily for all of us,
Wikipedia has a good starting point.
It's missing a couple that stand out in my mind
(or that have a different name I didn't look for, or multiple classifications and it
didn't make it to the one I looked at), but that's just
something I can put on my to-do list to improve. The Machine Learning category, for instance,
seems fairly light.
So just browsing a list and short description of algorithms may enlighten you as to how you
can reduce your problem to one that's already been solved. If you can do that, you've got a solution
from someone who's probably much smarter than you are. It's as if you have
Donald Knuth and the rest of computer
science academia on your team, and you don't even have to pay them (except, perhaps by buying
their book, or subscribing to a journal that will disseminate their knowledge).
I find it interesting that lots of people write about how to produce clean code,
how to do good design, taking care about language choice, interfaces, etc, but few people
write about the cases where there isn't time... So, I need to know what are the forces that tell you
to use a jolly good bodge?
I suspect we don't hear much about it because these other problems are often caused by that excuse.
And, in the long run, taking on that technical debt will likely cause you to go so slow that that's the
more interesting problem. In other words, by ignoring the need for good code, you are jumping into
a downward spiral where you are giving yourself even less time (or, making it take so long to do anything
that you may as well have less time).
More...
I think the solution is to start under-promising and over-delivering, as opposed to how most of us do it
now: giving lowball estimates because we think that's what they want to hear. But why lie to them?
If you're using iterative and incremental development, then if you've over-promised one iteration, you
are supposed to dial down your estimates for what you can accomplish in subsequent iterations, until
you finally get good at estimating. And estimates should include what it takes to do it right.
That's the party-line answer to the question. In short: it's never OK to write sloppy code, and
you should take precautions against ever putting yourself in a situation where those
viscous forces pull you in that direction.
In those cases where you've already painted yourself into a corner, what then? That's the interesting
question here. How do you know the best
places to hack crapcode together and ignore those things that may take a little longer in the short run, but
whose value shows up in the long run?
The easy answer is the obvious one: cut corners in the code that is least likely to need to change or
be touched again. That's because (assuming your hack works) if we don't have to look at the code again,
who really cares that it was a nasty hack? The question whose answer is not so easy or
obvious is "what does such a place in the code look like?"
By the definition above, it would be the lower levels of your code. But if you do that, and inject a bug, then
many other parts of your application would be affected. So maybe that's not the right place to do it.
Instead, it would be better to do it in the higher levels, on which very little (if any) other code
depends. That way, you limit the effects of it. More importantly, if there are no outgoing dependencies
on it, it is easier to change than if other code were highly dependent on it. [1]
Maybe the crapcode can be isolated: if a class is already aweful, can you derive a new class from it and
make any new additions with higher quality? If a class is of high quality and you need to hack something together,
can you make a child class and put the hack there? [2]
In the end, there is no easy answer that I can find where I would definitively say, "that's the place for a bodging."
But I suspect there are some patterns we can look for, and I tried to identify a couple of those above.
Do you have any candidates you'd like to share?
Notes: [1] A passing thought for which I have no answers:
The problem with even identifying those places is that by hacking together solutions, you are more likely
to inject defects into the code, which makes it more likely you'll need to touch it again.
[2] I use inheritance here because the new classes should be
able to be used without violating LSP.
However, you may very well be able to make those changes by favoring composition.
If you can, I'd advocate doing so.
Posted by Sam on Feb 13, 2008 at 08:44 AM UTC - 5 hrs
One step back from greatness lies the very definition of the impossible leadership situation:
a president affiliated with a set of established commitments that have in the course of
events been called into question as failed or irrelevant responses to the problems of the day...
The instinctive political stance of the establishment affiliate -- to affirm and continue the
work of the past -- becomes at these moments a threat to the vitality, if not survival,
of the nations, and leadership collapses upon a dismal choice. To affirm established
commitments is to stigmatize oneself as a symptom of the nation's problems and the premier
symbol of systemic political failure; to repudiate them is to become isolated from one's most
natural political allies and to be rendered impotent.
A little while ago Obie asked "What's this crap about a Ruby backlash?" The whole situation has reminded me of Skowronek's work, so I dug a couple of passages up.
We're at a crossroads right now between two regimes - one represented by Java, and the other represented by Ruby (although it is quite a bit more nuanced than that). My belief right now is that Java The Language is in a position where it can't win. People are fed up with the sameoldcrap, and a change is happening (see also: Why Do I Have To Tell The Compiler Twice?, or Adventures in Talking To a Compiler That Doesn't Listen.)
More...
What these [reconstructive] presidents did, and what their predecessors could not do, was to
reformulate the nation's political agenda altogether, ... and to move the nation past the old
problems, eyeing a different set of possibilities... (Skowronek, pg. 38)
When the new regime starts gaining momentum, in the old regime there will be wailing and gnashing of teeth. We can see some of this in the dogma repeated by Ruby's detractors alluded to (but not sourced) by Daniel Spiewak. We hear it in the fear in people's comments when they fail to criticize the ideas, relying instead on ad hominem attacks that have little to nothing to do with the issues at hand.
(Unlike Obie, I don't have any reason to call attention to anyone by name. If you honestly haven't seen this, let's try i don't like ruby, ruby sucks, and ruby is slow and see if we can weed through the sarcasm, apologists who parrot the line so as not to offend people, or just those exact words with no other substance. )
Neal Gafter quotes himself and Joshua Bloch in Is Java Dying? (where he concludes that it isn't):
Neal Gafter: "If you don't want to change the meaning of anything ever, you have no choice but to not do anything. The trick is to minimize the effect of the changes while enabling as much as possible. I think there's still a lot of room for adding functionality without breaking existing stuff..."
Josh Bloch: "My view of what really happens is a little bit morbid. I think that languages and platforms age by getting larger and clunkier until they fall over of their own weight and die very very slowly, like over ... well, they're all still alive (though not many are programming Cobol anymore). I think it's a great thing, I really love it. I think it's marvelous. It's the cycle of birth, and growth, and death. I remember James saying to me [...] eight years ago 'It's really great when you get to hit the reset button every once and a while.'"
To me, the debate is starting to look a lot like the regime change Skowronek's work predicts when going from a vulnerable establishment regime where an outsider reconstructs a new one.
I'm not saying Ruby itself will supplant Java. But it certainly could be a piece of the polyglotprogramming puzzle that will do it. It's more of an overall paradigm shift than a language one, so although I say one part is represented by Java and another by Ruby, I hope you won't take me literally.
Franklin Roosevelt was the candidate with "clean hands" at a moment when failed policies,
broken promises, and embarrassed clients were indicting a long-established political order.
Agitating for a rout in 1932, he inveighed against the entire "Republican leadership." He
denounced them as false prophets of prosperity, charged them with incompetence in dealing with
economic calamity, and convicted them of intransigence in the face of social desperation.
Declaring their regime morally bankrupt, he campaigned to cut the knot, to raise a new standard,
to restore to American government the ancient truths that had first inspired it.
(Skowronek, pg 288)
Hoover's inability to take the final step in innovation and
repudiate the system he was transforming served his critic's well... Hoover would later
lament the people's failure to appreciate the significance of his policies, and yet he was
the first to deny it. The crosscurrents of change in the politics of leadership left him with
an impressive string of policy successes, all of which added up to one colossal political
failure... Hoover sought to defend a system that he had already dispensed with...
What do you find hard about TDD? When you're developing and you see yourself
not writing tests but jamming out code, what causes those moments for you?
And have you really, in all honesty, ever reaped significant benefits either in
productivity or quality from unit testing? Because there's a pretty large contingent
of folks who don't get much mileage out of TDD, and I can see where they're coming from.
My TDD Stumbling Blocks
I'll address the first bit in one word: viscosity. When it's easier to do the wrong thing
than the right thing, that's when I "see myself not writing tests but jamming out code."
But what causes the viscosity for me? Several things, really:
More...
When I'm working with a new framework or technology and I don't know how to test it: I'm trying
to avoid this now by learning languages by unit testing.
However, it's still tough. I started writing tests in C# .NET recently, but moving things to
ASP.NET has made me stumble a bit. That's mostly because I didn't take the time to understand
how it all worked before I started using it, and now I'm in the process of rewriting that code before it becomes too
entrenched.
UIs: I still don't understand how to test them effectively. I like Selenium for the web,
but most tests I write with it are brittle. Because of that, I write them flippantly. It's a
vicious cycle too: without learning what works, I won't get better at identifying strategies to
remove the viscosity, so I won't write the tests.
That last one is a killer for me. When I'm working on new projects, it's incredibly easy to write
tests as I develop. So much so that I don't bother thinking about not doing it. Unfortunately, most
of my work is not in new code bases.
I should also note that I often don't bother unit testing one-off throwaway scripts, but there
are times when I do.
On top of that, my unit tests rarely stay unit-sized. I generally just
let them turn into integration tests (stubbing objects as I need them when they are still
unit-sized). The only time I bother with mocks are if the integration piece is taking too long
to run tests.
For example, I might let the tests hit a testing database for a while, but as the tests get unbearable
to run, I'll write a different class to use that just returns some pre-done queries, or runs
all the logic except for save().
What about rewards?
In Code Complete 2, Steve McConnell talks about why it's important to measure experiments when
tuning your code:
Experience doesn't help much with optimization either. A person's experience might have
come from an old machine, language, or compiler - when any of those things changes, all
bets are off. You can never be sure about the effect of an optimization until you
measure the effect. (McConnell, 603)
I bring that up because I think of TDD (and any other practice we might do while
developing) as an optimization, and to be sure about it's effects, I'd have to measure it.
I haven't measured myself with TDD and without, so you can take what follows as anecdotal
evidence only. (Just because I say that, don't think you can try TDD for a couple of days
and decide it's slowing you down so it doesn't bring any benefit - it takes a while to
realize many of the benefits.)
So what rewards have I noticed? Like the problems I've had, there are a few:
Better design: My design without TDD has been a train wreck (much of that due to my
past ignorance of design principles), but has (still) improved as a result of TDD.
After all, TDD is a design activity. When writing a test, or determining what test to write next, you
are actively involved in thinking about how you want your code to behave, and how you want to
be able to reuse it.
As a byproduct of writing the tests, you get a very modular design - it becomes harder to do
the wrong thing (bad design), and easier to keep methods short and cohesive.
Less fear: Do you have any code that you just hate to touch because of the horror it sends
down your spine? I do. I've had code that is so complex and wrapped up within itself that I've
literally counseled not changing it for fear of it breaking and not being able to fix it. My
bet is that you've probably seen similar code.
The improved design TDD leads to helps that to some extent obviously. But there may be times
when even though you've got a test for something, it's still ugly code that could break easily.
The upside though, is you don't need to fear it breaking. In fact, if you think about it,
the fear isn't so much that you'll break the code - you fear you won't know you've broken it.
With good tests, you know when you've broken something and you can fix it before you deploy.
Time savings: It does take some time to write tests, but not as much as you might think.
As far as thinking about what you want your code to do, and how you want to reuse it, my
belief is that you are doing those things anyway. If not, you probably should be, and your
code likely looks much the same as some of that which I have to deal with
(for a description, see the title of this weblog).
It saves time as an executable specification - I don't have to trace through a big code base
to find out what a method does or how it's supposed to do it. I just look up the unit tests
and see it within a few clean lines.
Most of your tests will be 5-7 lines long, and you might have five tests per method. Even
if you just test the expected path through the code, ignoring exceptions and negative tests,
you'll be a lot better off and you'll only be writing one or two tests per method.
How long does that take? Maybe five minutes per test? (Which would put you at one minute per line!)
Maybe you won't achieve that velocity as you're learning the style of development, but certainly you could
be there (or better) after a month or two.
And you're testing anyway, right? I mean, you don't write code and check it in to development
without at least running it, do you? So, if you're programming from the bottom up, you've
already written a test runner of some sort to verify the results. What would it cost to
put that code into a test? Perhaps a minute or three, I would guess.
And now when you need to change that code, how long does it take you to login to the application,
find the page you need to run, fill out the form, and wait for a response to see if you were right?
If you're storing the result in the session, do you need to log out and go through the same process,
just to verify a simple calculation?
How much time would it save if you had written automated tests? Let's say it takes you two
minutes on average to verify a change each time you make one. If it took you half-an-hour
of thinking and writing five tests, then within 15 changes you've hit even and the rest is gravy.
How many times do you change the same piece of code? Once a year? Oh, but we didn't include all the
changes that occur during initial development. What if you got it wrong the first
time you made the fix? Certainly a piece of code changes 15 times even before you've got it
working in many cases.
Overall, I believe it does save time, but again, I haven't measured it. It's just all those little
things you do that take a few seconds at a time - you don't notice them. Instead, you think
of them as little tasks to get you from one place to another. That's what TDD is like: but
you don't see it that way if you haven't been using it for a while. You see it as an
extra task - one thing added to do. Instead, it replaces a lot of tasks.
And wouldn't it be better if you could push a button and verify results?
That's been my experience with troubles and benefits. What's yours been like? If you haven't
tried it, or are new, I'm happy to entertain questions below (or privately if you prefer) as
well.
It's something to be expected, but as I thought about it, I wondered why.
It's not my intent here to draw negative attention by questioning the conventional wisdom of the status quo,
but I fear that may happen. I simply want to ask the obvious:
How many projects have you participated in where Unicode was an explicit or implicit requirement?
What percentage of the total do those make up? In the remainder of cases, would something
like Arc have been useful to you?
For the vast majority of projects I've worked on, having support for 9+ bit character sets or curly
quotes was not a requirement, and Arc would have been useful on the ones that didn't have a specific
language or platform requirement. (I understand if your work takes you there, but also understand
many of ours don't.)
Keep it civil and topical please. It's nothing but an observation and a question, not a statement of
religious belief spread with the fervor of a crusader.
Posted by Sam on Jan 30, 2008 at 07:34 AM UTC - 5 hrs
Because when you don't, how do you know your change to the code had any effect?
When a customer calls with a trouble ticket, do you just fix the problem, or do you
reproduce it first (a red test), make the fix, and test again (a green test, if you fixed the
problem)?
Likewise, if you write automated tests, but don't run them first to ensure
they fail, it defeats the purpose of having the test. Most of the time you won't
run into problems, but when you do, it's not fun trying to solve them. Who would
think to look at a test that's passing?
The solution, of course, is to forget about testing altogether. Then we won't be lulled into
a false sense of security. Right?
Posted by Sam on Jan 28, 2008 at 06:46 AM UTC - 5 hrs
I don't like to have too many microposts on this blog, so I've decided to save them up and start
a Programming Quotables series. The idea is that I'll post quotes about programming that have one or more of the
following attributes:
I find funny
I find asinine
I find insightfully true
And stand on their own, with little to no comment needed
Here's the second in that series. I hope you enjoy them as much as I did:
More...
Now we can see that although each language provides abstraction mechanisms, and lets you build new abstractions with the mechanism, new kinds of abstractions give us new ways to express relationships. These things can be abused, of course, but nothing can save you from this: If you don't let your Architect play with Domain-Specific Languages, what is to stop them from configuring everything in your application with XML?
The earlier Ages were times when people were unsure how best to deliver business value through software development. They piled process on top of process, hoping that by finding and perfecting the right process they could ensure timely delivery of valuable software in a predictable way. When it didn't work, they logically assumed the best approach would be to do even more of the stuff that wasn't working, and to monitor the stuff very tightly with high-ceremony process controls.
Then again, all one has to do to out-perform a traditional development team is Not Suck. It doesn't even require agile development methods. Obviously, to Not Suck is a great step forward. But surely it isn't your ultimate professional goal!
Language expertise is fine, but it isn't the most valuable thing out there. If someone programs conscientiously, I can work with them. I have a lot of respect for people who write solid code despite not having completely up to date language knowledge. Language knowledge is easy: you read, you think, you try. And, you can catch up. Conscientiousness, though, is the thing that really matters. Next to it, language expertise is easy.
It's obvious now, though, isn't it? A design pattern isn't a feature. A Factory isn't a feature, nor is a Delegate nor a Proxy nor a Bridge. They "enable" features in a very loose sense, by providing nice boxes to hold the features in. But boxes and bags and shelves take space. And design patterns - at least most of the patterns in the "Gang of Four" book - make code bases get bigger. Tragically, the only GoF pattern that can help code get smaller (Interpreter) is utterly ignored by programmers who otherwise have the names of Design Patterns tatooed on their various body parts.
Steve Yegge, Code's Worst Enemy (Really, you should read the whole thing. It's important.)
Posted by Sam on Jan 23, 2008 at 10:35 AM UTC - 5 hrs
Suppose you want to share some data that one object produces with another object as the consumer. How would you go about doing that?
If you took a straightforward approach, you might have Producer call consumer.method("data") and pass it the data that way. On the other hand, you could have Consumer get the data it needs by requesting it from Producer with something like this: myData = producer.getData().
However, perhaps Producer and Consumer shouldn't know anything about each other. Then you might introduce an Intermediary that gets the data from Producer and passes it to Consumer with something like consumer.myData = producer.getData()
Now if you want to get really creative, you could make Producer write its data to an XML file, and then have Consumer read the data from there.
But why?
Disagreements and horror stories are encouraged below.
I think you have a great concept going. I really would like to find out HOW you became passionate about programming? I just graduated with a BS in CIS and am looking for an entry level IT job, HOWEVER I am not a bit excited about computers anymore. Like you I was just planning on continuing my education -get my MBA. But I know an IT job is what I went to school for. HELP! How do I get excited about an IT job when I can't even figure out what title to put on a job search? just degree in CIS?!
I started to comment, but as it became longer, I decided it might benefit others as a standalone post.
More...
I think you just have to make the decision to be passionate. Wake up in the morning and think about how lucky you are. Decide to enjoy the day, and to enjoy what you do. But you can do better.
Think about what drew you to the profession in the first place, and try to get a job doing that. If you can't get one in a timely fashion, try to get a job doing something similar and spend your free time working on side projects that interest you. That's a lot of what I'm doing.
For me, it is learning new things and gaming that I enjoy most. So, I spend a lot of time doing that. I still have to do grunt work, but I get equal doses of fun stuff too - all the while I am expanding my skill set and enjoying most of it.
As far as the job search goes, I'd recommend networking with people. Visit the local User Groups, get involved in forums and mailing lists. Learn things and share them, and people will eventually come to you with jobs. Even though I like the computers, I've found that I really enjoy the relationships with people who also like computers. Before, I stayed locked in a room thinking and working by myself. Now, I venture out from time to time, and in addition, I have the online relationships I enjoy immensely.
Since you probably can't afford to live that long without a job, perhaps in the mean time you can go to a career fair at a local university or just use the search term "programming" and browse jobs until you find one that interests you.
I'd also read many different weblogs about programming to stay up to date on trends in the industry, as well as to receive solid advice that stands the test of time.
Just as importantly, take a look back at your own situation. Can you identify anything that may be causing your malaise? If so, can you remove it? It may be as simple as that.
To the rest of you: how do you maintain and find passion in your work when you seem to have lost it?
Posted by Sam on Jan 18, 2008 at 09:05 AM UTC - 5 hrs
You feel, look, and do better when you are accomplishing goals and showing progress. That's one
reason you'll find it in this week's advice from MJWTI.
The chapter is called "Daily Hit" and in it Chad recommends "setting a goal (daily, weekly, or
whatever you're capable of) and tracking this type of accomplishment." Make sure it's known to
your manager as well - don't let the underpants gnomes take credit for your
success. Also, the shorter the distance between hits the better, since "if you're supposed to produce a
hit per day, you can't spend two weeks tracking the perfect task."
More...
I work in an environment where it wouldn't benefit me to "tell the manager" about my daily hits.
They know already. But you might want to make yours known. Obviously you don't want to
be braggadocious about it, but don't keep it to yourself either.
I like to do more than one hit per day. One is the absolute minimum. I try to get an overview
of what needs to be done during the week, and create a high level plan in my mind over the weekend.
I'll set each day's tasks and the time I'll be working on them on the day before. Then, I stick
to the plan.
Even when I don't finish a task in the time allotted, I can reschedule and finish it the next day. In this way, I'm always
having small successes which keeps me motivated and moving along towards my goals.
We're not talking about just what has to be done - it's about going above and beyond that. Chad ends the chapter with the advice to make a list of the "nitpicky problems" you and your team face that waste a little time each day, and starting to do some work on those things. I've done some of that, but there's plenty left to do, and re-reading this chapter reminded me that I need to start scheduling those things as well.
Posted by Sam on Jan 14, 2008 at 06:42 AM UTC - 5 hrs
This is a story about my journey as a programmer, the major highs and lows I've had along the way, and
how this post came to be. It's not about how ecstasy made me a better programmer, so I apologize if that's why you came.
In any case, we'll start at the end, jump to
the beginning, and move along back to today. It's long, but I hope the read is as rewarding as the write.
The experiences discussed herein were valuable in their own right, but the challenge itself is rewarding
as well. How often do we pause to reflect on what we've learned, and more importantly, how it has changed
us? Because of that, I recommend you perform the exercise as well.
I freely admit that some of this isn't necessarily caused by my experiences with the language alone - but
instead shaped by the languages and my experiences surrounding the times.
One last bit of administrata: Some of these memories are over a decade old, and therefore may bleed together
and/or be unfactual. Please forgive the minor errors due to memory loss.
Before I was 10, I had a notepad with designs for my as-yet-unreleased blockbuster of a side-scrolling game that would run on
my very own Super Sanola game console (I had the shell designed, not the electronics).
It was that intense interest in how to make a game that led me to inspect some of the source code Microsoft
provided with QBASIC. After learning PRINT, INPUT,
IF..THEN, and GOTO (and of course SomeLabel: to go to)
I was ready to take a shot at my first text-based adventure game.
The game wasn't all that big - consisting of a few rooms, the NEWS
directions, swinging of a sword against a few monsters, and keeping track of treasure and stats for everything -
but it was a complete mess.
The experience with QBASIC taught me that, for any given program of sufficient complexity, you really only
need three to four language constructs:
Input
Output
Conditional statements
Control structures
Even the control structures may not be necessary there. Why? Suppose you know a set of operations will
be performed an unknown but arbitrary amount of times. Suppose also that it will
be performed less than X number of times, where X is a known quantity smaller than infinity. Then you
can simply write out X number of conditionals to cover all the cases. Not efficient, but not a requirement
either.
Unfortunately, that experience and its lesson stuck with me for a while. (Hence, the title of this weblog.)
Side Note: The number of language constructs I mentioned that are necessary is not from a scientific
source - just from the top of my head at the time I wrote it. If I'm wrong on the amount (be it too high or too low), I always appreciate corrections in the comments.
What ANSI Art taught me about programming
When I started making ANSI art, I was unaware
of TheDraw. Instead, I opened up those .ans files I
enjoyed looking at so much in MS-DOS Editor to
see how it was done. A bunch of escape codes and blocks
came together to produce a thing of visual beauty.
Since all I knew about were the escape codes and the blocks (alt-177, 178, 219-223 mostly), naturally
I used the MS-DOS Editor to create my own art. The limitations of the medium were
strangling, but that was what made it fun.
And I'm sure you can imagine the pain - worse than programming in an assembly language (at least for relatively
small programs).
Nevertheless, the experience taught me some valuable lessons:
Even though we value people over tools, don't underestimate
the value of a good tool. In fact, when attempting anything new to you, see if there's a tool that can
help you. Back then, I was on local BBSs, and not
the 1337 ones when I first started out. Now, the Internet is ubiquitous. We don't have an excuse anymore.
I can now navigate through really bad code (and code that is limited by the language)
a bit easier than I might otherwise have been able to do. I might have to do some experimenting to see what the symbols mean,
but I imagine everyone would.
And to be fair, I'm sure years of personally producing such crapcode also has
something to do with my navigation abilities.
Perhaps most importantly, it taught me the value of working in small chunks and
taking baby steps.
When you can't see the result of what you're doing, you've got to constantly check the results
of the latest change, and most software systems are like that. Moreover, when you encounter
something unexpected, an effective approach is to isolate the problem by isolating the
code. In doing so, you can reproduce the problem and problem area, making the fix much
easier.
The Middle Years (included for completeness' sake)
The middle years included exposure to Turbo Pascal,
MASM, C, and C++, and some small experiences in other places as well. Although I learned many lessons,
there are far too many to list here, and most are so small as to not be significant on their own.
Therefore, they are uninteresting for the purposes of this post.
However, there were two lessons I learned from this time (but not during) that are significant:
As you can tell, I was quite the cowboy coding young buck. I've tried to change that in recent years.
How ColdFusion made me a better programmer when I use Java
Although I've written a ton of bad code in ColdFusion, I've also written a couple of good lines
here and there. I came into ColdFusion with the experiences I've related above this, and my early times
with it definitely illustrate that fact. I cared nothing for small files, knew nothing of abstraction,
and horrendous god-files were created as a result.
If you're a fan of Italian food, looking through my code would make your mouth water.
DRY principle?
Forget about it. I still thought code reuse meant copy and paste.
Still, ColdFusion taught me one important aspect that got me started on the path to
Object Oriented Enlightenment:
Database access shouldn't require several lines of boilerplate code to execute one line of SQL.
Because of my experience with ColdFusion, I wrote my first reusable class in Java that took the boilerplating away, let me instantiate a single object,
and use it for queries.
How Java taught me to write better programs in Ruby, C#, CF and others
It was around the time I started using Java quite a bit that I discovered Uncle Bob's Principles of OOD,
so much of the improvement here is only indirectly related to Java.
Sure, I had heard about object oriented programming, but either I shrugged it off ("who needs that?") or
didn't "get" it (or more likely, a combination of both).
Whatever it was, it took a couple of years of revisiting my own crapcode in ColdFusion and Java as a "professional"
to tip me over the edge. I had to find a better way: Grad school here I come!
The better way was to find a new career. I was going to enter as a Political Scientist
and drop programming altogether. I had seemingly lost all passion for the subject.
Fortunately for me now, the political science department wasn't accepting Spring entrance, so I decide to
at least get started in computer science. Even more luckily, that first semester
Venkat introduced me to the solution to many my problems,
and got me excited about programming again.
I was using Java fairly heavily during all this time, so learning the principles behind OO in depth and
in Java allowed me to extrapolate that for use in other languages.
I focused on principles, not recipes.
On top of it all, Java taught me about unit testing with
JUnit. Now, the first thing I look for when evaluating a language
is a unit testing framework.
What Ruby taught me that the others didn't
My experience with Ruby over the last year or so has been invaluable. In particular, there are four
lessons I've taken (or am in the process of taking):
The importance of code as data, or higher-order functions, or first-order functions, or blocks or
closures: After learning how to appropriately use yield, I really miss it when I'm
using a language where it's lacking.
Metaprogramming is OK. Before Ruby, I used metaprogramming very sparingly. Part of that is because
I didn't understand it, and the other part is I didn't take the time to understand it because I
had heard how slow it can make your programs.
Needless to say, after seeing it in action in Ruby, I started using those features more extensively
everywhere else. After seeing Rails, I very rarely write queries in ColdFusion - instead, I've
got a component that takes care of it for me.
Because of my interests in Java and Ruby, I've recently started browsing JRuby's source code
and issue tracker.
I'm not yet able to put into words what I'm learning, but that time will come with
some more experience. In any case, I can't imagine that I'll learn nothing from the likes of
Charlie Nutter, Ola Bini,
Thomas Enebo, and others. Can you?
What's next?
Missing from my experience has been a functional language. Sure, I had a tiny bit of Lisp in college, but
not enough to say I got anything out of it. So this year, I'm going to do something useful and not useful
in Erlang. Perhaps next I'll go for Lisp. We'll see where time takes me after that.
That's been my journey. What's yours been like?
Now that I've written that post, I have a request for a post I'd like to see:
What have you learned from a non-programming-related discipline that's made you a better programmer?
Posted by Sam on Jan 07, 2008 at 06:42 AM UTC - 5 hrs
A note to myself (a .NET neophyte) and others who may not know how ASP.NET works:
I was writing a user control (we'll call it ContentBoxVariation) in ASP.NET which composes another (ContentBox). Both have a public property Title, with getters and setters.
You might call ContentBoxVariation in an .aspx page like this:
More...
<aNamespace:ContentBoxVariation" ID="ContentBoxVariation1" Title="Welcome to Sam's" runat="server"/>
Then ContentBoxVariation includes ContentBox like this:
You might think that the ContentBoxVariation would pass it's Title to the ContentBox, and that the result would be "Welcome to Sam's."
Unfortunately, it does pass it's title, but the title hasn't been changed from its default of "" at the time it happens. As far as I can tell, the instantiation happens in this order:
Instantiate ContentBoxVariation
Instantiate the composed ContentBox
Set ContentBox's Title to the variation's title, which is currently "". Even if setting it to a different variable, it seems to stay blank as if that variable doesn't exist, even though it does and no error occurs.
Set the variation's title.
The order makes sense if you are looking at it like that, but being in templates and looking at it from that point of view, it is surprising.
The solution is to explicitly set the title for ContentBox in code, like this:
Posted by Sam on Jan 04, 2008 at 07:01 AM UTC - 5 hrs
At the beginning of this week's chapter from My Job Went To India,
Chad Fowler relates a story about Rao, a "mind-reading" programmer who could pick up on the
subtleties in conversations and implement code before you realized you had asked for it.
Both when I first read this, and the second time around, alarms went off in my head: "What about
YAGNI?" Why would you implement something
that wasn't asked for? If you are wrong, you could be relieved of your position for wasting
time and money.
Thankfully, Chad addressed my concerns. It turns out,
More...
We might be standing around waiting for a pot of coffee to brew, and I would talk about how
great it would be if we had some new flexibility in our code that didn't exist before. If I
said it often enough or with enough conviction, even though I hadn't really put it on the
team's TO-DO list, Rao might fill the gaps between "real work" looking at the feasibility of
implementing one of these things. If it was easy (and cheap) to implement, he'd whip it
out and check it in.
(emphasis mine)
Chad also mentions the potential pitfalls in such an approach:
You waste time and money if the functionality was not needed
You increase complexity of the code base and make "it less resilient to change" if your
code forces "the system down a particular architectural path."
You could unintentionally make the application "less functional or desirable to the customer."
Honestly, I'd caution against using this advice unless you are in one of the following situations:
You've known the feature-requester long enough that you can pick up on things he's asking for, but hasn't
yet asked for. I think you should be really close in this situation. How can you predict otherwise?
There is obviously something missing from the spec, and you have enough experience with
similar systems to know it is missing. This might be something like "We need a login system."
You can probably safely assume they'll need a way to log out as well, and perhaps even
"I forgot my password" functionality.
The logout functionality I'd almost always toss in (unless requested otherwise). However, even
on the "forgot password" feature, I'd consider a couple of things. First, do I know the customer
well enough that we've done another application for them and they wanted it? Second, is the
budget big enough to where I know they won't be upset if I implement it?
There could be more, but that's what my brain thought of this morning. Of course, in many
cases it's just better to ask first.
Posted by Sam on Jan 02, 2008 at 08:40 PM UTC - 5 hrs
It's the new year, and it's time to get back in the swing of things after the hectic holiday season. I had planned on taking the rest of this week off from posting as well, but I'm starting to feel behind on things, so this will let me set down my goals and focus on them in the coming year.
I had actually planned a different post for today, but Dan Vega inspired me with his list of goals so that's why you're reading this instead. Like Dan, I'm going to try to keep mine positive and specific, with an emphasis on SMART objectives.
With no further ado, here are my top professional goals for 2008:
More...
Start regular practice sessions: When writing about my experiences with MJWTI, I resolved to practice programming. Therefore, that's on my list and I want to devote
between 1-3 hours per week to doing it. However, I also want to:
Devote more time to open source software: I already try to spend some time doing this, but I'd like to devote a day to it each week. My original
goal was to contribute 1 patch each week, but I didn't think that would be achievable as I plan to switch projects on occasion (for practice, above), and when I do, it is tough to come up to speed, much less provide a patch. So my goal will be to devote 4-6 hours of one day a week to open source, and a patch (however small) every three weeks.
This also means I'll be getting more practice, so I may confine the practice sessions to 1 hour and make those be the constraint-driven problems I mentioned in the article linked above. I have no goal for the number of accepted patches, just that I offer them to the project.
Do something not useful (and something useful) in Erlang: With multi-core processors becoming the norm, concurrent programming is going to get big. Since Erlang was designed with concurrency as a goal, it also has the potential to be big. This is a no-brainer to try and stay ahead of the curve. Obviously whatever useless and useful things I do in Erlang will need to have concurrent aspects to them.
Speak at a conference: I don't see this happening until 2009. The first half of this year I will be too busy to practice, much less attend and speak, so I'll plan on using the second half of 2008 as practice towards the ultimate goal of speaking at a conference (or multiple) in 2009.
Attend a conference (or multiple): First on my list is a ColdFusion conference, just to meet a lot of you that I exchange thoughts with on a regular basis. Second on my list is to attend a conference focused on a language or platform I don't use or that I am quite new to using. Third would be a Ruby conference. Finally, I'd like to re-attend a No Fluff Just Stuff symposium, since it was so rewarding the first time.
Realistically, the money situation right now (being a student until May) is too tight to even consider four conferences. Further, in June I'll be getting married, buying a house (or before then), and having a family to take care of, so even going to one conference may be financially irresponsible. Because of that, I'll have to reconsider this as time goes along, or see if I can get someone to pay for me to go. That leads me to my final, and most important goal:
Increase my earnings: I have several goals at the company I work for, but aside from those, I'd like to:
Get another regular source of income as a contractor. I'm also willing to take non-regular work, but ideally I'd find
a cool project to work on with a regular source of income, and with growth potential. I do have one in mind if it's still available, but I might like to take some more work as the year progresses.
Start implementing some of the things on my Ideas List instead of just continuing to add to it. Some of these might require
renting the ideas to companies, so forgive me for not sharing at the moment.
Posted by Sam on Dec 24, 2007 at 04:52 PM UTC - 5 hrs
Suppose for the purposes of our example we have string the_string of length n, and we're trying to determine if string the_substring of length m is found within the_string.
The straightforward approach in many languages would be to use a find() or indexOf() function on the string. It might look like this:
More...
However, if no such method exists, the straightforward approach would be to just scan all the substrings and compare them against the_substring until you find a match. Even if the aforementioned function exists, it likely uses the same strategy:
deffind_substring_pos(the_string,the_substring) (0..(the_string.length-1)).eachdo|i| this_sub=the_string[i,the_substring.length] returniifthis_sub==the_substring end returnnil end
That is an O(n) function, which is normally fast enough.
Even though I'm one of the guys who camps out in line so I can be one of the first to say "don't prematurely optimize your code," there are situations where the most straightforward way to program something just doesn't work. One of those situations is where you have a long string (or set of data), and you will need to do many comparisons over it looking for substrings. Although you'll find it in many cases, an example of the need for this I've seen relatively recently occurs in bioinformatics, when searching through an organism's genome for specific subsequences. (Can you think of any other examples you've seen?)
In that case, with m much smaller than a very large n, O(m * log n) represents a significant improvement over O(n) (or worst case m*n). We can get there with a suffix array.
Of course building the suffix array takes some time - so much so that if we had to build it for each comparison, we're better off with the straightforward approach. But the idea is that we'll build it once, and reuse it many times, amortizing the cost out to "negligible" over time.
The idea of the suffix array is that you store every suffix of a string (and it's position) in a sorted array. This way, you can do a binary search for the substring in log n time. After that, you just need to compare to see if the_substring is there, and if so, return the associated index.
The Wikipedia page linked above uses the example of "abracadabra." The suffix array would store each of these suffixes, in order:
a
abra
abracadabra
acadabra
adabra
bra
bracadabra
cadabra
dabra
ra
racadabra
Below is an implementation of a suffix array in Ruby. You might want to write a more efficient sort algorithm, as I'm not sure what approach Enumerable#sort takes. Also, you might want to take into account the
ability to get all substrings, not just the first one to be found.
classSuffixArray definitialize(the_string) @the_string=the_string @suffix_array=Array.new #build the suffixes last_index=the_string.length-1 (0..last_index).eachdo|i| the_suffix=the_string[i..last_index] the_position=i # << is the append (or push) operator for arrays in Ruby @suffix_array<<{:suffix=>the_suffix,:position=>the_position} end #sort the suffix array @suffix_array.sort!{|a,b|a[:suffix]<=>b[:suffix]} end deffind_substring(the_substring) #uses typical binary search high=@suffix_array.length-1 low=0 while(low<=high) mid=(high+low)/2 this_suffix=@suffix_array[mid][:suffix] compare_len=the_substring.length-1 comparison=this_suffix[0..compare_len] ifcomparison>the_substring high=mid-1 elsifcomparison<the_substring low=mid+1 else return@suffix_array[mid][:position] end end returnnil end end
sa=SuffixArray.new("abracadabra")
putssa.find_substring("ac")#outputs 3
Thoughts, corrections, and improvements are always appreciated.
Update: Thanks to Walter's comment below, the return statement above has been corrected.
Posted by Sam on Dec 21, 2007 at 12:38 PM UTC - 5 hrs
This seems to be becoming a theme here lately: DIFN.
That's the advice in MJWTI for this week, although Chad Fowler doesn't put it so bluntly.
In the chapter, Chad describes a race where the first team to complete a project over the weekend wins $100 thousand. Could you do it?
More...
How is it that an application
of similar scope to those we spend weeks working on in the office is going to get
finished in a single weekend?
We've all seen projects take weeks when they could be measured in days. So what gives?
The answer, of course, is that we aren't accustomed to doing it right now. Stop putting off tasks. Just do them.
To help meet that goal and create race conditions, I like to timebox my daily tasks.
From 5:30 to 6:15 I read my email do my morning blog reading. Then I take 15 minutes and enjoy a cold Red Bull. After that, I might work on Project A for 3 hours, then read email for 15 minutes. I've got half a days work done before most people get to the office. After that, I might switch to Project B for three more hours, and so on.
To keep track of what I should be working on and give myself pop-up reminders that it's time to change tasks, I've been using Apple's iCal, and it works pretty well.
My only problem is that as I need to work more often in Windows, I'm not using it as much, and particularly this week my productivity has been way down. (I admit, the impending holiday may have something to do with that as well.) However, FedEx just dropped off VMWare Fusion, so hopefully I won't need to boot into Windows anymore and the problem will be solved.
My only complaint against iCal itself is that I wish I didn't have to set up an email address in the mail client for it to send me an email - that's just annoying.
If you're not on a Mac, Google Calendar would work (except you're not getting the popup reminders). Even just spending 15 minutes before you leave work to plan the next day, and writing it on the whiteboard or some sticky-notes would likely be a major improvement for your work-day, and might even be better than a technology-based solution.
How have you made it easier on yourself to do it right now?
Posted by Sam on Dec 19, 2007 at 09:26 AM UTC - 5 hrs
A while back I started a Twitter account with the idea of using it as a tumblelog
for quotes about software that I wanted to highlight. Unfortunately, the small limit on the number of
characters Twitter enforces didn't allow me to post entire quotes, much less attribute them.
Likewise, I don't like to have too many microposts on this blog, so I've decided to save them up and start
a Quotables series. The idea is that I'll post quotes about programming that have one or more of the
following attributes:
I find funny
I find asinine
I find insightfully true
And stand on their own, with little to no comment needed
Here's the first in that series. I hope you enjoy them as much as I did:
More...
The odds of finding truly beautiful code in most production systems seem to be on par with
the odds of finding a well-read copy of IEEE Transactions on Software Engineering in Paris Hilton's
apartment.
If you can't make a decent web page in .NET you shouldn't be in the business.
-Commenter at Reddit or Digg (I can't seem to find it anymore, and I just copied the quote a couple of months ago)
You have to take responsibility for teaching yourself, and that is a far greater responsibility than skimming a book and fooling around copying and pasting code from web pages. You can't just take basic or even sketchy knowledge of how to program in one language and "transfer" it to another language. You think you can just "pick it up," but in reality you can't, and neither can I.
But the fatal flaw in the GoF book was that they included recipes.
And many people thought they were the best part. Even now, you see books on Java design patterns
that blindly mimic the structure of the examples in the GoF book (even though Java has some better
mechanisms, like interfaces vs. pure virtual classes). Recipes bad. Because they suggest more than just a way to name common things. They imply (and put in you face) implementation details.
Because of meta-programming, many of the design patterns in the GoF book (especially the structural ones) have much simpler, cleaner implementations. Yet if you come from a weaker language, your first impulse is to implement solutions just as you would from the recipe.
At some point, you have to have the guts to go against the grain. Just because a "best practice" works for someone else at some other company doesn't necessarily make it a "best practice" for you and your company. A "proven methodology" isn't necessarily going to be a "proven methodology" for you. Have the guts to challenge the status quo. If it's not making you more efficient, it's likely hindering you. Refactor it out.
Posted by Sam on Dec 14, 2007 at 03:33 PM UTC - 5 hrs
This week I return to following the advice in Chad's book. It's something I've been doing now for a while: automation.
I'm really big into automation - one of the things I really like to do is create developer tools, or even just small throwaway scripts that get
me through the day.
One paragraph that stuck with me was this one:
So, imagine your company is in the business of creating websites for small
businesses. You basically need to create the same site over and over again,
with contacts, surveys, shopping carts, the works. You could either hire
a small number of really fast programmers to build the sites for you, hire
an army of low-cost programmers to do the whole thing manually and
repetitively, or create a system for generating the sites.
Sound like anyoneyouknow? (Or any of the other people writing generators, automated testers, and the like?)
It was after reading that paragraph that I decided we needed to change things at work. Forget about code repetition, there was plenty of effort repetition as well. The first part of that process was getting cfrails together, the remaining part is to build a WYSIWYG editor for building sites - if I ever get around to it.
There are other things to automate besides frameworks that generate code. Neal Ford has a pair of talks (both links there are PDFs found via his past conferences page) he gives that illustrate a bunch of tips and "patterns" along these lines. I enjoyed both of them and will eventually get around to reviewing them. He also
mentioned that a book covering the topic is coming soon.
Getting back to MJWTI, Chad lists a "simple (minded) formula" to calculate productivity:
productivity = # projects or features or amount of work / (# programmers * average hourly rate)
At the end of the chapter he shows it in action: 5 units of work with 3 fast programmers at $80 per hour would be as productive as 20 programmers at $12 per hour on the same project (obviously ignoring the communication deficiencies and other pitfalls of a group that large). But if you are able to automate enough of your work, you can be the single programmer at $80 per hour on the same 5 units of work.
The exact math isn't what's important - the fact that you are more productive by automating as much as possible is.
In what ways do you automate your workday, no matter how big or small?
It's easy to say, "yes, we can publish software in the same manner." Every time we offer a
download, it's done just in time. This post was copied and downloaded (published) at the
moment you requested it.
What my question covers is this: Can we think of an idea that would be repeatable, sell
it to customers to fund the project, and then deliver it when it's done? (It should be sold to
many customers, as opposed to custom software, which is, for the most part, already developed in that manner.)
In essence, can we pre-sell vaporware?
We already pre-sell all types of software - but that software is (presumably) nearing a
releasable state (I've had my doubts about some of it). Can we take it to the next level
and sell something which doesn't yet exist?
If such a thing is possible, there are at least three things you'll need to
be successful (and I bet there are more):
A solid reputation for excellence in the domain you're selling to, or a salesperson with
such a reputation, and the trust that goes with it.
A small enough idea such that it can be implemented in a relatively short time-frame. This, I
gather, would be related to the industry in which you're selling the software.
An strong history of delivering products on time.
What do you think? Is it possible? If so, what other qualities do you need to possess to
be successful? If not, what makes you skeptical?
Posted by Sam on Dec 10, 2007 at 11:46 AM UTC - 5 hrs
Here's some pseudocode that got added to a production system that might just be the very definition of a simple change:
Add a link from one page to cancel_order.cfm?orderID=12345
In that new page, add the following two queries:
update orders set canceled = 1, canceledOn=getDate() where orderID=#url.orderID#
delete from orderItems
Now, upload those changes to the production server, and run it real quick to be sure it does what you meant it to do.
Then you say to yourself, "Wait, why is the page taking several seconds to load?"
"Holy $%^@," you think aloud, "I just deleted every item from every order in the system!"
It's easy enough for you to recover the data from the backups. It isn't quite as easy to recover from the heart attack.
Steve McConnell (among others) says that the easiest changes are the most important ones to test, as you aren't thinking quite as hard about it when you make them.
Posted by Sam on Dec 07, 2007 at 03:06 PM UTC - 5 hrs
When someone starts complaining about customers who are making silly requests, I normally say something like,
"I know! If it weren't for those damn customers, we'd have a perfect program!"
There'd be no one using it, but hey - the application would be sweeeeet.
This week I'm going to diverge from Chad's book on how to save your job. That's mostly
because I don't have the book with me, but this has been on my mind the last couple of days
anyway: the fear of success.
I've noticed it in myself and others from time to time - inexplicably sabotaging opportunities to succeed.
I try not to listen to that voice now if I can help it.
More recently, I've started to notice it in companies and customers as well - groups as opposed to individuals.
I've started wondering if reluctance to "go live" until the product was a symbol of perfection
fits in with this phenomenon.
More...
What can we do to help them get over this irrational behavior? If they continuously request those trivial changes
and never go live, the project has failed. Do you think they will blame themselves, their ideas, and their actions?
No, they will blame you, and find someone else to work with next time.
So you may have been paid for your time, but it still impacts you negatively.
Don't get me wrong - sometimes there are good reasons to wait to release a product or service. Sometimes,
you don't need to DIFN.
However, the fear
that your customers won't know to look under "output devices" to find a subcategory of "printers" is
probably not on that list of reasons. Someone has been using a product to great advantage
for many years
and you want to "wait until you finish the last bit" to sell it as a whole to others - also probably not
on that list. You want the login on the left hand side instead of the right?
After a week of such changes, it's one thing. Six months? GMAFB.
Perhaps you'd have been better off letting your customer's use it to see if they got confused, preferred
blue links to red ones, or even happened upon an idea to make the application flow better.
So what does make the "OK to wait"-list? The fear
of underwhelming an audience with your unfinished product would, especially if you're
get to show them exactly one time. I can't think of much else that does. Can you?
So the point is that you need to get over the fear of success. Stop snatching defeat from the jaws of
victory. Let a good thing or two happen. Help your customer's get past their fears.
Changing ourselves
to recognize that fear and ignore it is something we can all do. Looking at our customer's excuses to
keep the product in the warehouse from a fear-of-success angle might provide a way to relate to them
instead of scoffing at their incessant requests for frivolity.
Success is staring you in the face. All you have to do is stick your hand out and embrace hers. Why do
you turn and run away?
I'm exploring this space for the first time.
Obviously, I have a lot of questions and very few answers. If you've got either of them, let me know
in the comments - it's always appreciated.
Posted by Sam on Dec 05, 2007 at 07:02 AM UTC - 5 hrs
I'd like a codometer to count all the lines of code I write during the day. It should keep track of lines that get kept and lines that get removed. I don't know what that information would tell me, but I'm curious about it. It should probably work independent of the IDE, since I often use several during the day.
I'd like it if not only you would stop stealing my focus, but also provide updates in the corner of the screen. When I've put you in the background, you should let me know when you're done processing so I can come and click the "next" button. On top of that, give me an option to have you click next automatically for me.
Someone found a set of bicycle pedals that fit under the desk for me. Excellent to be able to get a little exercise while I do my morning blog-reading. I couldn't find one the last time I looked, but I did this time. I'm not sure if mine are the same, or how it will work, but I will let you know when I do.
Posted by Sam on Dec 03, 2007 at 06:32 AM UTC - 5 hrs
It's not a hard thing to come up with, but it's incredibly useful. Suppose you need to
iterate over each pair of values or indices in an array. Do you really want to
duplicate those nested loops in several places in your code? Of course not. Yet
another example of why code as data is such a powerful concept:
More...
classArray # define an iterator over each pair of indexes in an array defeach_pair_index (0..(self.length-1)).eachdo|i| ((i+1)..(self.length-1)).eachdo|j| yieldi,j end end end # define an iterator over each pair of values in an array for easy reuse defeach_pair self.each_pair_indexdo|i,j| yieldself[i],self[j] end end end
Now you can just call array.each_pair { |a,b| do_something_with(a, b) }.
Posted by Sam on Nov 30, 2007 at 06:46 AM UTC - 5 hrs
Although computer science is a young field of study, it is rife with examples of good and bad
ways to do things. This week's advice from MJWTI
instructs us to focus on the past - learn from the successes and failures of those who came before
us.
Chad includes a quote from a Jazz musician that illustrates the concept of how we
learn to become masters quite well:
My advice is to do it by the book, get good at the practices, then do as you will. Many people want to skip to step three. How do they know?
Similarly, part of code practice
involves studying other people's code.
Doing so is good to teach you new tricks and things to avoid. But as
Chad also mentions, it also exposes you to projects you might not have otherwise known about -
giving you the option in the future to reuse it instead of writing your own version if your
application requires it.
But not having those books won't stop me from reading source code - I plan to start that as
part of my weekly practice sessions. It fits so well with one of the things I'm most
interested in - improving the design of my own applications.
You can say you don't run into trouble and therefore, your design is good, but how would you know until
you've gone back to it after not looking at it for a year? You need something to compare it
to, and I'm not convinced a UML diagram will suffice.
In the end, Chad gives two action items to follow up on:
Pick a project to read, make notes, and "outline the good and the bad." Use that
experience to publish a critique of
the project and it's code.
Start a study group to dissect and learn from code.
I'd like to start reading other source code, but I'm not sure when I'll publish a critique of
it. On top of that, one of the things I'd like to do in the code dojo is dissect other people's code, even
if I already find it helpful to analyze our own.
When you look at code, do you do it with a critical eye?
Posted by Sam on Nov 28, 2007 at 08:07 AM UTC - 5 hrs
The other day I was working on a crossover function to be used by a genetic algorithm.
The idea is that you select two individuals in your population of possible solutions
as parents (or more - bits aren't predisposed to monogamy or
bisexual reproduction) with the idea
that you'll combine their "DNA" in hopes of producing a more fit individual.
The idea of the
crossover for my case was a single point, and it goes somewhat like this
(a slightly simplified version, for the sake of discussion):
More...
The parents look like this: "-X--XX-X--X", where X is some character besides "-"
Left_side_of_child = select a random number of characters from the left side of one of the parents
Count the number of non-dashes in Left_side_of_child - that is the number of characters you need to
skip on the other parent.
Figure out where to start on the other parent. If that index is a dash, you can randomly select
any adjacent dash until you reach any other character.
Get the right side of the second parent and append it to the left_side_of_child to give birth to
your new baby string.
So the idea is that the X's in each parent are meaningful and they need to remain the same in number
and relative position to each other - but dashes
can be inserted between them.
It's not the most complicated algorithm in the world, so why did I spend several hours getting it to work?
Two reasons:
Most of the time, I try to write (literally) just a couple of lines of code before testing it to ensure
what I just wrote is doing what I expected it to do.
If you write too many lines of code before exercising it, when you notice a problem you'll
have a harder time diagnosing which line caused it than if you had frequent feedback. Just check
that the variables have
the values you are expecting at each point in the algorithm as often as possible.
Like many programmers, I sometimes have this insane fear of throwing away code - even if it's crapcode.
The function that seemed so simple to begin with had ballooned to over 40 lines, and included
several attempts at fixing the data instead of fixing the logic. Magic numbers were sprinkled like
pixie dust throughout the algorithm. Each patch brought further complexity, making it harder and harder
to find the problem.
I kept telling myself, "This is retarded. Why is this seemingly simple function causing you so much
pain? Just throw it out and start over." I simply waited too long to do that. Eventually I came to
my senses, started from scratch, and rewrote it to be 15 lines in about 10 minutes.
Lesson:Don't be afraid
to throw out your dirty diapers.
It's good to relearn these lessons from time to time. I expect I won't be forgetting these two now for quite
a while.
My goal is to review everything before the end of the weekend, and send the iPod out on Monday (along with an announcement here of the winner, and recognition of the other participants - so if you want to be excluded for some reason, let me know that as well).
If you haven't started, there's still enough time to come up with a solution: it needn't be long or difficult - just demonstrate something new in a language you haven't had much experience in.
If you've got a blog, post the solution there and let me know about it. If not, send it to me directly - first get in touch with me via my contact page and then send it via email.
Posted by Sam on Nov 26, 2007 at 06:08 AM UTC - 5 hrs
Last week I posted about why software developers should care about process, and
how they can improve themselves by doing so. In the comments, I promised to give a review of
what I'm doing that seems to be working for me. So here they are - the bits and pieces that work for me.
Also included are new things we've decided to try, along with some notes about what I'd like to
attempt in the future.
More...
Preproject Considerations
Most of our business comes through referrals or new projects from existing customers.
Out of those, we try only to accept referrals or repeat business from
the "good clients," believing
their friends will be similarly low maintenance, high value, and most importantly, great to work with.
We have tried the RFP circuit in the past, and recently considered
going at it again. However, after a review of our experiences with it, we felt that unless you are the cause of the RFP
being initiated, you have a subatomically small chance of being selected for the project (we've been on both
ends of that one).
Since it typically takes incredible effort to craft a response, it just seems like a waste of hours
to pursue.
On the other hand, we are considering creating a default template and using minimal
customization to put out for future RFPs, and even then, only considering ones that have a very
detailed scope, to minimize our effort on the proposal even further.
We're also trying to move ourselves into the repeatable solutions space - something that really takes the
cheap manufacturing ability we have in software - copying bits from one piece of hardware storage to another -
and puts it to good use.
Finally, I'm very interested to hear how some of you in the small software business world bring in business.
I know we're technically competitors and all, but really, how can you compete with
this?
The Software Development Life Cycle
I won't bother you by giving a "phase" by phase analysis here. Part of that is because I'm not sure
if we do all the phases, or if we're just so flexible and have such short iterations the phases seem to bleed
together. (Nor do I want to spend the time to figure out which category what each thing belongs in.)
Depending on the project, it could be either. Instead, I'll bore you with what we do pretty
much every time:
At the start of a project, we sit down with client and take requirements. There's nothing fancy here.
I'm the coder and I get involved - we've found that it's a ridiculous waste of time to pass
my questions through a mediator and wait two weeks to get an answer. Instead, we take some paper or
cards and pen, and dry erase markers for the whiteboard. We talk through of what the system should do at a high level,
and make notes of it.
We try to list every feature in terms of the users who will perform it and it's reason for existence.
If that's unknown, at least we know the feature, even if we don't know who will get to use it or why
it's needed. All of this basically gives us our "use cases,"
without a lot of the formality.
I should also note that, we also do the formal bit if the need is there, or if the client wants to
work that way. But those meetings can easily get boring, and when no one wants to be there, it's not
an incredibly productive environment. If we're talking about doing the project in Rails or ColdFusion,
it often takes me longer to write a use case than it would to implement
the feature and show it to the client for feedback, so you can see why it might be
more productive to skip the formality in cases that don't require it.
After we get a list of all the features we can think of, I'll get some rough estimates of points
(not hours) of each feature to the client, to give them an idea of the relative costs for each feature.
If there is a feature which is something fairly unrelated to anything we've had experience with, we give
it the maximum score, or change it to an "investigate point cost," which would be the points we'd need
to expend to do some research to get a better estimate of relative effort.
Armed with that knowledge, they can then give me a prioritized list of the features they'd like to see
by next Friday when I ask them to pick X number of points for us to work on in the next week. Then
we'll discuss in more detail those features they've chosen, to get a better idea of exactly what it is
they're asking for.
We repeat that each iteration, adjusting the X number of points the client
gets to choose based on what was actually accomplished the previous iteration - if there was spare time,
they get a few more points. If we didn't finish, those go on the backlog and the client has fewer points
to spend. Normally, we don't have the need for face to face meetings after the initial one, but I prefer
to have them if we can. We're just not religious about it.
Whiteboards at this meeting are particularly useful, as most ideas can be illustrated quite quickly, have
their picture taken, and be erased when no longer needed. Plus, it lets everyone get involved when we start
prioritizing. Notecards are also nice as they swap places with each other with incredible ease.
Within each iteration,
we start working immediately. Most of the time, we have one week iterations, unless there are a couple of projects going on -
then we'll go on two week iterations, alternating between clients. If the project is relatively stable,
we might even do daily releases. On top of that,
we'll interface with client daily if they are available that frequently, and if there is something to show.
If the project size warrants it, we (or I) track our progress in consuming points on a burndown chart.
This would typically be for anything a month or longer. If you'll be mostly done with a project in a week,
I don't see the point in coming up with one of these. You can set up a spreadsheet to do all the calculations
and graphing for you, and in doing so you can get a good idea of when the project will actually
be finished, not just some random date you pull out of the air.
Another thing I try to be adamant about is insisting the client start using the product as soon as it
provides some value. This is better for everyone involved. The client can realize ROI
sooner and feedback is richer. Without it, the code is not flexed as much. Nor do you get to see what
parts work to ease the workload and which go against it as early in the product's life, and that makes changes more difficult.
For us, the typical client has been willing
to do this, and projects seem to devolve into disaster more readily when they don't.
Finally, every morning we have our daily stand-up meeting. Our company is small enough so that we can
talk about company-wide stuff, not just individual projects. Each attendee answers three questions:
What did you do yesterday?
What are you going to do today?
What is holding you back
The meeting is a time-conscious way (15 minutes - you stand so you don't get comfortable) to keep
us communicating. Just as importantly, it keeps us accountable to each other, focused on setting
goals and getting things
done, and removing obstacles that get in our way.
On the code side of things, I try to have unit tests and integration tests for mostly everything.
I don't have automated tests for things like games and user interfaces. I haven't seen much detriment
from doing it this way, and the tradeoff for learning how to do it doesn't seem worth it at the moment.
I would like to learn how to do it properly and make a more informed decision though. That
will likely come when time is not so rare for me. Perhaps when I'm finished with school
I'll spend that free time learning the strategies for testing such elements.
Luckily, when I'm working on a ColdFusion project, cfrails is pretty well tested so I get to skip a lot
of tests I might otherwise need to write.
By the same token, I don't normally unit test one-off scripts, unless there are obvious test cases I can
meet or before doing a final version that would actually change something.
I don't know how to do it in CF, but when I've use continuous integration tools for Java projects it has been
helpful. If you have good tests, the CI server will
report when someone checks in code that breaks the tests. This means bad code gets checked in less often.
If you don't have the tests to back it up, at least you'll feel comfortable knowing the project builds
successfully.
For maintenance, we normally don't worry about using a project management tool to track issue.
Bugs are fixed as they are reported - show stoppers immediately, less important within the day, and things deemed
slight annoyances might take a couple of days. I'd like to formalize our response into an actual policy, though.
Similarly, new requests are typically handled within a couple of days if they are small and I'm not
too busy - otherwise I'll give
an estimate as to when I can have it done.
With bugs in particular, they are so rare and few in number
that I could probably track them in my head. Nevertheless, I mark an email with my "Action Required" tag,
and try my best to keep that folder very small. Right now I've overcommitted myself and the folder isn't
empty, but there was a time recently that it remained empty on most nights.
In any event, I normally only use project management tools for very large projects or those I inherited
for some reason or another.
Summary
If you're a practitioner, you can tell the ideas above are heavily influenced by (when not directly part of)
Scrum and Extreme Programming. I wouldn't call what we're doing by either of their names. If you're not familiar
with the ideas and they interest you, now you know where to look.
Where would we like to go from here?
One thing that sticks out immediately is client-driven automated testing with Selenium or FIT.
I'd also like to work for several months on a team that does it all and does it right,
mostly to learn how I might better apply things I've learned, heard of, or yet to be exposed to.
What else? That will have to be the subject of another post, as this one's turned into a book.
Thoughts, questions, comments, and criticisms are always welcome below.
Posted by Sam on Nov 23, 2007 at 11:11 AM UTC - 5 hrs
In this week's advice from MJWTI,
"The Way That You Do It," Chad Fowler talks about process and methodology in software development. One quote
I liked a lot was:
It's much easier to find someone who can make software work than it is to find someone who can make the
making of software work.
Therefore, it would behoove us to learn a bit about the process of software development.
More...
I never used to have any sort of process. We might do a little requirements gathering, then code everything
up, and show it to the customer a couple of months later. They'd complain about some things and offer
more suggestions, then whoever talked to them would try to translate that to me, probably a couple of
weeks after they first heard it. I'd implement my understanding of the new requirements or fixes, then
we'd show it to the customer and repeat.
It was roughly iterative and incremental, but highly dysfunctional.
I can't recall if it was before or after reading this advice, but it was around the time nevertheless that I
started reading and asking questions on several of the agiledevelopmentmailinglists.
Doing that has given me a much better understanding of how to deliver higher quality, working software on a timely
basis. We took a little bit from various methodologies and now have a better idea of when software will
be done, and we interface with the customer quite a bit more - and that communication is richer than ever
now that I involve myself with them (most of the time, anyway). We're rolling out more things as time goes
along and as I learn them.
I'd suggest doing the same, or even picking up the canonical books on different methodologies and reading
through several of them. I haven't done the latter quite yet, but it's definitely on my list of things to do.
In particular, I want to expose myself to some non-Agile methods, since most of my knowledge comes from
the Agile camp.
Without exposing yourself to these ideas, it would be hard to learn something useful from them.
And you don't have to succumb to the dogma - Chad mentions (and I agree) that it would be sufficient to
take a pragmatic approach - that "the best process to follow is the one that makes your team the most
productive and results in the best products." But it is unlikely you will have a "revelationary epiphany"
about how to mix and match the pieces that fit your team. You've got to try them out, "and continuously refine
them based on real experience."
I don't think it would be a bad idea to hire a coach either (if you can afford one - or maybe you
have a friend you can go to for help?), so you've got someone to tell you if you're doing
it the wrong way. If you have a successful experiment, you probably did it the right way. But you won't
likely know if you could get more out of it. The same is said of doing it the wrong way - you may be
discarding an idea that could work wonders for you, if only you'd done it how it was meant to be.
In the end, I like a bit of advice both Venkat and Ron Jeffries
have given: You need to learn it by doing it how it was meant to be done. It's hard to pick and choose different practices without
having tried them. To quote Ron,
My advice is to do it by the book, get good at the practices, then do as you will. Many people want to skip to step three. How do they know?
Do you have any methodological horror stories or success stories? I'd love to hear them!
Update: Did I really spell "dysfunctional" as "disfunctional" ? Yup. So I fixed that and another spelling change.
Posted by Sam on Nov 22, 2007 at 12:04 PM UTC - 5 hrs
Since the gift buying season is officially upon us, I thought I'd pitch in to the rampant consumerism and list some of the toys I've had a chance to play with this year that would mean fun and learning for the programmer in your life. Plus, the thought of it sounded fun.
Here they are, in no particular order other than the one in which I thought of them this morning:
More...
JetBrains' IntelliJ IDEA: An awesome IDE for Java. So great, I don't mind spending the $249 (US) and using it over the free Eclipse. The Ruby plugin is not too shabby either, the license for your copy is good for your OSX and Windows installations, and you can try it free for 30 days. Martin Fowler thinks