My Secret Life as a Spaghetti Coder
home | about | contact | privacy statement
Something I haven't thought much about, but am beginning to (want to) get into is the auto-generation of reports. Autogenerating forms, validation, and operations for CRUD is quite simple - the approach I've used simply gets what metadata it can from the database, and dynamically builds around it. So, NOT NULL fields become required, datetimes build different form fields from nvarchars, and validate as such (as do other types), and field maxlengths are honored on the HTML side based on the DB metadata (among other things). That is incredibly useful by itself, but not useful enough for a real, production quality application (should the user really be required to remember the categoryID?)

Of course not -- some fields should map to a foreign key which should show up as a select form element, which lists the human readable descriptions of those fields (as opposed to their database identifiers). Further, some fields should be hidden, and some fields should be compositions of others. Values for some fields should be in a range of legitimate values, and so the list of possible customizations goes on. In those special cases (which occur frequently enough), you just need to provide a means to override the default behaviors (through extra, user-defined metadata, a DSL, or some other tactic or a combination of them).

Then, in the really rare edge cases, you can always drop out of the framework and code it yourself (or have it generate a template for you to modify). Even for search forms (which I think are more often in need of customization than simple CRUD stuff), you can provide the basics and allow the programmer to customize as needed (and drop out of the framework all together if need be).

But is there a similar strategy for generating reports? My initial thought is that there isn't. Reports just seem to need too much customization to have any useful automagic generation. But, I'm trying to look past that initial impression, because I don't have as many years of experience in hand-writing the same crap code over and over again in reporting as I do in writing CRUD operations (now, I have the same crap code for CRUD in one place and it just figures out what to do, but at least I don't ever need to write it again =)).

In any case, I just don't see it. But, I can see using metadata/DSL, where you can define the tables and columns you want to report on, and generally it will still be much faster for developing reports (can you tell I'm trying to work on this at the moment?). Taking the idea further, what about bringing convention over configuration into this realm? Once you get out of the world of 8.3 filenames and realize you can name variables (or columns) pretty much anything you want, it is easy to see the benefit of having names that describe what you want the user to see. For instance, instead of orderGTot, just go ahead and name the column order_grand_total or orderGrandTotal. Who cares about the extra characters? It's not like you have to type them that many times, since the framework (presumably) takes care of most of it for you, and you don't have to define what the user should see in more than one place.

Of course, this is also just metadata. But, instead of having what amounts to the same metadata in different places, you've just got it in one - your column name. So if we take this idea to reporting, what if we just tag certain columns with something like _reportable? I haven't figured out how we might generate reports on that amount of information alone, but I see it as a useful start.

When your classes average between zero and twenty lines of code (which amounts to just customizing interfaces and performing validations, along with perhaps a couple of calculations), it's time to stop worrying about DAOs, DTOs, PPOs, and Cheerios. Skip to the IPO and let the frameworks worry about the rest. I want one to do my reporting for me, because that seems to be one of the last major hurdles that I see (at least, as far as getting rid of repetition goes).

What do you think?

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!


Comments
Leave a comment

I'm not an expert on code generation but I do a ton of report creation at work so here's my take on the issue.

Typically (and I know I'll get arguments on this) code generators are built to solve very specific problems. Take CRUD for example - a very reptitive and quite frankly boring issue - but very specific. You can model the methods easily and you can create the code based on a template. Reports are very perscriptive - they need to solve a different problem just about every time you write one. Widgets/Hours = productivity, (unitPrice*tax)+shipping = totalCost, sum(hoursWorked), count(widgets) - all very specific and varying in their application. So I don't really see how they could possibly be simplified/mass produced/generated for you. Sure you could mess with complex naming conventions (if column name like '%price%' then do this...) and build a framework around that - but what would you be gaining? Now you have a difficult naming syntax to follow - and what happens if you break that syntax? I'm exaggerating the whole concept here a bit but I think you get my point.

Trust me I've thought about this before - and I've yet to come up with anything in my mind that would possibly be worth doing.

I do, however, think that there is an opportunity for some sort of reporting framework to be created. A standard template for outputting/organizing the necessary reports. But I just don't see a way around good old fashion query writing to get the data.

Posted by todd sharp on May 23, 2007 at 01:07 PM UTC - 5 hrs

It is all about the domain specific language. I wouldn't start from the db, but from the requirements. When people specify reports, what kind of words do they use? If you can come up with a reporting language (and I'm guessing looking at interface for crystal reports and the like would be another source) you could probably come up with concepts to describe a certain collection of reports. That would then allow you to gen the SQL, service methods, controllers and views to display it.

Todd - you generate lots of reports - any way you could put together some samples of the kinds of requests? I'm pretty sure that with enough data points it'd be possible to start to identify some patterns for certain types or classes of reports.

Posted by Peter Bell on May 23, 2007 at 05:02 PM UTC - 5 hrs

I suppose I could do that. I'll see what I can put together and email you in a few days.

Posted by todd sharp on May 23, 2007 at 05:35 PM UTC - 5 hrs

Cool - thanks!

Posted by Peter Bell on May 24, 2007 at 02:31 PM UTC - 5 hrs

Todd- I see your point, and that was my thought as well (that "they need to solve a different problem just about every time you write one")

Peter - I think I agree with the DSL. That would be a good place to finish up, and I think I agree with Todd when he says that to put enough metadata within column names and the like for reporting purposes would probably just be insane to try and keep track of.

I've written quite a few reports, but they are few and far between, and mostly always simple. I like the idea of looking into Crystal and other reporting applications for ideas on where/how to base the syntax.

Anyway, Todd if you don't mind sending that email to Peter to me as well, I'd appreciate it. This is something I want to explore further, though I don't expect to do anything with it very soon (too much on my plate right now) but I will think over it some more and eventually get to it.

I guess I was hoping there was enough in common for all reports that we could get to a decent starting point with defaults and customize it from there - but I certainly wasn't too hopeful about it, as even from my limited experience there didn't appear to be one that I could identify.

With that said, it still seems to me like I repeat similar processes every time I create a report (I /know/ I have intra-report repetition, but I feel like I also have inter-report repetition). Its that which I'm trying to identify.

In any case, thanks for the comments guys!

Posted by Sam on May 25, 2007 at 08:34 AM UTC - 5 hrs

I think that this is a solvable problem (as always, for 80% to start with). Most reports are comprised of one or more properties (which may be simple columns but are usually some kind of simple aggregate or aggregation plus calculation). You then need to concepts of levels of grouping with roll ups (so you can see, city, regional, and national sales or the number of jobs by division, region and company), options for allowing users to set common filters like start and end dates, perhaps support for arbitrary filters which could have their own DSL for describing title, field, form element, whether required and optional value list) - I think it would be a manageable problem. Main issue I have is that I also don't ever really write reports, so hopefully Todd'll be able to share some domain knowledge with us.

Posted by Peter Bell on May 25, 2007 at 08:45 AM UTC - 5 hrs

A good book on OLAP may work as well. Trouble is, I don't know how much more compact/abstracted it can get, seeing as SQL is already a DSL for this, but I think it is worth exploration.

Posted by Sam on May 25, 2007 at 09:12 AM UTC - 5 hrs

Hey boys - believe it or not I did start on a document, but then the whole Scorpio thing went public and well I just got distracted :)

I did just stumble acrossed this - and I did not read too far into it (sis is in hospital) but could there be something here?

http://www.scottpinkston.org/blog/index.cfm/2007/2...

Posted by todd sharp on Jun 02, 2007 at 01:07 PM UTC - 5 hrs

Todd - thanks. One of these days I'll check out the OpenBiblio implementation and see how generalizable their solution is to the generic issue of reports.

And, I hope you're sister is alright.

Posted by Sam on Jun 02, 2007 at 01:13 PM UTC - 5 hrs

Oh, and for easy reference, the project Scott references is at http://sourceforge.net/projects/obiblio/

Posted by Sam on Jun 02, 2007 at 01:14 PM UTC - 5 hrs

Leave a comment

Leave this field empty
Your Name
Email (not displayed, more info?)
Website

Comment:

Subcribe to this comment thread
Remember my details
Google
Web CodeOdor.com

Me
Picture of me

Topics
.NET (19)
AI/Machine Learning (14)
Answers To 100 Interview Questions (10)
Bioinformatics (2)
Business (1)
C and Cplusplus (6)
cfrails (22)
ColdFusion (78)
Customer Relations (15)
Databases (3)
DRY (18)
DSLs (11)
Future Tech (5)
Games (5)
Groovy/Grails (8)
Hardware (1)
IDEs (9)
Java (38)
JavaScript (4)
Linux (2)
Lisp (1)
Mac OS (4)
Management (15)
MediaServerX (1)
Miscellany (76)
OOAD (37)
Productivity (11)
Programming (168)
Programming Quotables (9)
Rails (31)
Ruby (67)
Save Your Job (58)
scriptaGulous (4)
Software Development Process (23)
TDD (41)
TDDing xorblog (6)
Tools (5)
Web Development (8)
Windows (1)
With (1)
YAGNI (10)

Resources
Agile Manifesto & Principles
Principles Of OOD
ColdFusion
CFUnit
Ruby
Ruby on Rails
JUnit



RSS 2.0: Full Post | Short Blurb
Subscribe by email:

Delivered by FeedBurner