An aching head with bioinformatiC

home | about | contact | privacy statement

Posted by Sam on Sep 02, 2007 at 03:48 PM UTC - 5 hrs

Bioinformatics is one area of computing where you'll still want to pay special attention to performance. With the human genome consisting of 3 billion bases, using one byte per base gives you three gigabytes of data to work with. Clearly, something that gives you only a constant reduction in computational complexity can result in huge time savings.

Because of that concern for performance, I expect to be working in C++ regularly this semester. In fact, the first day of class was a nice review of it, and I welcome the change since it's been many years since I've done much of anything in the language.

One thing that struck me as particularly painful was memory management and pointers. When was the last time you had to remember to delete [] p;? The power of being able to do such low-level manipulation may be inebriating, but you better not get too drunk. How ever would you be able to keep the entire program in your head? (Paul Graham's timing was amazing, as I saw he posted that article about 10 minutes before my re-introduction to C++).

C++ works against that goal on so many levels, particularly with the indirection pointers provide. Something like this simple program is relatively easy to understand and remember:

	#include 

#include 

using namespace std;

int main(int argc, char *argv[])

{

    int  *i = new int(1);

    *i = 1;

    cout << *i;

    delete [] i; 

    system("PAUSE");

    return EXIT_SUCCESS;

}

It is easy to see that i is a pointer to a location in heap memory that's holding data to be interpreted as an integer. To set or get that value you need to dereference the pointer, using the unary * operator.

But what happens when you increase the complexity a little? Here we'll take a reference to a pointer to int.

int printn(int *&n)

{

    cout <<  *n;

}

The idea stays the same, and is still relatively simple. But you can tell it is starting to get harder to decide what's going on. This program sets a variable and prints it. Can you imagine working with pointers to pointers or just a couple of hundred lines of this? Three cheers for the people that do.

What if we change it a bit?

int printn(int *n)

{

    cout <<  *n;

}

Are we passing a pointer by value, an int by reference, or is something else going on?

It makes me wonder how many times people try adding or removing a * when trying to fix broken code, as opposed to actually tracing through it and understanding what is going on. I recall doing a lot of that as an undergrad.

I'm not convinced mapping everything out would have been quicker. (I'm not convinced throwing asterisks around like hira shuriken was either.) One thing is for sure though - getting back into C++ will make my head hurt, probably more than trying to understand the real bioinformatics subject matter.

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!

Last modified on Sep 02, 2007 at 03:49 PM UTC - 5 hrs

Comments

Leave a comment

as my C++ teacher used to say:

"there are only two types of programmers in this world: the ones that understand pointers in C and the ones that don't"

Posted by barry.b on Sep 02, 2007 at 06:11 PM UTC - 5 hrs

Certainly there are a lot of people who don't understand pointers. But I still think among those of us who do, their existence can lead to long chains of code that become hard to follow.

That's true in general without pointers, but I think they exacerbate the problem. You have to keep the code incredibly simple to follow them without impedance.

Posted by Sam on Sep 02, 2007 at 06:23 PM UTC - 5 hrs

Here's a non-pointer oriented tidbit one of my professor's mentioned:

Friend classes:
"The thing to remember about friends is that they can touch your private parts. (From a design standpoint) The question is, should friends be touching your private parts?"

Just thought you might enjoy that.

Mike.

Posted by Mike Kelp on Sep 02, 2007 at 08:46 PM UTC - 5 hrs

Hey buddy,

Wow -- sounds so exciting!

One thing you could do is use one of these nice C++ garbage collection systems that works behind the scenes. I've heard good things about these two:

Boehm: http://www.hpl.hp.com/personal/Hans_Boehm/gc/

Giggle: http://giggle.sourceforge.net/

Coming from the LISP world, I love not having to worry about such things. If you are into using LISP, you could use Gnu Common Lisp (GCL), and once you're ready to roll, just issue the form `(comp t)' and all of your functions will be compiled down to machine code, and will run at near C-speed. I find it wonderful both for the fact that I need only worry about the real functionality, and at the same time, I can use tools like ACL2 (http://www.cs.utexas.edu/users/moore/acl2/) to help me prove properties of my code.

-brother grant

Posted by grant on Sep 04, 2007 at 10:47 AM UTC - 5 hrs

@Mike - Maybe they should (at least the way they are designed it seems that it the purpose), but do you want them to? I've got some hot friends, so ... =)

@grant - Awesome. Thanks for the links. I wasn't aware of any of that, in particular I had never considered the possibility that Lisp could be compiled down to machine code. As you know, I've used it very sparingly in the past, but I plan at some point to learn it well. I'm especially interested in Paul Graham's Arc if he ever finishes the thing.

I'm not particularly worried about it at this point, but I do remember the pain of C++ when compared to all the lack thereof in the even higher-level languages I've been using since then.

Posted by Sam on Sep 07, 2007 at 11:04 AM UTC - 5 hrs

Leave a comment

Topics
.NET (19)
AI/Machine Learning (14)
Answers To 100 Interview Questions (10)
Bioinformatics (2)
Business (1)
C and Cplusplus (6)
cfrails (22)
ColdFusion (78)
Customer Relations (15)
Databases (3)
DRY (18)
DSLs (11)
Future Tech (5)
Games (5)
Groovy/Grails (8)
Hardware (1)
IDEs (9)
Java (38)
JavaScript (4)
Linux (2)
Lisp (1)
Mac OS (4)
Management (15)
MediaServerX (1)
Miscellany (76)
OOAD (37)
Productivity (11)
Programming (168)
Programming Quotables (9)
Rails (31)
Ruby (67)
Save Your Job (58)
scriptaGulous (4)
Software Development Process (23)
TDD (41)
TDDing xorblog (6)
Tools (5)
Web Development (8)
Windows (1)
With (1)
YAGNI (10)

Resources
Agile Manifesto & Principles
Principles Of OOD
ColdFusion
CFUnit
Ruby
Ruby on Rails
JUnit

RSS 2.0: Full Post | Short Blurb
Subscribe by email: