Saturday, June 30, 2007

A Developer's Life Stages

I ended up writing a novel as a comment on Jamie Fristrom's blog post on Speculative Generality. Read his entry. I'm reposting my comment here.

Developing The Simplest Thing That Could Possibly Work

In Forrester Research's 10 Mistakes in Software Development, #3 is overscoping a solution and #9 is jumping into development without enough research, and we've all done it. :(

A friend consulted at a large transportation company in the Kansas City area. They'd been gearing up for a massive conversion from their existing AS/400 systems to a completely Java-based system. Remarkably, in the early stages of the project, they decided to wrap the entire JDK in their own custom framework and decreed that all developers use these wrappers exclusively. As a result of this, they obtained -
  • little or no added functionality,
  • more limited abstractions than the raw JDK,
  • minimally tested code underlying everything,
  • the requirement to train all incoming Java developers on their framework and,
  • worst of all, the obligation to maintain many more thousands of lines of code.
I've observed that developers tend to progress through a number of stages in their career. These are like the stages of grief: inescapable. This is because each stage produces a set of skills and philosophies that are eventually synthesized in the seasoned software developer.

The Underengineer

The underengineer is elated at his ability to get things done quickly and be productive, disdaining others (usually the overengineer) for what they see as excessive rumination and for producing an excessive amount of code for a given task. You can count on these guys to get stuff done. Will it be done 'the right way?' Only if by accident. Underengineers are often unable to discern 'the right way,' often because they've spent little time reading and maintaining other people's code.

A developer may begin to come out of this stage when he tires of fixing or writing the same code (in different places) over and over again. Or perhaps he'll be propelled into the second stage by an enthusiastic reading of Design Patterns or a sudden grasp of UML.

The developer at this stage is fascinated by accomplishing the task at hand and it is this stage in which he learns how to code and debug.

The Overengineer

The overengineer knows that he can solve any single development task and has become fascinated by possibility. Abstraction is magical. How can any piece of code be used to solve multiple tasks? Paradoxically, this usually ends up producing more code for any one task. While the underengineer shakes his head at this, the overengineer knows that his approach will produce much less code for an entire set of similar problems. Meanwhile, the overengineer looks at the underengineer's hackery with contempt. The question the overengineer rarely asks himself, though, is will there be a large enough set of similar problems to justify this effort? And so he abstracts everything in the name of flexibility.

The overengineer is sometimes shocked to realize he's become much less productive in terms of functionality than before. He can't seem to get anything done! Still, he produces as much (if not more) code and suddenly becomes capable of building much larger systems than he could as an underengineer.

This is the stage in which the developer learns architecture and the value (and painfully, the costs) of abstraction. This is the stage of the architects of the transportation system conversion discussed above.

What kills the overengineer? Maintenance. Dependencies. Broken abstraction. Realizing that the flexiblity he built in is unused and real systems change in ways he could never have anticipated. At some point, and probably many, the overengineer will get a phone call at 2am asking him to come in and fix an issue that breaks his whole architecture. After a few of these heartbreaking moments, his world view begins to change again.

The Seasoned Developer

The seasoned developer is skilled in coding and debugging from his time as an underengineer and is a skilled architect from his time as an overengineer. What's more, he's come to understand the following:

The primary reason for abstraction is to simplify implementation for a given set of requirements.

If abstraction doesn't simplify, ditch it.

The seasoned developer also realizes that the solution domain for a given problem may extend beyond technology. People often try to use technology to solve social or organizational problems. Conversely, people frequently attempt social or organizational solutions for essentially technological problems. Recognizing these incongruities before they become a software design and implementation can help to avoid doomed projects.

The organizational problem in the transportation example above is that software architects were assigned to do a job before anyone knew what job they were to do. The proper response would be to do nothing - or better, to work on a different project until the problem domain was fully understood.

Instead they chose speculatively general busy work, incurring a much higher cost.

Friday, June 29, 2007

The Future of Processor Hardware

Many-Core Processors, Languages for Multiple Processors and Reconfigurable Computing


Many-Core Processors

Multicore processors are giving way to many-core processors with Intel estimating the number of cores doubling with each processor generation. The paradigms of the most commonly used programming languages are ineffective with respect to these types of processors. So what will happen?

Languages for Multiple Processors

Modified C++/Java# with futures, promises, (Flow Java)?

Functional languages are inherently more parallelizable than imperative languages. Will we see a resurgence of these types of languages?
- Concurrent Haskell/Concurrent Clean
- F#

High-Performance Reconfigurable Computing

Hybrid systems combining conventional (Von Neumann) architectures with FPGAs and other forms of reconfigurable hardware are starting to mature, but with radically different models of "software" development.

Is High-Performance Reconfigurable Computing the Next Supercomputing Paradigm?

From Wikipedia

Paradigm Shift

It is called the Reconfigurable Computing Paradox, that by software to configware migration (software to FPGA migration) speed-up factors by up to almost 4 orders of magnitude have been reported, as well as the reduction of electricity consumption by more than 1 order of magnitude - although the technological parameters of FPGAs are are behind the Gordon Moore curve by about 4 orders of magnitude, and the clock frequency is substantially lower than that of microprocessors. The reasons of this paradox are due to a paradigm shift, and are also partly explained by the Von Neumann syndrome.

Tuesday, June 26, 2007

EASTL

EA genius and colleague Paul Pedriana published a paper to the C++ Standards Committee detailing his EASTL, an EA version of the Standard Template Library that provides a number of efficiencies with game development in mind, but that are nonetheless applicable across other software domains.

I've always been a fan of the STL in general, and since I've been at Electronic Arts, of EASTL. It's nice to see some of these innovations get out into the world.

Thursday, June 21, 2007

Spread This Number?

AACS encryption key controversy

This bad boy?

13,256,278,887,989,457,651,018,865,901,401,704,640

So where was I when all the noise about this was going on in April? Ah, illegal numbers. What will they think of next? Anything that can be represented digitally can be represented as a single (usually very large) number. Software, movies, music, etc..

That reminds me of a story I read once about a civilization that encoded all its existing knowledge as a single mark on a stick. The position of the mark on the stick divided by the length of the stick yielded a decimal number less than one. The digits of the fractional part of that number represented the totality of the civilization's knowledge in an encoded form.

Could this work?

Let's think about this. A proton has a diameter of about 1.5E−15 m. That means we'd get only about 16 decimal digits before we're measuring sub-subatomic dimensions. And Shannon tells us it would take lots of digits to represent that knowledge in any reasonable form. Actually, it would take about 2.3 million decimal digits per megabyte of data. And we're talking at least trillions of megabytes of data.

So no, there is no way - now or ever - that this could work.

Thursday, June 14, 2007

Horrific Default Behavior/Failure Modes

(Updated 7/10/07)

Horrific Default Behavior

Today I wrote a batch file called clean.bat and put it in a directory on a different machine. It contained -

del /f /s /q *.obj
del /f /s /q *.ilk
del /f /s /q *.pdb

I had the directory up with Windows Explorer and '\\machine_name\c$\dev\..' was in the address bar. I saved the file into the directory of the other machine. Pretty straightforward, right?

So, thoughtlessly, I double click on clean.bat. Without ado, the batch file starts executing with this message -

'\\machine_name\c$\dev\..'
CMD.EXE was started with the above path as the current directory. UNC paths are not supported. Defaulting to Windows directory.

C:\WINDOWS>del /f /s /q *.obj


Holy crap! Fortunately, I saw it immediately and broke out of it. Even more fortunate, I don't think C:\WINDOWS has any .obj files.

But, if I had wanted to clean .exe files instead (which I frequently do), my machine would have been utterly hosed.

That, my friends, is horrific default behavior.

Failure Modes

Last year, our water heater died. It was pretty old, I think. The problem was the way that it died - by catching fire. That was pretty disturbing.

So the next day, while I was at work, the repairman came out to replace it. Rebekka asked him why it caught fire. The repairman explained, "that's how you know it's broken."Really? The next time my water heater fails, I'll know by following the fire trucks home to see my subdivision in cinders. "Time for a new water heater," I'll say to myself.

And how will I know when my nail clippers need replacing? Oh that's right, from the mushroom cloud over Orlando.

Wednesday, June 06, 2007

Compiling Haskell

I've played with the Glasgow Haskell Compiler only a bit. Compiling Haskell to, say, native code is quite a chore. The GHC compiles to a number of special intermediate languages between source and machine code.

Haskell > Core (Hindley-Milner typed lambda calculus) > STG > C/C-- > native machine code

That's a lot of work.

Small Tools

(Updated 7/10/07)

Part of the purpose of this blog is to remind myself of best practices I've discovered through the years I've been developing software. One of these best practices was brought to mind again in a recent conversation with a colleague.

Small, Combinable Tools

Unix, of course, is built around this philosophy. In essence, it is an extension of the practice of modular programming to applications. It allows us combine the functions of applications in an automatic or semiautomatic way. These are generally command-line tools, which may make them inconvenient in certain circumstances. Nevertheless, with a judicious layering approach, it is frequently possible to capture the best of both worlds - command-line combinable tools underneath, user-friendly GUI on top.

UnxUtils

If you use Windows and you develop software, it is absolutely imperative that you download and use UnxUtils. They are native Win32 ports of many standard unix utilities, and they can be extraordinarily useful. Add them to your PATH. Oh, and if you're not comfortable in unix and, by extension, with these utilities, let me encourage you strongly to learn to use them well. You'll thank me later!

Windows SysInternals

Windows SysInternals. You may not need these beauties often, but when you need them, you need them badly. I believe that few (or none) of these are command-line operable (I may be mistaken).

Windows Utilities

Junction for symbolic directory links in Windows. Get it.
PathMan for path management, and a ton of other good stuff here.
Unlocker, a fabulous (GUI) utility for unlocking files, by Cedric Collumb.

Build Utilities

Ant and NAnt for builds.

Freeing pov2mesh

My pov2mesh utility is currently distributed under the shareware model. I've pretty much decided to release it as freeware. If I do this, I will refund the registered users and remove the registration code from the application. Refunding will be easy, because there are so few registered users! I will not release the full source now, although I may in the future. I've also developed a number of other small tools that I will slowly be polishing (a little, not much) and releasing here sometime.

ASPack

($29) No, it's not a fanny pack. This executable packer is phenomenal.

Not So Small Tools

UltraEdit

($50) My primary text editor. It loads fast and I love its Find/Replace In Files. TextPad's a close second.

LTProf

($50) An easy-to-use, very lightweight profiler.

Araxis Merge

($129) I don't know how I'd live without it.

MilkShape

($35) Basic, inexpensive low-polygon 3d modeler than can import/export almost anything.

Ultimate Unwrap 3D

($50) Unwrap is great for texture mapping.

Paint Shop Pro

($90) Of course, as every game and web-developer who prefers Paint Shop Pro to Adobe's heavyweight (and expensive!) Photoshop knows. Now that Corel owns it, hopefully it will fare as well and not become as bloated (or expensive!) as Photoshop.

Gnu Prolog

Need a good Prolog implementation for expert system development or curiosity or anything else?

In the rare moments when I use Prolog, I use GNU Prolog.