h1

Can’t type, will program

February 2, 2009

I just read this post by Jeff Atwood about how important keyboards are to your average programmer. It’s true that “programming” tends to mean “writing code”, which will involve a lot of typing (unless you’ve got a copy of Dasher and a lot of free time). No matter what language you program in, you’ll need to write somewhere between Basic (lots of words) and J – a sample program of which is:

+/ i.@x:&.(p:^:_1) 1e6

Lovely. However, how do programmers work? Hopefully they think about the problem before typing, and then when they’ve settled on a solution, they start to implement it. If I ask a question such as “what is the product of all the numbers from 10 to 10000?”, a programmer might think of writing something like “ans = 1; for i in 10 to 10000, ans = ans * i; print ans”. A smart programmer might then think “hang on, that might overflow – I need an arbitrary-length integer”, at which point they might go away and download a library with this functionality, implement it themselves, or relax smugly, safe in the knowledge that their language provides this by default. Either way, the point is that the idea for the program might be a lot easier to come up with than the implementation.

Many years ago, after writing some horrendous games in the aforementioned QBasic, I came across something called Klik and Play (forerunner of The Games Factory and Click and Create). This tool let me make much better looking and more complicated games in much less time, and without writing any code at all! Since then, I’ve gone back to “proper” programming (in C++) for my day job, but even there it’s clear that there is a need for a simpler way of doing things.

Lots of “enterprise software” programmers write apps for internal use, and these tasks fall to the programmers because they’re essentially automation and process tasks that the other employees (the “customers”) can’t do themselves. I’ve found myself “writing code” for things that really shouldn’t need code at all – for example, maintaining legacy MFC apps that have hand-written code that puts a check mark next to a menu item when it’s clicked. This is the kind of thing that should be achieved through a visual editor, without needing to know which particular API calls are needed to get a handle to a menu item. Systems like Windows Forms improve this situation in some areas, but still fail to support useful work-reducing concepts like data binding without significant coding input. Adobe Flex is great for GUI work, but typically needs to be mated to Web services to provide backend functionality, due to its client-side focus.

One tool that allows non-programmers to program is Automate. This allows the graphical build-up of process flows, combined with integration into common business packages like Excel, to keep the programmer either out of the loop or at maximum efficiency (i.e. not writing code). It’s basically scripting for people who don’t know any scripting languages. You still need a lot of the fundamental concepts of process engineering that are used in computing (such as a good logical analysis of workflow, comprehensive error handling, data transformations, and an understanding of what the process is trying to achieve), but without having to know any APIs or languages.

Some things will always need to be written in a low-level language (such as the underlying data structures, algorithms, and operating systems), but “end-user” programmers will hopefully one day be freed from the need to learn a language. Originally, programs were written in machine code; assembler was then used to make a useful abstraction from the hardware itself; higher-level programming languages like C were then employed to build larger systems more efficiently; and modern scripting languages like Ruby build upon these languages to provide an even more concise method of performing general-purpose computations. All of these are still languages, however, and require specialists (programmers) to learn them. Today, specialists in areas other than programming need to use computers for their day-to-day work in their specialist field – the next level of abstraction must be no language at all.

Or maybe Javascript?

Advertisements
h1

[the horrors of] Database testing

January 8, 2009

I’ve just come across this post by Alex Scordellis, via some Coding Horror comments. It brings up some interesting issues I’m looking at, because I’m trying to figure out how best to test our database layer (at the company I work for). Alex says at the top of his post:

On my current project, we’re successfully using SQLite for in-memory testing with NHibernate / Castle.ActiveRecord. This makes tests that run against the database a lot faster than if we ran them against SQL Server.

He then goes on to say at the end…

What do we do about this? On the technical side, we can work to make our development and testing environments as close to production as possible.

By using the faster SQLite, you can make your development lives easier (in that tests are faster) but you have to add code to compensate for the dev/production impedance mismatch, and you could also be hiding performance problems. I’d be tempted to use the exact same system as production (SQL Server) for the dev/test setup, and try something like automating the unit tests (set up a machine to run them after code commits /nightly builds) to solve any speed issues.

Some of the databases I work with are fairly large (>100GB metadata, most of the actual data stored in a separate filesystem) and maintaining a dev database this size has its own problems, but on the other hand, meaningful performance tests require the full-size database to be available. I can’t think of any way around a requirement of a dev database closely mirroring the production system if testing of real-world performance is required.

When setting up tests, it has been noted that you need to reset the database to a known state (i.e. rebuild it). For a 100GB database, restoring from backup, this can take a long time. You can solve this problem by throwing hardware at it (if you need to run more tests in a given period, or your database build times increase, add more automated test runner machines) but this can be expensive. I’d be interested to hear from anyone with experience of testing large databases, to know if they’ve found some approaches better than others.

Of course, I’d love to be able to ditch SQL databases altogether, but it looks like we’re stuck with them… even Amazon’s SimpleDB has started to succumb… maybe these new-fangled XPath-enabled databases will change things.

Maybe not

h1

awesome…

December 4, 2008

This is just too good to be true:

http://failblog.org/2008/10/07/cow-curiosity-fail/

h1

Some C++ trivia

December 4, 2008

So it looks like I forgot all about this blog for the last few weeks. Oh dear. But now I’ve remembered that it exists, so here’s some C++ stuff that I discovered recently, based on some interview questions I’ve been doing.

I thought that I had a fairly good idea about how C++ worked, but of course I didn’t, and a question that looked pretty simple on the surface can have lots of subtleties. These usually make themselves known during an interview, when a candidate writes a little bit of code that you hadn’t anticipated, and you’re not sure whether it would compile. Fortunately, the candidate usually has bigger problems than pointing out that the interviewer is a simpleton, but it’s always good to go away and figure out exactly what’s going on with the problem you thought you knew inside out.

To make it easier to see what’s going on, I’ll use the format Code Listing / Question / Solution / Discussion for each problem. (Note that these aren’t the actual interview questions I use, but are small test cases that illustrate what it was that I wasn’t sure about. They are “complete” C++ programs.)

Problem #1: local static variable lifetime

Code

#include <iostream>
using namespace std;

class chatty
{
public:
    chatty() { cerr << "chatty()" << endl; }
    ~chatty() { cerr << "~chatty()" << endl; }
};

void main()
{
    cerr << "main()" << endl;
    static chatty c;
    cerr << "exit main()" << endl;
}

Problem
The program above uses a local static variable in its main() function. When does the constructor get called for this variable? When does the destructor get called?

Solution
First, the line “main()” is printed, then the line “chatty()”, then the line “exit main()”. “~chatty()” never gets printed.

Discussion
This behaviour is what I expected, which is always nice. Basically, local static variables are initialized the first time that the program gets to their declaration. They are never destroyed. They are “better” than standard global variables and static class members in that the construction time is deterministic, but they still have the problem that their destructors are never called. They’re great for little conveniences such as counters and timers, but I’d stay away from using non-integral types in static variables in “real” production systems. Store your state in class members instead, and make the function containing the local static variable a class method.

Problem #2: Auto-generated class methods

Code

#include <iostream>
using namespace std;

class bad_singleton
{
    bad_singleton() { cerr << "bad_singleton()" << endl; }
    ~bad_singleton() { cerr << "~bad_singleton()" << endl; }

public:
    static bad_singleton & instance()
    {
        static bad_singleton inst;
        return inst;
    }
};

Problem
In this listing, a rather lazily-designed singleton class has been defined. What can go wrong when you write code that uses this class?

Solution
It is possible to create more than one instance of this class. While the constructor is private, the copy constructor is not declared, and will therefore be generated for us in the usual way – i.e. with public visibility.
Another problem is that the inst variable will never have its destructor called, as in Problem #1.

Discussion
An example of some code that you might write, if you’re not careful, is the following:

void main()
{
    bad_singleton bs = bad_singleton::instance();
}

In this case, we’re trying to get the instance (which is returned by reference), but we’ve declared bs as a bad_singleton, not a reference to a bad_singleton (we forgot the ampersand). All is not lost: while our auto-generated copy constructor can perform the copy implied with this line of code, the private destructor will lead to a compile error, since bs will go out of scope at the end of main(), and the compiler will attempt to put in a call to the destructor, which fails.
We can get around this with the following (obviously bad) code:

void main()
{
    bad_singleton* bs = new bad_singleton(bad_singleton::instance());
}

This code now compiles, and we can create as many of our singleton as we like. Not so single after all.

Problem #3: Template recursion

Code

template<int N>
class factorial
{
public:
    enum { value = N*factorial<N-1>::value };
};

template<>
class factorial<1>
{
public:
    enum { value = 1 };
};

void main()
{
    int a = factorial<4>::value;
    int b = factorial<400>::value;
}

Problem
The code listing above gives us a templated (i.e. compile-time) factorial function (!). What happens when we hit “Compile” on this file, and then run it?

Solution
The factorial function given above is perfectly valid, and the computation of 4! (4 factorial) should work as expected. However, when we pass 400 as a template parameter, we have two problems:

  • Our int will overflow (400! is a very very big number, much more than 32 bits can represent in integer form)
  • Our compiler has to recurse 400 times to generate the result of this computation. This is a pretty deep parse-tree and there’s a lot of generated code lurking in there.

Discussion
The actual results will be compiler-dependent. On Visual C++ 2008, the compiler tries its best, and issues lots of warnings about integer overflow (for all the overflowing recursions, which is most of them), and then either succeeds, or gives a recursion error, or crashes out, depending on how much memory you have, the day of the week, and the colour of your bedroom walls. The template recursion depth of your compiler may have a hard limit set (say max 20 recursions) to avoid this slightly unstable situation, and the moral of the story is: don’t take template metaprogramming too far! Think of it like a normal run-time recursive function, but with a very small stack.
The other thing about this particular bit of templatery is that it’s completely insane. Computing factorials with your compiler is weird.

h1

When in Rome…

August 12, 2008

use dependency injection.

Recently, when working on a Flex project, and being the numpty that I am, I was trying to work out how to design the app for the usual stuff (extensibility, maintainability, etc). It’s a GUI app with several components, and it needs to interact with various web services, so I went for an engine / GUI split as a starting point. All the code that directly deals with user actions, UI controls, and UI layout lives in a “ui” package, and all of the code that modifies application state, controls reading / writing data, and does the “business logic” (such a hateful term – is there a better one?) lives in an “engine” package. Application actions are executed following the Command pattern, and there are some data objects that live in a “data” package that talk to the web services and store data on disk, providing methods for reading and writing data that the engine code can use.

All seems well and good so far (unless I’ve already made some horrible mistake – <fear/>), but I now need a way of linking the engine components (called things like UserPreferencesManager) and the UI components (called things like UserPreferencesDialog) together. Typically, there will be one UserPreferencesManager (or whatever) for the application, and the preferences will need to be accessed all over the shop. Several options spring to mind:

  • Instantiate a UserPreferencesManager object as a member of the Application object, then access it from everywhere else using the handy ActionScript object “Application.application
  • Make the UserPreferencesManager a singleton (there’s only going to be one of them, right?)
  • Make the UserPreferencesManager a global static object
  • Instantiate a UserPreferencesManager object as a member of the Application object, then expose a property on all components requiring access to a UserPreferencesManager, and follow the Dependency Injection pattern to set each property to our application’s UserPreferencesManager (directly in the Application definition for children of Application, then from those objects to their children that require access to the UserPreferencesManager, and so on)
  • Use the Model Locator pattern

To be honest, these options didn’t spring to mind the first time I wrote a Flex app. I found out about Application.application and thought “great, that makes all this design pattern rubbish totally irrelevant – I can bind straight to the object wherever I am! Woohoo!”. Naturally, when I wanted to change one of these objects in my Application that was coupled to everything else, things started to look less rosy, so I asked my old friend Google for help.

Google sent me to the Adobe Developer Centre page linked to above, which made me very happy indeed. Some advice on the very problem I was struggling with, straight from the horse’s mouth! I read about their Model Locator pattern and it all made sense. I already knew what a Singleton was, and while I’d avoided that because I had some vague notion that it might behave very much like the Application.application method, Model Locator was a rubber-stamped, ready-made solution. Right?

Fortunately I found the Architectural Atrocities blog series, which has quite a lot to say about Flex, and Cairngorm in particular (which is what the Model Locator pattern comes from – I’ve never used the framework myself). The relevant part:

Translation:

We though all the patterns in the J2EE catalog looked too complicated so we came up with this idea of using global variables instead, it’s much easier.

Indeed the Model Locator is just another, more complicated way of writing a singleton (or global variable, if you like your bad programming to be more obvious). Strangely, the Adobe authors do mention the correct solution in their article:

This brings up an important best practice. When you create components that rely upon client-side data, it is all too easy to create a direct reference to a Singleton model, such as ShopModelLocator. Indeed this is true throughout the application. We discourage this approach. Instead, consider passing the model and/or its properties down through a hierarchy of your view components, for a cleaner and more thoughtful solution.

That’s dependency injection! Why bother with the singleton at all? If you just want one ShoppingCart object, just create one instance. There’s no reason to make the shopping cart object a Singleton – that just makes your life more difficult if you decide in the future that actually, you do want more than one shopping cart.

So I’ve been using dependency injection, and it’s great. It makes it obvious which classes depend on which objects, even if they’re nested several levels deep (because their parents, and their parents’ parents, etc, all need to have the dependencies injected into them). If you change your object instance, you’ll get compile errors (if you’re using type annotations in Flex, or otherwise statically typing your variables in other languages). You can even have more than one object of a given type if you like!

To summarise the methods listed above, then:

  • Global variable
  • Global variable
  • Global variable
  • Single instance that is passed around where it is needed
  • Global variable

It’s impressive how many different names there are for the same thing (and that’s not even mentioning Monostate). Yes, I know there are subtle differences and they’re designed to do slightly different things… but really.

My Command invoker is still a singleton though.

h1

What I really wanted to say

August 11, 2008

So my post today was meant to be about something else (that’s why I set up this account). Can’t remember what it should be yet…

Not lock-free coding

Not goating… although the Clippit thing is genius.

Maybe it was meant to be on writing unit tests for database-dependent code. What’s the best-practice way to do things? Should you rely on there being a genuine DBMS that you can mess with, and run some scripts to set up and tear down your test dataset? Or should you try to do all your tests without any database at all, relying on your SQL to be valid / tested independently (surely there must be a better way than this)? How should you set up test data for anything but in-memory datasets (file-based, Web-based, etc)? I need to look into this.

I’m not sure that the unit-testing thing was the idea for the post now. Maybe I’ll remember tomorrow.

This blog is already becoming a to-do list, hopefully I’ll be able to answer some of these questions soon!

h1

Content before design

August 11, 2008

I nearly went straight for the “design” button as soon as I logged in here at WordPress for the first time, but managed to resist the temptation. Design is more fun than writing content (probably because you can’t do it as often, unless you’re a graphic designer or whatever) so it’s what I’d normally do. However, getting a design right without content is difficult, and you’d probably have to invent some content anyway if you wanted to prototype it properly (and who wants to see Lorem Ipsum again)?

I need to get used to this “link to everything” malarkey, it probably makes blog posts “better” in some way. There’s probably a way to quantitively measure that, actually: might be worth a little test at some point in the future.

I also need to devote more time for searching for good links – went for the good old Google Number One there. Great. Does anyone archive top ten Google results…? somebody must be doing that for a “popularity versus time” graph.