Story Driven Development with DSL and REST – Part 2 December 12, 2018

Part 2 - The power of DSL and REST

Placing a script in the page with the HTML is fine for simple websites, but it makes the source code visible for the world to see and to copy. Of course, this is true for any embedded code, including JavaScript in the page header, but DSL scripts are far easier to read. More importantly, the tendency now is to go for single-page designs, which result in much larger scripts that combine the features of what would otherwise have been several pages. Whether JavaScript or EasyCoder, the larger the script the longer it takes to download, and in the case of EasyCoder, to compile prior to running.

Both of these issues are dealt with by using REST. As supplied, EasyCoder comes with its own REST server in the form of a PHP script in the plugins folder. This manages content in any number of MySQL tables added to those already used by WordPress. The tables all share the same structure; an id, a name and a value, and one of these tables can be used to hold your scripts. Another can hold chunks of HTML for use by your pages, and a third your CSS definitions. Other tables hold data specific to the needs of your website.

An EasyCoder web page can comprise just 3 commands:

variable Script
rest get Script from
   `https://mysite.com/wp-content-plugins/easycoder/`
   cat `easycoder/rest.php/`
   cat `ec_scripts/name/main`
run Script

In this script the word 'cat' performs string concatenation, keeping the lines short enough to fit on a mobile screen. You would normally just put the whole thing into one long string.

The script declares a variable, performs a GET request on the REST server (asking for the item named 'main' in the 'ec_scripts' table), fills the variable with what it gets back then runs the script. Everything from this point on, including loading the HTML, is handled by the script. This is the ultimate in hiding your code from curious eyes, yet the script and the HTML are still easily accessible by anyone with the right accreditation.

Cute as this is it's not actually a very good way of organizing things. You should always deliver a web page as quickly as possible so the user sees something, in case they get impatient and click away from your page. If it's not the whole page then some part of it, just to let the user know it's on its way. So the initial page should contain just enough HTML to give an outline of what's coming; the title and the major content areas, initially empty. It should also contain enough of a script to get things moving. Much of this will be REST calls to load other scripts and HTML, so you're not giving much away by having it visible. There's probably a masthead logo, which takes time to download, so there's plenty of time for the script to compile and start work. The EasyCoder compiler is pretty quick; it's unlikely that any initial page of this kind will take as long as 100ms to compile. In most cases compilation will take under half that, so there's little to be gained by loading the initial script from the REST server as shown above.

Once the script is running it will start to display content; items that are either already in the initial page or called from the REST server. Multiple requests can take place simultaneously, so the pipeline is kept pretty full. The user will see the page load quickly then fill in with the detail.

The initial script will load more HTML, CSS and other scripts whose job is to handle interactivity. There's no need for these to be in place right at the start as few users start clicking before the page has loaded, and if they do it's OK for nothing to happen. The scripts can load after everything else; the chances are that images will still be arriving at this time. If you leave the browser command window open you can see the newly-arrived scripts compiling while you watch the page develop in its own window.

When you finish with a piece of HTML you can remove it by setting the content of its parent to empty. Similarly, when you finish with a script you can call its exit command, which removes it completely from the browser's JavaScript space. This makes it very easy to build a single-page website of unlimited complexity, that pulls in resources as they are needed and disposes of them afterwards.

Development implications

With JavaScript running a single-page design the entire codebase must be loaded up front. Whatever programming framework is used, the code will be bulky and complex for all but the simplest web pages. Load-on-demand, as described above, is possible to do in JavaScript but makes the code even more complex and harder to understand. By comparison, EasyCoder only requires its own JavaScript modules to be loaded in the head of the document; the total size of this is something under 150Kb depending what plug-in extensions are required. The initial script - the only one that has to be present from the start - is probably only a few kilobytes as it's only dealing with the first things the user will see. Other scripts may be larger, but a good rule of thumb is to keep them to 300 lines or less in the interests of easy maintenance. A script of this size will be around 10Kb and compile in under 100ms.

By loading code and HTML on demand and releasing it afterwards there is no limit to the size of a single-page design. New components can be added with little risk of their interfering with what's already there. You can even operate your own plugin mechanism that loads new content without knowing anything about its internal workings.

The key features of a DSL/REST combination are:

  1. It can give a lower startup time than conventional practices.
  2. Increasing the size of the site has no effect at all on the load time.
  3. The software is largely a formal version of the customer's own stories.
  4. Part of all of the development can be done by teams without extensive JavaScript skills.
  5. Ongoing maintenance and updates are far easier and less expensive to manage.

Other Tools

When you store scripts and HTML in a database you must have some means to edit them. EasyCoder includes a set of editor scripts for this purpose. Scripts are just plain text but for HTML we use the CKEditor rich text editor; you will need to download and install this yourself. You are of course free to choose any other tools you like and use them to edit your scripts and HTML fragments.

No Comments on Story Driven Development with DSL and REST – Part 2
Categories: Uncategorized

Story Driven Development with DSL and REST December 8, 2018

This article describes how to implement large-scale, high-performance, interactive single-page and other designs without using any JavaScript and without having to be a professional programmer.

Part 1 - Story Driven Development

Every new website project starts with a "story"; a narrative that describes the appearance of the site and the way it responds to user interactions and other events. Stories can be visual (picture-based), they can be expressed as text, or they may simply exist as conceptual entities in the shared awareness of the development team. The last of these leaves no permanent record other than some emails and Slack messages, but in most cases the website itself displays its purposes clearly enough.

Story Driven Development (SDD) may or may not be a recognized term, but I use it here to mean a formal, documented process with just 3 steps:

1 Write the stories
2 Do the development
3 Test the result against the stories

SDD doesn't care how you do the development, only that it's following the stories. We start with the expressed requirements of the client, formalized as design documents and used to verify the final product. Along the way there may be many steps  bundled together as "Do the development", and for all but the most basic websites a program of some kind will run in the browser to control aspects of the design. JavaScript is now the only game in town as far as the core programming language is concerned, but it is increasingly layered with tools and frameworks like Angular, React and Vue. These are all tools for professional programmers that leave little space for the client to be involved in the development process and little opportunity to verify progress until the final product is delivered.

For over 50 years the software industry has had a poor record of delivering what is wanted. Projects overrun on both time and budget and frequently fail to work as requested. Sometimes they collapse in unrecoverable heaps, but the response of the industry is usually to throw more complexity at the problem in an often futile attempt to squash bugs before they occur. The consequence, even for initially successful projects, is an ever-growing dependency on a coterie of highly-paid, highly trained software engineers who will not be around years later to maintain the products they have built. The result is a downward spiral of reliability and a loss of fitness for purpose, as many customers are unwilling or unable to afford the equally high cost of maintenance.

But it doesn't have to be like this. With more emphasis on the stories themselves we can build code that is understandable by product owners as well as software engineers. It's a fundamental principle of Open Source that "with enough eyeballs, all bugs are shallow", and it goes without saying that the easier your code is to understand the easier it will be years down the road to find people who can be safely trusted to maintain it.

Let's look at an example.

Suppose we have a web page that displays some paragraphs of information. If there is too little content the message may not be properly conveyed, but too much may overwhelm users and cause them to give up. So we decide to compromise by using a 'concertina' design pattern, putting a "more..." link at the end of each topic, which when clicked causes a popout block - further text and/or images - to be added to the topic. A corresponding "...less" link in the new text restores things to their previous state.

The basic story for invoking this feature looks something like this:

"When the user clicks the More button...
1 Close any popout blocks that are open
2 Hide the More button
3 Make the popout block visible
"

With the help of our HTML designer let's now rewrite step 3 in terms that relate to the content of the web page:

Set the display style of the div whose id is "popout-1" to "block"

This is fairly clear and unambiguous, though the syntax is a little clumsy and could be improved. More importantly, you may need to do more than one thing to the popout div, such as giving it a background color or a border. To avoid repeating the central clause (the div whose id is "popout-1") each time, let's create a named entity that we can use whenever we want to refer to the div. We can call it Popout1 so it's easy to see which id it relates to. The action we want to perform is to attach this entity to the div element in the DOM:

attach Popout1 to the div whose id is "popout-1"

Now when we want to show the text, all we need to say is

set the display style of Popout1 to "block".

When a story like this is translated into JavaScript it leaves the normal, everyday world and enters the domain of the programmer. From here on, every addition, every change, however simple, needs a programmer to cast the necessary magic spells.

However, although the story is expressed in ordinary English it's actually an unambiguous description of some actions. This is really what defines a computer language, so in an ideal world there would be a compiler able to handle instructions like these. The good news is that languages of this kind do exist - they're called DSLs - Domain-Specific Languages. Rather than catering for general-purpose programming concepts, a DSL uses a syntax and vocabulary that have particular meaning in the domain for which the DSL is written. Here in the domain of the web page it's all things relating to browsers and what they do.

DSLs can be used by programmers and non-programmers alike, though the former can be expected to produce better code more quickly and with fewer bugs. The code may not be simple - a DSL script may be asked to do complex things and if complexity is really there it's not possible to hide it just by writing it differently. The difference is that no matter how complex a DSL script becomes it can still be read and understood by anyone having sufficient domain knowledge, without also having to be a JavaScript expert. Every command in a DSL script has a real-world meaning and it takes only minutes to learn the essential features of the scripting language.

A good DSL has other advantages. It can offer a built-in debugger that allows you to step through the script, examining variables as you go. Most bugs are quickly revealed by this technique. And in a WordPress context, much of the coding can be done by people with ordinary Editor rights on the website. This is because DSL scripts live in the page with the HTML, so it makes a DSL very convenient to use for simple web pages that only require a small amount of coding to handle their features. No need to find a place for custom JS files; everything is done using the WordPress editor.

You may have realized that the DSL I've been talking about all this time is EasyCoder. Others may exist of course, and I would encourage you to look out for them, but I'll continue to focus on what EasyCoder offers to website builders.

You may also have gained the impression that because the examples we present are simple, EasyCoder is only for simple websites. This is far from true; most examples have to be simple because that's the nature of examples. In fact, EasyCoder has some industrial-strength features that I'll describe in part 2 of this article.

Part 2 - The power of DSL and REST

No Comments on Story Driven Development with DSL and REST
Categories: Uncategorized

The Titchmarsh Effect November 18, 2018

Alan Titchmarsh is the presenter of a number of similarly themed programmes in which instant transformations are done on people's gardens without their knowledge. A team of experts creates beautiful designs, lots of problems are overcome along the way, usually to a very tight timescale, and the owner - or should I say victim? - is always delighted, or at least pretends to be. It’s all great TV.

But consider: If the owner were a keen gardener the makeover wouldn’t have been needed in the first place. And if they wanted a beautiful garden and could afford to pay for it they’d have done so. Who is going to maintain Alan’s loving creations?

Something these programmes rarely seem to do is return a few years later and see how well it stood the test of time. I fear that were they to do so it would make for some rather more depressing TV, with the inspiration long gone, original concepts overgrown and weeds rampant. Entropy rules.

It's rather like that with software projects. Everything is hunky-dory on the day of delivery but the system has to be maintained, usually by people with a lower level of expertise than the original team. The result is a gradual deterioration in the quality and reliability of the product. There's also an ever-escalating cost of repairs because each time work needs to be done a new team has to be brought up to speed and before they can make progress they have to first unpick the mistakes of the previous team.

For both gardens and software, the root cause is usually excessive complexity and an inappropriate choice of technology. The principle too often ignored is

"If you can't maintain it, don't build it."

If your client can afford a full-time gardener, then go ahead with the elaborate design. If not, give him a lawn and some shrubs. Similarly with software; only use tools and techniques that are readily understood by the people who will be tasked, years from now, with the maintenance of the system.

The problem with complexity

Every new technology is touted by its evangelists as the answer to all problems, but it ain't necessarily so. Just because it works well in a highly professional environment with long-term support guaranteed, you shouldn't assume it will automatically meet your needs now and into the future. The languages and frameworks that are favored today by the software industry all have a lifetime, after which it will be hard to find anyone experienced in their use. That translates to high costs and uncertain results.

The response of the software industry to complexity is to add more complexity. That appears bizarre but actually it’s not as daft as it sounds if we compartmentalize the complexity properly to keep it well hidden from all but those who really need and are able to handle it. The trouble is we tend proudly display it as a sign of machismo, placing it in full view where it often becomes a barrier to understanding.

Languages like JavaScript and the frameworks like Angular and React are raw, naked software; built to work well in the hands of experts. You should never expose this nakedness to your customers, most of whom will find it upsetting. They would prefer to see it decently clothed by the domain in which they operate.

Story-Driven Design

Customer needs and wishes start life expressed as stories, written in English by domain experts, and if you want to fully benefit from the experience of those experts you should depart as little as possible from that format. SQL is the leading example of this principle - nobody writes database code in C or JavaScript. SQL is probably the best advert there is for languages that match the domain and keep the complexity hidden. And it’s not alone; although it may be stretching the definition of a language just a little, Excel macros can also be described as a DSL.

Where are the DSLs?

You may argue that for other domains, such languages don't readily exist and will have to be written before any useful progress can be made on the project itself. You might also believe that special skills will have to be brought in and that the resulting effort cannot possibly be cost-effective.

Well it may be true that the DSL you need probably doesn’t exist and will have to be written, but let me assure you that a DSL is no harder to write than the complex nakedness it is required to cover. Yes, there’s an overhead, but once the initial preparation has been done, development and maintenance of the end product get easier, so in the longer term the savings are significant. It’s also a fact that once you’ve written the core of a DSL, adding new features to it becomes a simple task.

A well-designed DSL is highly modular, with each language feature handled by its own private code section. Much of the work is done by calls to core functions, which get massive use that flushes out bugs, leaving them little room to hide in. As the language develops the core gets less and less attention, so earlier features are rarely disrupted by new additions. Functions outside the core are rarely called from anywhere outside their own modules, which minimizes the tendency for spaghetti code to develop.

Writing a good DSL requires a positive attitude towards encapsulation. The starting point for any new functionality is the syntax you require it to provide to its users, so it works best if the functions being implemented have a tight, well-defined API. Google Maps is a good example of a highly complex product with an elegant, lean API that allows it to be readily encapsulated in DSL code. With experience you find that almost anything can be encapsulated; in most cases the hardest part is in finding a source syntax that reads well and is unambiguous, that is, unable to be confused with other functionality using the same language keywords. An example would be the need for care when implementing words like set and create, both of which are likely to be heavily overloaded in a typical DSL.

For WordPress

If your project is based on WordPress then much if not all of the work has already been done. The EasyCoder plugin is a DSL that implements a set of language features relevant to the needs of web page builders. Its comprehensive feature set includes support for managing DOM elements and also things like JSON and vector graphics. The package comes with a REST server that can quickly be leveraged to provide support for load-on-demand code, HTML and CSS, facilitating the construction of large single-page websites.

EasyCoder is built using its own plugin architecture that lets new syntax to be added without any need to change existing scripts. New functionality can be provided by anyone; not just theEasyCoder team itself. No conventional compiler techniques are used; script commands are handled by simple text processors that analyze the incoming stream and create a description of what they find, in the form of JavaScript objects. These are handled at runtime by code that is essentially just a thin wrapper around the underlying functionality that is being exposed as script. Any competent JavaScript programmer will take only a few hours to get to grips with the structure of both the compiler and the runtime.

For the rest

If the functionality offered by the standard product is inadequate, even after adding your own plugins, it should be noted that EasyCoder is the name for both the browser language provided to WordPress users and for the core engine itself. What this means is that the engine is largely independent of the language features so it can be the basis of an entirely new DSL where the only basic requirement is that it should all be written in JavaScript. The entire source code is available on GitHub - see the About page - and anyone is welcome to adapt it to their own needs.

No Comments on The Titchmarsh Effect
Categories: Uncategorized

Mutual Understanding June 5, 2018

In the previous article I asked why it is that when we speak to computers we use a kind of "baby talk" that makes it easy for them but very hard for us. We've had personal computers for over 40 years now and their power has grown immensely, but we still twist our minds into knots to describe things to them rather than expect them to understand our language. It's often claimed by the software industry that there's no alternative, but there are reasons to be suspicious of this claim.

Surely the existence of SQL gives the lie to the claim that computers don't and can't understand human language. Here we have a scripting system that is readily understood by people of all kinds; not just programmers but anyone who needs to store and retrieve data for whatever purpose. So what's special about SQL that sets it apart from C, Java and the rest?

Well, firstly SQL is highly domain-specific. Everything in the language relates to databases; it doesn't try to do graphics, music, web pages or navigation. So although it's not a large language, every term it contains is related to the domain it covers.

Secondly, SQL makes somewhat less use of symbols than other computer languages, in proportion to the overall character count, at least. The higher the density of symbols the more difficulty people have in quickly absorbing the meaning of a sentence.

And SQL sentences tend to look like English sentences, providing a further aid to understanding.

Now I cannot cite a definite proof for this, but I am fairly convinced that major software disasters are relatively rare in the world of SQL. Where things go badly wrong they tend to do so in other areas of a project. If I am right it's because of one thing; that the ownership of the code has not been delegated. The owner of the code is a database user, not a programmer. The project may well have employed a programmer to build the database code but the individual SQL commands still remain accessible to the owner. There's a saying that "given enough eyeballs, all bugs are shallow". The more that people outside the programming team are able to monitor development the less the likelihood that serious mistakes will happen.

Whether or not my crazy theory has any basis in fact, it's worth asking if the principles of SQL might be applied to other domains. Because in general they aren't. SQL is an example of a computer program that understands English-like statements if they are carefully written in a standard syntax and use known keywords.

What if we could do the same for other domains? Take web pages, for example. One of the fundamental uses of the React framework is to describe web pages, building them on the fly in the browser instead of loading them from the server. There are two ways to do this; one involves constructing JSX (XML that looks very much like HTML) and passing it through a JSX precompiler running in the browser to convert it regular JavaScript, which is then run along with the rest of the JavaScript code that deals with event handling and other things. The other way is to dispense with JSX and just write the functions that build the DOM objects.

But whichever route is taken it still looks like JavaScript, with or without extra blobs of XML. This is not something that most web designers can handle comfortably. Let alone the thousands of amateur and semi-professional WordPress users building their own websites.

None of the other available frameworks get any closer to handling a project expressed in plain English, the test being that it makes sense to a non-programmer when read out loud. That's a particular challenge but I won't settle for any less. After all, I can speak to my phone and perform a variety of useful tasks, so don't tell me it's not possible to write computer software that can process a constrained version of natural language.

I'm not trying to say that traditional computer languages should be thrown out and replaced with natural language. There are some things so complex they cannot be expressed easily in a spoken form and require a formal implementation such as modern computer languages offer. SQL itself has to be written in something, after all. But if a problem can be talked about and explored in natural human language then we should do our best to maintain that advantage all the way through to the code we provide to the computer.

No Comments on Mutual Understanding
Categories: Uncategorized

The power of language

All human activity requires communication. Our species only became successful and able to dominate all others when we learned to communicate using language, and human languages are immensely complex entities. Nobody is quite sure how many words we use in everyday speech but it runs from hundreds into thousands - maybe tens of thousands. Words differ in the way they sound and the way they are written, resulting in bizarre misunderstandings of the "four candles"/"fork handles" variety. Syntax rules layer on top of the words themselves and apply an agreed way in which to combine them to express meaning, adding further opportunities for things to go dramatically wrong.

Given all this potential for misunderstanding you'd wonder how we ever get anything right, but we do, even when the parties concerned are from different countries. Part of the way we achieve this is by setting up "domains" in which words have specific meanings that they may not have in general speech, and are used in tightly specified ways.

The word "language" itself has two distinct meanings. One is concrete, as in "I speak a foreign language" and the other is more abstract, as in "bad language" or "the language of commerce". The latter case doesn't mean that we swear in Greek or that business people only speak French while the rest of us speak English, only that they have a particular way of describing their work that might be unfamiliar to ordinary people.

There are thousands of domains covering all areas of human activity from bee-keeping to fine art, from car maintenance to cookery and from music to weather forecasting. Each uses a particular subset of the host language (e.g. English) to convey concepts specific to the domain.

If I want to convey instructions to another person - say how to boil an egg - they go something like this:

take a small pan
fill it with water
place the pan on the hob
bring to the boil
lower an egg into the pan
wait 5 minutes and 20 seconds
remove the egg

The domain here is cookery so although the instructions are in English we see a number of keywords - pan, water, hob, boil, egg etc. - that have specific meanings in that domain that they might not have in general speech. The instructions are all words and are imperative - that is, each line is a command. This example is close to being unambiguous and could be made so by a further tightening of the syntax, without affecting the ability of a cook to read and understand it.

So it's interesting to see how different things are when we communicate with computers. What if I want to tell a computer to boil an egg? Let's assume the computer is wired to a fully automated kitchen so physically it's capable of performing the required tasks.  But then things get tricky because computers don't understand text like that given above, even though the only real requirement is that the instructions have to be unambiguous. Nothing more.

To get a computer to boil our egg we have to translate the instructions into a computer language such as C, Java or JavaScript, but the result is something that a cook is unlikely to understand. In fact, a human being isn't even able to "speak" a computer program without difficulty as it's full of symbols, not just words.

When we use the term "computer language" we are in fact invoking a third meaning of the word "language", since computer languages don't conform in any meaningful sense to either of the previous meanings. They aren't in general pronounceable in speech, they don't "belong" to specific tribes, countries or traditions and they don't apply to domains. Instead they are general purpose symbolic vocabularies designed for the convenience of computers, not people.

Something else is different, too. It's a rare computer language that has as many as a hundred different keywords. Imagine that in order to communicate in English you had to use 100 words or fewer and build your sentences by combining those words in different ways, mixing upper and lower case and larding them with a profusion of symbols to reduce wordiness? Can you imagine how hard it would be to construct - or read - the Gettysburg Address or War and Peace?

Human communication has for millennia relied on languages with thousands of distinct words and meanings, able to be used for all purposes and equally good at handling the spoken context as the written. What on earth leads us to believe that when we want to give precise instructions to a computer we'll get the best results by throwing away all this linguistic heritage in favour of something our brains are demonstrably poor at handling?

The answer to this question can wait for the next article in this series.

No Comments on The power of language
Categories: Uncategorized

What software crisis? June 3, 2018

You may question the assertion that we have a software crisis and that it's been ongoing for 50 years. After all, universities turn out computer science graduates by the thousand and there are more programming languages and frameworks than ever before. So here's a longish discussion about why I feel things are not as they should be.

There are a number of rather questionable assumptions made in software engineering, that lead to a tendency to make technical choices before really considering what the overall aim of the project really is. Strategic decisions are made by engineers without intimate knowledge of the problem domain and without  adequate input from users. Rules are laid down and slavishly followed even when it becomes obvious they aren't working. And little consideration is given to future-proofing, for the day when the original team is long gone but the project needs updating and maintaining.

I take issue in particular with the dogmatic insistence that only by applying ever more complex software frameworks and practices can we guarantee a satisfactory outcome. Someone (not Albert Einstein, though he's frequently credited for it) defined insanity as "repeating the same mistakes and expecting different results", yet that is exactly what we do every time. And the first law of holes - famously quoted by Denis Healy, the former British Chancellor - says "When you're in a hole, stop digging".

Complexity and denial

The problem is how we deal with complexity. It's true that in some cases complexity is more apparent than real, but we rarely seek to question if this might be the case with our own projects. Instead, we apply the same standard-sized hammer to all problems, ignoring the possibility that parts or all of a project might work better with a different approach.

In my last employment, a great deal of care was expended on laying down rules and formats for the various processes involved. The project - a large website - had started life in Java, with strong object orientation, domain separation, modularity, subclassing and Spring injection, yet after less than 5 years the code was becoming unmaintainable. Refactoring was always a huge exercise that broke as much as it fixed. Most of the code simply shovelled data from one domain to another and the prevailing mantra was that in order to understand the system all you had to do was "read the code", a phrase that always fills me with the greatest of misgivings as it's generally used as an excuse not to write documentation. The problem with Spring as the underlying framework is that although you can read a single file it's often nigh-on impossible to see where it fits into the system. Not even the debugger can take you through the dozens of proxied software layers between one module and another, so only the very best engineers can really understand how system works. Isn't this what frameworks are supposed to avoid?

When - inevitably - you lose the people with that knowledge it becomes hard to replace them. Newcomers do their best but they lack an accurate mental picture, and without good documentation to provide insight into the minds of the original developers they have little chance of gaining it. If they are not at least the intellectual equals of their predecessors they may never do so. Yet the pressure is on to deliver results, so results they deliver, often using a subtly or fundamentally different approach to the original, making it even harder for others to figure out what's happening and further adding to the sclerotic nature of the code.

This particular project, undergoing endless, rapid and substantial feature growth, would probably have ground to a halt before much longer, but during the couple of years I was there a persuasive case was made for new components to be written using Node.js as it was more flexible and potentially cheaper to deploy in a large cloud hosting environment. It was quickly apparent that leaving half the system in Java while all the new parts were JavaScript didn't make effective use of the engineers involved, so a decision was made to retrain the Java people and migrate the whole thing to Node.

Out of the frying pan...

The technical rules changed overnight. Everything now has to be done with pure functions and the domain structure is defined by example in the first modules created, leaving the programmers little to do except churn out code that could be slotted into the structure. However, where complexity really exists you can't get rid of it just by reorganising things. It's like playing Whack-a-mole; if you push it down in one place it pops up in another. Instead of the hierarchies of Java we now had composition, the downside of which is repetition. Because of the continuing lack of an effective documentation policy this repetition tends to be done in slightly different ways each time, so although modules are similar the details tend to trip up the unwary. So complexity increases inexorably, step by step.

Another problem is in the way data is passed between modules. In Java this is always done with beans, each having a well-defined contract.  Much of the coding work goes into creating beans, filling them with data from other beans using a host of mapper classes then passing them to another part of the system where much the same happens again, resulting in a lot of bulk that's doing very little real work. On top of that is a correspondingly huge pile of unit test classes. It's hellishly clunky but it works - mostly.

In the new company culture, classes are regarded as subversive left-overs from the world of object orientation. Data is passed using composition, with objects comprising collections of attributes with no explicit contract ever expressed. The order and nature of these attributes varies from one file to another in a completely arbitrary and undocumented manner, so the need to read the code becomes inescapable. This can be quite hard when objects are deconstructed, spread and reconstructed with new attributes at each step of the way. By the time you've worked your way through the flow you've quite often forgotten why you started and what you were looking for.

I left the company a few months after the start of this exercise but I heard later the pressure was unrelenting, which doesn't surprise me. Such an environment does not allow the time to step back and review progress, is unlikely to encourage the production of good documentation and causes reliance to an unhealthy extent on the skills of a few exceptional individuals to hold it together.

The odd thing about all this is that the job being done is really quite simple. The user clicks a button on a web page; the system fetches the relevant data and writes bits of it to database tables, schedules actions and returns a response. The UI does most of the presentational work so the server is just handling a few JSON structures, combining and recombining them according to business rules. This is all done on a massively parallel scale, of course, but it's the job of the cloud service to meet capacity demands. Why is the server code so hard to develop and maintain? In spite of new frameworks, technologies and programming techniques coming on-stream every few months, the same things keep happening in project after project, year after year for at least the last 50 years. The way things are usually done results in a dangerously small number of people who really know and understand what's going on. We take a huge risk allowing massive systems to be so dependent on such a small pool of expertise.

The power of language

Here I depart from mainstream thinking and make a fundamental assertion: It's all about language.

The human brain is wired to process language, and the printed word is a very efficient input mechanism. When we read a novel, a newspaper or a paper on a subject about which we know something, we process the incoming information very rapidly and retain quite a lot of it because it slots into what we already know. There's already a place in the brain to park the new information.

Computer software is different, as are mathematical formulae. True, there are some whose brains are wired to take in formulae as readily as the rest of us absorb John Grisham, but they are a minority in a world that demands more and more people to have computer expertise. For the majority, processing a page of JavaScript is no easier than handling Government regulations written in a foreign language. However, we can learn foreign languages and eventually have the doubtful pleasure of experiencing the full beauty of those regulations, but the same doesn't apply to Java, JavaScript and the rest.

This is because computer "languages" are not really languages. Not in the same sense as English, French, German or Italian, at least. Imagine that in order to communicate in English you had to use just 100 words (or fewer) and build your sentences by combining those words in different ways, mixing upper and lower case and larding them with a profusion of symbols to reduce wordiness? Can you imagine how hard it would be to construct - or read - the Gettysburg Address or War and Peace?

But that's what programmers are expected to do.

The vision and the reality

Projects are conceived in people's minds, and before they can be implemented they have to be conveyed to a development team, which means writing them down. (Well in some cases they aren't, but then we're asking for some pretty spectacular disasters.) The project requirements are written in English (I'll use the word to refer generically to a human language), initially in a narrative form that describes in the most general of terms what is wanted.

Next, the requirements are defined more formally, but still in English. Use cases (stories) and other defining documents get written. These are then passed to the programming team, who convert them to computer code.

You may never have thought about it, but there's a huge gap in there. I was first aware of it nearly 30 years ago when automating a section of a factory production line. The factory engineers had produced a clear description of what the line should do but were completely baffled by the computer code needed to implement their requirements. One of them asked this memorable question:

"If computers are so smart, why can't they understand what we want?"

This question struck home at the time and has stayed with me ever since. In that time computers have become vastly more powerful than those around when the question was asked, which makes it even more pertinent today. Why can't they understand what we want? Why do we have to translate our needs into a low-level form we find hard to understand, just so the computer can?

The answer lies in language. Computer "languages", not really being languages at all, have no way to express things at a level human beings feel comfortable with.

Actually, that's not quite true. If you use relational databases you'll know all about SQL. Here's a real language; real in that every word in its syntax relates in some way to databases. And to nothing else; it doesn't try to be a Swiss Army Knife in the way Java or JavaScript do. The generic term for this kind of language is Domain Specific Language, or DSL for short. And there aren't enough of them.

Tell it like it is

The gap I mentioned is between the use cases and the code we end up with, and it can often be filled by a DSL. For most domains it's possible to devise a "ubiquitous language" that can be understood both by domain experts and by computers. Like SQL, such a language is dedicated to its own domain (and has little use outside it), is strongly typed and typically rich with real-world objects such as Users, Products or Records, each of which corresponds to a Java bean or a JSON object in a conventional implementation. With such a DSL it's possible to convert the user specifications to an unambiguous description of the entities and processes involved in the domain and run this directly.

There are several advantages to this approach. Firstly there's speed of development, since a great part of the usual development stage is avoided. Secondly there's reliability. DSLs comprise a core that's heavily used - and after a while completely bug-free - plus a set of keyword handlers that follow standard principles in their implementation. In addition, domain experts as well as programmers will have exposure to the code, and as Linus Torvalds says, "given enough eyeballs, all bugs are shallow". And finally there's long-term reliability, assured by making it far easier to recruit engineers who can understand the code well enough to take over the project.

Against this, the DSL has to be built and equipped with sufficient features to do its job properly. This is the job of a specialist programming team, but the techniques are well established and not hard to learn. A fairly comprehensive DSL can be constructed by one programmer in a few months. If enhancements are needed later it's relatively easy for new people to take over because they don't have to be domain experts as well as programmers.

Breaking the cycle of failure

So what I'm saying is that if you can express the domain requirements in some form of unambiguous English script, don't waste time and resources translating this into "standard" computer code but instead build a DSL to run the script directly. Some direct gains are

- the business logic, as expressed by the scripts,  is 'owned' by domain experts at all stages of the programme.
- all code is verifiable by anyone having a good understanding of the business requirements.
- programmers are mainly responsible for maintaining the language itself - a small part of the total codebase.
- maintenance and bug fixing become easier and more reliable.
- the long-term integrity of the system is not dependent on maintaining a pool of key engineers.

Of course, the success or otherwise of a project is not solely due to just code quality, but we have to start somewhere and fix what can be fixed. Continuing to repeat the mistakes of the past 50 years is not a sensible option.

In the articles that follow I'll go into more detail on some of the topics and issues I've highlighted above.

No Comments on What software crisis?
Categories: Uncategorized

50 years of the Software Crisis June 1, 2018

The name The Software Crisis was coined at the first NATO Software Engineering Conference in 1968 at Garmisch, Germany. At that time, software projects frequently ran over time and budget while failing to work properly if at all. Maintenance was hard to do and expensive so many projects fell into unrecoverable heaps.

50 years on, little has changed in spite of the huge advances in the power of both computer hardware and the software that drives it. Edsger Dijkstra put it succinctly in 1972:

"The major cause of the software crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem."

This series of articles examines the problem from the perspective of this author, advances a quite simple theory as to why things are as they are and suggests a means to deal with it.

No Comments on 50 years of the Software Crisis
Categories: Uncategorized