Saturday, April 9, 2011

Rebuilding Radio NZ - Part 2: The Birth of ELF

In this second part I'll be talking about the birth of ELF.

Warning

Even though its a drag, I'm repeating this disclaimer.

There is no such thing as instant pudding. You cannot copy what someone else does and get the same result.

This post is about a specific site with its own special functional requirements and traffic loads. radionz.co.nz is a public broadcaster's website that includes news, audio and content related to on-air programmes. Traffic loads are very peaky (and high).

This series of posts should NOT taken as advice for or against any particular system. It deals with our specific pain-points and how we are solving them.

You should do your own research and assessment before choosing any CMS or development framework. A good starting point is Strategic Content Management on A List Apart.

Some Management Theory

The manager is responsible for the system in which his staff work. By system, I mean all aspects of the job that contribute to whatever you are producing. The system includes workspaces, office layout, tools, technology, processes and procedures to name a few components.

It is the manager's responsibility to improve the system. In doing so he must understand the difference between problems which are part of the system (built in), and those that are outliers (from outside).

For example, for knowledge workers their computer is part of the system. No one can be productive if their computer keeps failing, is underpowered or does not have the software they need to do their job.

A one-off power cut that stops people working for a day is probably an outlier that needs special attention. (Or may need no attention at all).

The system itself (and everything in it) needs to be designed and maintained. There is nothing worse than a free-running system where components essentially design themselves, are become sub-optimised, failing to work together as a whole. It is very common for processes to become run-down over time and no longer be fit-for-purpose.

The aim is to have stable, predictable processes where you can be sure that content moving through the system meets quality expectations when it is finally published. Efficiency, and replicability are just two aspects of the equation.

The tools that are used to produce and manage web content play a critical role in the system, and one of my roles is to make sure the tools do not get in the way of creating our content.

It is from that base that we considered the suitability of our current CMS tools.

Cracks in the walls

The Radio NZ website was built from scratch - when we started we had no existing processes to support publishing large amounts of web content, and no web infrastructure. We designed new publishing processes and chose our tools (Matrix and a number of custom scripts) based on those processes (I'll be documenting these in later posts).

These processes have been improved iteratively over time. Some of these changes were facilitated by new features in Matrix, others from internal rearrangement. As well as process improvement, we continued to add new content and functionality to the site.

But from late 2009 we found it increasingly difficult to innovate. The modular approach to building sites in Matrix - the very paradigm that got us off the ground so fast - was slowing us down.

Matrix makes The Hard Things simple. Start a new site, set up a home page, a 404 page; all done is 5 minutes. Change content on an About Us page; 1 minute. Setup a form for people to submit queries; done in 10. Display the same content in three different places, auto-generate menu structures; more complex, but still relatively fast to implement.

But for us, some Simple Things were getting harder to do. We were having to create increasingly complex configurations to optimise the display of our content, and create new ways of viewing it. (Examples of this in subsequent posts).

This was largely because our content was stored as individual assets, rather than as modelled data. Each asset knows nothing about any other asset. For example, an audio asset does not know what programme it was broadcast on. A programme asset (a programme's home page) does not know who the hosts of the programme are. And so on.

Some of the asset structures required to support certain features require huge amounts of work to implement.

On top of this was system performance. We are a media site with fast-changing content and high performance demands, and I think the only Matrix customer using the system in this way.

Many pages (like our old home page) were built from many pieces, putting a high load in the system when they had to be rebuilt and cached. With frequent publishing we had to expire the cache as often as 10 times an hour.
In order to deal with our high traffic load it was suggested that a custom caching regime be considered. This would allow us to publish updates 5-10 times an hour, and for the content to be recached more efficiently.

We had already made changes to the operation of the cache (see this old post), and they'd been running for several years, so I had a very good understanding of how this part of the system worked. It was unlikely that these new changes would be of use to other Matrix users and would not become a part of the core product; if implemented, they would be our responsibility to maintain.

The cost of working-around these two problems (asset modeling and caching) - problems that may not exist with other systems - was deemed too high. Sadly, matrix was no longer a good fit for our content or our traffic profile. It was time to consider alternatives.

The decision to change was entirely pragmatic and based on changing business requirements. It was a difficult decision to make, especially after a long history with one product.

ELF is born

Looking at our content, and the sort of features we wanted, it was pretty obvious that a lot of custom code would have to be written.

Very few of our pages are the standard 'edit, upload a photo, update the title' type of content. With this in mind I thought it better to have complete control over all the software, rather than bolt 95% of what we wanted onto an existing product.

Rails looked like a good platform to model and deliver content like ours, and had an excellent local (Wellington) community. There are many development houses and government agencies working with Rails.

So Ruby on Rails it was.

An additional factor was the use of the framework on our company intranet. We had developed a number of powerful modules that could be leveraged for the public website. (In practice, I think we saved about 6 weeks time by recycling existing code).

The name ELF was chosen after a brain-storming session. ELF stands for Eight Legged Freak (i.e. a spider). It was chosen because a spider lives on the web, and because an Elf has 'magical powers' that benefit its users.

In my next post I'll talk about planning the migration of content and the first section we built and made public: Recipes.

No comments: