Content modelling and the future of content delivery

By Richard Jones

In a world filled with billions of internet-connected, content-consuming devices, it’s never been tougher to get your content in front of the right audiences – at the right time and in the right context. 

In this guide we’ll be looking at the critical role that content modelling and structured content will play in rising to this challenge (and why addressing this challenge is so important).  

First though, to understand why content models matter so much, let’s look at how content delivery is changing and the challenges and opportunities this presents to content owners.

The future of content delivery

We’re hitting the mainstream of a new phase of web development where content takes precedence, and where the devices consuming it are not always known to us. 

We’ve already seen the huge adoption of voice-activated speakers, with almost a quarter of UK households now home to a smart assistant device. Open APIs from the likes of Amazon and Google mean that it’s relatively easy to make your content accessible for these new use cases.

Of course smart speakers are not the only new destinations for our content, and there are many categories of devices that constitute the Internet of Things. But many of these are data collection devices, such as fitness trackers, are so are not content consumers in their own right. 

Voice active speaker

For the purpose of this guide, we’re not interested in this type of device. Instead, we’ll focus on devices that can make use of the type of data we’d traditionally store in a content management system (CMS). 

I refer to these devices collectively as ‘the unknown consumer’ – a catch-all term for new breeds of content-consuming, internet-connected consumer devices. 

What connects these devices is that:

  • They can’t ‘read’ unstructured content, so information that’s ‘locked’ in a web page geared at web browsers is of no use to them.
  • The context of the information we offer them is important e.g. a smart fridge might be looking to use our content differently to a fitness tracker, so we need to provide content that suits these different types of use cases.
  • We don’t know the capabilities of the unknown consumer i.e. it’s not always possible to know which devices are trying to access our content and why.

Let’s now do some future-gazing to understand the current and future use cases for some of these devices.

The real-time video game 

The simulation genre of video games is capable of utilising very detailed content sources to give the player the most realistic experience possible. In the past, games were locked into the data on their release disc, but today every gaming experience is preceded with some kind of update from a central server. 

To take an example from a fairly recent football game, the game utilises many key metrics about players that could change over time (or could be disputable). But now consider a scenario where we could allow our game to consume data from a third-party source? How would that change the player experience?

At a simple level this would allow us to play a full season whilst the actual results of other matches were kept up-to-date. 

In a more detailed scenario, imagine if a player’s real-life injury on a Saturday meant you could no longer have that player on your team when you came to play the game on the Monday. And imagine a scenario where the real-time fitness and fatigue of a player could be factored into gameplay. Smart, huh?

Could content save a life? 

It goes without saying that medical devices are relied upon to work accurately – but they’re not always without fault. At the moment, the FDA – the US Department of Health and Human Services’ Food and Drug Administration (FDA) – and its equivalents around the world maintain a database of devices and their approval status. If a device needs to be recalled a notification is sent out to the end-user of the device and recorded in the FDA database. 

In this scenario, the purchaser of the device (possibly a buying department in a hospital) and the end user of the device (possibly a front-line paramedic) are not necessarily in daily communication. 

Whilst there are processes in place to ensure the information reaches the right person, what if the device itself could consume data from the FDA database and check if it has been recalled? What if it could (in an extreme situation) shut itself down as a result to prevent harm? Of course there are ethics to consider here, just as there are with the likes of the driverless car, but these scenarios of content consumption could be happening right now.

The driverless car 

It seems very likely that autonomous vehicles will create the same cultural shift that the original automobile was responsible for in the last century. Putting aside the myriad of questions they raise about ethics, employment, and politics (to name a few), driverless vehicles represent new territory from a content perspective. 

" "

Because if a car is driving itself, and you genuinely don’t need to be part of the process, there’ll be new and intriguing opportunities for content consumption. Driverless cars could be designed in a way that means this information is presented onboard in a variety of ways. Remember current assumptions around an individual's relationship with their phone or tablet may not be set in stone.

Defining the business need 

At Inviqa, we believe all technology decisions should be centred around a company’s strategic goals and its users’ needs. For the purposes of this series, that objective (albeit very wide-ranging) could be the following: 

‘I need to be able to serve content now and in the future to devices whose capabilities and intent may be unknown to me – without relinquishing publishing control’. 

In the rest of this guide, we'll look at how we can meet this goal and we’ll explore the role that content models and structured content will play in that journey.

Why content models matter

Now that we’ve established the opportunity of serving content to unknown, internet-connected devices, we need to think about how we can structure our content in the most flexible way. Because many of tomorrow’s content-consuming devices won’t be able to read the contents of a typical web page.

To understand how best to structure our content we need to consider the following:

Software is not good at interpreting meaning 

Software can struggle to interpret the meaning of your content. Take the example of delivering route guidance to a navigation system. If you try to deliver that information via a human-readable format, it’s not going to work well. 

Sure, it would be doable with smart enough software, but using structured data will be more precise and efficient, and will achieve better results.

Smart car

Likewise, if we take the example of a diet tracking app, how well is that app likely to understand a recipe on a web page? Without understanding the context of the recipe information, it’s going to struggle to consume that content effectively.

Web page thinking is limiting 

We’ve already explored some limitations of the traditional web page format in helping us serve the needs of our unknown consumers. But many people still think of the web using the page model. 

With modern HTML, pages may have semantic markup elements that go some way to making sense of the content on the page. But at best these are still signals for interpretation. 

In order to meet our goals we need to move beyond page thinking and start to think of our content in terms of structured data. 

In this way of thinking, the content creator needs to think beyond pages and relinquish control of the design and layout of their content.

Content needs to be liberated from design 

Content needs to be modelled in such as way that the task becomes creating the most-structured, clearest-possible data whilst delegating the design to the content consumer – whether that be the theme layer of a content management system (CMS) or the voice user interface (VUI) deciding which available information should be ‘spoken’ back to the user. 

The blobs vs. chunks debate has been raging for years, but at the beginning it was more a conceptual CMS theory that was important for maintaining consistency in design when using a CMS. The premise of the debate is whether a CMS should allow the user to create pages using a WYSIWYG (‘what you see is what you get’) editor, giving them creative freedom at the expense of structure. 

Screengrab of how to create a blog post

But with the challenges posed by new breeds of content consumers, it’s become essential to remove or mimimise the use of WYSIWYG so that we can offer clean data to these unknown devices – otherwise we are expecting it to parse and interpret irrelevant information. 

Adaptive vs. responsive design

There is also a long-running debate about the pros and cons of adaptive vs. responsive design. The premise here is whether every device requesting a page is presented with the same things and the device itself decides what to do – as opposed to a system that changes what it sends based on the device requesting the information.

This is a fascinating debate covered in depth by Karen McGrane and well worth reading for context to this article. When we consider the unknown consumer we have to assume that, since we don’t know what the consumer will be doing with our content, we have to provide it with everything (but in a way that does not dictate design).

An example of a content model 

Let’s move beyond the theoretical and use a real example. Consider the content type of a news article. At its very lowest level this would be a title, body, and associated metadata to do with publication date and author. 

There’s not much structure here, so we could potentially add an alternative, short-form summary of the body field as well. Once we start adding some context of a real scenario we see that this can quickly escalate.

Now consider the news article as part of a football team’s website. The article itself could be related to a fixture, a player, or a specific team or event. Each of these are content types in their own right. 

Spider diagram with 'article' at the centre connected to 'team', 'fixture' and so on

Illustration of a potential content model for a football news article

This type of model means the content consumer can use only the parts of the data that it really needs, and can understand the relationships between different pieces of content. 

Defining the content model 

The definition of the content model is an important part of a web development project. Whilst the relationships between content can seem simple at first, we quickly discover complexities in the way content fits together and how we can break down structures into finite elements that link together. 

This needs to be done in a workshop style with stakeholders that truly understand the complexity of their content alongside CMS specialists who understand the possible implications of the decisions being made in terms of complexity and performance.

Content modelling is the foundation of content strategy

Getting the content model right before even starting the process of building a CMS is essential as the content model is the foundation of everything that comes next.

Getting buy-in and a full understanding of the reasons for approaching content this way is essential for the success of a project – and also for the creation of the content that will work. 

In the next part of this guide we’ll look at some more practical ideas around content modelling and some of the common pitfalls that we see in CMS implementations. 

Content best practices and anti-patterns

This guide has already explored approaches to content modelling and the relationships between types of content. Now let’s look at content management best practice in a world of unknown consumers.

Single responsibility 

Modern content management systems (CMS) allow us to easily create as many content types as we need and add to these any number of fields. With this flexibility there is no need to force a content type to do more than one thing. 

When creating content models, we like to consider the single responsibility principle: that any content type or field has one purpose and one purpose only. Taking a simple example, if we have an article content type, we would not try to make this behave differently by adding a date field and using it for an event. 

Sometimes, during planning sessions, as the number of possible content types starts to grow, there’s a temptation to map different physical content to content types we’ve already defined in a bid to reduce complexity. 

In reality, we end up doing the opposite, because things get considerably more complicated when you attempt to use a single content type for different purposes. Instead, it’s important to keep content types discrete, specific, and well-namedso it’s obvious to an end user which one to use in any scenario. 

Let’s take another example where we are working on a content model where the organisation offers different types of event – webinars and multi-day, onsite workshops, for example.

Here there might be a temptation to create a single event content type as they sound similar on paper. In reality, the content requirements are completely different – one will require a venue, for example, while the other needs a dial-in URL. That’s why two content types are required. 

We also have the same scenario for fields within a content type. If we take the example above, it would be tempting to have a single URL field which has two purposes depending on the type of event: 1) to give us the link to access the webinar and 2) to give us a map to the venue. 

These are exaggerated but not uncommon examples that lead to overly complex content models. The moment we introduce any ambiguity into the system, we are effectively restricting the unknown consumer as it will not understand the context of our content. 

Non-content content 

‘Non-content’ content is a common anti-pattern that’s often driven from design requirements. I’ve heard this type of content referred to as a jump, link, signpost and others, but ultimately it’s always the same: a piece of content that’s simply text, an image, and a link. It has no purpose other than to link to something else. An example is a ‘jump’ that provides a teaser and link to a blog post.

The problem with this type of content is that, while it’s not meant to be treated as real content in its own right, it still gets indexed by search engines. For this reason, it’s important to think about our content in a number of view modes. If we want to link to a blog post, for example, we can render it as a teaser view that provides exactly the same content as our ‘jump’, but displays it in a different format.

The example here shows a news article in teaser mode and full content mode: the same content in different contexts. 

It’s essential to take this into consideration for the unknown consumer for whom context is important. We need to present the content in a form where the device or platform can decide which parts are appropriate to display. For example, we don’t expect a voice user interface (VUI) like Amazon Echo to read the entire contents of an article, but it might be possible for it to read a shortened summary. 

Field ambiguity 

A good content model should not require training for a user to understand it. As with the single responsibility principle I mentioned, all fields within a content type should be obvious to anyone with working knowledge of the content domain they are working within. 

Fields should be clearly labelled with sufficient contextual help text that no one needs to question what the field is for. 

Content for logic 

Another commonly-witnessed anti-pattern is the use of fields in a content type to control some other part of the system. There could be a drop-down field, for example, which allows the content editor to switch between a number of possible layouts. 

Encapsulating logic in this way is confusing as it leads to ‘magic’ happening within the CMS. Our unknown consumer will have no way of knowing what these tricks are and this could lead to the context of the content being changed unintentionally. 

The Godzilla content type 

If the single responsibility principle is followed, this one should never happen, but we often see the monster content type which is all things to all people. It has potentially hundreds of fields and can act as many different things depending on the combination of settings filled in. 

This often happens when a system evolves over time. Fields are easy to add in a CMS and often this might happen as a result of a support request without referring back to the original content model and system design. 

Our unknown consumer will not be able to correctly interpret this complex kind of content and so we must treat the addition of a new field to content as a non-trivial change, no matter how easy it is to do. 

Welcoming in the unknown consumer

We’ve explored some of the anti-patterns we see in CMS-based website builds. Now let’s look at how we can expose our newly-created content model to the outside world so that we can invite others to make use of our content.

Expose your content

The premise we’re exploring in this blog series is that there will be (and may already be) new devices that consume your content in ways you don’t know about – for use cases that are still being conceived. 

In the traditional content model, you expose your content as HTML with the assumption that it will be consumed by a web browser. You make certain decisions about layout and presentation depending on the capabilities of the rendering engine. 

But in the new scenario we are discussing, we are not concerned with layout, but pure, structured data, because many of these new devices won’t be able to read the type of data we traditionally store in a content management system (CMS).

The preferred approach is to expose an application programming interface (API) that can be queried. The dictionary definition of API feels a little outdated for the purposes of today but the principles are exactly the same as they have been for many years:

API (noun): a set of functions and procedures that allow the creation of applications which access the features or data of an operating system, application, or other service.

Design an API

Designing an API for the purposes of exposing your content is very much like designing the content model in the first place. You need to provide methods for systems to quickly and efficiently search for content, and then to extract the specific parts of the content model required.

Since you won’t necessarily be in communication with those who are using your content, it’s important to provide clear documentation to them, so that anything you do provide is clear and concise.

One of my favourite tools for this process is Apiary which allows you to create an API model in a prototype form and to provide mock responses for testing. This can become your blueprint and documentation so that anyone who wants to make use of your content is fully aware of what the capabilities are.

Once the blueprint is agreed, you can build the API in your preferred CMS making sure it adheres to the contract the API implies.

You also need to respect that other people are building a dependency on your content, and therefore you should be careful about versioning of the API in the future to ensure backwards compatibility. 

Use new business models

When thinking of a traditional website there are a number of tried-and-tested methods of monetising content. That might be a paywall block or an alternative, free version of the article.

An API approach provides you with a more controlled way of metering your content. A few examples might be:

  • Metered access: allowing API clients to access a certain number of articles, or have full access for a specific period of time.
  • Differential access: allowing API clients to access some fields in the content model and others at a premium access level – free teasers, for example, with full content at a premium.
  • Pay per view: allowing API clients to trade credits for paid access, one article at a time.

Being flexible with the content models you provide will encourage the development of consumers for your content. With growing pressure on the advertising revenue associated with content, providing value via enhanced, ‘naked’ content could be an effective and alternative business model. It also allows you to try different models with less complexity than you might face with a traditional website. 

Create a content legacy

The posts in this series have been concerned with dealing with the unknown – but it’s also important that you are responsible with your content and the parts of the system that are under your control.

In the space of the last two decades we have created and made obsolete a huge number of data storage formats. 

Physical formats are one thing, but data formats are also a concern. Can we still read a graphics file from an obsolete application last used in 2003?

Even worse is the current tendency to replatform and rebuild your web presence every few years. In my experience, websites have a lifespan of 2-5 years due to changes in technique, design, and technology. 

However the content we produce should last forever. We should not be relying on the heroic efforts of the Internet Archive to preserve the content legacy of the 21st century. It’s far too easy to consider content as disposable and we will suddenly find a huge gap in our collective knowledge that is decades-wide.

We can’t predict the future, but we can prepare for it. By preserving your content in a pure form – that can be accessed easily by future consumers – you’re creating a legacy of content that can be presented in whatever formats the future holds.