Semantic markup, page titles and language: accessibility for developers

  • Author: Access iQ ®
  • Date: 1 Feb 2013
  • Access: Premium

Quick facts

Semantic markup means structuring markup that adds and supports the meaning of the text, complies with web standards and helps communicate more than the text could do on its own.

  • Support meaning through markup
  • Use the right language for the job
  • Provide descriptive and unqiue page titles
  • Use the lang and xml:lang attributes to identify the language of the page
  • Include WAI-ARIA where practicable

The W3C has developed a series of technical specifications and guidelines it has adopted as Standards, which together define an Open Web Platform 'that has the unprecedented potential to enable developers to build rich interactive experiences, powered by vast data stores, that are available on any device'.

On the internet there are two basic types of content:

  • Documents: primarily focussed on presenting information.
  • Applications: focussed on functions, transactions, and interactions.

Most often there is some overlap, for instance, when an application has a large information page or a document includes an interactive form, but on the whole, most pages fall clearly on one side or the other. Markup languages like HTML and XML give structure and meaning to a document. Semantic markup means structuring markup that adds and supports the meaning of the text, complies with web standards and helps communicate more than the text could do on its own.

Web standards enforce the separation of presentation from meaning, structure from content. The standards state that presentation, meaning and functionality should remain as separate as possible, partly through the different languages used for each; CSS, HTML, JavaScript. This is how, for example, <em> (for emphasis, which denotes a change in meaning or status) is more semantic than <i> (for italic, which denotes a change in presentation or style). Semantic markup prefers the instructions related to presentation be called from a style sheet using cascading style sheets (CSS) and not in HTML, unless the CSS properties are enclosed in a <style> element in the <head> of the document.

Semantic markup also requires including appropriate and correctly formatted meta information, which by being machine-friendly helps web content to be found, understood, indexed and inter-related to contribute to a semantic web. Some of the meta information like page titles and document language provide meaning and information that help not only understanding but also usefulness. Following some of these conditions is very easy to do, some taking only a few minutes to apply to an entire large website, but have lasting benefits to their place in the internet.

Following web standards and using semantic markup will, by and large, enhance accessibility. This is not because accessibility is built into web standards but because standards-compliant semantic markup is more machine-readable, which includes browsers, search engines and the assistive technologies used by people with disabilities. Good semantic markup also allows better keyboard navigation, making the site easier for everyone to use.

Because WCAG 2.0 is published by the W3C through WAI, web standards (which are also managed by the W3C) support the four Principles of Accessibility that require web content to be:

  • Perceivable
  • Operable
  • Understandable
  • Robust

Guideline 1.3 Adaptable states:

Create content that can be presented in different ways (for example simpler layout) without losing information or structure.

The way to follow this guideline as a developer is to ensure markup is standards-compliant and uses semantic markup correctly. WCAG 2.0 suggests several specific techniques for doing just that.

Using semantic markup

Semantic markup requires web content to be marked up according to its meaning rather than to manage visual presentation. It also requires content elements to be structured in order to support and enhance meaning.

In this way, content can be interpreted in a meaningful way by assistive technology and the meaning conveyed to, for example, someone using a screen reader.

It may be that in your organisation, responsibility for this rests with the people entering the content, ensuring text content is marked up semantically, or to the web designers, who may create the semantic page structures that support the content. Not all web teams rigidly separate development from content authoring.

Part of your job may be to set up the authoring, management and design systems to bring the content to the webpage, in which case you will also have some level of responsibility for ensuring semantic markup is used, through templates and content editors. Exactly what markup is used for each situation is a matter of professional judgement, but the following techniques are available to you.

Support meaning through markup

Check the content for elements that have a semantic function, where the meaning of the content can be supported by HTML. Then check that the markup used is semantically appropriate, that is it supports and enhances the meaning rather than only affecting the visual presentation.

For page structure, use appropriate block elements that can be understood and rendered by user agents and assistive technology. This requires marking up paragraphs with <p>, lists with <ol> and <ul> headers with <h1> through to <h6> (following the numeric hierarchy), short quotations with <q>, long quotations with <blockquote>, and so on.

This should be done in combination with appropriate elements that affect text elements. HTML specifications give you a list of elements affecting text and how to use them semantically, such as the phrase elements <em>, <strong>, <cite>, <dfn>, <code>, <samp>, <kbd>, <var>, <abbr> and <acronym>. Each of these applies semantic meaning to the content they contain, and can be styled to use preferred visual presentation using cascading style sheets (CSS).

<p>Our <cite>Style Manual</cite> said to insert this string: <code class="jscript"> document.write(a + ' ' + b);</code>, but doesn't make clear exactly <em>where</em> to insert it.</p>

User agents have default settings for these elements but you can choose to manage the presentation of them through CSS, applying specific styles appropriate to the design.

There will be occasions where variations in the way text is presented should be supported and reinforced by explicit additions to the text.

For example, if a list of products in an online store displays new entries in a bold font through the application of the <strong> tag, then that status should also be indicated through text information, such as placing the word "new" in parentheses after the title.

This ensures that information about which products are new is not conveyed by visual presentation only.

Supporting meaning by applying semantic markup allows browsers to display information appropriately, while allowing screen readers to convey the information in meaningful ways to people with vision impairment. For example, structuring lists correctly using <ul>, <ol> and <dl> ensures screen reader users will receive correctly structured and described information and understand the difference between these list types.

Use the right language for the job

You should not use presentational markup to convey semantic meaning. An example is using a combination of text size and font weight to make text look like a heading, rather than applying an <h> element. Without the <h> tag, a screen reader will not recognise the text as a header, just as decorated text, and not treat it as a heading, which is part of the page structure.

Make sure you also use semantic markup when colour is used to give text a particular status or function. Colouring emphasised text as red may work for sighted users but screen readers will not convey that the text has any special status. Far better to use semantic elements to convey what the red text would convey to sighted users. To convey meaning to people using assistive technology, you cannot rely on visual presentation alone, and must therefore use semantic markup.

Additionally, the days of using table layouts to visually arrange the elements on a page are long behind us. Modern HTML, XHTML and HTML 5, combined with effective CSS are capable of doing more than was previously possible with <table> elements. Using tables for layout is an abuse of semantic markup and should be reserved for tabular information only.

A good way to test a page is to review it with CSS and JavaScript off, and even without images, to see if the meaning of the content is well supported. You should be able to understand the meaning of the document without presentation or function, including the reading order and meaningful landmarks, without anything other than HTML.

Page titles

Page titles are an important navigational element for all users. When you visit a webpage, the page title should accurately describe the content you will find in a concise manner.

Page titles are used in many different contexts. A page title appears at the top of the window tile in a web browser, and in tabbed browsing where the page title is used to identify the webpage that is open within the tab. Page titles are also used in browsing history, bookmarking, sharing via social media and search result listings. They also play an important role in search engine rankings.

Clear page titles benefit everyone. A sighted user may glance across a number of tabs and read the page titles to find the webpage they want to return to. A vision-impaired person can use their screen reader to read out each page title to do the same.

Although the responsibility for creating page titles will primarily lie with the content manager, there are certain considerations for developers when building a website.

Descriptive page titles

When developing a website, you may be required to either create page titles for webpages within the website or, more likely, ensure that content management system (CMS) or other web software that automatically generates page titles either partially or fully meet the accessibility requirements.

Page titles must describe the topic or the purpose of a webpage to meet Success Criterion 2.4.2. They should give the user sufficient content about what is on the webpage and where they are in relation to the website.

Website context

For large websites, it can be helpful to include information about the website or sub-site to which a webpage belongs to give added context for the user according to Technique G88: Providing descriptive titles for webpages.

If page titles are automatically generated either fully or partially, consider automatically including in the page title:

Unique page titles

The same technique also notes that it is useful if the page titles are unique. If page titles aren't unique, it will be difficult for a person to identify between the two pages with different content on them and chances are the page title is not accurately reflecting the topic or purpose of the webpage either.

Be mindful that content management systems or other software that creates dynamic content for the web may automatically generate page titles that aren't unique. This should be corrected within the web environment so that automatically generated page titles are unique and descriptive.

Document language

Identifying the primary language of a webpage enables assistive technologies to correctly present the language to the user. For example, screen readers or text-to-speech engines that support multiple languages will use this information to voice the text with the appropriate accent and pronunciation.

You can easily and quickly set the default language of the page using the lang or xml:lang attribute for HTML and XHTML respectively.

Using the lang and xml:lang attributes

Guideline 3.1.1 Language of page states:

The default human language of each webpage can be programmatically determined. (Level A)

The default language of the webpage can be specified using the lang attribute for HTML documents or xml:lang if you are using XHMTL 1.1. The value of lang or xml:lang attribute must conform to BCP 47: Tags for the identification of languages or refer to the commonly used language codes. If the webpage includes content in multiple languages, then the default language should be the language is that is most used on the webpage.

Any content within the webpage that is in a different language to the default language must also be identified appropriately. If you are responsible for uploading content, refer to the topic on using foreign words or phrases within a webpage in our guide to web accessibility for content authors for more information.

The following example defines the content of an XHTML 1.0 document with content type of text/HTML to be in the US English language. Both the lang and xml:lang attributes are specified in order to meet the requirements of XHTML and provide backward compatibility with HTML.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"


<html xmlns="" lang="en-US" xml:lang="en-US"> … </html>

Include WAI-ARIA where practicable

Accessibility for Rich Internet Applications (ARIA) is a set of attributes that can be added to elements to add meaning, functionality and structure to assistive technologies like screen readers, which in turn can help people with disabilities better understand the content on the page. The great benefit of using ARIA landmark roles is that browsers that do not understand it, disregard it.

Of particular interest for semantic markup are the landmark roles used to label specific locations on the page that a user of assistive technology can navigate to and interact with.

For the purpose of semantic markup, you should be familiar with the following ARIA landmark roles:

Adding a role is very simple, and follows the same syntax as adding an attribute to an element:

<div role=”main> ... </div>

In the case of HTML5, if you use the nav or other new elements, you can still use the ARIA role with no interference:

<nav role=”navigation”> ... </nav>