Website accessibility testing: automated testing vs. human evaluation

  • Author: Access iQ ®
  • Date: 28 Aug 2013

A research paper that investigates the use of evaluation tools has shown the importance of using a mixture of testing methodologies in order to accurately assess the accessibility of a website. Such methods include the use of automated testing, manual expert evaluation and user testing.

Access iQ™ spoke to co-author of the research paper and director of Web Key IT, Vivienne Conway, about the research findings.

Access iQ™: In your research paper Benchmarking Web Accessibility Evaluation Tools: Measuring the Harm of Sole Reliance on Automated Tests, you explain the ‘harm’ of bypassing human evaluation in favour of evaluation tools, in order to assess the accessibility of a website. You explain in terms of how many Success Criteria (SC) would be missed. However, how does reliance on evaluation tools only, affect the user?

VC: Actually, I think it is the user who is inconvenienced most when website owners rely on automated tools. Recently I came across a website where automated tools missed keyboard traps that were critical to the use of the website. In fact the keyboard-only or screenreader user could only go from the URI to the search field and the next tab went back to the URI, rendering the website completely useless for them. The automated tools missed this one entirely. As a result, the website owner relying   on automated testing only, could think that the web page was compliant. In this case the website owner will also lose business as this was a critical booking page and the use would go to a competing service that was accessible. On top of this, the user could not fulfil the required task and had to go to another service.

We need to look after our users, and that means testing as rigorously as possible – automated testing, manual expert evaluation and users with disabilities and senior citizens.

Access iQ™: One of your findings is that automating as many SC’s as possible leads to an increased number of false positives. What are the most common examples of false positives that you found on the websites tested for this research and why do you think they are the most common?

VC: Where there are large numbers of false positives, it means that the tool mistakenly reports large numbers of problems which are not true violations. The tools say it is an error, but it isn’t. This often happens when a tool, in order to catch as many problems as possible, reports items which are not actually violations, as if they were. Taken to the extreme, if everything on the web page is reported as a violation, they would catch all of the errors on the page, but there would be so much ‘noise’ that it would be virtually impossible to ascertain if the issue was really a violation.

This is one area that is often overlooked by users of automated tools. While website owners are usually aware that automated tools miss many errors or are unable to check certain Success Criterion, they often assume that if an error is reported, it must be true. This is quite often not the case. In our research we found that an average of 1 out of 3 automated errors reported by the tools assessed were actually not correct, while they missed large numbers of actual violations.

For instance, let’s look at keyboard traps. Many automated tools report a keyboard trap, but you can’t find out if this occurs unless you work through the page with a keyboard-only approach. I often find this reported, yet can navigate via the keyboard quite easily. Sometimes it relates to the use of JavaScript which if implemented correctly can still be accessible. Language specification is another issue that can be mistakenly reported. Every markup language iteration seems to have different language specification requirements and many automated tools are not equipped to catch this – often mistakenly reporting a missing language specification which is actually present.  Language specification is important for screen reader users, people with cognitive, and language disabilities and people who rely on captions for synchronised media. Yet another area we find false negatives is in the provision of form labels. There are quite a number of SC that relate to form labels and their programmatic association. While forms are notoriously poorly implemented, quite often automated tools report errors where there are none. For instance you are permitted to use the TITLE attribute where the label cannot reasonably be implemented. While this is somewhat controversial, it most often happens with a search field.  Most automated tools will flag this as an error as there isn’t an associated label. Yet another is the ‘Bypass blocks’ (2.4.1).  Technically, if the page is well-structured with headings, (Technique H69) that is considered a 'Sufficient Technique', yet most automated tools would say that the page failed this criterion if there were not actually any ‘skip’ links.

Access iQ™: How do experts assess which web pages of a website they test for accessibility?

VC: The W3C has a task force working on an Evaluation Methodology for assessing the accessibility of a website, Website Accessibility Conformance Evaluation Methodology (WCAG-EM) 1.0. This is currently a working draft, but it describes methods for sampling the website to ensure that you assess all of the important pages. It is available at the above link and we would like to see people trialling it and reporting back to the Task Force. We suggest that you first define the scope of your evaluation, and understand the nature and purpose of the website. You need to explore the website carefully and then start by identifying the common pages of the website – pages that are relevant to the whole website such as home page, login page, contract page, site map, help etc. Then look for pages that have a common functionality for the website. A page that is included in common functionality is one that if removed from the sample would change the use or purpose of the website such as a catalogue or shopping cart. Next we look at identifying the variety of the web page types – looking for different styles, layout structure, forms, tables, lists, multimedia, scripted pages, pages with different technology, pages that rely on templates, pages that change appearance or behaviour depending on an event or dynamic content. We also look for pages that use different technology to the rest of the website.

So, you need pages that include:

  • Common web pages for the website
  • Examples of pages with different templates
  • Instances of distinct types of pages and technologies
  • Pages that are relevant for people with disabilities
  • Pages that form part of complete processes – such as all pages that would be required for a shopping cart function – catalogue, shopping cart, payment, preview of order etc.
  • An optional component is to include a random sample of pages (sometimes 25 per cent of the sample) that aren’t included in any of the above.

Choosing your sample isn’t a quick process and needs to be done very carefully as whether or not it is representative of the website depends upon a statistically relevant sample. AGIMO recommend 10 per cent of the website, but this is not within the financial means of many organisations. For example if your website is 50,000 pages that would mean assessing 5,000 pages which would be astronomically expensive for a manual evaluation. However a very carefully selected 30 page sample of a moderate size website could reflect the website’s accessibility and be manageable in terms of time and cost.

Access iQ™: A lot of organisations would argue that it is more financially viable to assess the accessibility of their websites using evaluation tools. Do you think this is the case and why?

VC: You can purchase automated tools for a very reasonable cost, $USD 600 for one of the most popular. However if you rely on the results of the automated test alone, you are not receiving a true reflection of your website’s accessibility and are unable to test many of the SC. When WCAG says that the SC are ‘testable statements’, they are not saying ‘automated testable statements’.  You can check with an automated tool to see if all images have alternative text, but you cannot test with that tool to see if it is the right text for the image. I’ve seen a web page with over 50 images all with ‘thumbnail’ for the alternative text. This is obviously a ‘fail’, but the automated tool will usually pass this page. You could have alternative text that says ‘Mt. Magnet Gold Mine’ when it is taken at another location altogether and the alternative text hasn’t been changed when the image changed. I saw one recently that had 9 images for different locations for a federal government department which said ‘front entrance’ and one of them had a tent where a public meeting was being held – the image had been swapped but the alternative text had not been changed. Links often have the same problem and don’t adequately describe what the link does, and where it goes.

However in saying that, automated tools have a real place in accessibility testing. They are able to assist the testers with locating trends of errors as they are often able to crawl through an entire website and may locate something deeper in the website that the user doesn’t find in their exploration. They can flag things that the tester needs to explore such as the use of pointer-specific handlers (usually mouse handlers that aren’t keyboard-accessible). 

The problem is the reliance on automated tools, mainly due to their cost and time saving. However it isn’t a true cost or time saver as you are still open to complaints and possible litigation. There is also the social responsibility to ensure our websites are accessible for people with disabilities, something that the use of automated tools on their own cannot address. A carefully chosen sample tested manually, together with automated tools used to scan the whole website, and the assistance of users with disabilities is the most cost-effective and reliable method to check your website’s accessibility, in my opinion.

Access iQ™: Do you think using experts to test the accessibility of a website alone is the answer?

VC: I firmly believe that there are a number of approaches necessary to test a website but they involve knowledgeable testers – whether they are 'experts' or not. You need a good base knowledge of website testing. In the WCAG-EM, we state:

Users of the methodology defined by this document are assumed to be knowledgeable of WCAG 2.0, accessible web design, assistive technologies, and of how people with different disabilities use the Web. This includes understanding of the relevant web technologies, barriers that people with disabilities experience, assistive technologies and approaches that people with disabilities use, and evaluation techniques and tools to identify potential barriers for people with disabilities. In particular, it is assumed that users of this methodology are deeply familiar with the resources listed in section Background Reading.

That being said, it doesn’t mean you have to be an ‘expert’ but you do need to be knowledgeable. It is accepted in the accessibility community that you should employ a combination of manual evaluation, automated tools, and testing by users with disabilities and senior citizens.

I’ve heard it said that testing your own website is rather like a university student taking a course, composing the exam, writing the exam, and then marking it themselves. The result would not be at all objective or reliable. If I told the tax department that I didn’t employ a certified auditor as I felt that I could do just as good a job myself, that wouldn’t convince them of the reliability of my findings. External auditing provides added protection, an objective third-party, and the testing agency should provide you with recommendations and ‘best practice’ suggestions as to how you can improve troublesome areas.

The best method is to employ a combination of methods, and that way you have the best chance of locating issues that pose real accessibility problems for your users and finding the best solution for fixing the issues.

Vivienne Conway is the Director of Web Key IT Pty Ltd. which provides accessibility services including testing, accreditation, training and consulting. She is an invited expert on a number of W3C working groups including the Research and Development Working Group and the Evaluation Methodology Task Force.