information Society Technology

Aggregation model for test results

Introduction

For automatic and/or expert evaluation it is important that the individual assessments can be aggregated into an end-result that is easy to understand and has a clear interpretation. Aggregation of test results is possible on different levels like the checkpoint or test level.

One of the metrics that can also be interesting for policy makers, is the accessibility barrier probability. The European Commission regulation 808/2004 concerning community statistics for the information society explicitly states that one characteristic to be provided is barriers to the use of ICT, Internet and other electronic networks, e-commerce and e-business processes. This section describes a model for calculating the accessibility barrier probability for single Web pages and Web sites.1

Approach

Web accessibility evaluation can be viewed as a three stage process. Figure 2 and Figure 3 summarise the Web accessibility evaluation process. The notation in the figures that will be used throughout this section is introduced in: Definitions and mathematical background .

In the first stage, W3C's Evaluation and Report Language (EARL) is used as a standardised format for collecting and conveying test results from accessibility assessment tools according to any given standard. The model also supports the weighting of the test reports according to their error probability such that the contribution of tests with lower confidence to the overall result is reduced.

The second stage performs aggregation of the individual test results into one comprehensive figure for a web page. The calculation is based on a statistical model (user centric accessibility barrier model, UCAB, introduced in: Mathematical background: The UCAB model ). The underlying assumption being that the accessibility barriers within a web page accumulate.

To present the results to the public, e.g., people who experience barriers (such as people with disabilities) or users of the data for planning and development purposes (such as policy makers and stakeholders) the third stage of Web accessibility evaluation needs to provide means for interpreting the results. This includes statistical analyses of the findings, presentation of average values for a Web site or for groups of Web sites by geographical region or business sector. This stage can be viewed as the “business logic” for estimating the accessibility barriers. The reporting of findings will be covered in Reporting of test results and Scorecard report.

Definitions and mathematical background

In this section we explain the main concepts involved in the modelling of Web accessibility barriers. Subsequently we show how they can be transferred into a statistical model and introduce the UWEM User Centric Accessibility Barrier Model (UCAB).

Definitions

Web page

A resource on the web as defined by section 4.

Barrier

An accessibility barrier is modelled as a product failure caused by an incompatibility between the needs of a disabled user and product functionality. The incompatibility is caused by the web page, i.e., it is not the user's fault.

Barrier type

A barrier type is related to a test procedure (as described in: Tests for conformance evaluation ) and provides a unique interpretation of its result. For example, a non-text element can constitute a barrier in several ways. One way is described by barrier type 1.1_HTML_01 ("alt attribute is missing").

Barrier types are an integral part of barrier modelling because they have two important properties:

  • Barrier types can be measured objectively with reproducible results.
  • The results of the measurements can be aggregated.

Additionally it is assumed that each barrier gives the same result when the same element is being checked more than once, so that the same barrier type will be reported every time, and therefore also the same barrier probability will apply every time the element is being tested.

Accessibility

Under the scope of this section, accessibility is defined as the absence of barriers within the Sampled Resource List.

Notation

The following notation is used to refer to the quantities that are involved in the calculations.

The results from each test in stage one are given by a report R pb 0,1 where p is a page and b a barrier type. R pb = 1 means that the test for barrier type b failed, whereas R pb = 0 means that the test for barrier type b passed. For expert evaluation these results will usually be given in a tabular test report.

Automatic evaluation has the capacity to assess all elements within the page p. In this case the result can also be given as the ratio of the number of failed tests B pb = 1 to the number of all relevant elements N pb = 1 for barrier type b.

R pb automatic = B pb N pb

If the confidence level of a test procedure is known the reports can be adapted to reflect this. Let P b fp P_b^{fp} denote the probability that the test for b b yields a false positive and P b fn P_b^{fn} the probability that the test for b b yields a false negative 2. The following report includes confidence weighting:

R pb expert = { P b fp , test b passed 1 P b fn , test b failed size + 2 {R_{pb}^{expert}} = left lbrace matrix { P_b^{fp}, # "test "b" passed" ## 1 - P_b^{fn}, # "test "b" failed" } right none

If ratio reports are used the calculation changes to

R pb automatic = B pb N pb 1 P b fn N pb B pb N pb P b fp size + 2 {R_{pb}^{automatic}} = size -2 {B_{pb} over N_{pb}} (1-P_b^{fn}) + size -2 {{N_{pb} - B_{pb}} over N_{pb}} P_b^{fp}

The barrier probability of barrier type b is given by F b 0;1 3.The barrier probabilities constitute a fixed set of parameters. A small value of F b indicates that the probability that a disabled user encountering a barrier of type b will experience an accessibility problem is small. All parameters are set to the same value F b = 0.05b for a start. The parameters are being validated and will be tuned within EIAO with regard to automatic evaluation for future versions of UWEM.

The result of the aggregation is interpreted as accessibility barrier probability: F p is the probability that the Web page p constitutes an accessibility barrier for a disabled user.

Example (Confidence weighting)

The goal is to estimate the proportion of images that don't have an appropriate alternative text. The inspected web page contains ten image elements with alternative text. The (stage one) evaluation reports that for three of the images the alternative text is not appropriate.

The barrier ratio is. R pb = 3 10 = 0.3 R_pb = size -2 {3 over 10} = 0.3

Suppose that the probability of false negatives (i.e. false "fail" results) is small because it is relatively easy to recognise suspicious image descriptions like file names or place holder text: P b fn = 0.01 P_b^{fn} = 0.01 . On the other hand the probability of false positives (i.e., false "pass" results) is higher because sometimes the whole page context needs to be taken into account to determine whether the description is appropriate: P b fp = 0.1 P_b^{fp} = 0.1 .

The estimate is adapted accordingly: R pb = 3 10 1 P b fn 10 3 10 P b fp = 3 10 0.99 7 10 0.1 = 0.367 R_{pb} = size -2 {3 over 10 } cdot (1-P_b^{fn}) + size -2 {{10-3} over 10} cdot P_b^{fp} = size -1 {3 over 10 } cdot 0.99 + size -1 {7 over 10} cdot 0.1 = 0.367

Mathematical background: The UCAB model

The main purpose of the aggregation of the reports from stage one is to model the experience of a disabled user trying to access a Web site. The goal is to determine the probability that user can not complete a task because of the accessibility barriers they encounter. Depending on the severity the barriers might either stop the disabled used right away or prevent the completion of the task because too much time and effort are required. To address this goal we introduce the User Centric Accessibility Barrier Model (UCAB).

There a two main statistical assumptions underlying the UCAB model:

Independence of barrier occurrences
Each test passes or fails independently of each other one. (i.e. the reports R pb used as random variables are mutually independent.)
Barriers within a Web page accumulate
Each barrier that a disabled user encounters within a Web page reduces the overall accessibility, i.e. increases the accessibility barrier probability F p . This is modelled as probability that the user encounters any barrier within the web page.

Let A and B be two independent events then the probability that A or B occurs is given by:

P A B = P A / B / / = 1 - P A / P B / = 1 - 1 - P A 1 - P B

where P A denotes the probability A and A / is the complementary event of A.

Stage 2: Accessibility Barrier Probability Fp

An accessibility barrier probability F p is modelled as a product failure caused by an incompatibility between a disabled user's need and product functionality. It is assumed that a failure mode reported by a test procedure will introduce an accessibility barrier with some known probability F b .

Application of the UCAB model yields the following formula (notation as described in section notation).

F p = 1 - all barrier types b 1 - R pb F p

Example (Fp for single Web page):

The (stage one) evaluation of a Web page yielded reports that were generated by three different tests:

In detail, the reports might look like this:

Assuming that the barrier probabilities of the failure modes have the values F b0 = F b1 = F b2 = 0.05 , the accessibility barrier probability of the Web page calculated from the UCAB model is:

F p = 1 1 R p,b0 F b0 1 R p,b1 F b1 1 R p,b2 F b2 F p = 1 1 0.05 1 0 1 0.05 F p = 1 0.95 2 F p = 0.0975

Combining results from different testing procedures

It is possible to combine the results from evaluations performed with different tools or by different experts if the following conditions apply:

  1. No double reports (No barrier should be included more than once in the aggregation. To meet this the evaluation has to observe a division of the tests that are performed, e.g. into different sets of Web pages or into automatic and expert evaluation).

  2. Same sample (The reports have to cover the same data, i.e. the same version and selection of Web resources).

Stage 3: Accessibility Barrier Probability Fs

The Web page accessibility barrier probabilities naturally lend themselves to aggregation, so that the average barrier probability and variance for a Web site can be calculated from the sampled Web pages for a Web site. Similarly, aggregation can be further performed over several Web sites, regions or countries.

The barrier probabilityf F s or a Web site s is calculated as the mean of the barrier probabilities of the Web pages that have been sampled from the Web site.

F s = 1 n E j = 1 n F pj

where n is the number of Web pages sampled from Web site s and F pj is the barrier probability for Web page pj – calculated as described in section 6.4.

The standard deviation S s S_s is given by

S s = 1 n j = 1 n F pj F s 2 S_s = sqrt {size -2 {1 over n} sum from j=1 to n {(F_{pj}-F_s)^2}}

Example (Fs for single Web site)

Two pages have been sampled from Web site. The barrier probabilities have been calculated in stage 2:

Then the average barrier probability of s is

F s = 1 2 E F p1 + F p2 = 1 2 E 0.0975 + 0.1425 = 1 2 0.24 = 0.12

The standard deviation is

S s 2 = 1 2 E F p1 - F s 2 + F p2 - F s 2 = 1 2 E 0.0975 - 0.12 2 + 0.1425 - 0.12 2 = 0.00050625

Limitations of the UCAB model

Clearly we are assuming an idealised Web in this model, and some of the assumptions may not be true for a real Web site. The evaluation phase of UWEM will be used to verify if the model is usable, and also to improve the model where necessary.

Aspects not covered by the model

Underlying assumptions

  1. This probabilistic model for accessibility barriers is work in progress. It will be evaluated and refined if necessary during the evaluation phases of UWEM. Aggregation of accessibility barriers will probably be based around these general ideas, but the exact model is subject to change during the evaluation phase.
  2. This version of UWEM does not specify how the parameters should be selected. Instead the default valueswill be used, Note that with this parameter selection the results are the same as without confidence weighting.
  3. The accessibility barrier probability Fb for a barrier type b can be estimated via user testing using a representative set of users. It may also be estimated to some extent with expert testing or semi-automatic testing, however the level of precision in detecting real barriers will be less than for user testing. In this simple first approach the barrier probability is set to a fixed value. In later versions this parameter could be used to provide a finer grading of the severity of the barrier types, e.g. by introducing different values for different disability groups.
www.wabcluster.org