On the Same Page: Interpreting Diploma Exam Results

Are we assessing fairly?
The Diploma Exams are an objective, reliable scale.

I attended the session “Interpreting Diploma Exam Results” presented by Exam Managers from Alberta Education. Here is some of what I learned:

Factors that Affect Student Performance on Diploma Exams are Difficult to Control

I like the following statement from one of the documents that were distributed at the session because it encapsulates just how difficult it is to help students learn complex academic material:

See how many factors there are that affect student achievement, how many of them are in your (administrators, curricular leaders) control? As a leader in your district, attending to those things that you can control is very important, but we all know that all of these other things, you cannot control. You might be able to influence some of those things, but you cannot control them.
It is really important that we all realize this and we focus our attention on what we can control, not what we can’t. This session today is not about blaming, comparing, crying (though there might be some of that), it is about learning from our results to affect [sic] positive action for the future – for the students. The results are what they are and we take them and interpret them and then go on with action to change (or keep) policies, procedures, development, etc. (Alberta Education, 2011).

Proportions of Students Who Achieve A, B, C or F on the School Mark and Exam Mark

Alberta Education is concerned when the proportion of students who receive a grade of F on the diploma exam is at least 10% more than the proportion of students who receive a grade of F on their school mark. This is of significant concern because the most important standards to distinguish between are the acceptable standard and the unacceptable standard--whether the student passes and earns credits, or fails and earns no credits. We want to be sure we are awarding credits that are appropriately merited. If a significant number of students have a passing class mark, but then get a failing diploma exam mark, then our assessment instruments are likely too easy (Edwards, 2011).

Individual Students’ School-Awarded Marks Should be within 15% of their Diploma Examination Marks Just Two Thirds of the Time, but the Average School-Awarded Mark and Average Diploma Examination Mark Should be Very Similar

A main function of the diploma exams is to ensure that schools are evaluating students with the appropriate standards. There are two main standards categories used by Alberta Education from K to 12: Acceptable Standard and Standard of Excellence. Note that performance standards must be distinguished from cutscores. A performance standard is a description of what a student needs to be able to do in order to be classified at that performance level. These are clarified in the Information Bulletins. A cutscore is the particular range of numbers on the score scale that correspond to each performance level. In Alberta, the cutscore range for the Acceptable Standard is defined as grades from 50% to 79%. The cutscore range for the Standard of Excellence is defined as grades from 80% to 100%.

A student’s school awarded mark should be within 15% of their diploma exam mark 67% of the time. Differences greater than 15% between the school-awarded mark and the diploma exam mark for a particular student are a consequence of individual student differences and because teachers assess several outcomes in class that cannot be assessed on a machine-scored examination. However, although individual students’ school marks may vary significantly from their diploma exam marks, the average mark awarded by the school should almost equal the average diploma exam mark achieved by the school. The diploma exams provide teachers and schools with feedback about whether their courses have an appropriate difficulty level relative to the rest of the Province (Edwards, 2011).

Use z-Scores to Compare Our Student Performance to Provincial Performance

“Subtests and reporting categories are of different lengths and difficulties; z-scores make all subtests and reporting categories of equivalent lengths and difficulties,” (Edwards, 2011).

The formulas recommended by Alberta Education are:

For example, let us say that on a certain diploma exam our students’ average school-awarded mark was 64.1% and the average provincial school-awarded mark was 63.7%:

This indicates that our average school-awarded mark is very close to the average school awarded mark of the province.

Let us say that on this same exam, our students’ average diploma exam mark was 58.5% and the provincial average was 63.8%:

According to the session presenter, z-scores between −0.25 and −0.50 are of some concern, and z-scores less than −0.50 are of significant concern. Consequently, a z-core of -0.44 is of some concern, however, it is not unexpected that upgrading students would score lower than the province due to the many challenges they need to deal with. What is of more concern is that on this particular imagined exam, we awarded an average class mark .03 standard deviations above the provincial school-awarded average, but then our students scored 0.44 standard deviations below the provincial diploma exam average.

I suggest we calculate the difference between the two z-scores calculated using the formulas above. A difference greater than 0.50 would indicate that our exams and other assessments are too easy and do not match the standards set by Alberta Education in the Information Bulletins. In this imagined case, the difference is (+0.03) - (-0.44) = 0.47.

For these calculations to be valid we need a significant sample size. For example, a sample size of 10 or below provides meaningless data. A sample size of 80 could be used to derive statistics we could have much more confidence in. It is also possible to calculate the confidence intervals for various statistics from sets of data but I will not do so here.

Causes of Class Averages Being Significantly Higher than the Diploma Exam

A common problem is that the average school-awarded mark is significantly above the average diploma examination mark. This is often caused by the following factors:

Exams used in class have too few questions at the Standard of Excellence

Too many marks are given for participation and completeness of work, rather than for quality of work

Lack of awareness of the provincial standard for the examination, whether at acceptable standard or at standard of excellence

(Alberta Education, 2011)

Hypothetical Analysis of Diploma Exam Results Completed by an Alberta Education Statistician

For an analysis of a hypothetical school, see P:\STAFF FOLDERS\Michael Gaschnitz\Case_Study_03.pdf. The notations are those of an Alberta Education statistician. Note that in this group of 81 Chemistry 30 students, 22.2% failed received failing class mark, but on the diploma exam 34.6% received a failing mark, a margin of 11.4%, which exceeds the 10% limit. The statistician highlighted this category in pink to indicate a significant concern.

References

Edwards, J. (2011, November). Interpreting Diploma Exam Results. Poster session presented at Calgary Regional Consortium, Calgary, AB.

On the Same Page

Pages

Tuesday, 13 November 2012

Interpreting Diploma Exam Results