An Educator's Guide to

Does this research finding apply to my classroom, school or district?
How do I determine if the research is designed to support these claims?
How is student achievement measured?
How is student achievement measured?

In education, the familiar metrics used in the reporting of student performance on standardized tests of achievement tests of achievement include percentile rank, grade equivalent, normal curve equivalents and scale scores.

Normal curve equivalent scores and scale scores are the only types of measures that should be used to measure change in individual student, classroom, or school performance. The other measures are either inappropriate for use in statistical analyses or inappropriate for measuring change because of the way they are scaled.

Also take note of whether the study reports a calculation of the size of the improvement in achievement in terms of an effect size. Since different studies tend to use different measures of achievement, it is often difficult to compare the relative effectiveness of different software packages or the same software package across different studies. Once achievement gains are converted into an effect size, the effect size can be used to compare the relative effectiveness of a software package. Effect sizes also give us a sense of whether a gain in achievement is important (i.e., is it big or small). As a quick rule of thumb, an effect size of 0.30 or greater is considered to be important in studies of educational programs. It is also important to know how an effect size compares with more familiar metrics of learning. For example, an effect size of 0.1 is equivalent to about one month of learning gain. Also, effect size needs to be interpreted in practical terms. A small effect size may be of important practical significance if the intervention is relatively inexpensive compared to competing options, if the effect occurs among all groups of students, and if the effect accumulates over time.

The test that is used by researchers to measure student performance in a study may affect the magnitude of the effect size that is estimated. Researchers have found that the use of "local" tests, specifically developed to measure how students perform on tasks closely aligned with the content of the software, result in larger effect sizes than when more common standardized tests are used. Somtimes researchers measure technology's effectiveness by using tests that fit the specific technology program's goals so narrowly that they do not reflect more common and familiar academic outcomes. Ideally, researchers use tests that have been validated for use across more than one program but that are also sensitive to the kinds of things students might be expected to learn, given the software's design.


Please send comments and suggestions.

This site was created by the Center for Technology in Learning at SRI International under a task order from the Planning and Evaluation Service, U.S. Department of Education (DHHS Contract # 282-00-008-Task 3).

Last updated on: 11/04/02