Uses an established set of questions, procedures, and scoring methods for all takers

  • Entry
  • Reader's guide
  • Entries A-Z
  • Subject index

Validation of test scores involves collecting evidence and developing an argument that supports a particular use (i.e., an inference or decision) of the test scores. For a validity argument to be correct, it must be supported by evidence and be logical and coherent. There are various types of evidence that can be used to support a validity argument, including content-related validity evidence, criterion-related validity evidence, and evidence related to reliability and dimensional structure. The type of evidence needed to support the use of the test scores depends on the type of inference or decision being made. As such, test scores can only be said to be valid for a particular use. If multiple inferences or decisions are to be made based on a set of ...

Uses an established set of questions, procedures, and scoring methods for all takers

locked icon

Sign in to access this content

Sign in

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life

  • Read modern, diverse business cases

  • Explore hundreds of books and reference titles

sign up today!

                CODE OF FAIR TESTING PRACTICES IN EDUCATION

           PREPARED BY THE JOINT COMMITTEE ON TESTING PRACTICES


The Code of Fair Testing Practices in Education states the major
obligations to test takers of professionals who develop or use
educational tests.  The Code is meant to apply broadly to the use of
tests in education (admissions, educational assessment, educational
diagnosis, and student placement).  The Code is not designed to cover
employment testing, licensure or certification testing, or other types
of testing.  Although the Code has relevance to many types of
educational tests, it is directed primarily at professionally developed
tests such as those sold by commercial test publishers or used in
formally administered testing programs.  The Code is not intended to
cover tests made by individual teachers for use in their own classrooms.

The Code addresses the roles of test developers and test users
separately.  Test users are people who select tests, commission test
development services, or make decisions on the basis of test scores. 
Test developers are people who actually construct tests as well as those
who set policies for particular testing programs.  The roles may, of
course, overlap as when a state education agency commissions test
development services, sets policies that control the test development
process, and makes decisions on the basis of the test scores.

The Code has been developed by the Joint Committee on Testing Practices,
a cooperative effort of several professional organizations, that has as
its aim the advancement, in the public interest, of the quality of
testing practices.  The Joint Committee was initiated by the American
Educational Research Association, the American Psychological
Association, and the National Council on Measurement in Education.  In
addition to these three groups the American Association for Counseling
and Development/Association for Measurement and Evaluation in Counseling
and Development, and the American Speech-Language-Hearing Association
are now also sponsors of the Joint Committee.

This is not copyrighted material.  Reproduction and dissemination are
encouraged.  Please cite this document as follows:

     Code of Fair Testing Practices in Education. (1988)
           Washington, D.C.: Joint Committee on Testing Practices.
   
    (Mailing Address: Joint Committee on Testing Practices,
     American Psychological Association, 1200 17th Street, NW,
     Washington, D.C. 20036.)

The Code presents standards for educational test developers and users in
four areas:
                 A.  Developing/Selecting Tests
                 B.  Interpreting Scores
                 C.  Striving for Fairness
                 D.  Informing Test Takers

Organizations, institutions, and individual professionals who endorse
the Code commit themselves to safeguarding the rights of test takers by
following the principles listed.  The Code is intended to be consistent
with the relevant parts of the Standards for Educational and
Psychological Testing (AERA, APA, NCME, 1985).  However, the Code
differs from the Standards in both audience and purpose.  The Code is
meant to be understood by the general public; it is limited to
educational tests; and the primary focus is on those issues that affect
the proper use of tests.  The Code is not meant to add new principles
over and above those in the Standards or to change the meaning of the
Standards.  The goal is rather to represent the spirit of a selected
portion of the Standards in a way that is meaningful to test takers
and/or their parents or guardians.  It is the hope of the Joint
Committee that the Code will also be judged to be consistent with
existing codes of conduct and standards of other professional groups who
use educational tests.

                   A.  DEVELOPING/SELECTING APPROPRIATE TESTS*

Test developers should provide            Test users should select tests 
the information that test users           that meet the purpose for which 
need to select appropriate                they are to be used and that are
tests.                                    appropriate for the intended test
                                          taking populations.

        TEST DEVELOPERS SHOULD:                   TEST USERS SHOULD:


1.  Define what each test measures        1.  First define the purpose for
and what the test should be used for.     testing and the population to be
Describe the population(s) for which      tested.  Then, select a test for
the test is appropriate.                  that purpose and that population
appropriate.                              based on a thorough review of the
                                          available information.

2.  Accurately represent the              2.  Investigate potentially useful
characteristics, usefulness, and          sources of information, in
limitations of tests for their intended   addition to test scores, to
purposes.                                 corroborate the information
                                          provided by tests.

3.  Explain relevant measurement concepts 3.  Read the materials provided by
as necessary for clarity at the level     test developers and avoid using
of detail that is appropriate for the     tests for which unclear or
intended audience(s).                     incomplete information is
                                          provided.

4.  Describe the process of test          4.  Become familiar with how and 
development.  Explain how the content     when the test was developed and
and skills to be tested were selected.    developed and tried out.

5.  Provide evidence that the test meets  5.  Read independent evaluations 
its intended purpose(s).                  of a test and of possible
                                          alternative measures.  Look for
                                          evidence required to support the
                                          claims of test developers.

6.  Provide either representative samples 6.  Examine specimen sets, 
or complete copies of test questions,     disclosed tests or samples of 
directions, answer sheets, manuals, and   questions, directions, answer 
score reports to qualified users.         sheets, manuals, and score reports
                                          before selecting a test.


*Many of the statements in the Code refer to the selection of existing tests. 
However, in customized testing  programs test developers are engaged to
construct new tests.  In those situations, the test development process  should
be designed to help ensure that the completed tests will be in compliance with
the Code.

        TEST DEVELOPERS SHOULD:                         TEST USERS SHOULD:

7.  Indicate the nature of the evidence   7. Ascertain whether the test 
obtained concerning the appropriateness   content and norm group(s) or 
of each test for groups of different      comparison group(s) are
racial, ethnic, or linguistic backgrounds appropriate for the intended test
who are likely to be tested.              takers.

8.  Identify and publish any specialized  8. Select and use only those 
skills needed to administer each test     tests for which the skills needed 
and to interpret scores correctly.        to administer the test and
                                          interpret scores correctly are
                                          available.
----------------------------------------------------------------------------
                             B.  INTERPRETING SCORES

Test developers should help users          Test users should interpret scores
interpret scores correctly.                correctly.

        TEST DEVELOPERS SHOULD:                     TEST USERS SHOULD:

9.  Provide timely and easily understood    9.  Obtain information about the 
score reports that describe test            scale used for reporting scores, 
performance clearly and accurately.         the characteristics of any norms 
Also, explain the meaning and               or comparison group(s), and the 
limitations of reported scores.             limitations of the scores.

10.  Describe the population(s) represented 10.  Interpret scores taking into
by any norms or comparison group(s), the    account any major differences 
dates the data were gathered, and the       between the norms or comparison 
process used to select the samples of       groups and the actual test takers.
test takers.                                Also take into account any
                                            differences in test administration
                                            practices or familiarity with the
                                            specific questions in the test.

11.  Warn users to avoid specific,          11.  Avoid using tests for 
reasonably anticipated misuses of test      purposes not specifically
scores.                                     recommended by the test developer
                                            unless evidence is obtained to
                                            support the intended use.

12.  Provide information that will help     12.  Explain how any passing 
users follow reasonable procedures for      scores were set and gather
setting passing scores when it is           evidence to support the
appropriate to use such scores with the     appropriateness of the scores.
test.

13.  Provide information that will help     13.  Obtain evidence to help show
users gather evidence to show that the      that the test is meeting its
test is meeting its intended                intended purpose(s).
purpose(s).
-----------------------------------------------------------------------------
                            C.  STRIVING FOR FAIRNESS

Test developers should strive to           Test users should select tests 
make tests that are as fair as possible    that have been developed in ways
for test takers of different races,        that attempt to make them as fair
gender, ethnic backgrounds, or different   as possible for test takers of
handicapping conditions.                   different races, gender, ethnic
                                           backgrounds, or handicapping
                                           conditions.

        TEST DEVELOPERS SHOULD:                    TEST USERS SHOULD:

14.  Review and revise test questions      14.  Evaluate the procedures used 
and related materials to avoid potentially by test developers to avoid
insensitive content or language.           potentially insensitive
                                           content or language.

15.  Investigate the performance of        15.  Review the performance of 
test takers of different races, gender,    test takers of different races,
and ethnic backgrounds when samples of     gender, and ethnic backgrounds
sufficient size are available.  Enact      when samples of sufficient size
procedures that help to ensure that        are available.  Evaluate the 
differences in performance are related     extent to which performance
primarily to the skills under assessment   differences may have been caused 
rather than to irrelevant factors.         of the test.

16.  When feasible, make appropriately     16.  When necessary and
modified forms of tests or administration  feasible, use appropriately
procedures available for test takers with  modified forms or administration 
handicapping conditions.  Warn test users  procedures for test takers with 
of potential problems in using standard    handicapping conditions. 
norms with modified tests or               Interpret standard norms with care
administration procedures that result in   in the light of the modifications
non-comparable scores.                     that were made.
-----------------------------------------------------------------------------
                            D.  INFORMING TEST TAKERS

Under some circumstances, test developers have direct communication with test
takers.  Under other circumstances, test users communicate directly with test
takers.  Whichever group communicates directly with test takers should provide
the information described below.


                      TEST DEVELOPERS OR TEST USERS SHOULD:


17.  When a test is optional, provide test takers or their parents/guardians
with information to help them judge whether the test should be taken, or if
an available alternative to the test should be used.

18.  Provide test takers the information they need to be familiar with the
coverage of the test, the types of question formats, the directions, and
appropriate test-taking strategies.  Strive to make such information equally
available to all test takers.


Under some circumstances, test developers have direct control of tests and 
test scores.  Under other circumstances, test users have such control. 
Whichever group has direct control of tests and test scores should take the
steps described below.


                      TEST DEVELOPERS OR TEST USERS SHOULD:


19.  Provide test takers or their parents/guardians with information about
rights test takers may have to obtain copies of tests and completed answer
sheets, retake tests, have tests rescored, or cancel scores.

20.  Tell test takers or their parents/guardians how long scores will be kept
on file and indicate to whom and under what circumstances test scores will or
will not be released.

21.  Describe the procedures that test takers or their parents/guardians may
use to register complaints and have problems resolved.


Note:  The membership of the Working Group that developed the Code of Fair
Testing Practices in Education and of the Joint Committee on Testing Practices
that guided the Working Group was as follows:


Theodore P. Bartell  John J. Fremer     George F. Madaus    Nicholas A. Vacc
John R. Bergan       (Co-chair, JCTP    (Co-chair, JCTP)    Michael J. Zieky
Esther E. Diamond    and Chair, Code    Kevin L. Moreland
Richard P. Duran     Working Group)     Jo-Ellen V. Perez   (Debra Boltas and
Lorraine D. Eyde     Edmund W. Gordon   Robert J. Solomon   Wayne Camara of
Raymond D. Fowler    Jo-Ida C. Hansen   John T. Stewart     the American
                     James B. Lingwall  Carol Kehr Tittle   Psychological
                                        (Co-chair, JCTP)    Association served
                                                            as staff liaisons)

Additional copies of the Code may be obtained from the National Council on
Measurement in Education, 1230 Seventeenth Street, NW, Washington, D.C.  20036.
Single copies are free.

What term refers to established criteria for comparing individual scores on a standardized test?

Norm-referenced refers to standardized tests that are designed to compare and rank test takers in relation to one another.

What is the term used to describe the consistency of test scores?

Reliability is the degree to which an assessment tool produces stable and consistent results. Test-retest reliability is a measure of reliability obtained by administering the same test twice over a period of time to a group of individuals.

What is test validity and reliability?

Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method, technique or test measures something. Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.

What is a reliability test?

Test reliability. Reliability refers to how dependably or consistently a test measures a characteristic. If a person takes the test again, will he or she get a similar test score, or a much different score? A test that yields similar scores for a person who repeats the test is said to measure a characteristic reliably.