Whether used for high-stakes decisions or classroom-based formative decisions, the most fundamental element of any educational assessment is validity (Russell & Kavanaugh, 2011). IAR validity focuses on the accuracy of information collected about students, the accuracy of inferences made based on that information and the appropriateness of decisions about students, schools, teaching and learning that are based on those inferences (Russell & Kavanaugh, 2011). On the other hand, large-scale assessment programs such as NAEP have traditionally attempted to assess students using single instruments administered to students under the same conditions. As a result it has been acknowledged that this practice does not give valid information (Russell & Kavanaugh, 2011).
The importance of evaluating these two programs is to discover problems and additional requirements early to prevent more serious issues that may arise later in their use. Through evaluation, stakeholders can determine which assessment programs to continue and which ones to end (IAR, 2011). Critical evaluation of these programs will ensure quality and build client confidence about purchasing, or using the assessment procedure. Effective assessment will assist the users to prioritize resources by identifying program components that are most effective or critical (IAR, 2011).
The evaluation of Instructional Assessment Resources (IAR) should ensure that students attain self assessment which will in turn help students to develop critical reflection as they have to evaluate their own and other students work. IAR does not provide the opportunity for students to learn responsibility towards others via assessment and to learn to make critical judgments. Frankland (2007) says that an assessment process such as IAR should play a fundamental role in all aspects of learning where individuals and groups have to be accountable for their work. IAR should enable students to grow their ability to be realistic judges of their own performance and the ability to monitor their own learning.
In order to be more effective, IAR should involve learners in the assessment of self development and learning. This will motivate them to think about what had been learnt to a certain point, where the gaps are, and how to fill up or minimize the gaps (Frankland, 2007). The assessment process should introduce the concept of individual judgment to students and ensure effective communication among teachers and peers. Carless, Joughin & Li (2007) noted that assessment processes should lead to both the award of a reliable grade and contribute to productive student learning.
The major challenge of IAR is that students spend too much time being tested and lack enough time in productive learning. The key stakeholders in these two learning processes should note that learning is much more than the accumulation of grades. On the other hand, NAEP does not give students motivating assessment activities which involve them in productive learning experiences. Therefore, NAEP assessment process is likely to lead to better student grades because it involves students actively in the assessment, engages them with standards and encourages them to monitor and improve their own work.
NAEP assessment process presents a major challenge to tutors because they carry potential share of workload. Carless, Joughin & Li (2007) argues that tutors are forced to share ideas, strategies, worksheets, model answers, quality exemplars or follow-up tasks as part of the process of streamlining assessment. Ehlers & Schneckenberg (2010) learned that the main purpose of student assessment has been as a mechanism for comparing individual learners as a screening or selection process. NAEP assessment process has not provided access to individuals to continued education and progression to higher degrees, to entry into the civil services and to progression for jobs and careers (Ehlers & Schneckenberg, 2010). Instead this process is predicated on measuring the differences between individual students based on race, ethnicity and gender. NAEP assessment process has led to pressure for objectivity through standardized tests which are set that the bright or hard working students get it right and the dull or lazy ones get it wrong (Ehlers & Schneckenberg, 2010).
Both NAEP and IAR use technologies in their assessment processes. Ehlers & Schneckenberg (2010) says that “technology has been focused on the development of simple multiple choice question and answers designed both for easy marking and for standardized test provision” (p. 441). More emphasis should be put on the trends being assessed. Traditionally, university assessment was focused on a mastery of a body of scholarly knowledge defined by experts in a subject and the ability to research about that body of knowledge
(Ehlers & Schneckenberg, 2010). Nowadays, assessment has been more focused on professional standards as defined by external bodies.
The main aim of NAEP has been to compare institutions and to provide a comparison of the attainment of individual learners rather than as a measurement of quality in terms of learning processes (Ehlers & Schneckenberg, 2010). The stakeholders in IAR should be cautioned that the process is central because quality is seen to be congruent with the move from teaching to learning and to new processes of knowledge development outlined in the curriculum. Ehlers & Schneckenberg (2010) mentioned that critiques of both IAR and NAEP should also be “based on the effect on student motivation and coming from those seeking to develop assessment based on reflection on learning, particularly through the introduction of e-portfolios” (p. 441).
Both IAR and NAEP assessment processes should allow students to garner rewards from their participation in the assessment processes. Palomba & Banta (2001) says that through their participation in the assessment process, students should be given small incentives for participating in surveys or testing projects. This is because the real payoff to students for their involvement in assessment should be an opportunity to learn more than they would have learned without it (Palomba & Banta, 2001). Both NAEP and IAR programs should review their strategies to meet their full potential to engage faculty, students and other stakeholders in a systematic effort to improve quality of education.
Instructional Assessment Resources (IAR)
Goals and strengths
IAR conducts regular assessment of teaching in order to improve the instruction itself. IAR (2011) says that “teaching assessment allows the instructors to monitor student learning and make necessary adjustments to improve their teaching”. One of the strengths of IAR is that its teaching assessments help teachers to avoid surprises in end-of-course evaluations (IAR, 2011). As a result, IAR (2011) indicated that “the assessment process embraces evaluation of student satisfaction with course organization, exams, assignments, and instruction throughout the term giving room for mid-course adjustments and the improvement of student satisfaction”. According to IAR (2011) IAR student assessment entails “the assessment of student learning through assignments, tests, and portfolios”.
Recommendations for improvement
IAR assessment process should consider individual student interests, abilities, cognitive styles, and rates of learning, patterns of developed abilities, ethnicity, sex, social class and their motivations (Clark & Zimmerman, 2004). The IAR measures should be sensitive to pluralistic issues and reinforce achievements of all students not just those from a particular background or social class. Clark & Zimmerman (2004) mentioned that “students achievements should be focused on and celebrated so they are motivated to learn, and teachers will be provided tools that allow them to deliver quality instruction to all students including those who are talented in given fields” (p. 145).
Consequently, IAR assessment processes should use accurate scoring keys for selected response assessment and good scoring guides for extended written response and performance assessment (Stiggins, Arter & Chappuis, 2004). Regardless of how carefully the whole process has been planned, things can still go wrong that result in inaccurate estimates of achievement. Stiggins, Arter & Chappuis (2004) says that “some of the problems that may arise from the process can be solved by adhering to the test development process” (p. 114).
Evaluation measures used
IAR uses three types of evaluation which include: needs, process and outcome evaluation.Needs evaluation is typically used in program planning. IAR (2011) indicated that “this evaluation helps determine which program aspects or activities are most needed and for which population”. Generally, this method is used to help build up new programs or justify existing program components. Process evaluation investigates the implementation process of a program (IAR, 2011). This approach is useful to appraise program activities and categorize any essential improvements or changes (IAR, 2011). Outcome evaluation helps determine the overall effects or outcomes of the program in relation to program intentions. IAR (2011) says that this method may indicate whether the program objectives were met and also includes any recommendations for improvement.
IAR assessment process employs scenario based higher order thinking evaluation. Nilson (2010) says that scenario based higher order thinking evaluation is a series of multiple choice items based on a new realistic stimulus such as tables, graphs, diagrams and data set. In addition, Schuh (2011) says that the assessment process may face the challenge of availability of resources to support the initiative. In addition, a structured assessment instrument such as IAR is not be limited to needs, process and outcome evaluation. Schuh (2011) noted that institutions using this process are not limited by the specific questions within the assessment tool.
Additional measures needed
Teachers and students need to construct the way forward for classroom assessment practice together. McInerney, Brown & Liem (2009) says that “student’s comments suggest their actions are shaped by their past experiences and the future they anticipate, so it is important to realize that changes to teacher and students roles and responsibilities will take time” (p. 101). The main criticisms of IAR are based on the desires of empirical scholars to have a stable population to acquire hard data that can be treated statistically and reported as predictive (Dorn, 1999). This means that there is a need for the student and subject matter content to be measured is stable and predictable. IAR faces challenges in its effort to assess performances in real-life situations because both the student and school undergo change (Dorn, 1999).
The IAR assessment process should embrace a deductive process in which critical theories are applied to the student’s outcomes. Christ (1997) says that students should be urged to apply the same theories to their own work after learning in a formal critique. McInerney, Brown & Liem (2009) articulated that a good assessment process should allow us to review aspects of the learning context and teacher assessment actions to consider how they might better support the formative function of classroom assessment. The IAR process should provide students with multiple opportunities to demonstrate what they know and can do. McInerney, Brown & Liem (2009) cautioned that the use of IAR should accommodate student goals as part of fostering a climate where assessment is viewed at a minimum, as a joint teacher-student responsibility. The entire process must value the suggestions as feedback that signals respect for students and their ideas and prompts students to actively engage in learning assessment process (McInerney, Brown & Liem, 2009).
How this supplementary data can enhance the results and conclusions.
The team involved in the design and implementation of IAR student assessment should have insider information about challenges going on in student’s lives, which may have undue influence over their final report on the student’s progress (Orrell, Cooper & Bowden, 2010). IAR assessment process should consider that students and peers gain much from playing an active role in the assessment process. Orrell, Cooper & Bowden (2010) further says that self assessment is a powerful tool for learning when adequately supported and developed. They also noted that when students are required to review and critique their behavior by IAR process they will if guided appropriately develop their meta-cognitive ability.
The supplementary data should be used to ensure recognition, accurate interpretation and respect for diversity, evaluators should ensure that the members of the evaluation team collectively demonstrate competence (AEA, 2004). Competence would be reflected in evaluators seeking awareness of their own assumptions, their understanding of the worldviews of different participants and stakeholders in the evaluation. AEA (2004) stated that “the use of appropriate evaluation strategies and skills in working with different student groups will enhance the results and conclusions”. The supplementary data will enhance diversity in terms of race, ethnicity, gender, religion, socio-economics, or other factors pertinent to the IAR evaluation context (AEA, 2004).
National Assessment of Educational Progress (NAEP)
Goals and strengths
NAEP administers assessments in twelve subject areas which include Reading, Mathematics, Science, Writing, Arts, Civics, Economics, Geography, and U.S History. Pellegrino, Jones & Mitchell (1999) says that NAEP has chronicled academic achievement for over a quarter century. This is because it has been a valued source of information about the academic proficiency of students in the United States (Pellegrino, Jones & Mitchell, 1999). It provides among the best available trend data on the academic performance of elementary, middle and secondary students in key subject areas (Pellegrino, Jones & Mitchell, 1999).
NAEP measures national and state progress toward the third National Education Goal and provide timely, fair, and accurate data about student achievement at the national level, among states and in comparison to other nations (Pellegrino, Jones & Mitchell, 1999). It also develops sound assessments to measure what students know and can do as well as what they should know and be able to do. Pellegrino, Jones & Mitchell (1999) noted that “NAEP helps states and other link their assessments to the National Assessment and use National Assessment data to improve education performance” (p. 68). Pellegrino, Jones & Mitchell (1999) says that one of the strengths of NAEP assessment is that it has set an innovative agenda for conventional and performance based testing and in doing so it has become a leader in American achievement testing.
NAEP’s prominence has made it a victim of its own success. Pellegrino, Jones & Mitchell (1999) indicated that “recent demands for accountability at many levels of the educational system, the increasing diversity of America’s school age population, policy concerns about equal educational opportunity, and the emergence of standard’s based reform have had demonstrable effects on the program” (p. 56). Regulators in the education sector and other with legitimate interest in the status of United States education have asked NAEP to do more beyond its central purpose. The main strength of NAEP is that without changing its basic design, structural features have been added to NAEP in response to the growing consistency for assessment in schools. Pellegrino, Jones & Mitchell (1999) says that “the state testing program, the introduction of performance standards and the increased numbers of hands-on and other open-response tasks have made NAEP more complex” (p. 57).
Recommendations for improvement
Moreover, Pellegrino, Jones & Mitchell (1999) says that NAEP cannot and should not attempt to meet all the diverse needs of the program’s multiple constituencies. From NAEP assessment program, it clear that the nation needs a new definition of educational progress. The program should thus provide a more comprehensive picture of education in America. Pellegrino, Jones & Mitchell (1999) thus mentioned that NAEP should only be one component of a more comprehensive integrated system on teaching and learning in America’s schools. The current NAEP achievement survey fails to capitalize on contemporary research, theory, and practice in the disciplines in ways that support in-depth interpretations of student knowledge and understanding (Pellegrino, Jones & Mitchell, 1999).
Full student involvement is not evident in NAEP assessment process. Palomba & Banta (2001) says that an “important challenge with this approach is whether or not students will take seriously their responsibilities to participate in assessment activities” (p. 24). In this context, the assessment teams should promote course-embedded assessment so as to draw on the natural motivation of students to do well in their studies. These assessment programs should encourage students to also serve on program assessment committees, participate in design of portfolio or other assessment projects, critique existing techniques through focus groups and assist in conducting assessment projects (Palomba & Banta, 2001).
Evaluation measures used
NAEP assessment process comprises two elements: the long-term trend assessments and the main assessments (NAEP, 2004). The long-term trend assessment process uses considerably the same assessments after a certain period of time, each time a subject is evaluated, in order to determine student progress in that subject over time (Kitmitto & Mello, 2008). On the other hand, the main assessments is regularly adapted to reflect current curriculum policies, content currently in use in the nation’s schools, and enhancements in techniques of educational measurement (NAEP, 2004).
NAEP staff administers the long-term trend assessment to students who are part of the NAEP long-term trend sample (NAEP, 2011). Students are asked to complete a questionnaire to provide context for the results. NAEP (2011) indicated that the questionnaires are presented to students as an intact form in the revised assessment. However, in the original assessment given in previous years these questions are interspersed with the cognitive items (NAEP, 2011).
NAEP’s student evaluation measures should reach beyond capacities of large scale survey methods. Pellegrino, Jones & Mitchell (1999) indicated that “the current assessments do not test portions of the current NAEP frameworks well and are ill-suited to conceptions of achievement that address more complex skills” (p. 65). In this context, student achievement should be more broadly defined by NAEP frameworks and measured using methods that are matched to the subjects, skills and populations of interest (Kitmitto & Mello, 2008).
In NAEP assessment process, the evaluation plan should be shared with students ahead of time to make the learning targets clearer. This implies that students should be involved throughout a unit, by establishing where each day’s instruction fits into the plan or by writing practice test questions periodically for each cell of the plan as a form of review (Stiggins, Arter & Chappuis, 2004). Data collected in the assessment process should provide quality comments on which areas require significant improvement. Mann (2006) says that the teams involved should conduct further analysis to examine the pattern of the quality of assessment comments from the data. According to Mann (2006), student’s ability to critique, assess and evaluate improves with practice.
Additional measures needed
Critics of NAEP assessment program especially those groups representing specific vested interests have raised several complaints about the nature and implications of writing prompts developed by test makers (Flood, 2003). Flood (2003) noted that there have been concerns with privacy invasion, racial/ethnic/religious, regional biases exceeding the background limits of writers found voice in these complaints. Preskill & Catsambas (2006) says that there is always a need to allow for more in-depth responses regarding in-person training while conducting the assessment. This will increase the effectiveness, usefulness, and appeal of current training and support.
According to Evans, Economy & Forney (2009), legitimate concerns should be expressed about lack of inclusiveness in the whole assessment process. The process should not be based on gender, race, ethnicity and other dominant groups within institutions. Rather the process should measure students’ knowledge of basic facts in specific areas. Another additional measure should be the inclusion of many constructs rather than one assessment scheme. Evans, Economy & Forney (2009) studied that combining all notions can result in confounding notions, especially in terms of what is perceived as being more developed. With these additional measures in place, the end result of NAEP assessment should demonstrate to what extent teachers and students meet their objectives (Clark & Zimmerman, 2004). The process should inform students about what they need to do to improve and at the same time provide teachers with information that can help them recognize their successes and make revisions when required (Clark & Zimmerman, 2004).
How this supplementary data can enhance the results and conclusions
The differences observed among racial/ ethnic subgroups can almost certainly be associated with a broad range of socioeconomic and educational factors and may not be fully addressed by NAEP assessment program (Williams, 1995). As a result NAEP assessment program should bring out the differences between public and non-public schools. Nevertheless, the process should articulate differences in reading performance and its effectiveness. Also educational programs within the states, the challenges posed by economic constraints and student demographic demands should be addressed by the program (Williams, 1995). Davis (2007) says that the NAEP average scores should depict the progress over a certain period of time and show were declines occur in the average scores of all graders. The supplementary data will help the evaluators to close the gaps among ethnic groups and promote equity in the nation’s educational systems (Davis, 2007).
The supplementary data will enhance results and conclusions. This is because the measures will reduce the programmes high sampling and administration costs. Teddlie & Reynolds (2000) learned that “the data will ensure that substantial modifications are made to test the design itself, so that the later assessment resembles traditional standardized achievement tests” (p. 275). They further say that despite the compromises made in the overall program, the data will promote a concerted effort over the years to maintain special bridge samples whose testing conditions are comparable to the earlier NAEP assessments.
The recommendations and results evident in these two assessment processes are meant to portray some aspects of the condition of education in the country. They are best looked at as implying various ideas to be further observed in light of other assessment processes. They also elaborate on the many issues contributing to educational achievement among the teachers and students. A critical review of both IAR and NAEP has provided a dependable measure of their achievement and trends of were improvement is required at the national and state levels in various grades and subjects. The motivation behind IAR and NAEP assessment process should be the success and improvement of student performance at state and nation level, but not on the basis of race, disability, ethnicity and gender.