Validity is the level of accuracy in measurement of a concept during a study or research. It is concerned with the aptness, and convenience of the exact deductions made from test scores. Validity has been viewed conventionally through various forms related to content, criterion as well as that which is concerned with constructs a proof among others. However, there are current paradigms that take validity as a unitary concept (Cook, 2005). As a result, test proofs should be got from various sources to support an interpretation. If consideration to the orderly assortment of validity evidence is improved, evaluations in psychometric tests will be improved.
When a test score and the quality being measured are in conformity, then validity is said to be achieved. The usefulness of a test score can only be achieved by having definite evidence got through definite means (Marakas, Johnson, & Clay, 2007). Besides, test values should have validity for the purpose intended. There are rules of evidence that psychologists abide to, as a way of ascertaining that a test score has a meaning for a specific reason. Therefore, to attain validity, the level of deductions from test findings should be reasonable, appropriate and significant (Saccuzzo & Kaplan, 2008).
At initial stages of data collection in any given study, there is usually no reason to consider a test score valid. However, evidence for validity of a test score is attained through correlation between the test values and other variables in the study. Just like in a court scenario where evidence of guilt is expected to be convincing, evidence is required to show the validity of the test and even establish a connection between two or more variables being considered (Cook, 2005).
It is worth mentioning that validity is not a property attached to the apparatus used in a study, but instead, to the results and interpretations of the apparatus. This means that same instrument can be used to measure different variables with different interpretations. Since validity is concerned with deductions, not the apparatus used, it should be well founded for every intended explanation (Cohen, Fu, & Fu, 2008).
It would be wrong to consider the validity as a concept, since it implies that it can only be found at two extremes. First, validity is attached to the deductions from measurements of an instrument which are reflected as levels of accuracy but not perfect values. The idea of validity is similar to a hypothesis (Cook, 2005). First, researchers states their hypothesis, thereafter, they gather enough proof to support the appropriateness of their arguments. Similarly, validity requires information that helps to proof and link the interpreted test values to theoretical information behind a specific study and the preferred interpretation. Nevertheless, validity cannot be proven since it is only expressed as a degree of appropriateness (Cook, 2005).
The new unitary approach to validity has created some paradigms in this phenomenon. All validity should be grouped under a single structure of construct validity. This unique approach highlights that test scores obtained from an instrument are helpful when they signify a construct. At the same time, enough proof is needed to support the correlation between the two. The other forms of validity are then regarded as sources of validity proof in support of the general framework (Cook, 2005). There is no such big difference between the categories of validity, since all of them overlap. They also collectively support deductions made from the test values on construct validity.
Content evidence entails steps of analyzing the correlation between the content of a test and the construct that a researcher aims to measure (Cook, 2005). The content ought to reflect the utter truth in order to obtain the intended information. The process involves scrutiny of the instrument intended to be used, development of chosen items to act as values for the information needed and qualification of the human resource used (Marakas, Johnson, & Clay, 2007). Detailed description of all these steps is broadly grouped as the content evidence, and they assist in signifying the construct.
Response process is an important process that helps to review, thereby shedding some light on the relationship between the construct and nature of details given by researchers (Cook, 2005). For instance, a tutor might inquire if students taking a test that involves measuring the diagnostic reasoning, whether, they invoked high-order thinking procedure. Additionally, if one of the necessities of an instrument used requires students to assess the performance of one another, evidence from the responses would suggest that the experimenters were well trained. Lastly, this category further incorporates data collection and security methods (Cohen, Fu, & Fu, 2008).
Existence of a relationship with test scores from another instrument where existence of a connection is found or not supports an explanation that is reliable to the fundamental construct. For instance, in order to get quality in life assessment, variations in values among patients with unstable health states would support validity (Cook, 2005). Analysis of results of an evaluation can express and even unearth any possible sources of invalidity. For instance, a test can be carried out to determine the rating levels of teachers in a certain school.
If the assessment indicates that male teachers are rated lower than their female counterparts, it could translate to a number of inferences (Cook, 2005). First, results are ambiguous. Second, the information implies that male teachers are not as effective as their female folks. In such a scenario, proof of the results is important to link test scores to the original construct before the proof can influence the validity of the deductions (Saccuzzo & Kaplan, 2008).
Moreover, evidence of results can still be evaluated by finding out whether what was expected from the assessment was achieved, and the unplanned effects were evaded. Therefore, if a faculty of higher authority excluded the teachers with lower test scores, the unexpected negative results would definitely affect the meaning of the score values and hence their validity (Saccuzzo & Kaplan, 2008). On the contrary, if the remedy was provided and the low value of performance improves, the proof would support the validity of the interpretations. This is a clear indication of the controversial aspect of proof of consequences as far as a validity level is concerned.
If the provided proof does not hold up to the original argument of validity, the argument can be declined or improved by fine-tuning the measurement process. The new argument is then re-evaluated once more. Validity is thus seen like a cyclic process of testing and revision. Nevertheless, the amount of evidence needed will depend on the anticipated uses of the instrument (Cook, 2005). Situations that require a high level of confidence in the accuracy of explanations will need more proof than situations that need lower levels of confidence for similar explanations.
In addition, integration of the various sources of evidence will imply that some instruments used in measurement will depend more on some categories of validity evidence than others. Integration of categories of validity into one single framework implies that gathering of evidence requires employment of insight and careful planning. Researchers should use the sources of validity evidence as a basic framework when assessing instruments (Saccuzzo & Kaplan, 2008).
Validity is concerned with the level at which test values reflect the underlying construct as well as the explanation of the results according to evidence provided. However, it does not relate to the instrument of measurement itself (Cook, 2005). A precise understanding of validity in psychometric tests and other evaluations helps the professionals in these fields a great deal. Consequently, enhancement of the consideration given to systematic gathering and appraisal of validity evidence will translate to development in research, learning and health care.