100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Task 2. Fool the assistent $3.20
Add to cart

Case

Task 2. Fool the assistent

1 review
 46 views  1 purchase
  • Course
  • Institution

GGZ2030. Psychodiagnostics. Task 2 elaborated: Fool the assistant. The notes from the tutorial are credited with green

Preview 3 out of 20  pages

  • June 30, 2020
  • 20
  • 2019/2020
  • Case
  • Unknown
  • Unknown

1  review

review-writer-avatar

By: carowolters • 4 year ago

avatar-seller
Task 2. Fool the assistent
Understand it on a conceptual level (understand the formula, don’t really have to calculate
it)

What is reliability?

Kaplan, R. M., & Saccuzzo, D. P. (2018). Psychological testing: principles, applications, &
issues. Chapter 4: Reliability

Tests that are relatively free of measurement error are considered to be reliable.

History of reliability
Serious measurement error occurs in most physical, social, and biological sciences.
The advanced development of reliability assessment is owed to the work of Spearman.
Reliability theory puts the two concepts of sampling error and product moment correlation
together in the context of measurement. Cronbach et al. later made a major advance by
developing methods for evaluating many sources of error in behavioural research. Reliability
theory continues to evolve.

Theory of reliability
Classical test score theory assumes that each person has a true score that would be
obtained if there were no errors in measurement. However, because measuring instruments
are imperfect, the score observed for each person almost always differs from the person’s
true ability or characteristic. The difference between the true score and the observed score
results from measurement error. So, the observed score (X) has two components: a true
score (T) and an error component (E)  X = T + E




An assumption in this theory is that errors of measurement are random. Although systematic
errors are common in most measurement problems, other errors are more likely to result in
wrong conclusions. When you always misread a ruler with 2cm (standard error), you would
still get equal pieces.
Basic sampling theory tells us that the distribution of random errors is bell shaped. Thus, the
center of the distribution should represent the true score, and the dispersion around the
mean of the distribution should display the distribution of sampling errors. The true score
can be estimated by finding the mean of the observations from repeated applications.

,The dispersions around the true score tell us how much error there is in the measure.
Classical test theory assumes that the true score for an individual will not change with
repeated applications of the same test. Because of random error, however, repeated
applications of the same test can produce different scores. Theoretically, the standard
deviation of the distribution of errors for each person tells us about the magnitude of
measurement error. Because we usually assume that the distribution of random errors will
be the same for all people, classical test theory uses the standard deviation of errors as the
basic measure of error. This is called the standard error of measurement:
The mean of the distribution of scores from repeated applications will be an estimate of the
true score. The standard deviation will be the standard error of measurement. It tells us, on
average, how much a score varies from the true score. In practice, the standard deviation of
the observed score and the reliability of the test are used to estimate the standard error of
measurement.

The domain sampling model
The domain sampling model is another central concept in classical test theory. It considers
the problems created by using a limited number of items to represent a larger and more
complicated construct. To evaluate something, it would be unlikely that there is time to go
systematically through every item. It is decided to use a sample of it.
The model conceptualizes reliability as the ratio of the variance of the observed score on the
shorter test and the variance of the long-run true score. The error is introduced by using a
sample of items rather than the entire domain. As the sample gets larger, it represents the
domain more accurately. So, the greater the number of items, the higher the reliability.
In a test, each item is a sample of the ability or behavior to be measured.
Reliability can be estimated from the correlation of the observed test score with the true
score. However, finding the true scores is not practical and is rarely possible. As they are not
available, the alternative is to estimate what they would be. Because of sampling error,
different random samples of items might give different estimates of the true score. The
distribution of these estimates should be random and normally distributed. If many tests are
created by sampling from the same domain, a normal distribution of unbiased estimates of
the true score should be found.

Theorists have demonstrated that an unbiased estimate of test’s reliability is given by the
square root of the average correlation between all other randomly parallel tests from the
domains: . The correlation between two randomly parallel tests will be smaller than
the estimate correlation between one of the tests and the true score according to the
formula.

, Item response theory
Most methods for assessing reliability depend on classical test theory. However, a growing
movement is turning away from this theory. This is because classical test theory requires
that exactly the same test items be administered to each person.
A newer approach, the item response theory (IRT), has been developed. Using IRT, the
computer is used to focus on the range of item difficulty that helps assess an individual’s
ability level. For example, if the person gets several easy items correct, the computer might
move to more difficult items. Then, this level of ability is intensely sampled. The overall
result is that a more reliable estimate of ability is obtained using a shorter test with fewer
items. However, the method requires a bank of items that have been systematically
evaluated for level of difficulty. Complex computer software is also required.

Models of reliability
Guidelines require that a test be reliable before one can use it. Most reliability coefficients
are correlations. The reliability coefficient is the ratio of the variance of the true scores on a
test to the variance of the observed scores:
T = true scores
X = observed scores

The ratio of true score variance to observed score variance can be thought of as a
percentage. A reliability of 0.40 means that 40% of the variation or difference among the
people will be explained by real differences among people, and 60% must be ascribed to
random or chance factors.

Sources of error
An observed score may differ from a true score. There may be situational factors (e.g. loud
noises, temperature). Also, the items on the test might not be representative of the domain
(e.g. you can spell 96% of English words correctly, but the test included exactly those items
you could not spell).
Test reliability is usually estimated in one of three ways. In test-retest method, we consider
the consistency of the test results when the test is administered on different occasions.
Using the method parallel forms, we evaluate the test across different forms of the test.
With the method of internal consistency, we examine how people perform on similar subsets
of items selected from the same form of the measure. Each approach is based on a different
source of variability.

Time sampling: the test-retest method
Test-retest reliability estimates are used to evaluate the error associated with administering
a test at two different times. This type of analysis is of value only when we measure "traits"
or characteristics that do not change over time. As such, if the characteristic administered at
two points in time produces different scores, then we might conclude that this is a result of
random measurement error. Tests that measure some constantly changing characteristic are
not appropriate for test-retest evaluation.
Test-retest reliability is relatively easy to evaluate. However, you need to consider many
other details besides the methods for calculating the test-retest reliability coefficient. A
carryover effect should always be considered. This occurs when the first testing session
influences scores from the second session (e.g. test takers sometimes remember their
answers from the first time they took the test). When there are carryover effects, the test-

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller dominiquekl. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $3.20. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

48756 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling
$3.20  1x  sold
  • (1)
Add to cart
Added