Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

PSYCHOLOGICAL ASSESSMENT (SUMMARIZED REVIEWER), Lecture notes of Psychology

A PSYCHOLOGICAL ASSESSMENT summarizd reviewer covering all chapters of psychology books- Cohen, etc.

Typology: Lecture notes

2023/2024

Available from 04/20/2024

akinwallflower
akinwallflower 🇵🇭

6 documents

1 / 70

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
PSYCHOLOGICAL ASSESSMENT
PSY 109
KRISTOFFER SOTOZA, RPm | 2026
1
Testing is the process of measuring
psychology-related variables by means of
devices or procedures designed to obtain a
sample of behavior.
Psychological assessment as the gathering
and integration of psychology-related data for
the purpose of making a psychological
evaluation that is accomplished through the
use of tools such as tests, interviews, case
studies, behavioral observation, and specially
designed apparatuses and measurement
procedures.
Psychological testing defined as the process
of measuring psychology-related variables by
means of devices or procedures designed to
obtain a sample of behavior.
In contrast to the process of administering,
scoring, and interpreting psychological tests
(psychological testing), psychological
assessment may be conceived as a problem-
solving process that can take many different
forms.
1. Educational Assessment- the use of tests and
other tools to evaluate abilities and skills relevant
to success or failure in a school or pre-school
context. Eg. Intelligence tests, achievement
tests, and reading comprehension tests.
2. Retrospective Assessment- the use of
evaluative tools to draw conclusions about
psychological aspects of a person as they
existed at some point in time prior to the
assessment.
3. Remote Assessment- the use of tools of
psychological evaluation to gather data and
draw conclusions about a subject who is not in
physical proximity to the person or people
conducting the evaluation.
4. Ecological Momentary Assessment (EMA)-
the “in the moment” evaluation of specific
problems and related cognitive and behavioral
variables at the very time and place that they
occur.
1. Begins with a referral for assessment from a
source. Eg. teacher, school psychologist,
counselor, judge, clinician, or corporate human
resources specialist.
2. One or more referral questions are put to the
assessor about the assessee.
3. The assessor may meet with the assessee or
others before the formal assessment in order to
clarify aspects of the reason for referral.
4. The assessor prepares for the assessment by
selecting the tools of assessment to be used.
5. The formal assessment will begin.
6. After the assessment, the assessor writes a
report of the findings.
7. More feedback sessions with the assessee
and/or interested third parties.
Collaborative Psychological Assessment-
the assessor and assessee may work as
“partners” from initial contact through final
feedback.
֍ Therapeutic Psychological Assessment-
therapeutic self-discovery and new
understandings are encouraged throughout
the assessment process.
Dynamic Assessment- refers to an interactive
approach to psychological assessment that
usually follows a model of (1) evaluation, (2)
intervention of some sort, and (3) evaluation.
Most typically employed in educational settings.
CHAPTER 1
PSYCHOLOGICAL TESTING &
ASSESSMENT
Testing and Assessment
Psychological Testing and
Assessment Defined
Variety of Assessment
The Process of Assessment
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46

Partial preview of the text

Download PSYCHOLOGICAL ASSESSMENT (SUMMARIZED REVIEWER) and more Lecture notes Psychology in PDF only on Docsity!

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  • Testing is the process of measuring psychology-related variables by means of devices or procedures designed to obtain a sample of behavior.
  • Psychological assessment as the gathering and integration of psychology-related data for the purpose of making a psychological evaluation that is accomplished through the use of tools such as tests, interviews, case studies, behavioral observation, and specially designed apparatuses and measurement procedures.
  • Psychological testing defined as the process of measuring psychology-related variables by means of devices or procedures designed to obtain a sample of behavior.
  • In contrast to the process of administering, scoring, and interpreting psychological tests (psychological testing), psychological assessment may be conceived as a problem- solving process that can take many different forms. 1. Educational Assessment- the use of tests and other tools to evaluate abilities and skills relevant to success or failure in a school or pre-school context. Eg. Intelligence tests, achievement tests, and reading comprehension tests. 2. Retrospective Assessment- the use of evaluative tools to draw conclusions about psychological aspects of a person as they existed at some point in time prior to the assessment. 3. Remote Assessment- the use of tools of psychological evaluation to gather data and draw conclusions about a subject who is not in physical proximity to the person or people conducting the evaluation. 4. Ecological Momentary Assessment (EMA)- the “in the moment” evaluation of specific problems and related cognitive and behavioral variables at the very time and place that they occur. 1. Begins with a referral for assessment from a source. Eg. teacher, school psychologist, counselor, judge, clinician, or corporate human resources specialist. 2. One or more referral questions are put to the assessor about the assessee. 3. The assessor may meet with the assessee or others before the formal assessment in order to clarify aspects of the reason for referral. 4. The assessor prepares for the assessment by selecting the tools of assessment to be used. 5. The formal assessment will begin. 6. After the assessment, the assessor writes a report of the findings. 7. More feedback sessions with the assessee and/or interested third parties. - Collaborative Psychological Assessment- the assessor and assessee may work as “partners” from initial contact through final feedback. ֍ Therapeutic Psychological Assessment - therapeutic self-discovery and new understandings are encouraged throughout the assessment process. - Dynamic Assessment- refers to an interactive approach to psychological assessment that usually follows a model of (1) evaluation, (2) intervention of some sort, and (3) evaluation. Most typically employed in educational settings.

CHAPTER 1

PSYCHOLOGICAL TESTING &

ASSESSMENT

Testing and Assessment Psychological Testing and Assessment Defined Variety of Assessment The Process of Assessment

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

֍ Dynamic: used to describe the interactive, changing, or varying nature of the assessment. I. The Test

  • Test- a measuring device or procedure.
  • Psychological test- a device or procedure designed to measure variables related to psychology (such as intelligence, personality, aptitude, interests, attitudes, or values).
  • Psychological tests and other tools of assessment may differ with respect to a number of variables, such as:
    1. Content (Subject matter) - different test developers come to the test development process with different theoretical orientations.
    2. Format- pertains to the form, plan, structure, arrangement, and layout of test items as well as to related considerations such as time limits. Format is also used to refer to the form in which a test is administered: computerized, pencil-and- paper, or some other form.
    3. Administration Procedures
      • One-to-One Basis - require an active and knowledgeable test administrator.
      • Group Administration - may not even require the test administrator to be present while the test takers independently complete the required tasks.
    4. Scoring and Interpretation and Procedures
      • Score - a code or summary statement that reflects an evaluation of performance on a test, task, interview, or some other sample of behavior. o Can be self-scored, scored by computer, or require scoring by trained examiners.
      • Scoring - the process of assigning such evaluative codes or statements to performance on tests, tasks, interviews, or other behavior samples.
      • Cut Score (Cutoff Score) - a reference point, usually numerical, derived by judgment and used to divide a set of data into two or more classifications.
    5. Psychometric Soundness (Technical Quality)
      • Technical quality pertains to reliability, validity, and utility. - Psychometrics - the science of psychological measurement. - Psychometrist and Psychometrician - a professional who uses, analyzes, and interprets psychological test data. II. Interview- a method of gathering information through direct communication involving reciprocal exchange. Can be both verbal and nonverbal behavior; language, movements, and facial expressions, the extent of eye contact, apparent willingness to cooperate, and general reaction, appearance.
  • Panel Interview (Board Interview)- more than one interviewer participates in the assessment. ֍ Advantage: anyidiosyncratic biases of a lone interviewer will be minimized. ֍ Disadvantage: The cost of using multiple interviewers.
  • Motivational Interview- therapeutic dialogue that combines person-centered listening skills with the use of cognition-altering techniques designed to positively affect motivation and effect therapeutic change. III. Portfolio- Samples of one’s ability and accomplishment—files of their work products. These work products—whether retained on paper, canvas, film, video, audio, or some other medium. IV. Case History Data- refers to records, transcripts, and other accounts in written, pictorial, or other form that preserve archival information, official and informal accounts, and other data and items relevant to an assessee. o Files or excerpts from files maintained at institutions and agencies (schools, hospitals, employers). o Letters and written correspondence, photos and family albums, newspaper and magazine clippings, home videos, movies, audiotapes, work samples, artwork, doodlings, and accounts and pictures. o Postings on social media.
  • Case Study (Case History)- report or illustrative account concerning a person or an event that was compiled on the basis of case history data. V. Behavioral Observation- defined as monitoring the actions of others or oneself by visual or electronic means while recording quantitative and/or qualitative information regarding those actions.
  • Naturalistic Observation- observe the behaviors in real world settings or in the setting in which the behavior would typically be expected to occur. VI. Role Play Tests The Tools of Psychological Assessment

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  • Diagnostic Test - a tool of assessment used to help narrow down and identify areas of deficit to be targeted for intervention.
  • Informal Evaluation - a typically nonsystematic assessment that leads to the formation of an opinion or attitude. II. Clinical Settings - used to help screen for or diagnose behavior problems, employed with only one individual at a time. Eg. intelligence tests, personality tests, neuropsychological tests, or other specialized instruments. III. Counseling Setings - schools, prisons, and governmental or privately owned institutions. ֍ Improvement of the assessee in terms of adjustment, productivity, or some related variable. ֍ Measures of social and academic skills and measures of personality, interest, attitudes, and values. IV. Geriatic Settings
  • Older people require psychological assessment to evaluate cognitive, psychological, adaptive, or other functioning. ֍ Extent to which assessees are enjoying as good a quality of life. ֍ Quality of Life - variables related to perceived stress, loneliness, sources of satisfaction, personal values, quality of living conditions, and quality of friendships and other social support.
  • Screening for cognitive decline and dementia; ֍ Dementia - loss of cognitive functioning that occurs as the result of damage to or loss of brain cells. ֍ Alzheimer’s Disease - a form of dementia. ֍ Pseudodementia - severe depression in the elderly can contribute to cognitive functioning that mimics dementia. V. Business and Military Settings
  • Achievement, aptitude, interest, motivational, and other tests.
  • Decisions regarding promotions, transfer, job satisfaction, and eligibility for further training.
  • Engineering Psychologists - employ a variety of existing and specially devised tests in research designed to help people at home, in the workplace, and in the military.
  • Marketing And Sale of Products - help corporations predict the public’s receptivity to a new product, a new brand, or a new advertising or marketing campaign. VI. Governmental and Organizational Credentialing - governmental licensing, certification, or general credentialing of professionals. VII. Academic Research Settings - publish research should ideally have a sound knowledge of measurement principles and tools of assessment. VIII. Other Settings
  • Court trials, program evaluations.
  • Health Psychology - discipline that focuses on understanding the role of psychological variables in the onset, course, treatment, and prevention of illness, disease, and disability. I. Before the Test
  1. Test users have discretion with regard to the tests administered, they should select and use only the test or tests that are most appropriate for the individual being tested.
  2. Test should be stored in a way that reasonably ensures that its specific contents will not be made known to the test taker in advance.
  3. The test administrator must be familiar with the test materials and procedures and must have at the test site all the materials needed.
  4. Ensuring the room is suitable and conducive.
  • Protocol - the form or sheet or booklet on which a test taker’s responses are entered. A description of a set of test- or assessment- related procedures. II. During the Administration
  • Rapport - working relationship between the examiner and the examinee. III. After the Administration
  1. Safeguarding the test protocols to conveying the test results in a clearly understandable fashion.
  2. Make a note of out of ordinary events on the report of the testing.
  3. Scoring needs to conform to pre- established scoring criteria.
  4. Responsibility in interpreting score in accordance with established procedures and ethical guidelines. IV. Assessment of People with Disabilities
  • Alternative Assessment - an evaluative or diagnostic procedure or process that varies from the usual, customary, or standardized way a measurement is derived, either by virtue of some special accommodation made to the assessee or by means of alternative methods designed to measure the same variable(s).
  • Accomodation - the adaptation of a test, procedure, or situation, or the substitution of one test for another, to make the assessment more suitable for an assessee with exceptional needs.
  • Four Variables to consider: How are the Assessments Conducted?

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  1. The capabilities of the assessee.
  2. The purpose of the assessment.
  3. The meaning attached to test scores.
  4. The capabilities of the assessor.
  • Test Catalogues - contain only a brief description of the test and seldom contain the kind of detailed technical information that a prospective user might require.
  • Test Manuals - detailed information concerning the development of a particular test and technical information relating to it.
  • Professional Books - many books written for an audience of assessment professionals are available to supplement, reorganize, or enhance the information typically found in the manual of a very widely used psychological test.
  • Reference Volumes ֍ Mental Measurements Yearbook o Compiled by Oscar Buros in 1938. o Authoritative compilation of test reviews. ֍ Tests in Print o Published by The Buros Center o Lists all commercially available English-language tests in print. o Provides detailed information for each test listed, including test publisher, test author, test purpose, intended test population, and test administration time.
  • Journal Articles - contain reviews of the test, updated or independent studies of its psychometric soundness, or examples of how the instrument was used in either research or an applied context.
  • Online Databases ֍ Educational Resources Information Center (ERIC) o One of the most widely used bibliographic databases for test-related publications. ֍ ERIC website at www.eric.ed.gov o Contains a wealth of resources and news about tests, testing, and assessment. I. China (as early as 2200 BCE)
  • Testing as a means of selecting who would obtain government jobs. ֍ Only open to men ֍ Examined proficiency in subjects like music, archery, horsemanship, writing, and arithmetic, as well as agriculture, geography, civil law, and military strategy. ֍ Merits in passing the examination: o Wear special garb- accorded special courtesies by anyone. o Exemption from taxes. o Exempt one from government- sponsored interrogation by torture.
  • Song (or Sung) Dynasty (960 to 1279 CE) ֍ Emphasized knowledge of classical literature. o Acquired the wisdom of the past and were therefore entitled to a government position. II. Ancient Greco-Roman Writings - categorize people in terms of personality types. ֍ Overabundance or deficiency in some bodily fluid. III. Charles Darwin (1809–1882)
  • On the Origin of Species by Means of Natural Selection
  • Chance variation in species would be selected or rejected by nature according to adaptivity and survival value. ֍ Spurred scientific interest in individual differences. IV. Francis Galton
  • Charles Darwin’s half cousin.
  • Explore and quantify individual differences between people. ֍ Classify people “according to their natural gifts” and to ascertain their “deviation from an average.” ֍ Development of many contemporary tools of psychological assessment, including questionnaires, rating scales, and self-report inventories.

CHAPTER 2

HISTORICAL, CULTURAL, &

LEGAL/ETHICAL CONSIDERATIONS

Historical Perspective Where to go for Authoritative Information: References Sources Antiquity to the Nineteenth Century

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  • Belgian psychologist (Ovide Decroly) informed Goddard of Binet’s work and gave him a copy of the Binet-Simon Scale.
  • Psychological Testing in Ellis Island - most immigrants from various nationalities to be mentally deficient when tested.
  • Eugenics - science of improving the qualities of a breed through intervention with factors related to heredity. Mentally deficient individuals should be segregated or institutionalized and not be permitted to reproduce.
  • The Kallikak Family: A Study in the Heredity of Feeble-Mindedness - traced the lineage of one of his students at the Vineland school back five generations.
  • Culture Specific Test - “Isolate” the cultural variable in the test. Tests designed for use with people from one culture or group but not from another. I. Verbal Communication
  1. The examiner and the examinee must speak the same language.
  2. Test is in written form and includes written instructions. Testtaker must be able to read and comprehend what is written.
  3. Assessment is conducted with the aid of a translator. Subtle nuances of meaning may be lost in translation, or unintentional hints to the correct or more desirable response may be conveyed.
  4. A trained examiner may detect the examinee’s grasp of a language, or a dialect is too deficient to proceed. II. Nonverbal Communication and Behavior
  • Messages conveyed by such body language may be different from culture to culture. III. Standards of Evaluation - judgments related to certain psychological traits can also be culturally relative.
  • Individualist Culture - characterized by value being placed on traits such as self-reliance, autonomy, independence, uniqueness, and competitiveness.
  • Collectivist Culture - value is placed on traits such as conformity, cooperation, interdependence, and striving toward group goals.
  • Cultural Formulation Interview (CFI) - consists of 16 questions, a way to collect information on patients' illness experience, social and cultural context, help-seeking, and treatment expectations relevant to psychiatric diagnosis and assessment. o Our own professional cultures—systems of knowledge, concepts, rules, and practices that are learned and transmitted across generations—mold our scientific interpretations that may not reflect the realities of health and illness in our patients’ lives.
  • If a test is used to evaluate a candidate’s ability to do a job, the test should do just that— regardless of the group membership of the testtaker. Scores should not be affected by variables such as group membership, hair length, eye color, or any other variable.
  • Affirmative Action - voluntary and mandatory efforts undertaken by federal, state, and local governments, private employers, and schools to combat discrimination and to promote equal opportunity for all in education and employment.
  • Laws - rules that individuals must obey for the good of the society as a whole.
  • Ethics - a body of principles of right, proper, or good conduct.
  • Code of Professional Ethics - defines the standard of care expected of members of that profession. ֍ Standard of Caren - the level at which the average, reasonable, and prudent professional would provide diagnostic or therapeutic services under the same or similar conditions.
  • Common Core State Standards - the product of a state-led effort to bring greater interstate uniformity to what constituted proficiency in various academic subjects.
  • Sputnik - the launching of a satellite into space by USSR. o US: allocated toward identifying the gifted children who would one day equip the United States to successfully compete with the Soviets.
  • National Defense Education Act - provided federal money to local schools for the purpose of Legal and Ethical Consideration Some Issues Regarding Culture and Assessment Tests and Group Membership The Concerns of the Public

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

testing ability and aptitude to identify gifted and academically talented students. I. Legislation

  • Minimum Competency Testing Programs - formal testing programs designed to be used in decisions regarding various aspects of students’ education. USED in decision making about grade promotions, awarding of diplomas, and identification of areas for remedial instruction.
  • Truth-In-Testing Legislation - give testtakers a way to learn the criteria by which they are being judged. o Disclosure of answers to postsecondary and professional school admissions tests. o The test’s purpose and its subject matter. o The knowledge and skills the test purports to measure. o Procedures for ensuring accuracy in scoring. o Procedures for notifying testtakers of errors in scoring. o Procedures for ensuring the testtaker’s confidentiality.
  • Equal Employment Opportunity Commission (EEOC)- published sets of guidelines concerning standards to be met in constructing and using employment tests.
  • Quota System - a selection procedure whereby a fixed number or percentage of applicants from certain backgrounds were selected.
  • Discrimination - the practice of making distinctions in hiring, promotion, or other selection decisions that tend to systematically favor members of a majority group regardless of actual qualifications for positions.
  • Reverse Discrimination - practice of making distinctions in hiring, promotion, or other selection decisions that systematically tend to favor members of a minority group regardless of actual qualifications for positions.
  • Disparate Treatment - the consequence of an employer’s hiring or promotion practice that was intentionally devised to yield some discriminatory result or outcome.
  • Disparate Impact - the consequence of an employer’s hiring or promotion practice that unintentionally resulted in a discriminatory result or outcome. II. Litigation - the process of taking legal action. The court-mediated resolution of legal matters of a civil, criminal or administrative nature.
  • PARC v. Commonwealth of Pennsylvania (1971)- PARC brought suit because children with intellectual disability in that state had been denied access to public education.
  • Mills v. Board of Education of District of Columbia (1972)- a similar lawsuit was filed on behalf of children with behavioral, emotional, and learning impairments.
  • Daubert v. Merrell Dow Pharmaceuticals- Mrs. Daubert’s use of the prescription drug Bendectin to relieve nausea during pregnancy. Children were born with birth defects. Opposing expert testimony, whether or not such testimony had won general acceptance in the scientific community, would be admissible.
  • Frye v. the United States- scientific research is admissible as evidence when the research study or method enjoys general acceptance.
  • Rule 702 - Allowing more experts to testify regarding the admissibility of the original expert testimony. Enacted to assist juries in their fact- finding by helping them to understand the issues involved.
  • General Electric Co. v. Joiner (1997)- the Court emphasized that the trial court had a duty to exclude unreliable expert testimony as evidence.
  • Kumho Tire Company Ltd. v. Carmichael (1999)- the Supreme Court expanded the principles expounded in Daubert to include the testimony of all experts, whether or not the experts claimed scientific research as a basis for their testimony.
  • Missouri case of Zink vs. State (2009)- After David Zink rear-ended a woman’s car in traffic, Zink kidnapped the woman, and then raped, mutilated, and murdered her. Death penalty should be set aside because of his mental disease. Defense attorney had failed to present “hard” evidence of a mental disorder as indicated by a PET scan. I. Test-User Qualifications
  • Ethical Standards for the Distribution of Psychological Tests and Diagnostic Aids
  1. Level A - tests or aids that can adequately be administered, scored, and interpreted with the aid of the manual and a general orientation to the kind of institution or organization in which one is working. Eg., achievement or proficiency tests.
  2. Level B - tests or aids that require some technical knowledge of test construction and use and of supporting psychological and educational fields such as statistics, individual differences, psychology of The Concerns of the Profession

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

IV. The Right to the Least Stigmatizing Label - the least stigmatizing labels should always be assigned when reporting test results.

  • The Case of Jo Ann Iverson o Administered a Stanford-Binet Intelligence Test. o Arden Frandsen reported “feeble-minded, at the high-grade moron level of general mental ability.” o Carmel Iverson, brought a libel (defamation) suit.
  • Measurement - the act of assigning numbers or symbols to characteristics of things according to rules.
  • Scale - a set of numbers (or other symbols) whose properties model empirical properties of the objects to which the numbers are assigned.
    1. Continuous Scale - uncountable set of values or infinite set of values. An approximation of the “real” number.
    2. Discrete Scale - countable in a finite amount of time.
  • Error - the collective influence of all of the factors on a test score or measurement beyond those specifically measured by the test or measurement. I. Nominal Scales - classification or categorization based on one or more distinguishing characteristics. ֍ Used exclusively for classification purposes and could not be meaningfully added, subtracted, ranked, or averaged. ֍ Eg. A yes or no response results in the placement into one of a set of mutually exclusive groups: suicidal or not. ֍ Arithmetic operations - counting for the purpose of determining how many cases fall into each category and a resulting determination of proportion or percentages. II. Ordinal Scales - classification and rank ordering. ֍ No absolute zero point and numbers do not indicate units of measurement. ֍ lndividuals are compared with others and assigned a rank. o Absolute Zero - the zero point represents the absence of the property being measured. ֍ Alfred Binet - believed strongly that the data derived from an intelligence test are ordinal in nature. Test was not to measure people but merely to classify (and rank) people on the basis of their performance on the tasks. ֍ Rokeach Value Survey - a list of personal values are put in order according to their perceived importance to the testtaker. Eg. intelligence, aptitude, and personality test scores are, basically and strictly speaking, ordinal. III. Interval Scales - contain equal intervals between numbers. ֍ No absolute zero point. Eg., individual were to achieve an IQ of 0: not an indication of zero (the total absence of) intelligence. ֍ Possible to average a set of measurements and obtain a meaningful result. IV. Ratio Scales - has a true zero point. All mathematical operations can meaningfully be performed because there exist equal intervals between the numbers on the scale as well as a true or absolute zero point.
  • Distribution - set of test scores arrayed for recording or study.
  • Raw Score - straightforward, unmodified accounting of performance that is usually numerical. ֍ Simple Tally - number of items responded correctly on an achievement test. The data from the test could be organized into a distribution of the raw scores. One way the scores could be distributed is by the frequency with which they occur. In a frequency distribution , all scores are listed alongside the number of times each score occurred.
  • Simple Frequency Distribution - individual scores have been used and the data have not been grouped.
  • Grouped Frequency Distribution - test-score intervals (class intervals) replace the actual test scores. I. Graph - diagram or chart composed of lines, points, bars, or other symbols that describe and illustrate data.

CHAPTER 3

A STATISTICS REFRESHER

Scales of Measuremnt Describing Data Frequency Distribution

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  1. Histogram - a graph with vertical lines drawn at the true limits of each test score (or class interval), forming a series of contiguous rectangles.
  2. Bar Graph - numbers indicative of frequency also appear on the Y-axis, and reference to some categorization appears on the X-axis.
  3. Frequency Polygon - expressed by a continuous line connecting the points where test scores or class intervals (X- axis) meet frequencies (Y-axis). A measure of central tendency is a statistic that indicates the average or midmost score between the extreme scores in a distribution. I. Mean
  • Interval-level statistic.
  • Most stable and useful measure of central tendency.
  • Average- symbol and pronounced “X bar”
  • Equal to the sum of the observations divided by the number of observations. ֍ X̄ = ΣX/n ֍ Summation Notation - denoted by Σ (summation meaning “the sum of”) ֍ - test score ֍ n - number of test scores or observations II. Median
  • Middle score.
  • Ordinal in nature.
  • Useful for relatively few scores fall at the high or low end of the distribution. o Ordering the scores in a list by magnitude, in either ascending or descending order. o Odd Number - the score that is exactly in the middle. o Even Number - arithmetic mean of the two middle scores. III. Mode
  • The most frequently occurring score in a distribution of scores. o Tie - more than one mode. o Bimodal Distribution - there are two scores that occur with the highest frequency.
  • Least used o Two modes- each of which falls at the high or the low end of the distribution—thus violating the expectation that a measure of central tendency. o Nominal statistic and cannot legitimately be used in further calculations.
  • Useful in analyses of a qualitative or verbal nature. Measure of variability is the statistics that describe the amount of variation in a distribution. Variability is an indication of how scores in a distribution are scattered or dispersed. I. Range - equal to the difference between the highest and the lowest scores. The simplest measure of variability to calculate, but its potential use is limited. Eg., the lowest score was 0 and the highest score was 100, the range would be equal to 100 − 0, or 100. II. Interquartile And Semi-Interquartile Ranges - a distribution of test scores can be divided into four parts.
  • Quartiles - the dividing points between the four quarters in the distribution. Refers to a specific point.
  • Quarter - refers to an interval. ֍ Q2 and the median are the same. ֍ Q1 and Q3 the quarter-points.
  • Interquartile Range - a measure of variability equal to the difference between Q3 and Q1.
  • Semi-Interquartile Range - equal to the interquartile range divided by 2.
  • Perfectly Symmetrical Distribution- Q1 and Q3 will be exactly the same distance from the median.
  • Skewness - distances are unequal and lacks symmetry. III. Average Deviation Measures of Central Tendency Measures of Variability

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  • Approximately 68% of all scores occur between the mean and ±1 standard deviation.
  • Approximately 95% of all scores occur between the mean and ±2 standard deviations.
  • Tail - the area on the normal curve between 2 and 3 (or - 2 and - 3) standard deviations above the mean.
  • Normal curve has two tails.
  • Mentally Retarded or Gifted - approximately two standard deviations from the mean (IQ of 70 – 75 or lower or IQ of 125–130 or higher). A standard score is a raw score that has been converted from one scale to another scale, where the latter scale has some arbitrarily set mean and standard deviation. ֍ Why is it important? The position of a testtaker’s performance relative to other testtakers is readily apparent. I. z Scores - results from the conversion of a raw score into a number indicating how many standard deviation units the raw score is below or above the mean of the distribution.
  • Equal to the difference between a particular raw score and the mean divided by the standard deviation.
  • Zero Plus or Minus One Scale ֍ Mean : 0, Standard Deviation : 1
  • Why is it important? ֍ Provide a convenient context for comparing scores on the same test or different tests. II. T Scores - devised by W. A. McCall and named after his professor E. L. Thorndike.
  • Fifty Plus or Minus Ten Scale ֍ Mean: 50, Standard Deviation: 10 ▫ Ranges from 5 standard deviations below the mean to 5 standard deviations above the mean. ▫ T score of 0 - 5 standard deviations below the mean. ▫ T of 50 - raw score that fell at the mean. ▫ T of 100 - 5 standard deviations above the mean. ֍ Advantage - none of the scores is negative. ֍ T Score = 10 (z score) + 50 III. Other Standard Scores
  • Stanine - developed during World War II. Contraction of the words ‘ standard and nine ’. ֍ Mean : 5, Standard Deviation : approximately 2 ֍ 5th stanine - performance in the average range.
  • Standard scores converted from raw scores may involve. ֍ Linear Transformation - retains a direct numerical relationship to the original raw score. ֍ Nonlinear Transformation - may be required when the data under consideration are not normally distributed yet comparisons with normal distributions need to be made.
  • Normalized Standard Scores ֍ Normalizing A Distribution o Involves “stretching” the skewed curve into the shape of a normal curve and creating a corresponding scale of standard scores (normalized standard score scale). o Can readily be compared with a standard score on another test. Scales of Measuremnt

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

A coefficient of correlation (or correlation coefficient) is a number that provides us with an index of the strength of the relationship between two things. An understanding of the concept of correlation and an ability to compute a coefficient of correlation is therefore central to the study of tests and measurement.

  • Correlation - the degree and direction of correspondence between two things.
  • Coefficient of Correlation (R) - expresses a linear relationship between two (and only two) variables, usually continuous in nature. ֍ Interpreted by its sign and magnitude; ֍ Magnitude - a number anywhere at all between −1 and +1. ֍ Perfect Correlation - either +1 or −1. Sign : Positively or negatively correlated. ֍ Positive Correlation - simultaneously increase or simultaneously decrease. ֍ Negative Correlation - one variable increases while the other variable decreases. ֍ Zero Correlation - no relationship exists between the two variables.
  • Devised by Karl Pearson
  • Also known as ‘ Pearson Correlation Coefficient ’ and the ‘ Pearson Product- Moment Coefficient of Correlation
  • Used when the relationship between the variables is linear and when the two variables being correlated are continuous. ֍ How to compute? o Convert each raw score to a standard score and then multiply each pair of standard scores. A mean for the sum of the products is calculated, and that mean is the value of the Pearson r. o N : number of paired scores. o Σ XY : sum of the product of the paired X and Y scores. o Σ X : sum of the X scores. o Σ Y : sum of the Y scores. o Σ X 2 : sum of the squared X scores. o Σ Y 2 : sum of the squared Y scores. ֍ Coefficient Determination (r^2 ) o An indication of how much variance is shared by the X- and the Y-variables. o Square the correlation coefficient and multiply by 100. I. Product-Moment Coefficient of Correlation
  • Moment - describes a deviation about a mean of a distribution.
  • Deviates - individual deviations about the mean of a distribution. First moments of the distribution.
  • Second Moments - the moments squared.
  • Third Moments - moments cubed. One commonly used alternative statistic is variously called a rank-order correlation coefficient, a rank-difference correlation coefficient, or simply Spearman’s rho. Developed by Charles Spearman , a British psychologist (Figure 3–12), this coefficient of correlation is frequently used when the sample size is small (fewer than 30 pairs of measurements) and especially when both sets of measurements are in ordinal (or rank-order) form. Special tables are used to determine whether an obtained rho coefficient is or is not significant.
  • Also called as bivariate distribution, a scatter diagram, a scattergram, or a scatterplot.
  • Scatterplot - simple graphing of the coordinate points for values of the X-variable and the Y- Correlation and Inference Concept of Correlation Pearson r The Spearman Rho Graphic Representation of Correlation

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

֍ Relatively Enduring - a trait is not expected to be manifested in behavior 100% of the time.

  • States - distinguish one person from another but are relatively less enduring. Both can be measure through observing a sample of behavior. Eg. direct observation, analysis of self-report statements or pencil-and-paper test answers.
  • Psychological Trait Exists Only as a Construct - an informed, scientific concept developed or constructed to describe or explain behavior. Can’t see, hear, or touch constructs, but we can infer their existence from overt behavior. ֍ Overt Behavior - an observable action or the product of an observable action. II. Assumption 2: Psychological Traits and States can Quantified and Measure
  • Test developers and researchers, much like people in general, have many different ways of looking at and defining the same phenomenon. o The test developer has provided test users with a clear operational definition of the construct under study. o A test developer considers the types of item content that would provide insight into it. o Weight given to items. o Appropriate ways to score the test and interpret the results. ֍ Cumulative Scoring - the assumption that the more the testtaker responds in a particular direction as keyed by the test manual as correct or consistent with a particular trait, the higher that testtaker is presumed to be on the targeted ability or trait. ֍ Domain Sampling - a sample of behaviors from all possible behaviors that could conceivably be indicative of a particular construct. A sample of test items from all possible items that could conceivably be used to measure a particular construct. III. Assumption 3: Test-Related Behavior Predicts Non-Test-Related Behavior
  • Blackening little grids with a number 2 pencil or simply pressing keys on a computer keyboard. Has little to do with predicting future grid- blackening or key-pressing behavior.
  • The tasks in some tests mimic the actual behaviors that the test user is attempting to understand. IV. Assumption 4: Tests and Other Measurement Techniques Have Strengths And Weaknesses
  • Understand how a test was developed, the circumstances under which it is appropriate to administer the test, how the test should be administered and to whom, and how the test results should be interpreted.
  • Understand and appreciate the limitations of the tests. V. Assumption 5: Various Sources of Error are Part of the Assessment Process
  • Error - a long-standing assumption that factors other than what a test attempts to measure will influence performance on the test.
  • Error Variance - the component of a test score attributable to sources other than the trait or ability measured. Potential Sources of Error Variance: Flu, Assessor, Assessees.
  • Classical Test Theory (True Score Theory) - each testtaker has a true score on a test that would be obtained but for the action of measurement error. VI. Assumption 6: Testing and Assessment can be Conducted in Fair and Unbiased Manner
  • Sensitized test developers and users to the societal demand for fair tests used in a fair manner.
  • Source of Fairness-Related Problems o Test user who attempts to use a particular test with people whose background and experience are different from the background and experience of people for whom the test was intended. VII. Assumption 7: Testing and Assessment Benefit Society
  • Need for instruments to diagnose educational difficulties.
  • Instruments to diagnose neuropsychological impairments.
  • Used in military to screen thousands of recruits.
  • Reliability - consistency. Necessary but not sufficient element of a good test.
  • Validity - measures what it purports to measure. Focus on the items that collectively make up the test. A Good Test Psychometric Soundness

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

  • Other Considerations ֍ Trained examiners can administer, score, and interpret with a minimum of difficulty. ֍ Useful Test (Utility) o Yields actionable results that will ultimately benefit individual testtakers or society at large. ֍ A “good test” is one that contains adequate norms. o If the purpose of a test is to compare the performance of the testtaker with the performance of other testtakers. o Norms provide a standard with which the results of measurement can be compared. We may define norm-referenced testing and assessment as a method of evaluation and a way of deriving meaning from test scores by evaluating an individual testtaker’s score and comparing it to scores of a group of testtakers. o Common Goal : Yield information on a testtaker’s standing or ranking relative to some comparison group of testtakers.
  • Norms - the test performance data of a particular group of testtakers that are designed for use as a reference when evaluating or interpreting individual test scores.
  • Normative Sample - a group of people whose performance on a particular test is analyzed for reference in evaluating the performance of individual testtakers.
  • Norming - the process of deriving norms.
  • Race Norming - the controversial practice of norming on the basis of race or ethnic background. I. Standardization - the process of administering a test to a representative sample of testtakers for the purpose of establishing norms. Test has clearly specified procedures for administration and scoring, typically including normative data. II. Sampling - the process of selecting the portion of the universe deemed to be representative of the whole population.
  • Set of individuals with at least one common, observable characteristic.
  • Sample - a portion of the universe of people deemed to be representative of the whole population. ֍ Stratified Sampling - include in your sample people representing different subgroups (or strata) of the population. Eg., Blacks, Whites, Asians o Stratified-Random Sampling: every member of the population had the same chance of being included in the sample. ֍ Purposive Sampling - arbitrarily selecting some sample because we believe it to be representative of the population. ֍ Incidental Sample (Convenience Sample) - convenient or available for use. Used due to budgetary limitations or other constraints. III. Developing Norms for a Standardized Test
  • The test developer administers the test according to the standard set of instructions that will be used with the test.
  • The norms be developed with data derived from a group of people who are presumed to be representative of the people who will take the test in the future.
  1. Percentiles Norms - the raw data from a test’s standardization sample converted to percentile form. Divide a distribution into 100 equal parts— 100 percentiles.
  • Percentile - an expression of the percentage of people whose score on a test or measure falls below a particular raw score. ֍ Percentage Correct - the distribution of raw scores to the number of items that were answered correctly multiplied by 100 and divided by the total number of items.
  1. Age Norms (Age-Equivalent Scores) - indicate the average performance of different samples of test takers who were at various ages at the time the test was administered.
  2. Grade Norms - designed to indicate the average test performance of testtakers in a given school grade.
  • Developmental Norms - both grade norms and age norms. o Norms developed on the basis of any trait, ability, skill, or other characteristic that is presumed to develop, deteriorate, or otherwise be affected by chronological age, school grade, or stage of life.
  1. National Norms - derived from a normative sample that was nationally representative of the population at the time the norming study was conducted. o Obtained by testing large numbers of people representative of different variables of interest such as age, gender, racial/ethnic background, socioeconomic strata, Norms Sampling to Develop Norms Types of Norms

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

Classical Test Theory is a score on an ability test is presumed to reflect not only the testtaker’s true score on the ability being measured but also error. I. Error - the component of the observed test score that does not have to do with the testtaker’s ability.

  • X : observed score
  • T : true score
  • E : Error II. Measurement Error - all of the factors associated with the process of measuring some variable, other than the variable being measured. Eg., English-language test on the subject of 12th-grade algebra being administered, in English, to a sample of 12- grade students, newly arrived to the United States from China.
  1. Random Error - a source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process. Eg., lightning strike or sudden surge in the testtaker’s blood pressure.
  2. Systematic Error - a source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true value of the variable being measured. Does not affect score consistency. III. Variance (o^2 )- the standard deviation squared which describes the sources of test score variability.
  • True Variance - variance from true differences.
  • Error Variance - variance from irrelevant, random sources. I. Test Construction
  • Item Sampling (Content Sampling) - the variation among items within a test as well as to variation among items between tests. Eg., test content differs in the way the it is worded or the included items. II. Test Administration
  • Test Environment - eg. room temperature, level of lighting, and amount of ventilation and noise, the events of the day.
  • Testtaker Variables - eg. emotional problems, physical discomfort, lack of sleep, and the effects of drugs or medication, formal learning experiences, casual life experiences, therapy, illness, and changes in mood or mental state.
  • Examiner-related Variables - examiner’s physical appearance and demeanor, professionalism, and nonverbal behavior. III. Test Scoring and Interpretation
  • Computer Scoring : Technical glitch might contaminate the data.
  • Bias and subjectivity of scorers. I. Test-retest Reliability Estimates - an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test. Useful if you are measuring something that is relatively stable over time (eg. personality traits).
  • Coefficient of Stability - the estimate of test- retest reliability. The longer the time that passes, the greater the likelihood that the reliability coefficient will be lower. II. Parallel-Forms and Alternate-Forms Reliability Estimates - administering different forms of tests with the same group.
  • Coefficient of Equivalence - the degree of the relationship between various forms of a test. a. Parallel Forms - the means and the variances of observed test scores are equal in each form of the test. ֍ Parallel Forms Reliability - an estimate of the extent to which item sampling and other errors have affected test scores on versions of the same test when, for each form of the test, the means and variances of observed test scores are equal. b. Alternative Forms - simply different versions of a test that have been constructed so as to be parallel. ֍ Alternate Forms Reliability - an estimate of the extent to which these different forms of the same test have been affected by item sampling error, or other error. III. Split-half Reliability Estimates - correlating two pairs of scores obtained from equivalent halves of a single test administered once. a. Three-step Process
  1. Step 1. Divide the test into equivalent halves.
  2. Step 2. Calculate a Pearson r between scores on the two halves of the test.
  3. Step 3. Adjust the half-test reliability using the Spearman–Brown formula. b. Splitting the Test
  4. Randomly assign items to one or the other half of the test.
  5. Odd-Even Reliability The Concept of Reliability Sources of Error Variance

PSY 1 09

KRISTOFFER SOTOZA, RPm | 2026

o Assign odd-numbered items to one half of the test and even-numbered items to the other half.

  1. Divide the test by content so that each half contains items equivalent with respect to content and difficulty c. Spearman-Brown Formula- estimate internal consistency reliability from a correlation of two halves of a test.
  • rSB - the reliability adjusted by the Spearman–Brown formula.
  • rxy - the Pearson r in the original-length test.
  • n - the number of items in the revised version divided by the number of items in the original version.
  • Use the Spearman– Brown formula to estimate the reliability of a whole test. ֍ Rhh - the Pearson r of scores in the two half tests. ֍ A whole test is two times longer than half a test, n becomes 2.
  • Usually, but not always, reliability increases as test length increases.
  • May be used to: o Estimate the effect of the shortening on the test’s reliability. o Determine the number of items needed to attain a desired level of reliability. IV. Other Methods of Estimating Internal Consistency
  • Inter-Item Consistency - the degree of correlation among all the items on a scale.
  • Homogeneity - greek words homos , meaning “same,” and genos , meaning “kind”. The degree to which a test measures a single factor.
  • Heterogeneity - the degree to which a test measures different factors. Composed of items that measure more than one trait. a. Kuder-Richardson Formula (KR-20)
  • G. Frederic Kuder and M. W. Richardson , named because it was the 20th formula developed in a series.
  • Determining the inter-item consistency of dichotomous items. ֍ Items that can be scored right or wrong (Eg. multiple-choice items). ֍ k - the number of test items. ֍ σ^2 - the variance of total test scores. ֍ p - the proportion of testtakers who pass the item. ֍ q - the proportion of people who fail the item. ֍ Σpq - the sum of the pq products over all items. b. Coefficient Aplha
  • Developed by Cronbach (1951)
  • The mean of all possible split-half correlations, corrected by the Spearman– Brown formula. ֍ Use on tests containing non- dichotomous items. ֍ ra - coefficient alpha ֍ k - the number of items, the variance of one item. ֍ Σ - the sum of variances of each item. ֍ σ^2 - the variance of the total test scores.
  • Ranges in value from 0 to 1 o Help answer questions about how similar sets of data are. o 0 : absolutely no similarity to 1 : perfectly identical o a value of alpha above .90 may be “too high” and indicate redundancy in the items. c. Average Proportional Distance (APD) - a measure that focuses on the degree of difference that exists between item scores.
  • .2 or lower : Excellent internal consistency
  • .25 to .2 : Acceptable internal consistency V. Inter-Scorer Reliability - the degree of agreement or consistency between two or more scorers with regard to a particular measure. Often used when coding nonverbal behavior. Eg. checklist of behavior about depressed mood.
  • Coefficient of Inter-Scorer Reliability - way of determining the degree of consistency among scorers in the scoring of a test. I. The Purpose of the Reliability Coefficient Using and Interpreting a Coefficient of Reliability