Collecting Evidence

Good assessment plans use multiple measures. Different types of measures can produce different estimates of student ability. And each measurement method has its own inherent errors and biases. That is why it is important to look at student learning and curriculum design through different lenses.

By “measuring” we mean the systematic collection of information used to compare measured performance to intended outcome. Quantifying is measuring how much you have. Qualifying is measuring how good it is (the nature, quality, ability, extent or significance). Does the measurement need to be perfect in student learning outcomes assessment? Assessment measures to not necessarily have to prove anything "beyond a reasonable doubt." The "preponderance of evidence" standard will do in most cases. Don't let the perfect become the enemy of the good. Consider the implications/ramifications of your measurement and decide what degree of precision is needed in a given situation. You simply need sufficient information to provide a reasonable understanding of student achievement or to guide action.

 

Assessment "Action Research" Questions

fingerprint.jpg

Before you start measuring, ask yourself, “what am I trying to measure?” What is the key information that will help you maintain the course or program at the highest quality possible? What evidence will tell you whether and how well students are learning? How much evidence is needed to show convincingly that students are or are not achieving an intended outcome? What useful information, data or artifacts already exist? Is your question examining the individual student, a particular course, a selection of program learning outcomes or the overall curriculum? Measure what matters. To you. 

Too often, assessment activity fixates on executing an assessment process or approach to document student attainment rather than focusing on shedding light on a vexing issue and using evidence to address student and institutional needs and questions... Rather than taking account of genuine academic concerns and deploying assessment to inform change in pedagogy, the activity is preoccupied with “doing” assessment rather than using assessment results. -George Kuh, et al. (2015)1 

 

We can frame assessment as "inquiry in support of learning."2 Assessment involves asking questions and designing experiments aimed at making innovations and improvements in teaching and learning.

 1. George D. Kuh, Stanley O. Ikenberry, Natasha Jankowski, Timothy Reese Cain, Peter T. Ewell, Pat Hutchings, and Jillian Kinzie, Using Evidence of Student Learning to Improve Higher Education (Links to an external site.), San Francisco, CA: Jossey-Bass, 2015.

2. Natasha A. Jankowski, St.Olaf: Utilization-Focused Assessment, NILOA, April 2012.

 

Examples of Assessment Research Questions

  • Have the revisions made to this course led to improved student learning?
  • What prerequisite courses need to be updated (and how) in order to better prepare students for this upper level course?
  • Do my seniors have the level of proficiency in this program competency that they need to enter the workplace?
  • How do students in our program compare to students in similar programs at other institutions?
  • What skills do transfer students need to catch up?
  • What concepts or practices are posing the greatest challenge to my students?
  • Do students from different demographic groups perform differently?
  • Are students able to transfer learning from one situation to another?

 

Types of Measures

Once you have determined the assessment questions, choose assessment measures that are best suited to answer those questions. A useful table at this Cal Poly website Links to an external site. lays out types of assessment measures and what sorts of information they provide. The outline below lists many types of assessment measures:

Direct Measures

  • Examination of student work (performance-based assessment)
    • Capstone projects
    • Course-embedded assignments (essays, reflections, papers, oral presentations, computer code, artwork)
    • Scholarly presentations or publications, white papers
    • Portfolios
    • Performances (films, concerts, speeches, theater productions, performance art, debates)
    • Discussion threads, mind maps
    • Case studies, simulations, activity logs, critical incident reports
    • Blogs, websites, journals
  • Standardized Testing
    • Locally developed examinations
    • Major field or licensure tests
    • Pre-test/post-test
  • Measures of professional activity 
    • Performance at internship, placement sites, clinical setting, student teaching
    • Supervisor evaluations
    • Licensure exams or certifications

Indirect Measures

  • Student self-reflection, self-assessment
  • Placement analysis (graduate or professional school, employment), graduation rates
  • IDEA student rating of instruction survey results
  • Satisfaction or engagement surveys (Noel-Levitz SSI, NSSE)
  • Locally developed surveys/questionnaires
  • Focus groups, exit interviews

External Measures

  • Juried review of performance, portfolio or product
  • Accreditation review of learning outcomes assessment
  • Advisory board review of program and student work
  • Consultant review of program
  • Internship supervisor evaluation
  • Employer survey; Alumni survey
  • Benchmarked standardized testing

A publication by Jo Beld provides in-depth explanations of  several useful assessment measures:  Download Beld_BuildingYourAssessmentToolkit_2015.pdf

 

This list of Assessment Activities comes from Sterling, et. al (2016):

AssessmentActivities.png  

 

Factors for Choosing Assessment Measures

Start with the assessment question(s); be sure that the measures you use address the questions you are trying to answer. 

The measure you are starting out with should be reliable and valid. Validity is the degree to which the measure aligns with your course or program learning outcomes, while reliability refers to how consistent the measurement results are. If you are examining a student work product (a written paper, performance or project), ensure reliability of the measure by providing the students with unambiguous instructions (rubrics can help both students and raters); ensure validity by designing an assignment that clearly gives students the opportunity to display proficiency. Scoring or, preferably, developmental rubrics can be used to assess student work artifacts. The collaborative construction of rubrics helps a group of faculty ensure validity of a measure. The use of a rubric for scoring (measuring) helps ensure reliability.

ValidReliable.png

Image from Validity and Reliability on Explorability.com Links to an external site. 

A standardized test or survey may be a highly reliable assessment instrument, but questions of validity arise. Does the standardized instrument measure your learning objectives or answer your assessment research questions? Performance-based measures may be highly valid, but assessing student work can be be time consuming, whereas with a standardized test someone else (or a machine) does the scoring for you. Standardized tests, however, have a cost associated with them and they can be expensive. 

Measurement standards indicate that there is a trade-off between reliability and validity. The complexity of a task may increase validity but at the same time will decrease reliability due to a lack of standardization.  -Grant Wiggins (1993)

 

When and Where to Collect Evidence

Your most valuable tool for analyzing where to look for assessment evidence is your CURRICULUM MAP. If you have used an intensity scale to create your map, look on the map for a course rated high for the learning outcome you wish to measure, and determine if that course includes a major assignment, project, research paper, performance or exam through which students can demonstrate that learning outcome. If you have used a developmental scale, look in courses in which the learning outcome is being practiced or mastered for similar evidence of student learning.

 

CurricMap2-1.png

 

For example, based on the figure above (which uses a developmental scale), to measure student achievement of Learning Outcome 2, you might examine the results of an exam in Course 120 to see what percentage of students have the necessary prerequisite knowledge to succeed in their next course, Course 201. And you might examine students' Capstone projects in Course 410 against a rubric for LO 2 to see how many met or exceeded the standard expected for graduating seniors.

 

A Bit of Advice

There are three simple principles that should guide your decision about what, where and when to collect evidence of student learning:

  1. Keep it real.
  2. Keep it simple.
  3. Keep it focused.

These aphormisms connect back to the values that underlie Champlain's approach to assessment. Keep it real to conduct meaningful and authentic assessment. Evaluate student work products that result from assignments that are explicitly designed to give students the opportunity to demonstrate the learning outcomes you wish to assess. 

Keep it simple (approachable) by harvesting artifacts that students are already creating in their coursework. Your program's curriculum map will guide you to potential assignments and class projects that can be "harvested" as artifacts for assessment.

To keep it focused, choose an artifact that can help you answer a few specific questions that you want to answer. Constructive and well designed assessment is relevant to inquiry about your courses, your program, your students, your teaching. Is your assessment question about individual student proficiency, course design, pedagogical method, program effectiveness or overall institutional competency achievement? Knowing which level(s) your inquiry is directed toward will help you determine the appropriate measure(s).

 

Performances of Understanding

The most powerful way to share with students a vision of what they are supposed to be learning is to make sure your instructional activities and formative assessments (and, later, your summative assessments) are performances of understanding. A performance of understanding embodies the learning target in what you ask students to actually do. 

Performances of understanding show students, by what they ask of them, what it is they are supposed to be learning. Performances of understanding develop that learning through the students' experience doing the work. Finally, performances of understanding give evidence of students' learning by providing work that is available for inspection by both teacher and student. Not every performance of understanding uses rubrics. For those that do, however, rubrics support all three functions (showing, developing, and giving evidence of learning).

-Susan Brookhart, 2013

 

Assessment Methods Resources

Susan M. Brookhart, How to Create and Use Rubrics for Formative Assessment and Grading Links to an external site., ASCD, Alexandria, VA, 2013

Dwayne Chism, Excavating the Artifacts of Student Learning Links to an external site., Educational Leadership, February 2018, Volume 75, Number 5.

Community College of Allegheny County Assessment Tool Kit

Jay McTighe, Three Key Questions on Measuring Learning Links to an external site., Educational Leadership, February 2018, Volume 75, Number 5.

Stirling, A., Kerr, G., Banwell, J., MacPherson, E., & Heron, A. (2016). A practical guide for work-integrated learning: Effective practices to enhance the educational quality of the structured work experience offered through Colleges and Universities. Toronto, ON: Higher Education Quality Council of Ontario/Education @ Work Ontario.

University of Hawaii, Manoa: Methods for Collecting Evidence Links to an external site.

Western Association of Schools and Colleges Evidence Guide