Collaborative Research: Developing and Evaluating Assessments of Problem-Solving in Computer Adaptive Testing Environments (DEAP-CAT)
Supported by the National Science Foundation
Project led by:
Toni Sondergeld, PhD and Kristin Koskey, PhD
Problem solving has been a priority within K-12 mathematics education for over four decades and is reflected throughout the Common Core State Standards Mathematics (CCSSM) initiative, which have been adopted in some form by 41 states. Broadly defined, problem solving is “mathematical tasks that have potential to provide intellectual challenges for enhancing students’ mathematical understanding and development” (NCTM, 2020, par. 1). In prior NSF-funded research (NSF#1720646, #1720661), problem-solving measures (PSMs) aligned to CCSSM for grades 3-5 were developed and validated to supplement previously established PSMs in grades 6-8. PSMs fill a gap in assessments addressing the depth in the learning standards. The current DRK-12 Level III project entitled Developing and Evaluating Assessments of Problem Solving in Computer Adaptive Testing Environments (DEAP-CAT) expands the scope of PSMs’ use and score interpretation. DEAP-CAT advances mathematical problem-solving assessments into CAT testing. CAT allows for more precise and efficient targeting of student ability compared to static tests. Few measures designed to assess students’ mathematical problem-solving ability leverage such technology. Desirable are shorter tests requiring less in-class time for assessment than current paper-pencil PSMs, while having sufficient reliability and strong validity evidence. Such tests may limit test-taker fatigue and increase classroom instruction time. Additionally, DEAP-CAT will benchmark current PSM6-8 instruments using an objective standard-setting method, which allows for improved score interpretations with content-related feedback. Immediate results of student- and class-level reports will be produced through the CAT system allowing for teachers to modify instruction to improve students’ learning.
This 5-year project aims to advance the use of CAT and assessment information in the mathematics classroom by applying an iterative and stakeholder-informed Design Science-Based Methodology to: (a) benchmark the previously established PSM6, PSM7, and PSM8 (Year 1); (b) develop, calibrate, and validate criterion-referenced CAT for each PSM (Years 1-5); (c) construct student- and class-level score reports for integration into the CAT system (Years 1-4); and (d) investigate teachers’ capacity for implementing, interpreting, and using the PSM-CAT assessments and results in STEM learning settings (Year 5). Accordingly, DEAP-CAT addresses two sets of research questions related to test development and DRK-12 STEM assessment specified in the program solicitation. Test Development Research Questions: (RQ1) What benchmark performance standards define different proficiency levels on PSMs 6-8 for each grade level? (RQ2) What are the psychometric properties of new PSM items developed for the CAT item bank? (RQ3) Is there significant item drift across student populations on the new PSM items? (RQ4) To what extent are PSM item calibrations stable within the CAT system? (RQ5) What recommendations for improvements do teachers and students have for the new PSM items, CAT platform and reporting system, if any? Objective Standard Setting will be used for benchmarking and Rasch (1-PL) measurement will be employed to address all psychometric questions. DRK-12 STEM Assessment Research Questions: (RQ6) To what extent do teachers interact with, perceive, and make sense of the assessment information generated for use in practice? (RQ7) Does an online learning module build teacher capacity for PSM CAT implementation, interpretation, and use of student assessment outcomes in STEM learning settings? An experimental design will be utilized to investigate teachers’ capacity for implementing, interpreting, and using PSMs in a CAT system. DEAP-CAT has the potential to impact the field by providing school districts and researchers a means to assess students’ mathematical problem-solving performance at one time or growth over time efficiently and effectively; address future online learning needs; and improve classroom teaching through more precise information about students’ strengths with less class time focused on assessment.