Jan. 7th ~What’s Hot in 2021: Beyond the Science of Reading; Tier 2 oral language and early reading interventions for preschool to grade 2 children: a restricted systematic review
The Weekly Email That Keeps You Informed of The Latest Reading Research!
Welcome! This is Volume 2, Issue No. 32
Welcome to the Reading Research Recap, a weekly newsletter featuring the latest reading research published in peer-reviewed scientific journals. The goal of the Recap is to share recent scientific findings and foster an appreciation of science as a way to navigate the world. I try to make this one of the most informative emails you get each week.
Happy New Year!
Hope it is a year filled with lots of reading!
As many of you know, back in September, my company (Elemeno) was acquired by MetaMetrics (creators of the Lexile Framework for Reading). MetaMetrics is committed to expanding their offerings and services in the foundational reading space. It has been incredibly exciting to join their team and help shape their foundational reading initiatives and I’m excited by what we will soon be able to offer EdTech companies, states, school districts as well as parents and teachers.
While no changes have occurred to the Recap in the past few months, we are thinking about how to spread it to a wider audience and make it more accessible. In order to better inform our changes, we would like to hear from you via this short google survey. Please only answer if you receive the Recap monthly (i.e. you don’t pay for it). We will have a separate survey for paying subscribers (weekly subscribers) next week. There are only a few questions asking about who you are, why you read the Recap, what you like/dislike, and any comments you have for us.
Your feedback is really valuable!
Thank you in advance,
-Neena
What’s Hot in 2021: Beyond the Science of Reading
“Literacy topics fluctuate each year in how much attention they receive in research and practice. The What’s Hot in Literacy annual survey asks twenty-five leading experts what literacy topics are currently receiving attention, or are hot, as well as which topics should be hot in the field. The results of these interviews are tallied to identify consensus among the participants. The following three levels are used to report the findings: a) “extremely hot” or “extremely cold” (100% consensus), b) “very hot” or “very cold” (75% consensus), and c) “hot” or “cold” (50% consensus). Items are identified as “should be hot” or “should not be hot” if at least 50% of the respondents agree. The four “very hot” topics for 2021 are digital literacy, dyslexia, phonics/phonemic awareness, and social justice/equity/anti-racism in literacy. Discussion of these topics (and others that were deemed should be hot) and why they may be currently receiving more attention than others in the field is included. Findings can be utilized by both K-12 and higher education professionals alike.”
Tier 2 oral language and early reading interventions for preschool to grade 2 children: a restricted systematic review (open access!)
“This systematic review investigated small-group Tier 2 interventions to improve oral language or reading outcomes for children during preschool and early primary school years. Literature published from 2008 was searched and 152 papers selected for full-text review; 55 studies were included. Three strength of evidence assessment tools identified a shortlist of six interventions with relatively strong evidence: (a) Early Reading Intervention; (b) Lonigan and Philips (2016) Unnamed needs-aligned intervention; (c) PHAB+WIST (PHAST)/PHAB+RAVE-O; (d) Read Well-Aligned intervention; (e) Ryder and colleagues’ (2008) Unnamed Phonological Awareness and Phonics intervention; and (f) Story Friends. Investigation of intervention componentry found common characteristics included 3–5 students, 4–5 sessions per week, minimum 11-week duration, content covering a combination of skills, modelling and explicit instruction, and trained personnel. Shortlisted interventions provide a useful foundation to guide further interventions and inform educators and policymakers seeking to implement effective evidence-based interventions in the early years of schooling.”
Development and Initial Validation of the Early Elementary Writing Rubric to Inform Instruction for Kindergarten and First-Grade Students (open access)
“This article describes the development of the Early Elementary Writing Rubric (EEWR), an analytic assessment designed to measure kindergarten and first-grade writing and inform educators’ instruction. Crocker and Algina’s (1986) approach to instrument development and validation was used as a guide to create and refine the writing measure. Study 1 describes the development of the 10-item measure (response scale ranges from 0 = Beginning of Kindergarten to 5 = End of First Grade). Educators participated in focus groups, expert panel review, cognitive interviews, and pretesting as part of the instrument development process. Study 2 evaluates measurement quality in terms of score reliability and validity. Data from writing samples produced by 634 students in kindergarten and first-grade classrooms were collected during pilot testing. An exploratory factor analysis was conducted to evaluate the psychometric properties of the EEWR. A one-factor model fit the data for all writing genres and all scoring elements were retained with loadings ranging from 0.49 to 0.92. Internal consistency reliability was high and ranged from .89 to .91. Interrater reliability between the researcher and participants varied from poor to good and means ranged from 52% to 72%. First-grade students received higher scores than kindergartners on all 10 scoring elements. The EEWR holds promise as an acceptable, useful, and psychometrically sound measure of early writing. Further iterative development is needed to fully investigate its ability to accurately identify the present level of student performance and to determine sensitivity to developmental and instruction gains.”
What are Teachers Reading and Why?: An Analysis of Elementary Read Aloud Titles and the Rationales Underlying Teachers’ Selections
“Although reading aloud to elementary students is a common practice, few studies have focused on the actual texts read, beyond considerations of fiction versus nonfiction, and few studies have included a line of inquiry exploring teachers’ rationales for text selection. In this mixed-methods study, we pair a content analysis of the reported read aloud titles of over 1000 teachers with interviews of a subset of teachers to understand the rationales behind their choices. For the content analysis, we analyzed the titles for multiple features (e.g., text type, publication year, inclusion in a series, etc.). Results suggest teachers still prefer fiction for read aloud events and the titles read are, on average, 25 years old. Our interviews with 14 teachers revealed that a myriad of factors inform their decisions for selecting the texts that they read in their respective classrooms. Overall, teachers’ reasons tended to focus on instructional, affective, or contextual rationales. Although teachers acknowledged the importance of context and representation, there is an apparent disconnect between what teachers said mattered and what were represented in the analysis of titles. Implications for future research and classroom practices are included.”
Insights into Dyslexia Genetics Research from the Last Two Decades (open access)
“Abstract: Dyslexia, a specific reading disability, is a common (up to 10% of children) and highly heritable (~70%) neurodevelopmental disorder. Behavioral and molecular genetic approaches are aimed towards dissecting its significant genetic component. In the proposed review, we will summarize advances in twin and molecular genetic research from the past 20 years. First, we will briefly outline the clinical and educational presentation and epidemiology of dyslexia. Next, we will summarize results from twin studies, followed by molecular genetic research (e.g., genome-wide association studies (GWASs)). In particular, we will highlight converging key insights from genetic research. (1) Dyslexia is a highly polygenic neurodevelopmental disorder with a complex genetic architecture. (2) Dyslexia categories share a large proportion of genetics with continuously distributed measures of reading skills, with shared genetic risks also seen across development. (3) Dyslexia genetic risks are shared with those implicated in many other neurodevelopmental disorders (e.g., developmental language disorder and dyscalculia). Finally, we will discuss the implications and future directions. As the diversity of genetic studies continues to increase through international collaborate efforts, we will highlight the challenges in advances of genetics discoveries in this field.”
Comparing Slope Stability and Validity for General Outcome and Specific Subskill Mastery Measurement
How to access the paper and a note from the author:
Hi Neena,
Thank you very much for your interest in this research. I’ve attached a link to 50 free copies. If those run out, interested readers can email me and I will send them a copy! Feel free to include my contact information for that purpose. Please let me know if you have any questions at all.
https://www.tandfonline.com/eprint/FNNZ6ZYMXXIAZHRNRBFW/full?target=10.1080/15377903.2021.2012863
Dr. Filderman’s email: mjfilderman@ua.edu
Background
Curriculum-based measurement of oral reading fluency (aka CBM-R or R-CBM) is a popular form of general outcome measurement (GOM) that is predictive of standardized reading test results, and often used within a response to intervention framework at various level. Specifically, all students in Tier 1 take R-CBMs about 3 times a year to see if they are meeting benchmark goals. Students requiring extra help (i.e., those in Tiers 2 and 3) take R-CBMs more frequently as part of ongoing progress monitoring and the data-based decision making (DBDM) process whereby teacher and interventionists determine if the intervention needs to be qualitatively or quantitatively changed.
Specific subskill mastery measurement (SSMM), in contrast, is not considered a general outcome measure (GOM), but rather, is a more proximal measure of what a student had learned. But, there is very little
General outcome measurement (GOM) such as R-CBM is used to see whether children are retaining and generalizing what they have learned, whereas SSMM is to see whether students have acquired a specific skill. One is not better than the other. They are used for different purposes, and in fact, both should be used, since they assess different things.
Rationale
While R-CBMs are widely used, they have been critiqued for being highly variable, and requiring too long a time frame to get good, stable data. This puts it at odds with the DBDM ethos of frequent assessment in order to make intervention responsive to students’ needs. SSMM might be a viable alternative to R-CBM, but there is a lack of research on using slopes of words read correct from SSMM word lists.
Research Questions
Directly quoted from the paper:
What is the slope for SSMM versus GOM (i.e., CBM-R) across a various number of sessions (e.g., 5 weeks, 7 weeks, and all sessions)?
What is the relationship of SSMM compared to CBM-R with
standardized achievement scores while controlling for previous standardized achievement scores and demographic variables?
Some technical notes:
The slope refers to the rate of improvement by a student in the number of words read correctly per minute (wcpm). Research has found “…slopes between 1.48 and 1.67 wcpm per week for students with disabilities in 2nd through 6th grade when provided with evidence-based intervention.”
You can use different methods to evaluate the slopes: Ordinary Least Squares (aka “the line of best fit” , this popular method is your “typical” regression estimation method, found in Microsoft Excel), and Bayesian Modeling (too complicated to get into here, but see this paper).
Sample
57 4th and 5th grade students with word-reading difficulties (as nominated by their teacher and confirmed with TOWRE scores) across 4 charter schools in the southeastern US
56% were male; all received free or reduced lunch;
racial demographics: “9% (n=5) were African American, 5% (n=3) were Asian American, 2% (n=1) were Native American, and 84% (n=48) were White; 70% (n=40) identified as Hispanic or Latinx’
“18% (n=10) had a disability; and 67% (n=38) had limited English proficiency.”
Measures
Given and pre and post-test:
3 WRMT subtests: Word Attack, Word Identification, and Passage Comprehension
Weekly Progress Monitoring Assessments:
The CBM-R was FastBridge
The SSMM was a modified experimenter-created measure
“The second subtest of the Big Word Reading Test (BWRT; Toste et al., 2019) was modified and used as the SSMM. The subtest consists of 94 multisyllabic words with affixes learned in the intervention…In addition, unlike the original test, a subset of 50 words, controlled for number of syllables in the words, were randomly selected and ordered to create a unique word list each week. Thus, both the CBM-R and SSMM measured words read correctly per minute (wcpm) per week…”
Methods
All students received an intervention focused on multisyllabic word reading for 4 days a week across 10 weeks. See above for the measures that were administered pre and post-test and weekly. Since this study is more about R-CBM and SSMM, and not about the intervention, I won’t report on the intervention specifics here.
Analysis
They used Mplus to estimate models for both R-CBM and SSMM. Each of the former were estimated using three different methods: ordinary least squares, uninformed Bayesian, and informed Bayesian (where priors were informed via Monte Carlo simulations).
Results
SSMM slope was significant, not R-CBM
“The slope for SSMM was significant across all methods, and consistent across the various times using the most robust method of estimation (i.e., informed Bayesian: β=1.43 wcpm per week at 5weeks to β=1.46 wcpm per week at 12weeks), while the standard error decreased over time as expected (SE= .05 at 5weeks to SE = .01 at 12weeks)…The slope for the CBM-R was not significant using any of the statistical methods over any of the time points. This could be because of the small sample size; however, Bayesian methods are designed to be more robust with smaller sample sizes (McNeish, 2016), especially for time-series data (Price, 2012).”
SSMM also was better predictive of WRMT
“Findings from the present study indicate that using SSMM more accurately predicted standardized assessment scores compared to the CBM-R or the combination of SSMM and CBM-R. Additionally, the SSMM was more highly correlated with the WRMT than the GOM, which may indicate it tapped the construct of general reading proficiency more effectively over the study time frame. These findings are in contrast to prior research which suggests that SSMM may be over- aligned with the intervention and thus not indicative of growth over time (Fuchs & Deno, 1991).”
Implications/Take-Home Message
I think these two quotes summarize the implications of this study the best:
“Thus, our findings support the development of intervention aligned SSMM that lend themselves to tracking growth over time. Moreover, our findings suggest that the use of SSMM alone may be more valid for making informed decisions over a shorter time frame.”
“…educators can make instructional decisions using well-designed SSMM after 5weeks of data collection if more advanced statistical approaches are made available to them, which would have substantial implications for the education of students with or at risk for disabilities.”
Limitations
A relatively small sample size (which could lead to the study being underpowered). That said, the authors state that the lack of significant effects for R-CBM slopes is more likely due to the “inherent variability” in the measure rather than a small sample size (especially because Bayesian methods are particularly suited/appropriate for small sample sizes using time series data)
Given the demographic information listed above under “sample,” the results of this study might not generalize to other populations
The reliability of the WRMT passage comprehension subtest was on the low end of acceptable reliability (.71)
Autocorrelation of data (which is present in all human subjects time-series studies). The authors modeled autocorrelation such that the immediate preceding session had the most impact on the next session, but it could be that with word reading acquisition, a lag of more than one session is more appropriate…this needs to be investigated further…