Dr. Taylor's latest set of assertions concerning the perceived problems with the Houston study again reflect her need to actually read the paper that was published. Nonetheless, we are gratified to learn that Dr. Taylor does not reject science, and we refer people to our original response, which addresses many of the issues raised in this latest attempt to denigrate our research. In responding, we note that it is common to denigrate and attack when results challenge predominant belief systems. However, the Houston study is only one of many studies that support the role of explicit instruction in the alphabetic principle for children at risk for reading failure. We remain mystified as to the amount of attention that this study receives from the media, politicians, activist groups, and reading professionals, and would encourage some attempt to more fully partake of the research literature that supports the importance of the alphabetic principle in learning to read in general, and the need to provide explicit instruction in the alphabetic principle for children at risk for reading disability.
Many of Dr. Taylor's assertions concerning the research reflect her access to the paper that was under review. It is typical that such reviews occur with an understanding of confidentiality. In order for Dr. Taylor to have the paper under review, someone violated a confidentiality agreement. We repeat that we did not release the paper and have not refused to answer critical questions about the reading studies, other than Dr. Taylor's initial post to us. We have no idea about what Dr. Taylor means by "serious ethical issues" and would simply note that any of the policy agendas that legislative bodies have set forth in the reading area could and would have been undertaken had our study never been published. In the remainder of our response, we will restrict comments to those aspects of Dr. Taylor's post that involved the data. We would note, however, that the coverage provided in the National Academy of Science was at the discretion of the committee. Dr. Foorman is only one member. The impact of which Dr. Taylor is so critical seems to reflect air rushing in to fill a vacuum and would have occurred had the study never been conducted. We are not responsible for implementations of Open Court in different school districts, have not recommended Open Court to different school districts, and certainly have no involvement in decisions in California about curriculums that are adopted. At the same time, there is data which supports the efficacy of Open Court, and we are pleased to have provided an independent evaluation of this and other curriculums.
Dr. Taylor makes the following assertions, to which we will respond:
There is no difference in the end of first grade and the beginning of second grade scores for children in the explicit instruction condition.
The means are 12.68 (end of grade 1) versus 5.73 (beginning of second grade). This difference is statistically significant at p<.005 using the separate variance t-test. Contrast this difference with that seen in the embedded phonics group (5.0 for April Grade 1 vs 4.75 for October Grade 2) and implicit instruction group (5.23 for April of Grade 1 vs 5.12 for October Grade 2). Neither of these latter differences is significant in any sense of the word. However, we would point out that the comparison Dr. Taylor wants us to make here is not fair to any of the experimental groups in so far as it attempts to equate April of Grade 1 with October of Grade 2, a span of 6 months. In fact, all three groups, not just the explicit instruction group are expected to outperform their Grade 2 counterparts who did not receive the programs in Grade 1 once these children also reach October of Grade 2. Six months is a long time at this age. Interestingly enough, this criticism was raised by one of the reviewers of our original manuscript, which we addressed (apparently to the reviewer's and editor's satisfaction) in our first revision. We believe that Dr. Taylor's first assertion is not correct.
There is no significant difference between phonological processing scores obtained by children at the end of first grade versus the beginning of second grade.
Again, by suggesting this comparison, Dr. Taylor makes the tacit assumption of zero growth in phonemic awareness from April to October, which is not consistent with data we have collected in several studies and cohorts of children. Nevertheless, the means for phonological analysis are 2.6 (April grade one) versus 1.74 (October grade two). This difference is only significant at p<.11 and would have required a sample size of 24 instead of 14 in Grade 2 to be significant at p<.05, all other things being equal. This may seem like a large increase in sample size, but note that the grade 2 samples in the embedded code and implicit code groups were 36 and 28, respectively. As we described in our previous post, the sample size was differentially lowered in the explicit instruction group by the decision to restrict comparisons to children receiving tutorial because of the decision in one school involved in explicit instruction not to provide tutorial services at this grade. We would also add that the difference is statistically significant at p < .05 for the synthesis factor of phonemic awareness, but we did not report results on synthesis in the paper because of the high correlation between analysis and synthesis. Again, contrast the difference in the explicit instruction group with the corresponding differences in the embedded code (1.59 in April of Grade 1 vs. 1.38 in October of Grade 2) and implicit instruction (1.53 in April of Grade 1 vs. 1.58 in October of Grade 2) groups. We feel that Dr. Taylor's description of the effects is quite misleading. However, the issue is not particularly relevant because of the false assumption of zero growth from April to October, and more importantly, because phonological processing was not strictly an endpoint, but a mediator of change. The relevance of phonological processing as a mediator of the instructional effects of explicit instruction are clearly demonstrated in the article.
Dr. Taylor asserts that in the other three groups, children beginning the program in the second grade scored significantly higher on the word reading test than children at the end of the first grade.
We have already shown this to be an inaccurate statement in (1) above, but again caution about the implicit, and incorrect, assumption of zero growth from April to October. There are no significant differences in the April means for children who had received a full year of instruction (grade one) versus the October means for those children beginning instruction in grade two. The means are as follows: embedded code: 5.0 (end of grade one) versus 4.75 (beginning of grade two); implicit code - research: 5.23 (end of grade one) versus 5.12 (beginning of grade two); and 1.91 (end of grade one) versus 3.17 (beginning of grade two). It is true that this latter group has a higher mean score in October of Grade 2 than in April of Grade 1, albeit not statistically significant. However, this is the group in which we did not provide instruction, so that this difference contradicts Dr. Taylor's assertions. We reiterate that these are not really optimal comparisons for addressing Dr. Taylor's question because of the implicit assumption of zero growth from April to October, and note that none of these three differences are statistically significant. The contrast with the explicit instruction condition is striking: 12.68 (end of grade one) versus 5.73 (beginning of grade two) and statistically significant at p<.005.
Dr. Taylor asserts that "it is clear from the data that it was Foorman's study which had a negative impact on word reading scores of the first graders..."
There is no evidence in the data for negative impacts of any of the interventions that were provided. All the conditions showed growth. The difference was in the rate of growth, and the end of year outcomes.
"There is no comparison of the first and second grade data"
In order to publish the paper, we were asked specifically to justify the combining of first and second graders. We demonstrated to the reviewers that combining first and second grade data was not a major factor in the outcomes. Age was included in all analyses, to account for grade level differences. This was especially important because children in the second grade began with the first grade curriculum.
Dr. Taylor again raises the issue of the "sample bias" that presumably favored the "Open Court/Direct Instruction" condition.
We simply reiterate our previous response and note that decisions about tutoring were based on resources, not on literacy survey scores. Schools did not assign resources based on any components of our data. Finally, dropping the non-tutored subjects, as in the recently published paper, had no effect on the outcomes or the conclusions.
Dr. Taylor asserts that "both the first and second grade children in the Open Court/Direct Instruction groups scored higher on the word reading test than children in the other three treatment groups at the beginning of the year..."
This difference was not statistically significant in the sample that included the non-tutored children, and is not statistically significant in the sample of children who received tutoring, which is the sample reported on in the published article. Prior to dropping the non-tutored children from the analysis, we addressed this issue at some length. The bottom line is that dropping the non-tutored children does not alter the pattern of results or the conclusions reached in the study.
Dr. Taylor again raises the issue of "a double dose of Open Court."
This assertion is not correct and was addressed in our previous response. As we stated, there was no difference in the amount of instructional time provided in the three research conditions. The doubling of lessons was done because the second graders couldn't read and had to start in the first grade Open Court lessons.
Foorman and her colleagues make no attempt to falsify their hypotheses or refute their theory that training of phonological awareness and direct instruction… is effective in improving beginning reading instruction."
Falsification is a characteristic of the experimental design. There was every opportunity for children in the other conditions to out perform children in the explicit instruction condition. The notion that we disregarded "clear sample bias", the "significant difference in test scores between children at the end of the first grade... and beginning of second grade..." and "...the negative effects their study apparently had on test performance of the children in the other three groups..." are all inaccurate assertions with no basis in the data from the study. Not only do they reflect an inaccurate representation of the data in the study, they ignore the fact that the article was reviewed and published in a major peer reviewed, archival publication. None of these differences were statistically significant in the original data set. Moreover, these potential problems were identified by the reviewers, and resulted in the re-analysis of the data including only the tutored children. There is no evidence for "negative effects" - all conditions showed growth-on-average in reading skills, but the rate of change and outcomes were quite different because of the impact of the explicit condition on children with phonological awareness difficulties. We feel that we have provided a thorough analysis of the data and have examined numerous possible explanations for the effects other than the curriculum. These include, but are not limited to, gender, ethnicity, and social class differences between the groups, possible problems created by differential placement of children into tutorial, teacher effects, and, to the extent that we could address them, school effects by including more than a handful of schools and placing more than one curriculum in a school. We used multiple measures, more than two time points so that we could assess change at the individual level and examine correlates of change free from regression artifacts, and separate end of year outcomes from standardized assessments with outstanding psychometric properties. In the Discussion section of the manuscript, we readily acknowledge several limitations of the study that warrant careful consideration and highlight the need for replication. We will not repeat those here other than to say that falsification is part of a process of systematic research in an area. It is not always possible to falsify an hypothesis in a single study, or to rule out all possible alternative explanations for an effect. We feel that there is a growing weight of evidence that points to the beneficial effects of explicit instruction in the alphabetic principle for children who are at risk for reading problems because of a variety of factors that result in them beginning school with poor phonemic awareness skills. We are convinced, and not a little dismayed, that no effort on our part, no matter how diligent, well intentioned, or comprehensive, will lead Dr. Taylor to acknowledge that any result favoring explicit instruction in this study merits further consideration as a potentially real effect, whose limits deserve to be explored in independent, systematic replication research. At the same time, we wonder how many of Dr. Taylor's criticisms of our design, methods, analysis, and sample would have been levied by her if the outcomes of the study had shown a superiority for the implicit instruction group of comparable magnitude to the one actually found for the explicit instruction group.
"The contention that phonemic awareness must be taught directly and that children need explicit systematic instruction of phonics is less of a scientific fact than an exercise in political persuasion.
What constitutes a scientific fact is not at issue. The Title 1 study is not a "scientific fact" - it is one piece of a large body of data on reading development and instruction. The issue at hand is the body of research, not a single study. We welcome citations of research that refute the conclusions that we and others have made on the basis of these studies. Political persuasion occurs at all levels, and by all sides. We would hope that any "political persuasion" is done on the basis of the body of evidence, and not on belief systems that seem impervious to research findings of this study.