LDOE’s statistical methods and reports are beyond useless according to Harvard professor

(Provided by Mary K. Bellisario)


Question: Have press releases issued by the LA Dept. of Education been accurate,especially regarding specific analysis ofresults of student AP tests, possibly ACT tests,  PARCC preparation, and the VAM– Value-added Method for teacher evaluations? 


Further:  Should legislators base their decisions, and vote their support, oninformation contained in state press releases about public education whichcould be inaccurate?


An independent, nationally-recognized data analyst, Dr. A. J. Guarino, is doubtful that data from the LA Dept. of Education is being analyzed or presented accurately.  He gives his reasons below. He calls into serious question analysisprocedures and “selective” reporting in three specific areas:  AP test results (and possibly ACT results), PARCC preparation, and the VAM method of teacher evaluation.


Attached above and below is an unsolicited critical analysis of information that has recently come from the LA Dept. of Education, describing Louisiana’s purported progress in public education.  The analyst, Dr. A. J. Guarino,is a professor of biostatistics at the Harvard University teaching hospital, Massachusetts General Hospital of Health Professions. He and his graduate students routinely collect and analyze media reports from around the U.S. for their accuracy.   Dr. Guarino has ties to Louisiana, where he formerly served as president of the Louisiana Education Research Association (LERA). 


If you have further questions, or want to verify his credentials, Dr. Guarino can be reached at:  ajguarino@mgh.harvard.edu


Among Dr. Guarino’s conclusions are: 

“When looked at collectively, there is a troubling trend of either the misrepresentation of statistical outcomes or the incorrect application of statistical principles.”


Further, Dr. Guarino concludes: 

“Based upon my professional experience, I believe the Department of Education and Superintendent White have created both of these scenarios.  My intent is to bring attention to these problems so that legislators and policy makers can make informed decisions moving forward.”


Teaching, learning, and assessment: The closed circle

A. J. Guarino*

I certify that there is no conflict of interest with any financial organizationregarding the content of this letter.

The intent of this posting is to report on findings and policies that are based uponincomplete information or misapplication of statistical principles.

        I am a professor of biostatistics at the Massachusetts General Hospital Institute of Health Professions in Boston (teaching hospital for Harvard) where I teach basic and advanced quantitative methods to graduate students in the health sciences. To document the need to be “statistically literate,” I present reports from media sources purporting major findings while failing to provide essential information, which often contradicts their “major findings.” As the former president of the Louisiana Education Research Association (LERA), I continue to follow the educational developments in Louisiana.  Over the past weeks and months, I noticed “press releases” and other media containing only selective information, which made the reported findings suspicious. After securing the full information, my analyses did not replicate those of the press-releases.

        Decisions must never be derived from findings that fail to provide complete information. At this stage of my career, my community-service is to report possible data misrepresentations that appear in the media. I would never seek nor accept compensation to provide statistical consultation.   

1.      Reporting of AP Test Scores

A recent press-release by Superintendent White noted that AP passing rates increased 24.6%, the highest in the nation, from 5,144 in 2013 to 6,407 in 2014. What is conspicuously missing is the ratio between successful tests and unsuccessful tests for each year.  The first number (5,144) represents a reported 5.3% passing rate while the second number (6,407) represents a 4.1% passing; in other words, the AP passing rate actually Decreased 22.64% from 2013 to 2014.

(5.3 – 4.1) /5.3 = 1.2/5.3 = .2264 (22.64%)

The failure to provide the essential information grossly distorts the interpretation. In order to review the actual passing rates for the latest round of tests, I had to obtain that information from a newspaper article the next day.  Here is the summary:  Of the 55,000 or so additional tests in 2014, only 1,263 (approximately 2.2%) passed. These data were never mentioned in the press release. 

Calculating the percent increase from the raw data without taking into consideration the contexts of those numbers is complicity. This is like saying more people finished the Boston Marathon this year than last year only because more people entered the race.  Given the ever increasing monies expended by the state to provide for students to take AP exams, it would be appropriate for those holding the data to inform those who need the data to make informed decisions that, last year, approximately 156,000 total AP tests were given and around 6,407 passed (i.e., 4.1%).

6407/156000 = .0410 (4.1%)

One final note on this matter: It appears that the ACT results have been reported in a similar fashion to the AP scores.  If that is the case, and the increases in the number of tests that were scored at 18 or higher are due to increased numbers of students taking the tests then those results are equally distorted.

2.      Preparation for PARCC Exams

A recent press release from the Superintendent of the State Department of Education states: Since 2010, teachers and state assessment staff in more than a dozen states and the District of Columbia have developed questions, accommodations, and policies that make for an improved blueprint for standardized testing.

At first glance, the statement appears to be quite informative. However, my graduate students were quick to notice the missing essential information, i.e.,Curriculum. It is significant to note that a “Test” is a type of assessment that evaluates academic achievement. Downing & Yudkowsky (2009) explain, “…assessment and instruction are intimately related. Teaching, learning and assessment form a closed circle, with each entity tightly bound to the other” (p. 9) .

Superintendent White broke the “closed circle” by suggesting that neither Eureka Math nor any other curriculum is favored for state testing.  In a note to a high level government official who was expressing concerns about Eureka Math, Superintendent White replied:  “We will also continue to insist, as you urge, that state tests be aligned to high expectations for our students’ skills and not to any particular curriculum (emphasis added).” 

In other words, Superintendent White is suggesting that Louisiana should give high-stakes exams to studentsirrespective of the fact that students, teachers, and stakeholders have not been given a clear curriculum on the first day of school.  How do you test what you are supposed to know at the end of the year if you haven’t communicated those expectations at the beginning of the year?  Lack of clarity for curriculum guarantees that profit-driven entities will provide their own curriculum, all of which purportedly provide the best means of preparing students to meet expectations laid out by high-stakes exams.  This guarantees a process that is costly, inefficient, and ill-defined.

The DOE discreetly recognizes this unfortunate reality by designating certain curriculum derived by outside sources as “Tier 1.”   However, no outside entity can provide to parents and teachers on the first day of school what students should know and how they will be expected to demonstrate that knowledge; this responsibility falls squarely to the DOEand to this point in time it is a responsibility that has not been met. TheDOE tacitly acknowledges its failure to generate curriculum when it continues to insist that school systems are free to use whatever curriculum they choose. 

The lack of a state generated curriculum that is available on the first day of school short circuits end-of-course exams..Results are tainted since there has been no defined path presented to articulate the criteria, much less communicate how the criteria will be met. 

3.  The Evaluation of Teachers via a Value-Added Model (VAM)

The Louisiana Department of Education has implemented the Value Added Measures (VAM) as part of a teacher’s performance evaluation. However, implementation of any educational interventions must be supported by the best available research results (evidence) or more formally known as Evidence-Based Practice (EBP). For example, the Individuals with Disabilities Education Act (IDEA) and Elementary and Secondary Education Act (ESEA) require that schools use programs, curricula, and practices based on “scientifically-based research” “to the extent practicable.” The best available research from the American Educational Research Association (AERA)  reports the following, “…weak to nonexistent relationships between state-administered value-added model (VAM) measures of teacher performance and the content or quality of teachers’ instruction.” Implications of these resultsfail to support the utility of VAM data for teacher evaluations. Concurring with the AERA findings is the American Statistical Association , the largest organization in America that represents statisticians and related fields.  The Louisiana Department of Education’s implementation of theValue Added Measures (VAM) as part of a teacher’s performance evaluation is in conflict with Evidence-Based Practice (EBP).


When looked at collectively, there is a troubling trend of either the misrepresentation of statistical outcomes or the incorrect application of statistical principles.  If the results aremisrepresented, you do not have all the necessary data to make an informed decision (#1 above).  If the statistical principles are incorrectly applied, you have created a fundamentally flawed process that provides no meaningful interpretations (#2 and 3 above). 

Based upon my professional experience, I believe the Department of Education and Superintendent White have created both of these scenarios. 

My intent is to bring attention to these problems so that legislators and policy makers can make informed decisions moving forward.


Bio:  Dr. A.J. Guarino

*A. J. Guarino presently teaches biostatistics at the Massachusetts General Hospital Institute of Health Professions in Boston. In 2011, Dr. Guarino was awarded the 2011 Nancy T. Watts Award for Excellence in Teaching – the highest prize given to a faculty member at Boston’s health sciences graduate school.  He received his bachelor’s degree from UC Berkeley and his doctorate in Educational Psychology with an emphasis in statistics and psychometrics. He has published over 50 refereed research articles in a variety of fields in health, education, psychology, assessment, and statistics and presented nearly one-hundred papers at national and regional conferences. Dr. Guarino has also coauthored five graduate level statistics textbooks.  A partial list of recent scholarship is provided below.  Dr. Guarino has also served as president of the Louisiana Education Research Association (LERA). 

He can be reached at: ajguarino@mgh.harvard.edu 

Recent Scholarship


Meyers, L., Gamst, G. & Guarino, A. J. (2013). Performing Data Analysis: Using IBM SPSS. Hoboken, NJ: Wiley.

Meyers, L., Gamst, G. & Guarino, A. J. (2013). Applied Multivariate Research Design and Interpretation Second Ed. Newbury Park, CA: Sage.

Meyers, L., Gamst, G. & Guarino, A. J. (2006). Applied Multivariate Research Design and Interpretation. Newbury Park, CA: Sage.

Gamst, G., Meyers, L., & Guarino, A. J. (2008). Analysis of Variance Designs. New York: Cambridge Press.

Meyers, L., Gamst, G. & Guarino, A. J. (2009). Data Analysis Using SAS Enterprise Guide. Hoboken, NJ: Wiley Publication.

Refereed Articles and Presentations

Matthews, L. T., Ghebremichael, M., Giddy, J., Hampton, J., Guarino, A. J., Ewusi, A., Carver, E., Axten, K., Geary, M., Gandhi, R. T., Bangsberg, D. R. (2011). A Risk Factor-Based Approach to Reducing the Likelihood of Lactic Acidosis and Hyperlactatemia in Patients on Antiretroviral Therapy. PLoS ONE 6(4), 1-7.

Nahas, S. J., Young, W.B., Terry, R., Kim, A., Van, D. T., Guarino, A.J., & Silberstein SD. (2010). Right-to-left shunt is common in chronic migraine. Cephalalgia 30(5), 535-42.

Chesser, S., Forbes, S. A., & Guarino, A. J. (November, 2011).  Investigation of intra-individual response of the stress hormone cortisol to varying educational environments (single vs. mixed sex groupings). Poster presentation at the annual meeting of the Society for Neuroscience, Washington DC.

Lopez, R. P., & Guarino, A. J. (2013). Psychometric Evaluation of the Surrogate Decision Making Self-Efficacy Scale. Research in Gerontological Nursing 6(1), 71-76.DOI: 10.3928/19404921-20121203-02

Certain, L., Guarino, A. J., & Greenwald, J. (2011). Effective Multilevel Teaching Techniques on Attending Rounds: a Pilot Survey and Systematic Review of the Literature. Medical Teacher, 33, 644-650.

Hastie, P., & Guarino, A.J. (2013). The Development of Skill and Knowledge during a Season of Track and Field Athletics. Research Quarterly for Exercise and Sports.

Lopez, R.P., & Guarino, A.J. (May, 2011). Confirmatory factor analysis of the Surrogate Decision Making Self-Efficacy Scale (SDM-SES). Poster presented at the annual meeting of the American Psychological Society, Washington DC.

Eaves, R.C. & Guarino, A. J. (2006). Dunn’s multiple comparison test. Encyclopedia of Measurement and Statistics, 293-296.