LDOE’s statistical methods and reports are beyond useless according to Harvard professor

(Provided by Mary K. Bellisario)


Question: Have press releases issued by the LA Dept. of Education been accurate,especially regarding specific analysis ofresults of student AP tests, possibly ACT tests,  PARCC preparation, and the VAM– Value-added Method for teacher evaluations? 


Further:  Should legislators base their decisions, and vote their support, oninformation contained in state press releases about public education whichcould be inaccurate?


An independent, nationally-recognized data analyst, Dr. A. J. Guarino, is doubtful that data from the LA Dept. of Education is being analyzed or presented accurately.  He gives his reasons below. He calls into serious question analysisprocedures and “selective” reporting in three specific areas:  AP test results (and possibly ACT results), PARCC preparation, and the VAM method of teacher evaluation.


Attached above and below is an unsolicited critical analysis of information that has recently come from the LA Dept. of Education, describing Louisiana’s purported progress in public education.  The analyst, Dr. A. J. Guarino,is a professor of biostatistics at the Harvard University teaching hospital, Massachusetts General Hospital of Health Professions. He and his graduate students routinely collect and analyze media reports from around the U.S. for their accuracy.   Dr. Guarino has ties to Louisiana, where he formerly served as president of the Louisiana Education Research Association (LERA). 


If you have further questions, or want to verify his credentials, Dr. Guarino can be reached at:  ajguarino@mgh.harvard.edu


Among Dr. Guarino’s conclusions are: 

“When looked at collectively, there is a troubling trend of either the misrepresentation of statistical outcomes or the incorrect application of statistical principles.”


Further, Dr. Guarino concludes: 

“Based upon my professional experience, I believe the Department of Education and Superintendent White have created both of these scenarios.  My intent is to bring attention to these problems so that legislators and policy makers can make informed decisions moving forward.”


Teaching, learning, and assessment: The closed circle

A. J. Guarino*

I certify that there is no conflict of interest with any financial organizationregarding the content of this letter.

The intent of this posting is to report on findings and policies that are based uponincomplete information or misapplication of statistical principles.

        I am a professor of biostatistics at the Massachusetts General Hospital Institute of Health Professions in Boston (teaching hospital for Harvard) where I teach basic and advanced quantitative methods to graduate students in the health sciences. To document the need to be “statistically literate,” I present reports from media sources purporting major findings while failing to provide essential information, which often contradicts their “major findings.” As the former president of the Louisiana Education Research Association (LERA), I continue to follow the educational developments in Louisiana.  Over the past weeks and months, I noticed “press releases” and other media containing only selective information, which made the reported findings suspicious. After securing the full information, my analyses did not replicate those of the press-releases.

        Decisions must never be derived from findings that fail to provide complete information. At this stage of my career, my community-service is to report possible data misrepresentations that appear in the media. I would never seek nor accept compensation to provide statistical consultation.   

1.      Reporting of AP Test Scores

A recent press-release by Superintendent White noted that AP passing rates increased 24.6%, the highest in the nation, from 5,144 in 2013 to 6,407 in 2014. What is conspicuously missing is the ratio between successful tests and unsuccessful tests for each year.  The first number (5,144) represents a reported 5.3% passing rate while the second number (6,407) represents a 4.1% passing; in other words, the AP passing rate actually Decreased 22.64% from 2013 to 2014.

(5.3 – 4.1) /5.3 = 1.2/5.3 = .2264 (22.64%)

The failure to provide the essential information grossly distorts the interpretation. In order to review the actual passing rates for the latest round of tests, I had to obtain that information from a newspaper article the next day.  Here is the summary:  Of the 55,000 or so additional tests in 2014, only 1,263 (approximately 2.2%) passed. These data were never mentioned in the press release. 

Calculating the percent increase from the raw data without taking into consideration the contexts of those numbers is complicity. This is like saying more people finished the Boston Marathon this year than last year only because more people entered the race.  Given the ever increasing monies expended by the state to provide for students to take AP exams, it would be appropriate for those holding the data to inform those who need the data to make informed decisions that, last year, approximately 156,000 total AP tests were given and around 6,407 passed (i.e., 4.1%).

6407/156000 = .0410 (4.1%)

One final note on this matter: It appears that the ACT results have been reported in a similar fashion to the AP scores.  If that is the case, and the increases in the number of tests that were scored at 18 or higher are due to increased numbers of students taking the tests then those results are equally distorted.

2.      Preparation for PARCC Exams

A recent press release from the Superintendent of the State Department of Education states: Since 2010, teachers and state assessment staff in more than a dozen states and the District of Columbia have developed questions, accommodations, and policies that make for an improved blueprint for standardized testing.

At first glance, the statement appears to be quite informative. However, my graduate students were quick to notice the missing essential information, i.e.,Curriculum. It is significant to note that a “Test” is a type of assessment that evaluates academic achievement. Downing & Yudkowsky (2009) explain, “…assessment and instruction are intimately related. Teaching, learning and assessment form a closed circle, with each entity tightly bound to the other” (p. 9) .

Superintendent White broke the “closed circle” by suggesting that neither Eureka Math nor any other curriculum is favored for state testing.  In a note to a high level government official who was expressing concerns about Eureka Math, Superintendent White replied:  “We will also continue to insist, as you urge, that state tests be aligned to high expectations for our students’ skills and not to any particular curriculum (emphasis added).” 

In other words, Superintendent White is suggesting that Louisiana should give high-stakes exams to studentsirrespective of the fact that students, teachers, and stakeholders have not been given a clear curriculum on the first day of school.  How do you test what you are supposed to know at the end of the year if you haven’t communicated those expectations at the beginning of the year?  Lack of clarity for curriculum guarantees that profit-driven entities will provide their own curriculum, all of which purportedly provide the best means of preparing students to meet expectations laid out by high-stakes exams.  This guarantees a process that is costly, inefficient, and ill-defined.

The DOE discreetly recognizes this unfortunate reality by designating certain curriculum derived by outside sources as “Tier 1.”   However, no outside entity can provide to parents and teachers on the first day of school what students should know and how they will be expected to demonstrate that knowledge; this responsibility falls squarely to the DOEand to this point in time it is a responsibility that has not been met. TheDOE tacitly acknowledges its failure to generate curriculum when it continues to insist that school systems are free to use whatever curriculum they choose. 

The lack of a state generated curriculum that is available on the first day of school short circuits end-of-course exams..Results are tainted since there has been no defined path presented to articulate the criteria, much less communicate how the criteria will be met. 

3.  The Evaluation of Teachers via a Value-Added Model (VAM)

The Louisiana Department of Education has implemented the Value Added Measures (VAM) as part of a teacher’s performance evaluation. However, implementation of any educational interventions must be supported by the best available research results (evidence) or more formally known as Evidence-Based Practice (EBP). For example, the Individuals with Disabilities Education Act (IDEA) and Elementary and Secondary Education Act (ESEA) require that schools use programs, curricula, and practices based on “scientifically-based research” “to the extent practicable.” The best available research from the American Educational Research Association (AERA)  reports the following, “…weak to nonexistent relationships between state-administered value-added model (VAM) measures of teacher performance and the content or quality of teachers’ instruction.” Implications of these resultsfail to support the utility of VAM data for teacher evaluations. Concurring with the AERA findings is the American Statistical Association , the largest organization in America that represents statisticians and related fields.  The Louisiana Department of Education’s implementation of theValue Added Measures (VAM) as part of a teacher’s performance evaluation is in conflict with Evidence-Based Practice (EBP).


When looked at collectively, there is a troubling trend of either the misrepresentation of statistical outcomes or the incorrect application of statistical principles.  If the results aremisrepresented, you do not have all the necessary data to make an informed decision (#1 above).  If the statistical principles are incorrectly applied, you have created a fundamentally flawed process that provides no meaningful interpretations (#2 and 3 above). 

Based upon my professional experience, I believe the Department of Education and Superintendent White have created both of these scenarios. 

My intent is to bring attention to these problems so that legislators and policy makers can make informed decisions moving forward.


Bio:  Dr. A.J. Guarino

*A. J. Guarino presently teaches biostatistics at the Massachusetts General Hospital Institute of Health Professions in Boston. In 2011, Dr. Guarino was awarded the 2011 Nancy T. Watts Award for Excellence in Teaching – the highest prize given to a faculty member at Boston’s health sciences graduate school.  He received his bachelor’s degree from UC Berkeley and his doctorate in Educational Psychology with an emphasis in statistics and psychometrics. He has published over 50 refereed research articles in a variety of fields in health, education, psychology, assessment, and statistics and presented nearly one-hundred papers at national and regional conferences. Dr. Guarino has also coauthored five graduate level statistics textbooks.  A partial list of recent scholarship is provided below.  Dr. Guarino has also served as president of the Louisiana Education Research Association (LERA). 

He can be reached at: ajguarino@mgh.harvard.edu 

Recent Scholarship


Meyers, L., Gamst, G. & Guarino, A. J. (2013). Performing Data Analysis: Using IBM SPSS. Hoboken, NJ: Wiley.

Meyers, L., Gamst, G. & Guarino, A. J. (2013). Applied Multivariate Research Design and Interpretation Second Ed. Newbury Park, CA: Sage.

Meyers, L., Gamst, G. & Guarino, A. J. (2006). Applied Multivariate Research Design and Interpretation. Newbury Park, CA: Sage.

Gamst, G., Meyers, L., & Guarino, A. J. (2008). Analysis of Variance Designs. New York: Cambridge Press.

Meyers, L., Gamst, G. & Guarino, A. J. (2009). Data Analysis Using SAS Enterprise Guide. Hoboken, NJ: Wiley Publication.

Refereed Articles and Presentations

Matthews, L. T., Ghebremichael, M., Giddy, J., Hampton, J., Guarino, A. J., Ewusi, A., Carver, E., Axten, K., Geary, M., Gandhi, R. T., Bangsberg, D. R. (2011). A Risk Factor-Based Approach to Reducing the Likelihood of Lactic Acidosis and Hyperlactatemia in Patients on Antiretroviral Therapy. PLoS ONE 6(4), 1-7.

Nahas, S. J., Young, W.B., Terry, R., Kim, A., Van, D. T., Guarino, A.J., & Silberstein SD. (2010). Right-to-left shunt is common in chronic migraine. Cephalalgia 30(5), 535-42.

Chesser, S., Forbes, S. A., & Guarino, A. J. (November, 2011).  Investigation of intra-individual response of the stress hormone cortisol to varying educational environments (single vs. mixed sex groupings). Poster presentation at the annual meeting of the Society for Neuroscience, Washington DC.

Lopez, R. P., & Guarino, A. J. (2013). Psychometric Evaluation of the Surrogate Decision Making Self-Efficacy Scale. Research in Gerontological Nursing 6(1), 71-76.DOI: 10.3928/19404921-20121203-02

Certain, L., Guarino, A. J., & Greenwald, J. (2011). Effective Multilevel Teaching Techniques on Attending Rounds: a Pilot Survey and Systematic Review of the Literature. Medical Teacher, 33, 644-650.

Hastie, P., & Guarino, A.J. (2013). The Development of Skill and Knowledge during a Season of Track and Field Athletics. Research Quarterly for Exercise and Sports.

Lopez, R.P., & Guarino, A.J. (May, 2011). Confirmatory factor analysis of the Surrogate Decision Making Self-Efficacy Scale (SDM-SES). Poster presented at the annual meeting of the American Psychological Society, Washington DC.

Eaves, R.C. & Guarino, A. J. (2006). Dunn’s multiple comparison test. Encyclopedia of Measurement and Statistics, 293-296.

My release of the John White, Alan Seabaugh taped conversation

My release of the John White, Alan Seabaugh taped conversation

I have been getting requests to release the taped conversation between Alan Seabaugh and John White discussing the manipulation of VAM, or the Value Added Modeling system devised by the state to evaluate teacher performance to serve a few of Seabaugh’s constituents. I did not receive the full 14 minutes that Tom Aswell received, just 3 or so minutes of some of the more interesting points, but without the full context. I think I have mined my piece pretty well for interesting nuggets but feel free to review it yourselves to see if you spot anything else interesting.

Audio removed by request as of 2/12/15  (I will need to create a revised one.)

I think it’s kinda cute how John White is using his big boy voice to talk to one of our legislators. You tell he’s really straining his diaphragm to summon up a deeper voice, and at points I worry he will let out a little squeak or perhaps exhaust himself before the conversation is over. Boy, that would be embarrassing. As it is, he just calmly explains how you can’t always rely on the data, especially when the data doesn’t show what you want it to show. No problem, and no need to bring this point up to BESE. Its much easier to randomly assign “bonus points” to favored teachers or teacher groups to get them to have the scores you want them to have. What’s nice about this approach is you can simultaneously sully the reputations of teachers teaching more challenging, poor, or black students and drive them out of the profession. This makes it easier to bring in charter school operators, who don’t require certified teachers at all! It also makes voucher schools look like better options by comparison. If you drive down the quality of public schools enough, well eventually those voucher schools which score 30 points lower on standardized (LEAP) tests public schools are given, will start to look like a good deal, or at least muddy the narrative a little by making them look not so obviously, dreadfully, worse.

I don’t have the full 14 minutes of the conversation.  For that you will have to contact Tom Aswell at Louisiana Voice, but this might still provide some interesting insight into how things really work between, BESE, Jindal, John White and the legislature.  And for future reference, if you are a constituent of Alan Seabaugh up in Shreveport, you might try contacting him for personal favors since that appears to work.  Perhaps he can get DOTD to repave your driveway, or DEQ to take out your trash?  Hey, it’s worth a shot, right?

thrown me something

Cleaning Up John White’s Mess

Cleaning Up John White’s Mess

John White is likely to be gone by the end of June but Louisiana will still have its work cut out cleaning up the messes he will leave behind.  Some of those messes off the top of my head are:

but what I’m going to tackle now is the fatally flawed COMPASS and VAM system that even John White’s own staff agree is racially and socioeconomically biased – as you can see from this internal e-mail below that circulated before the Seabaugh Solution was reaffirmed by White.

I want you to read the passages I highlighted and let that sink in before I explain.  COMPASS is a teacher evaluation system designed for Louisiana.   It was initially developed with the help of an out-of-state researcher named Charlotte Danielson, who is considered one of the pre-eminent authorities in this field.  However Ms Danielson has done more than simply distance herself from our evaluation system.

Danielson was surprised to hear the state was launching a teacher observation tool without first trying it out in a few districts. Before Tennessee made its evaluation system a state requirement last year, for example, it experimented for a year with various observation models in schools across the state.

“It’s never a good idea to use something for high stakes without working out the bugs,” Danielson said. “The thing I worry about from a purely selfish standpoint is that my name gets associated with something people hate, and I’m not happy about that.”

Besides making people unhappy, mistakes could also end up costing the state, Danielson warned. “I worry a lot [that] if we have systems that are high stakes and low rigor, we’re going to end up with court cases,” she said.

You see, we only took a few of the simplest metrics she developed 5 of 22.

Louisiana has adopted part, but not all, of her framework for use in classroom observations, which will factor into a teacher’s annual score and which will ultimately determine whether educators can keep their jobs.

Although Danielson helped the state create a shortened version of her system at its request, she’s worried her truncated observation checklist could create problems for teachers and evaluators.

“I think it decreases accuracy. I think that’s an almost certain consequence,” she said.

Louisiana adopted the new system to comply with Act 54, a law passed in 2010 aimed at improving teacher quality in the state with more intensive, annual teacher evaluations. Half of a teacher’s rating will be calculated based on how he or she scores in the observation, and half will be determined by how students perform on standardized tests. Teachers who perform poorly on the evaluations could lose their certification.

But more than that, teachers could be fired as well, based on a model the creator of which claims is quite likely flawed because of its simplicity.  However what many of you might not realize is that teacher effectiveness is also determined by the VAM, or Value Added Modeling score.  In fact, when there is a difference between VAM and the COMPASS evaluation, VAM is the score a teacher gets, which means the COMPASS evaluation is essentially useless for 1/3 of all teachers which have a VAM score because they teach a test evaluated subject.  The VAM system was built on a questionable premise to being with, but what little credibility it might have gained was completely annihilated by John White and Alan Seabaugh’s tinkering with the system for personal reasons.

However even more alarming is that the solution adopted seems to punish teachers who teach our neediest students, students from the poorest backgrounds.  The way it does this is by giving “bonus points” to teachers teaching more advanced students, which tend to be more affluent ones.  VAM is based on a curve.  Everyone can’t get an A.  Effectiveness ratings are based on where teachers fall in the curve, where the top 10-20 % are the most effective, and the lowest 10-20 % are the least effective.  In this type of scheme, both success and failure are guaranteed, and your success is entirely dependent on the success of those around you.  When some teachers are given bonus points to lift their scores, this causes teachers without these points to drop into lower categories.  The Seabaugh Solution involves giving bonus points to teachers teaching advanced students, which means they will never be found ineffective, thus immune to  most of the negative implications of COMPASS and VAM and more likely to earn financial incentives.  Teachers teaching students in schools with poorly performing students, which are mostly poor and black, will be that much more likely to be found lacking. . .  and subject to being stripped of tenure, or even dismissed.

The COMPASS system and VAM must be abandoned.  John White has failed at everything he tried to do in Louisiana, and everything he has done has failed.  Now it’s time to clean up the rest of his mess.  We can start by eliminating VAM and COMPASS and the people he brought in from out of state like Hannah Dietsch and Molly Horstman to oversee a system that was known to be racially biased, politically tampered with and so poorly designed and implemented that the person who helped create it no longer wants her name associated with it, because she thinks it’s so bad and so unfair it could expose us to lawsuits that would be easily won.

Time to start eliminating the mess. . .
Time to start eliminating the mess. . .

John White’s crumbling house of cards – From VAM to worse

John White’s crumbling house of cards – From VAM to worse

These days John White might be needing a hug. It appears all the ill-will and poor decisions making he has sown over the past year is coming home to roost on his doorstep, all at the same time.

Support the Louisiana Student Privacy Act

Recently we called out White for lying to everyone in the state of Louisiana over carelessly and needlessly exposing all of our school children to unnecessary risks by handing over their some of their most private data to companies hoping to make a profit off the information by direct marketing products to children, charging the state millions of dollars for the privilege of using their storage services, and by marketing and providing this information to unlimited third party vendors. I encourage the legislature to craft and pass the Louisiana Student Privacy Act, which will go further than FERPA (the Federal Student Privacy Act) and actually protect our children’s privacy instead of selling it out to anyone who asks.

Reject House Bill 650 – DOE reorg

White is trying to push through a bill that supporting the reorganization of the Louisiana Department of Education, about a year too late according to this DOE employee: (Link to testimony from April 11th)

Last week Beth Scioneaux indicated in House Education that the LDOE is going to lose another 34 positions. She didn’t say how many actual bodies were going to be lost but the numbers are getting pretty low with the exception of the TFA types. Erin Bendily lied through her teeth when she said that the department wasn’t working under the proposed reorg. White starting putting everything in place the moment he came in and had pretty much finished by early fall. He even had a draft of the org chart that he handed out to some staff around September early October. He also wants to make the Deputy Supt. position optional because his hand-picked second in command (Kunjan) couldn’t possibly be confirmed. He changed the title to Chief of Staff to get around it…

While Representative Chris Broadwater stated he has questions about how the reorg would work before he approves it (which already seems a little redundant since it doesn’t appear he cares what the answers are) well he might as well ask John White how the reorganization is already working out. White just needs rubber stamps, as BESE has learned. Unless you think White should be able to thumb his nose at the legislature and oversight, you should probably vote down House bill 650.

This information jives with everything I saw before I left. Erin actually sent an e-mail to us long before John White was confirmed, when he was still the Superintendent of RSD, instructing us not to make any decisions until Superintendent White was “officially” confirmed. I’m told he was making personnel decisions long before BESE voted to confirm him so it makes sense he would reorganize the department illegally and then ask for permission afterwards. White even drafts and then approves BESE agendas before BESE meets or reviews them.

Reject 2013 MFP

(don’t defund Special Education Programs, don’t use a DOE

known flawed VAM for funding calculations (SCR 23) )

John White has earned the ire of many advocates for Special Education Students of the disabled, gifted and talented varieties. Despite what he has claimed, this formula is not neutral, and actually reduces funding for some school districts even by DOE’s own reports they produced to try to dispute this fact based on the rosiest forecasts I’m sure they could come up with. The new MFP takes money away from talented programs, reduces gifted program funding, and reduces funding for disabled students who can’t exceed their value added scores by larger and larger amounts every year (which is statistically impossible).


Pass House Bill 160 – VAM delay and oversight

Because VAM or Value Added Modeling has been introduced into funding for school districts I would be remiss if I did not spend a little time explaining some of the flaws with this system that have come to my attention recently. If you don’t know what VAM is offhand, you are not alone. Even the Louisiana Department of Education, which uses VAM for everything from teacher evaluations, to school grades, to funding, to soybean substitute, has no idea how this system works according to this former employee.

Most of all, he [White] knows the Value Added Model, the all-important VAM, is broken. And he knows this because he broke it.

What was most interesting to read, however, was Johnny’s opposition to the ‘legislative oversight’ aspect to HB 160. He’s already allowing legislators (well, at least one northwestern one in particular) to dictate what the model does; why not let an entire committee or two have input? Is it because he knows that what he’s doing is wrong, if not illegal? Does he not want our elected BESE members to know that he bypassed them once again, by skipping policy and instead screwing with the math? Would he prefer that our esteemed, elected, representatives not be aware that he is playing around with citizens’ lives and careers because the governor tells him to ‘trust me, you gotta do this’? Perhaps he would not like the courts to know that he continues to flaunt the law by ignoring specific mandates that don’t suit him? He may be afraid that the public will learn that ‘Louisiana Make Believes’ data is being used to determine teachers’ futures, school takeovers and charters, voucher eligibility and the Course Choice crap, along with the future privatization of education. Or, maybe, it’s just that he’s concerned that everyone will finally discover that he’s nothing but White Lies.”

VAM was a flawed metric to use for all the things DOE tried to apply it to, but made more so by all the frenetic gyrations John White put the system through trying to please a select few. I strongly urge our legislators to reject the MFP formula put forth by BESE and ask that BESE restore the previous year’s formula that does not contain a tragically flawed VAM component. The folks that crafted the original VAM formula are gone, and John White almost exclusively hires TFA (Teach for America) folks that lack expertise in the areas he assigns them. Just this past week or so I learned that as many as 60 DOE personnel were given walking papers as part of a RIF, Reduction in Force plan. However I’ve also learned that John White already has new TFA recruits waiting in the wings to fill these spots he is “temporarily” eliminating – at much higher salaries and much fewer years of experience than much of the current staff he is releasing.

I’ve heard from several sources that John White is worried (freaked out) by some Freedom of Information requests that were made about VAM. Well there is a lot to be frightened of apparently:

It’s interesting to read that John White appears willing to hold off implementing Compass http://theadvocate.com/news/education/5756154-123/white-looks-at-teacher-evaluations (that new teacher evaluation system) for one year…just about exactly one year after educators around the state begged him to do exactly that. He knows the entire program is, in a nutshell, a clusterf**k. He knows the two data systems involved may not work together; one of them is not even completed. He knows his district support office (the one that was just approved…months after being created….http://theadvocate.com/home/4127328-125/la-superintendent-reorganizes-staff and after the guy running it for five months left. http://louisianavoice.com/2013/04/05/five-months-and-out-was-that-enough-time-for-doe-deputy-superintendent-to-obtain-louisiana-license-plates-for-his-car/) provides little to no support, while ‘toolboxing’ around the state in rental cars, collecting mileage and per diems. He knows that schools have been given precious few resources and little guidance, and no definite answers. He knows the ‘new and improved’ LouisianaBelieves website leads teachers on an endless loop of 404-file not found error messages, dead links, outdated or “coming soon” information.

In case some of you were not keeping up, John White scrapped the former department of education website in favor of what has been called a childish, useless crayon inspired endeavor that contains nothing useful except pictures of John White handing out big checks to school districts with his sleeves rolled up. As impressive as that sight is to the John White fan club, most of us are not members, and what he has done offends us.

When I was at DOE we used to website as an archive of useful information and reports which we could direct internal staff, school district personnel, parents and other stakeholders to.  Now the site is universally understood to be a useless mess.  I suppose in a way that makes sense. John White has a god complex and he’s crafted the new department website in his own image.

John White doesn’t want people keeping track of what’s going on, so he scrapped the old site but “we in the know” are doing our level best to keep you informed. White can’t continue to lie to us, sell our children, humiliate our teachers, defund our Special Education students, dazzle us with VAM BS, and ignore both the legislature and BESE. . .  unless you let him. I and my brave colleagues have done their part, now it’s time for some of you to do yours.

No matter what John White names his website, Louisiana doesn’t have to believe anymore.  Reject John White and his bankrupt policies and send him packing.  We deserve better.  Our children deserve better than having to put up with this.

VAMtastic – what bicycles and teachers have in common

VAMtastic – what bicycles and teachers have in common

To start off this discussion about VAM (value added modeling).  The idea that teacher effectiveness can be accurately predicted based mostly on student test scores (or entirely in Louisiana.) and that teacher’s fortunes should be tied to those test scores.

That sounds boring so I’ll start off with a really exciting topic, like global warming. :)  It appears that manmade activities are causing the earth to warm.  Even people the Koch brothers hired to say otherwise were unable to deny this.  Recently a “creative” state Republican legislator named Ted Orcutt proposed an extra tax on bicyclists claiming that bicyclists exude carbon dioxide at higher rates and are harming the environment.

“Also, you claim that it is environmentally friendly to ride a bike. But if I am not mistaken, a cyclist has an increased heart rate and respiration. That means that the act of riding a bike results in greater emissions of carbon dioxide from the rider.  Since CO2 is deemed to be a greenhouse gas and a pollutant, bicyclists are actually polluting when they ride.” Ted Orcutt

Early Global Warming Device - from Ted Orcutt's history book
Early Global Warming Device – from Ted Orcutt’s history book

He wanted to discourage bicycle use with his tax claiming this was in the interest of climate control. A helpful, knowledgable reader of the blog I pulled this from did the calculation for us to show the absurdity of this remark most of you probably already realized intuitively was inane and insane.

“I did a back-of-envelope calculation:
CO2 per mile for car getting 20 mph = 446 gram
CO2 per mile for average person riding bike 15 mph = 17gram
But carbon source is different.  The gasoline carbon is newly introduced to the atmosphere while the carbohydrate carbon the cyclist burned came from plants which obtained it from the atmosphere (of course the plants had to be farmed and the food transported both of which take fuel).” Gary

While it is true carbon dioxide does impact the warming of the climate, it is not the sole contributor,  His argument also does not take into account in the example given, cars. (Or the fact these people biking to work would otherwise have to use cars which produce CO2 at  more than 25 times the rate of a bicyclist and that by his logic we should also tax people in gyms and people for just breathing sitting on their sofas)  It does not take into account factories, gas-powered appliances and heating, deforestation, nuclear reactor meltdowns, warfare, airplanes, launching satellites into orbit, and freon and refrigerants just to name a few of some of the more common man-made contributors.

Terrance Shuman a fellow blogger and commenter identified a similar problem with something VAMvateers take for granted on Dr Mercedes Schneider’s blog this morning.

“The planted axiom in all of this, of course, is that the test scores we’re using actually convey something meaningful and important. This has not, in my opinion, been definitively established. And if we don’t know that, the rest of the house of cards comes tumbling down, doesn’t it?”

I pointed out this:

“As a corollary to your axiom, while I think it’s fair to say they convey something, and something that may even prove to meaningful in a limited context, what is not proven is whether the “something” that might be conveyed is meaningful to the context is being applied.  These are student test scores, not teacher test scores.  Sure, teachers have an impact on test scores, the absence of a teacher would probably yield a much lower one – for instance.  That does not take into account all factors influencing a test score.  In fact, what the study results do prove is that teachers are not the sole determining factor, and possibly not even the most important one.  Scores remain consistent only about roughly 20-30% of the time year over year with no change in composition of students or teaching methods.  That implies other factors not accounted for in the “model” impact 70-80% of the score.”

The logic of VAM, if you can call it that, is as inane as Ted Orcutt’s reasoning that if we would discourage people from bicycling we would reduce global warming.  VAMvateers point out that test scores increase over time as kids age, which is hard to argue with.  They claim a “good” teacher is better than a bad teacher for increasing a test score.  This is probably true, also hard to argue with that logic at face value and quantification is not a given.  However at that point they jump the data tracks.  They explain that teachers are the primary influence on student test scores therefore one can use test scores to sort out “good” and “bad” teachers.  However none of the studies, even their own studies, bear this out.  Even by the most benevolent interpretations I’ve seen, changes in test scores can’t be attributed to even half of a student’s change or test score outcome.  Factors that are much harder to measure like environment, curriculum, school facilities, parental involvement, learning disabilities (known or unknown), illness, psychological trauma, poverty, safety, and probably astrological sign (to name a few factors) add up to more that what bankers and billionaires would have you believe.

That’s not to say teachers are unimportant.  That’s also not to say bad teaching or bad teachers are not a problem, merely not “the” problem.  Teachers are a part of the equation we can control, but the measurement mechanism we are using is fallacious.  Just because people like Ted Orcutt are only able to apply what they learned in a first grade science class, doesn’t mean you need to.  Global warming is not adversely impacted by bicyclists simply because they exhale carbon dioxide, anymore than some of our sketchy educational outcomes are solely responsibility of our teachers.  There is no amount of “good” teaching that can singlehandedly overcome the stacked deck of generational poverty, health and safety issues and emotional trauma.  However claiming this to be the case has its advantages for the ones claiming it.

Swelling a class size from 20 to 40 or 60 or as many as 500 for virtual schools will not address those issues, and without looking at any data, I can guarantee it will make things worse.  You will have politicians in the pockets of billionaires and bankers trying to sell you on this idea with cryptic data you won’t understand, designed to dazzle you, just like they did with the dotcoms, the Enrons, the subprime mortgaged backed securities that brought down the housing market, the derivatives trading that almost wiped out AIG and then recently almost wiped out JP Morgan Chase in weeks (having failed to learn the lesson of AIG.)

Don’t be fooled by their “data.”  Look at their motivations, and use your brain, your heart and your experience.  Think back to your own classes (which I assume didn’t have 60+ children in them like Reformers are pushing for now.)

Bicyclists are not causing global warming anymore than teachers are causing our population with the highest childhood poverty rate in the industrialized world to do worse on  standardized tests.  Look in the mirror the next time you vote down a millage tax for school improvements, or you let your children watch TV before doing their homework, or elect a governor that tells you the teachers are at fault, not you, not poverty, not schools with roach infested halls and leaking walls and faulty air conditioning.  When you elect a meglamaniacal governor  like Bobby Jindal (who never saw an illegal contribution he felt he needed to return) and allow him to hire a Superintendent like John White that empties music rooms, art studios, libraries, guidance counselors, school psychologists, GT programs and cuts funding for special education programs. . . well look in the mirror and you will have your answer.

After all these Bankers and billionaires have done to fool you with their “data” in the past, you would really have to be a fool to “Believe” them now.  They even use this concept to mock you, you know. . .  The “Louisiana Believes” slogan John White and his sadistic cronies dreamed up is an inside joke at among the top brass at LDE, but if you don’t see it for what it is, well then the joke really is on you.

Actually I really hope you are being fooled by their numbers and no matter how many times they come to you with ridiculous claims backed by infantile reasoning of the Ted Orcutt kind.  Because if not, well, then you are just allowing your kid’s teachers to take the fall for your laziness, greed and sloth – glad that they are taking the fall so you can continue to watch your reality TV while your kids play their video games and 25% of other people’s kids struggle to complete their homework – with empty stomachs and gunshots in the background, or even the foreground in schools like Sandy Hook.

I wonder how their teachers VAM scores will look this year.  I’m pretty sure the VAMvateers haven’t added mass murder adjustment to their equation, but if when all the students at that school do worse than expected, at least we can can find someone other than the shooter, or the gun, or us for allowing those guns to be so accessible.

We can blame the teachers.

Problem solved.

Jersey Shore anyone?

easy as shooting sharks in a barrel

easy as shooting sharks in a barrel


Hi Mercedes! Mind if I steal your blog and add your blog thoughts to my own? (not that you have a choice. :) )

I think Peter Orszag, makes a valid point. VAM is slightly better than nothing depending on how you look at it, .. Assuming perfect data and that children were simple computer models confined to a lab. It might be correct 5% of the time or so. If we only knew which of the 100% of the results we got, comprised that 5%, perhaps we could act on it in a constructive and responsible manner? What does not make sense is using unproved, disproven and destructive free-market inspired economic principals to replicate 5% of the “correct results” (which we can’t identify) as well as 95% inaccurate ones.

Let’s say I want to kill sharks because I’ve decided they are “bad” fish. To a VAMite, the best way to do this is to create a giant barrel and dredge in along the ocean floor scooping up the schools of tuna the sharks like to hang out with.

Now we must eliminate the shark. To do that I could figure out an accurate way to identify a shark, maybe just hire a good fisherman(principal) that can identify them and remove them from the school. Not much money to be made doing that, and how will I sell all my canned tuna with all this live fresh tuna swimming around?

So I decide the best way for my tuna cannery and gun smithing business interests is to take a shotgun and shoot all the fish in the barrel.

Success! I can now claim I killed a shark and wave it around for all to see. Tada! I killed a “bad” fish with a bullet. Now our tuna supply is saved!

Hmm. . . I also killed all the other fish by letting out the water or riddling them with bullets…

But wait! Double kaching! More cheap fodder for my factory! Now I must sell this idea to the masses. . .

Attention Masses: All we have to do is put all out fish in barrels and shoot them; then we will kill all the “bad” fish! (Sure a few good ones have to be sacrificed, well all of them, but we did get rid of the “bad” fish and we can always can(computerize) the casualties. ) Bad fish problem solved and canned tuna all around!

NOTE TO SELF: Hire Pierson to capture all the fish and put them in barrels. Tell Murdoch to supply the guns and get Gates to supply the bullets.





Originally posted on deutsch29:

In order to truly understand value added modeling (VAM), forget the likes of me and of others who hold degrees in mathematics, or statistics, or measurement. Forget that we offer solid, detailed discussions of the problems of VAM. Forget also that those who formerly promoted VAM, like Louisiana’s George Noell, are mysteriously “no longer associated with the project.”

According to Michael Bloomberg, just ask a banker.

That’s right.  Banker and former director of the Office of Management and Budget for the Obama administration Peter Orszag has written an enlightening piece for Bloomberg.com explaining that VAM really does work.  According to Orszag, VAM can determine “which teachers are best.” Now, mind you, I’m no banker, but I would like to offer my thoughts on Orszag’s very positive article on the value of the value added.

First, let me begin with Orszag’s statement regarding “promoting the most talented teachers.” What, exactly, is a “most talented…

View original 1,577 more words

Excellent essay from edweek on why Value Added is junk science

Probing the Science of Value-Added Evaluation

by R. Barker Bausell, edweek.org January 16th 2013

Value-added teacher evaluation has been extensively criticized and strongly defended, but less frequently examined from a dispassionate scientific perspective. Among the value-added movement’s most fervent advocates is a respected scientific school of thought that believes reliable causal conclusions can be teased out of huge data sets by economists or statisticians using sophisticated statistical models that control for extraneous factors.

Another scientific school of thought, especially prevalent in medical research, holds that the most reliable method for arriving at defensible causal conclusions involves conducting randomized controlled trials, or RCTs, in which (a) individuals are premeasured on an outcome, (b) randomly assigned to receive different treatments, and (c) measured again to ascertain if changes in the outcome differed based upon the treatments received.

The purpose of this brief essay is not to argue the pros and cons of the two approaches, but to frame value-added teacher evaluation from the latter, experimental perspective. For conceptually, what else is an evaluation of perhaps 500 4th grade teachers in a moderate-size urban school district but 500 high-stakes individual experiments? Are not students premeasured, assigned to receive a particular intervention (the teacher), and measured again to see which teachers were the more (or less) efficacious?

Granted, a number of structural differences exist between a medical randomized controlled trial and a districtwide value-added teacher evaluation. Medical trials normally employ only one intervention instead of 500, but the basic logic is the same. Each medical RCT is also privy to its own comparison group, while individual teachers share a common one (consisting of the entire district’s average 4th grade results).

From a methodological perspective, however, both medical and teacher-evaluation trials are designed to generate causal conclusions: namely, that the intervention was statistically superior to the comparison group, statistically inferior, or just the same. But a degree in statistics shouldn’t be required to recognize that an individual medical experiment is designed to produce a more defensible causal conclusion than the collected assortment of 500 teacher-evaluation experiments.

How? Let us count the ways:

• Random assignment is considered the gold standard in medical research because it helps to ensure that the participants in different experimental groups are initially equivalent and therefore have the same propensity to change relative to a specified variable. In controlled clinical trials, the process involves a rigidly prescribed computerized procedure whereby every participant is afforded an equal chance of receiving any given treatment. Public school students cannot be randomly assigned to teachers between schools for logistical reasons and are seldom if ever truly randomly assigned within schools because of (a) individual parent requests for a given teacher; (b) professional judgments regarding which teachers might benefit certain types of students; (c) grouping of classrooms by ability level; and (d) other, often unknown, possibly idiosyncratic reasons. Suffice it to say that no medical trial would ever be published in any reputable journal (or reputable newspaper) which assigned its patients in the haphazard manner in which students are assigned to teachers at the beginning of a school year.

• Medical experiments are designed to purposefully minimize the occurrence of extraneous events that might potentially influence changes on the outcome variable. (In drug trials, for example, it is customary to ensure that only the experimental drug is received by the intervention group, only the placebo is received by the comparison group, and no auxiliary treatments are received by either.) However, no comparable procedural control is attempted in a value-added teacher-evaluation experiment (either for the current year or for prior student performance) so any student assigned to any teacher can receive auxiliary tutoring, be helped at home, team-taught, or subjected to any number of naturally occurring positive or disruptive learning experiences.

• When medical trials are reported in the scientific literature, their statistical analysis involves only the patients assigned to an intervention and its comparison group (which could quite conceivably constitute a comparison between two groups of 30 individuals). This means that statistical significance is computed to facilitate a single causal conclusion based upon a total of 60 observations. The statistical analyses reported for a teacher evaluation, on the other hand, would be reported in terms of all 500 combined experiments, which in this example would constitute a total of 15,000 observations (or 30 students times 500 teachers). The 500 causal conclusions published in the newspaper (or on a school district website), on the other hand, are based upon separate contrasts of 500 “treatment groups” (each composed of changes in outcomes for a single teacher’s 30 students) versus essentially the same “comparison group.”

• Explicit guidelines exist for the reporting of medical experiments, such as the (a) specification of how many observations were lost between the beginning and the end of the experiment (which is seldom done in value-added experiments, but would entail reporting student transfers, dropouts, missing test data, scoring errors, improperly marked test sheets, clerical errors resulting in incorrect class lists, and so forth for each teacher); and (b) whether statistical significance was obtained—which is impractical for each teacher in a value-added experiment since the reporting of so many individual results would violate multiple statistical principles.

Of course, a value-added economist or statistician would claim that these problems can be mitigated through sophisticated analyses that control for extraneous variables such as (a) poverty; (b) school resources; (c) class size; (d) supplemental assistance provided to some students by remedial and special educators (not to mention parents); and (e) a plethora of other confounding factors.

Such assurances do not change the fact, however, that a value-added analysis constitutes a series of personal, high-stakes experiments conducted under extremely uncontrolled conditions and reported quite cavalierly.

Hopefully, most experimentally oriented professionals would consequently argue that experiments such as these (the results of which could potentially result in loss of individual livelihoods) should meet certain methodological standards and be reported with a scientifically acceptable degree of transparency.

And some groups (perhaps even teachers or their representatives) might suggest that the individual objects of these experiments have an absolute right to demand a full accounting of the extent to which these standards were met by insisting that students at least be randomly assigned to teachers within schools. Or that detailed data on extraneous events clearly related to student achievement (such as extra instruction received from all sources other than the classroom teacher, individual mitigating circumstances like student illnesses or disruptive family events, and the number of student test scores available for each teacher) be collected for each student, entered into all resulting value-added analyses, and reported in a transparent manner.

Vol. 32, Issue 17, Pages 22-23, 25