Saturday, July 4, 2015

American Enterprise Institute Scholars Question Methodology and Conclusions of Wells Report

The NFL-sponsored Wells Report accused New England Patriots quarterback Tom Brady, Jim McNally (the Officials Locker Room attendant for the Patriots) and John Jastremski (an equipment assistant for the Patriots) of participating in a plan to intentionally deflate NFL footballs in contravention of league rules. In Putting the Wells Report in Proper Perspective, I described the evidence used to punish the New England Patriots and Tom Brady as "flimsy even regarding McNally and Jastremski"  and "almost nonexistent regarding Brady."

A friend of mine who has math and law degrees directed my attention to Deflating "Deflategate," which describes the American Enterprise Institute's thorough debunking of the methodology and conclusions of the Wells report. Authors Kevin Hassett and Stan Veuger conclude:

Our study, written with our colleague Joseph Sullivan, examines the evidence and methodology of the Wells report and concludes that it is deeply flawed. (We have no financial stake in the outcome of Deflategate.)

The Wells report’s main finding is that the Patriots balls declined in pressure more than the Colts balls did in the first half of their game, and that the decline is highly statistically significant. For the sake of argument, let’s grant this finding for now. Even still, it alone does not prove misconduct. There are, after all, two possibilities. The first is that the Patriots balls declined too much. The second--overlooked by the Wells report--is that the Colts balls declined too little.
 
The latter possibility appears to be more likely.

The entire AEI report can be found here. My friend points to this passage as particularly relevant:

The problem here is that ideally, measurements would have been taken simultaneously for all balls, outdoors, at the end of the half, and with the same gauge that was used before the game. Instead, the balls were taken inside and measured there, but not measured simultaneously. The pressure was checked twice for the Patriots balls (once with each gauge), after which the Patriots balls were reinflated and the Colts ball pressure was measured. Only 4 of the Colts balls (instead of all 12) were measured because halftime ended and the officials ran out of time. The fact that the officials ran out of time is highly material: it implies that the Colts balls were inside a warm room for almost the entire halftime before they were measured and thus had a chance to warm up.

Based on using his mathematical training to evaluate AEI's scientific analysis and on using his legal training to identify the flaws the Wells report's logic, my friend reached the following conclusions, which he permitted me to quote as long as I did not reveal his name (the quoted remarks have been edited slightly for length and clarity, but without changing the analysis and conclusions) :

As I view things, there are several (potential) problems to consider here: 

i) From a legal standpoint is there adequate "foundation" to admit the original air pressure results into evidence?; in particular: 

ii) Were the original notations/recordings of the individual ball pressures, both before the game, and at halftime, accurate? (Note: I think all commentary I've seen as to this point the values are...each and all...a correct "recording" of the various values, but "in court," this would almost certainly be "tested.")

iii) The AEI report, quoting/referring to the Wells report, says there were two pressure tester devices (gauges) used, one with an NFL logo on it, one without. The NFL on site official (Anderson) remembers using the "Logo" gauge on the balls before the game. The AEI report says that the Wells report concluded, notwithstanding Anderson's (stated, according to AEI) memory, that Anderson used the non-Logo gauge to run the tests before the game. The AEI report (correctly) notes the significance in the gauge used to test the air pressure before the game because the "Logo" gauge "reports" pressure (apparently consistently, if I'm understanding the AEI report) at .4 pounds per test greater than that reported by the "non- Logo" gauge. For example, if a particular football had been tested using the "Logo" gauge as having 12.0 psi, then presumptively had the same ball been tested instead using the "non-Logo" gauge, then the pressure would have been reported at 11.6 psi.
 

iv) Were all the NFL personnel involved in testing the footballs at the game equally skilled in each relevant aspect in using the gauges? [For instance--and here I'm confessing specific lack of knowledge--if the gauges "report" psi either by displacing a "ruler" (like an car tire gauge) or by causing a thin pointer to spin on a clock style display ... (I'm not being argumentative here... I have not seen a photo or video of the device(s) used. Here's a link to images turned up via a yahoo search using search terms "NFL football pressure gauge":
https://images.search.yahoo.com/search/images;_ylt=A0LEV0vt6H5V5j0AcXJXNyoA;_ylu=X3oDMTE0dGo2bWE1BGNvbG8DYmYxBHBvcwMxBHZ0aWQDQTAxMDRfMQRzZWMDcGl2cw--?p=nfl+football+pressure+gauge&fr=yfp-t-252&fr2=piv-web )

 .... then whether a particular "tester" accurately reads the gauge's reported result is potentially an issue. Not so much (or not at all) if the psi is "reported" via...for instance...an LCD device.]

v) Not trivially, it seems all the Patriots footballs tested at halftime were tested twice. An article by a team from Purdue at Columbus (an initial engineering report [yahoo search terms: "NFL football pressure purdue columbus" will yield a link to "Deflate Gate Examined"]) noted that "The football pump needle made a very poor seal when inflating the football. A noticeable amount of air was felt coming around the side of the needle when inflating the football. The amount of air leaving the football when removing the pump needle from the ball also released a small amount of air each time. The team estimated that with each removal of the valve 0.1 psi was released as we closely observed the behavior of the interaction between the needle and the football. Thus, it could be plausible that if a ball is inspected several times in succession without being inflated could be subjected to a significant loss in pressure and should be noted for further testing."
 

In other words, when the Patriots’ footballs were each tested twice at halftime, it is possible that the second test on most or all the balls reflected a lower value than the first test. Depending on how the “halftime/Patriots” psi results were reported (i.e., did the refs take the average of the two tests, and report this?) this has the potential to affect the overall likelihood that a reviewer might conclude that the Patriots footballs were improperly deflated.

vi) In light of something I frequently hear engineers say when trying to get a point across: "Do the math." Relevant here, because the AEI report (though rather obliquely and politely) seems to attack the accuracy of the simple math/statistics calculations contained in the Wells report.

OK, all that said, what does this all mean?

In front of a competent judge, the NFL could expect to have significant difficulty in getting the halftime psi results admitted into evidence.

Even if the results were admitted into evidence, I think it unlikely that they would be accorded much evidentiary weight.

In particular, I think a judicial fact-finding might look something like this (what follows is intentionally compressed):

Acting as finder of fact, the Court noted Defendant Brady’s objection to admitting into evidence the reported values of the psi testing of the Patriots’ and Colts’ footballs as measured before the game, and at halftime. The objection was a lack of foundation. The Court admitted the evidence, under the proviso that the Court would later evaluate them “for what they are worth.”

As it turns out, the Court finds they are not worth much.

In particular, the evidence shows that Referee Anderson tested all the footballs before the game using a gauge that reported values--each time, each test--0.4 psi higher than the other gauge used, in part, to measure the footballs during halftime. The Court finds the evidence does not demonstrate precisely which gauge or gauges were used to test each football during halftime. However, it is clear that both gauges were used. Also, it is clear that each “Patriot” football was tested for pressure twice during halftime, while only four “Colts” football were tested for pressure at halftime. Clearly, the Patriots’ footballs were tested first, and there was insufficient time to finish testing all twelve Colts’ footballs.

A significant difficulty in assessing the evidentiary weight of the test results arises from the lack of clarity as to which “gauge” tested each football during halftime. If all of the Patriots’ footballs were tested with the same gauge that Referee Anderson used to test them before the game, then the comparative results between the pre-game and halftime tests are sensible in a “math-physics-engineering” sense. However, if some or all of the Patriots footballs were tested at halftime using the second gauge, then the halftime test results would generate results that would unfairly and incorrectly support a conclusion that the Patriots’ footballs had been improperly deflated after the pregame testing, but before the game started.

Moreover, while the evidence and simple high school chemistry teach that application of the Ideal Gas Law (pv=nrt) mandates a conclusion that both teams’ footballs would have been found to have “lost pressure” between the pregame testing and the testing at halftime, another problem for assessing the weight and meaning of this evidence arises because of the specific circumstances of the halftime testing. All the game balls were taken into the warm “testing” room at halftime for the required testing. Testing on the Patriots’ footballs began immediately. Testing on the Colts’ footballs began after all the Patriots’ footballs had been tested. The evidence establishes that all the footballs that
“waited” to be tested underwent some increase in pressure due to being in a significantly warmer environment than that existing outside, during the game. However, this “re-inflation” was not uniform.

The end result is that the halftime testing was improperly skewed to support a conclusion that the Patriots’ footballs had been improperly deflated after the pregame testing. The Court finds that the total impact of this improper skewing is contained within a range of 0.6 to 1.0 psi as to each Patriots football. Adjusting the reported results to remove this improper skewing means the Patriots’ footballs did not demonstrate evidence of having been improperly deflated when they were tested at halftime.

To conclude this discussion of the pressure testing and the conclusions that might be drawn from it, the Court notes the criticism the AEI report (and Defendant Brady) level against the accuracy of the statistical calculations contained in the Wells report (and asserted in these proceedings by Plaintiff NFL). While the Court finds the AEI report’s conclusions in this regard to be well founded and accurate, the Court does not rely upon this point to reach its final decision. In other words, the Court assumes, arguendo, that the Wells report’s statistical calculations are accurate.


I have no advanced mathematical training and I am a second year law student, not a lawyer, but my friend's take on the AEI Report's numbers and how that evidence would be used in a hypothetical court setting comports quite well with my understanding of math and law. If NFL Commissioner Roger Goodell does not rescind the four game suspension that he levied against Brady it seems as if a competent judge would rule in Brady's favor if Brady decided to sue the NFL for suspending him based on evidence that lacks proper foundation.

The mathematical and legal issues in this case are interesting but there are some deeper questions here as well pertaining to the NFL as a whole and to media coverage of the NFL:


1) Who "tipped" the NFL with (apparently false) information that the Patriots were deflating footballs and why did this person do so?

2) A false allegation of cheating is a punishable offense under FIDE (International Chess Federation) rules. Since there is no credible evidence that the Patriots cheated, will the "tipper" be punished in some fashion by the NFL?

3) Who holds media members like Bob Kravitz, Mike Wilbon and others accountable for their baseless speculations and for their overwrought calls for punishing the Patriots in general and Bill Belichick in particular? Even the Wells Report, which appears to be incomplete and flawed at best, exonerates Belichick from any wrongdoing. Freedom of the press is a cherished and essential American right but accountability of the press should exist in some fashion as well. If media organizations, editors and writers/commentators will not hold themselves accountable then the public should be aware that anything written or stated by media organizations is highly suspect, particularly if that information is delivered by someone who has a track record of being more interested in self-promotion (or the promotion of some other personal agenda) than in seeking out the truth.

No comments: