Statistical ranking of defense lawyers? Maybe, but not this way.
It’s an intriguing notion: that one can objectively assess the relative effectiveness of a given lawyer. With hard data, and sound analysis. In the real world, it’s nigh impossible to tell how good a lawyer really is. You can look on Avvo and see what people here and there may have subjectively thought about him, but that doesn’t tell you whether any other lawyer would have done as well (or been just as dissatisfying). You can ask around and get a sense of what other lawyers generally think of him, but that’s just as subjective. There’s really nothing out there to tell you for sure whether that lawyer gets better-than-average results or not.
So Wake Forest professors Ronald Wright and Ralph Peeples — to their great credit — tried to see if it could be done. In their recent paper, “Criminal Defense Lawyer Moneyball: A Demonstration Project,” they conclude that it can be done. They may even be right about that. But not from the data they gathered, sadly.
[Warning: The internet's gonna try to do it anyway. Granted, the authors are concerned more with providing a useful tool for managers of institutional organizations providing legal aid, than with helping potential clients assess a lawyer's abilities. But you can bet that Avvo or someone like them would love to develop an algorithm that ranks lawyers in a way that potential clients are willing to pay for. It may or may not be all that objectively useful, but consider yourself warned.]
Wright and Peeples don’t claim that their methodology is the be-all and end-all of statistical analysis; it’s just a test to see if it’s doable here. So it would be just as foolish to draw any conclusions from their results (beyond the fact that they were able to get results, which was the point), as it would be to jump all over them for leaving out important considerations. Still, with that caveat, let’s do both!
The methodology was very simple. First, they needed a base number they could use to compare people. They chose the difference between the most commonly imposed guideline sentence for a given arraignment offense, and the actual sentence achieved in a particular case (looking at statewide North Carolina data.) So someone who could negotiate eighty days off a case would be considered more effective than someone who only got the prosecutor to come down thirty days. The guideline took the defendant’s criminal history into account, so that variable was already factored in. And because pleas vary from one criminal offense to another — you might get a lot of months off in a plea to a nonviolent offense, while a violent offense at the same level might not have as much wiggle room — they figured out the standard deviation for each offense.
With those basic numbers, it was easy to see how much an individual lawyer varied from the statewide average, in getting pleas for specific charges. They were able to collect a lot of other data about the lawyers and the cases, which enabled them to run a regression analysis to see which variables actually had any effect on those numbers.
The biggest variable had nothing to do with the lawyer, the defendant or the offense: it was the local prosecutor’s office. Some offices are hardasses, some aren’t. Hardly a surprise there, but also not exactly something within the lawyer’s control. So not a useful variable.
The next biggest variable was the defendant’s criminal history — something that was supposed to have been controlled for by using the guideline as the reference point. Apparently, repeat offenders are more likely to get more time off their sentence. The authors speculate that it might be because everyone in the system thinks the guidelines are too strict for predicates, but it could also be a flaw in the methodology. Whatever the reason, it’s got nothing to do with the lawyer, and so it’s still not a useful variable.
The only variable that was statistically significant that had anything to do with the defense lawyer himself was the amount of time he’d been practicing. Shiny new lawyers in their first 4 years performed slightly above average. They peaked during years 5 to 9. And then it was all downhill from there. A second peak of effectiveness during years 15 to 19, but dropping again after 20 and plummeting in free-fall after 25.
That’s a counter-intuitive observation if ever there was one. And the authors suggest reasons for it, such as becoming used to the way things are done and so less willing to fight for more. Other reasons might be that older, more experienced attorneys were more likely to handle cases with lower point spreads. Or that they got more results that weren’t quantified here, such as acquittals.
Might as well mention some problems with the methodology at this point. At least those that were apparent from the paper (apologies to the authors for any misunderstanding here).
A big one of course was, as always, the sample. But they did their best with what was publicly available, and frankly they could have done a lot worse.
More critically, they focused on a single metric that is probably not as good an indicator of relative merits as they suspect. The data itself shows that pretty much nothing within the lawyers’ control had any significant effect on the plea results. As a way of distinguishing two lawyers in an organization (or online) who graduated law school roughly the same time, the plea differential is kinda useless.
This could end right there. The study was to determine whether statistical analysis could be used to meaningfully compare attorneys. The authors say yes, but the data really says no. Certainly, the data demonstrates that there is variation among the outcomes that attorneys get, but the causes appear to have nothing to do with the relative merits of the attorneys. That does not mean that all attorneys are equally good; it simply means that the measurement they chose isn’t useful for comparing them.
And that itself doesn’t mean that statistics can’t be used to measure those differences. It only means that this statistic can’t.
But there are other problems that ought to be addressed by anyone trying to come up with a similar study (or algorithm).
A huge one was that the study ignored wins. Getting an acquittal at trial, getting the evidence suppressed and the case dismissed, persuading the prosecutor not to file charges in the first place. There are lawyers who plead out every case, but many don’t. Some are really good at making a case go away before it ever happens, but a study like this cannot measure that. Some are really good at winning trials, but ditto.
Maybe “ignored” is too strong a word. Because how do you count what didn’t happen? Still, it is an important consideration for any such approach.
Another problem was in the weighting of cases. The authors did acknowledge that different kinds of offenses get treated differently, and finding the mean plea deal and the standard deviation for each offense was certainly the way to go, but they shouldn’t have stopped there. No two cases are alike. Two may involve the exact same offense, but one is still… worse. The lawyer on the worse case isn’t likely to get as great a deal as the one on the less-serious one. That doesn’t make him a worse lawyer.
Maybe that’s why the young tyros were getting better deals than their elders — they handled the easier cases. It may well be that the more experienced lawyers (in addition to getting more acquittals, dismissals and non-prosecutions) were getting better deals than the younger ones could have gotten in those cases. Or not. We can’t tell from this data, is the point.
Anyway, this post is already longer than it was supposed to be. Statistical ranking of lawyers may well be doable, and it may well be useful. But this study doesn’t really support that conclusion. And those who would try to imitate it for publication or for profit need to understand why.