The Cost of Strict Discovery: A Comparison of Manhattan and Brooklyn Criminal Cases

pdfDan Svirsky


Practitioners agree that criminal discovery rules have important effects both on how cases develop and on how they get resolved. However, there has been little empirical work done to measure the nature and breadth of these effects. This paper seeks to fill that gap by presenting a statistical analysis of the criminal discovery rules and case outcomes in two similarly situated jurisdictions. First, this paper analyzes theoretical predictions about the effects of criminal discovery rules and initially agrees with the views of many practitioners: increased disclosure should lead to shorter cases. Then, to analyze this hypothesis, the paper presents a new dataset of 200 criminal cases that tracks the charged crime, attorney experience and background, attorney pay structure, bail status, and other variables that could impact case outcomes. The dataset measures and compares these variables across two jurisdictions that use different criminal discovery rules but are otherwise quite similar. Finally, the paper presents the results of this analysis, concluding that strict disclosure rules impose significant costs on legal systems. Specifically, the statistical analysis suggests that the Manhattan court system could save resources by mandating more liberal discovery practices. This policy shift would have the important added benefit of increasing fairness for criminal defendants.

I. Introduction………………………………………………………………………….. 524

II. The Criminal Discovery Process: Goals and Constitutional Background 52

A. The Purpose of Discovery: Balancing Fairness to Defendants and Effective Prosecutions 527

B. Constitutional Discovery Rules………………………………………………. 529

III. How Discovery Works………………………………………………………….. 530

A. Theory: Shavell’s Model of Discovery (1989)………………………….. 530

B. Adaptation of Shavell’s Model to the Criminal Context…………….. 534

1. Brady’s Constitutional Guarantees of Minimum Disclosure……… 534

2. Other Relevant Differences Between the Civil and Criminal Contexts 535

3. The New Theoretical Model of Criminal Discovery………………. 537

C. Other Models and Predictions About Discovery………………………… 537

D. How Discovery Works in Practice: New York City…………………… 538

IV. The Study……………………………………………………………………………… 539

A. Data………………………………………………………………………………….. 539

B. Methodology………………………………………………………………………. 541

C. Results……………………………………………………………………………….. 543

D. Rethinking the Model: Taking Defendants’ Strategic Responses into Account 543

E. Other Critiques of the Model and Data……………………………………. 546

1. Omitted Variables Bias………………………………………………………. 546

2. Shortcomings in the Variables…………………………………………….. 547

V. Conclusion……………………………………………………………………………. 548


If a criminal defendant is prosecuted in Manhattan, the prosecutor will engage in criminal discovery practice that is among the most restrictive in the country, sharing relatively little information with the defendant. If the defendant had crossed the Brooklyn Bridge before getting arrested, the prosecutor would have engaged in perhaps the most liberal criminal discovery practice nationwide. This paper exploits these contrasting practices in neighboring jurisdictions to test how different criminal discovery rules affect case outcomes. The paper has two goals: to craft a theoretical narrative about how criminal discovery works by adapting past work in the civil context and to test the empirical predictions this theoretical approach yields.

The analysis suggests that Manhattan is wasting significant resources by employing strict discovery practice; defendants in Manhattan seem to be working around strict discovery rules by using imperfect, inefficient substitutes for discovery, such as suppression hearings. Most significantly, the data revealed that during 2005 and 2006, Manhattan felony cases included suppression hearings roughly 50% of the time, while Brooklyn felony cases resulted in suppression hearings only 5% of the time. The model and data suggest that this dramatic disparity in suppression hearings in Manhattan is a waste of the legal system’s resources, because defendants are using an unnecessary, expensive court mechanism as a substitute for discovery.

Several scholars have used models to predict how discovery will affect outcomes in civil litigation,[1] and empirical studies have used statistical approaches to assess how civil discovery works in practice.[2] However, there has been little research, either theoretical or empirical, on how discovery affects case outcomes in the criminal context. The research in the civil context is not directly applicable to the criminal context because criminal defendants enjoy constitutional due process guarantees that are absent in civil cases, so there is no guarantee that the predictions that civil-based models yield will apply to criminal procedure policy. From an empirical standpoint, the effects of different discovery processes on criminal case outcomes remains largely uninvestigated. This paper seeks to fill this academic gap by comparing the strict and liberal discovery practices employed in Manhattan and Brooklyn.

New York State’s criminal discovery statute imposes lax requirements on prosecutors: prosecutors need only share basic information about a case.[3] However, nothing in the statute prevents local prosecutors’ offices from sharing more information.[4] While the Manhattan office has adopted a policy requiring prosecutors to hew closely to the state statute, Brooklyn’s policy requires prosecutors to divulge much more information during discovery.[5] In other words, depending on which side of the Brooklyn Bridge an individual gets arrested, she will either get severely restricted or near-total access to the prosecutor’s information about the case.[6] Because the two boroughs share so many similar characteristics, this variation provides an effective way to empirically test theoretical models of the effects of different discovery rules.[7]

Section II of this paper introduces background literature on criminal discovery and discusses the statutory and constitutional frameworks that govern how criminal discovery operates. Section III develops a theoretical model of criminal discovery that yields predictions about how discovery rules will affect costs and outcomes in cases. Section IV presents empirical findings to test these predictions. The empirical tests show that the model fails because it overlooks the fact that defendants can use imperfect discovery substitutes to get around strict discovery rules in Manhattan. If we account for this strategic behavior, the models’ predictions become accurate. Section V discusses these findings and the shortcomings of the empirical approach. Finally, Section V concludes that Manhattan is wasting significant resources on unnecessary suppression motions and hearings.

The Criminal Discovery Process: Goals and Constitutional Background

Discovery statutes play a dual role in criminal cases. First, they seek to ensure balance and fairness in an adversarial system: discovery statutes counteract prosecutors’ investigatory advantages without giving defendants the ability to win cases using perjury or intimidation.[8] Second, they promote efficiency: discovery rules grease the wheels of criminal cases by standardizing how information is shared and encouraging mutually-agreeable settlements where possible.[9] Several authors who have written about criminal procedure have focused on the first goal of ensuring fair outcomes and fair practices rather than efficiency.[10] While valuable, such scholarship leaves open the question of how we can design criminal discovery rules that promote both fairness and efficiency within the criminal justice system.

  1. The Purpose of Discovery: Balancing Fairness to Defendants and Effective Prosecutions

Discovery rules in criminal law formalize how the government and defendants share information. In the weeks after a prosecutor brings charges against a defendant, discovery rules direct the government to share evidence the prosecution has gathered, such as scientific tests performed by the state or past statements made by the defendant.[11] In turn, the rules might mandate that a defendant who intends to bring an alibi defense must share relevant information regarding the time and place of the alleged alibi.[12] Early access to this information helps both sides prepare for trial or negotiate a plea agreement.[13]

Discovery promotes efficiency, both by forcing each side to reveal its relative strength early on and by formalizing the mechanism by which information is shared. If only one side has a strong case, sharing this information can encourage quick settlements.[14] For instance, a defendant might be more willing to accept a plea bargain if she is aware that the prosecutor has particularly strong evidence against her. Discovery also promotes efficiency by creating a standardized process for information sharing. For example, if discovery rules mandate that all information be shared two weeks before trial, then attorneys need not spend time negotiating whether to share information one week or three weeks before trial. In this way, discovery helps parties avoid ad hoc arrangements and ensures that attorneys need not waste valuable time and resources negotiating minor procedural issues.[15] Discovery rules benefit all parties in a case while minimizing the burden on court systems.[16]

In addition, discovery rules help parties achieve accurate outcomes by avoiding “trial by surprise.” For instance, absent discovery rules, a party with a weak case can attain a better outcome by withholding information early on and then revealing the information at the last possible moment to put her opponent at a disadvantage.[17] As Judge Traynor stated, “[t]he truth is most likely to emerge when each side seeks to take the other by reason rather than by surprise.”[18]

Discovery rules can promote fairness to defendants by allowing defendants to assess the strength of the case against them before making life-altering decisions.[19] The decision to accept a plea bargain or go to trial is momentous. An individual has to engage in quantitative gymnastics, balancing her tolerance for risk with the potential loss of liberty. Fairness considerations suggest that a defendant should not have to make this decision without reliable information about the likelihood of losing her case, and discovery allows defendants to make informed decisions.[20] Similarly, discovery promotes fairness because it counteracts the government’s financial and investigative advantages. The government often has significant advantages over a defendant; it often has more resources to test evidence, a dedicated branch of investigators, and the opportunity to investigate crimes first.[21] Given these advantages, discovery levels the playing field because it helps defendants start and direct an investigation that might be difficult without some knowledge of the government’s case.[22]

While discovery rules promote fairness for defendants, some scholars and practitioners warn that liberal discovery rules may undermine the effectiveness of prosecutions.[23] A primary concern is that discovery may expose government witnesses to threats and coercion.[24] If prosecutors have to share the identity of witnesses early on, this might invite defendants to intimidate the witnesses to persuade them not to participate.[25] In addition, scholars and judges have worried that liberal discovery will encourage either more perjury or more effective perjury.[26] A defendant who is willing to commit perjury can craft a more effective lie if she knows the details of the prosecutor’s case ahead of time. Thus, increased discovery can promote fairness, but these benefits must be balanced against the potential for witness coercion and more informed acts of perjury.

Discovery rules balance these competing interests by regulating how much information prosecutors must share with the defense. The most liberal policy is open-file discovery, under which a prosecutor automatically shares all the information she collects on a case with the defendant.[27] Meanwhile, in a stricter discovery regime, the defendant might only gain access to basic information about the government witnesses, and the prosecutor might not have to share impeachment evidence for these witnesses until shortly before trial.[28]

  1. Constitutional Discovery Rules

A model of criminal discovery must take into account the interplay between discovery statutes and the Constitutional guarantees that govern discovery practices. Constitutional principles place mandatory bounds on discovery laws. In Brady v. Maryland, the Supreme Court held that under the Due Process Clause of the Fourteenth Amendment, the government must share all information that is material and exculpatory with the defense.[29] Brady involved a criminal case in which the prosecutor suppressed a codefendant’s confession despite the defendant’s request to access this evidence.[30] The Court ruled, “the suppression by the prosecution of evidence favorable to an accused upon request violates due process where the evidence is material either to guilt or to punishment, irrespective of the good faith . . . of the prosecution.”[31] Subsequent cases have clarified that evidence is “material” if “its suppression undermines confidence in the outcome of the trial.”[32] Prosecutors must also disclose evidence that could be used to impeach government witnesses.[33]

However, these Constitutional rules leave gaps, and criminal discovery statutes regulate within these gaps. For instance, consider information about a key government witness’s testimony. The Constitution does not compel the government to share this information if it does not have the potential to exculpate the defendant.[34] Further, a prosecutor might not voluntarily share this information if she feels that the benefits of doing so are outweighed by the drawbacks. Since Constitutional requirements mandate sharing exculpatory evidence, then as a general rule, state discovery statutes regulate how much inculpatory evidence a prosecutor must share even when she would prefer to keep it secret.[35]

Given the apparent discord of the goals of discovery statutes, the key remaining question is whether lawmakers can design discovery policies that are more efficient without sacrificing the ends of fairness and witness safety. Such an assessment requires both theoretical and empirical approaches. A comprehensive theoretical model of how criminal discovery works will yield hypotheses about which types of reforms can save money without sacrificing fairness. The theory should be coupled with empirical data to test the predictions that the theory yields. Once we verify or reject these predictions, we can move closer to finding more efficient criminal discovery policies.


How Discovery Works

This section presents a model of the effects of discovery statutes on case outcomes and case lengths. The model draws heavily on similar work in the civil litigation context by Steven Shavell.[36] The model presented in section III.A adapts Shavell’s work by translating it into the criminal context, and it predicts that more liberal discovery would lead to fewer trials and shorter cases. This section describes the Shavell model as it was developed in the civil context, the model’s advantages, and the predictions it yields.

  1. Theory: Shavell’s Model of Discovery (1989)

Shavell’s model of civil litigation is simple, compelling, and yields reasonable predictions about the effects of discovery. The model strips down the civil litigation process to its essential parts. From a plaintiff’s perspective, the only considerations that motivate the decision to bring a lawsuit are the potential damages and the likelihood of winning.[37] In deciding how to respond to a complaint, defendants should also be primarily motivated by the likelihood of losing and the cost of losing.[38] While other factors may also play into the parties’ decision-making, such as a sense of justice, a desire to prove that a defendant was in the wrong, or concerns about the expense and delay of litigation, Shavell’s model can account for these factors by considering them as part of the expected payoff or cost.

Shavell’s model determines the damages a plaintiff can expect in a civil trial as a function of the probability she will prevail, x, and the amount of damages she will receive if she prevails, y. Let e1, e2, e3 . . . en represent the pieces of evidence in the case. Each e equals either 1, if the piece of evidence is inculpatory, or 0, if it is exculpatory. There is no limit on the amount of evidence, n. We can compute x by finding the average of all of the pieces of evidence (e1 + e2 + e3 + . . . + en, divided by n). For example, if a given case involves five pieces of evidence, four inculpatory and one exculpatory, and the plaintiff will receive $100 in damages if she prevails, the expected damages is calculated by multiplying x (1+1+1+1+0, divided by 5, which is 0.8) and y ($100),[39] which yields an expected damages award of $80.[40]

To understand how discovery policies affect this model, begin by assuming that a plaintiff can credibly share every piece of evidence, e, in a case. Assume that if both parties are made aware of the value of each piece of e, they will agree on the probability of a guilty verdict because they will conduct the same calculation. Assume, too, that the defendant seeks to minimize the expected level of damages, while the plaintiff seeks to maximize the expected damages and minimize the costs of pursuing the litigation.[41] So long as both parties will accept mutually advantageous plea bargains, the model predicts that cases will always end in settlement.[42]

The following example helps illustrate this prediction. If a defendant and plaintiff see all the evidence and agree on the likely outcome of a case, then both sides can agree that if the case goes to trial, the plaintiff will receive, for example, $60 on average. In this case, the plaintiff can save her resources by settling the case, because she avoids trial preparation. As a result, the plaintiff would be happy offering a settlement of just under $60. The defendant, meanwhile, knows that if he goes to trial, he will have to pay $60 on average. Any settlement below $60 is a better outcome than he would likely get by going to trial.

According to this logic, every case should end in settlement because a mutually advantageous settlement will always be available to the parties. Obviously, many cases do go to trial, so, at first glance, the model seems problematic. However, the model only predicts universal settlement in a hypothetical world of perfect information sharing. So long as the parties cannot access and evaluate every piece of evidence, e, that defines the probability of the plaintiff winning, x, at least some cases will go to trial. For example, Shavell argues that trials occur because some plaintiffs might be unable to put all their cards on the table.[43] Imagine that evidence will not become available until the trial commences because, for example, a medical examination cannot be done quickly enough.[44] Or imagine that a plaintiff has inculpatory information, but she believes that sharing it will allow a defendant to intimidate the plaintiff’s witnesses. In these situations, the case might have a high expected damages reward but the plaintiff will not be able to share all the pieces of evidence to prove this to the defendant. As such, the defendant will have to make an uninformed guess about the true value of x in cases where the plaintiff is silent.

Shavell’s model assumes that when faced with a silent plaintiff, the defendant’s guess about the expected probability that the plaintiff will win the trial, x, will be the average strength of all other silent plaintiffs’ cases.[45] Put another way, when a defendant has to guess how strong the case is against her she bases this guess on how often plaintiffs win cases in general. Thus, imagine a universe where there are twenty silent plaintiffs. A defendant will find the average x for these twenty cases and will refuse to accept a settlement above this value. If the twenty silent plaintiff cases have an average x of .75 (yielding an expected damages award of $75), then in any case where the plaintiff is silent, the expected damages are $75. If a defendant believes the expected outcome is $75, then he will be unwilling to accept an offer of $80.

The existence of silent plaintiffs who cannot share certain evidence for legitimate reasons creates a problem. Once these legitimate silent plaintiffs exist, other plaintiffs with weak cases will find it advantageous to pass themselves off as legitimate silent plaintiffs. These illegitimate silent plaintiffs will act as though their case includes strong inculpatory information that is inaccessible, thus acting like a legitimate silent plaintiff. A defendant facing an illegitimate silent plaintiff will then be forced to evaluate the case using only the silent plaintiff average, as if it were a case involving a legitimate silent plaintiff with a strong case who cannot share evidence, because the defendant has no way to distinguish between the two types of silent plaintiffs. Thus, imagine that the pool of all silent plaintiffs—both legitimate and illegitimate—has an average x of .75 for their cases. A plaintiff with a case where x is .3 will be better off hiding evidence. By hiding pieces of evidence and claiming that she has no access to them yet, the plaintiff can pass herself off as a legitimate silent plaintiff. Because the defendant has no way to verify the plaintiff’s case, the defendant must assume that this silent plaintiff is essentially like all the legitimate silent plaintiffs, who on average have cases with x equal to .75. By keeping evidence secret, the plaintiff can thus offer a settlement of up to $75 and a defendant will accept it even though her actual case strength would have yielded $30 on average.

As more illegitimate silent plaintiffs enter the pool of silent plaintiffs, the average x for this pool drops, leading to more trials. To see why, note that a legitimate silent plaintiff with a case where x equals .74 would have initially reached a plea bargain—she’d offer $75 and the defendant would accept. Now, as illegitimate silent plaintiffs enter the pool, the average x for all silent plaintiffs might drop to .5. In this case, the legitimate silent plaintiff with an x of .74 will not be able to convince the defendant to accept a settlement higher than $50. Furthermore, although the legitimate silent plaintiff knows the strength of her evidence, she is unable to access all of it, so she is unable to present the real strength of her case credibly. Depending on the costs, this plaintiff would be better off simply going to trial, where she expects to get a $74 payout by the time trial occurs, at which point she will have access to all her evidence.

Shavell’s model predicts that more open discovery practices will cause more settlements in civil litigation because it forces illegitimate plaintiffs with weak cases to put their cards on the table. “[W]ith discovery, the group of silent plaintiffs no longer includes any plaintiffs with unfavorable information who can comply with a discovery request. Thus the group of silent plaintiffs is one that, on the average, would obtain more at trial.”[46] In other words, if legitimate silent plaintiffs with strong cases are unable to reveal their information, discovery will not make a difference for them because they still cannot reveal the information. Rather, discovery will only affect illegitimate silent plaintiffs, who have weak cases and want to hide their information strategically. With increased discovery, defendants will be willing to accept higher settlement amounts even when plaintiffs do not present all evidence during negotiations because the defendants know the pool of silent plaintiffs is stronger on average and is not merely concealing plaintiffs with weak cases.[47] As a result, Shavell predicts that more settlements will be reached, the legal system will save significant expense by avoiding costly trials, and defendants and plaintiffs will reach mutually beneficial settlements more often.[48]

  1. Adaptation of Shavell’s Model to the Criminal Context

Shavell’s model can be adapted to yield predictions about discovery rules in the criminal context because civil and criminal cases have roughly similar structures. Though they have some important differences—which are discussed below—the overall architecture is similar. An accusing party files a complaint against a defendant. Both parties have the opportunity to negotiate a settlement, or plea bargain, in which the accusing party settles for less than she desires in exchange for something of value, such as decreased uncertainty or expense. If a settlement is not reached, then a neutral third party will dispose of the case either through dismissal or trial. At this level of abstraction, the actors and relevant stages in the process are identical in the civil and criminal contexts. Both contexts include a prosecuting party, a defense attorney, and a neutral judge. Both contexts include a pre-trial period where settlements can be reached and a trial where a fact-finder will settle the dispute. Thus, the model presented in the previous section can be applied to criminal cases. Nonetheless, it is important to take into account the differences between civil and criminal procedure, as these differences could change the predictions that the model yields.

  1. Brady’s Constitutional Guarantees of Minimum Disclosure

As noted in section II.B, one significant difference between the civil and criminal contexts is that constitutional requirements mandate a certain minimum amount of discovery in criminal cases that does not apply in civil cases. Under Brady, the prosecutor must share material exculpatory information.[49] The Brady requirement serves the role that discovery does in Shavell’s model. Just as discovery in Shavell’s model forces illegitimate silent plaintiffs with weak cases to share exculpatory information when possible, the Brady requirement forces prosecutors to share exculpatory evidence that, if suppressed, would make the prosecutor’s case seem stronger. Now that prosecutors cannot hide exculpatory information, “the group of silent plaintiffs”—or in criminal cases, prosecutors—“no longer includes any [prosecutors] with unfavorable information who can comply with a discovery request.”[50] Since the Brady requirement takes the place of civil discovery in Shavell’s model, then criminal discovery statutes that impose different requirements than the Brady rule might have very different effects than civil discovery statutes. As a result, liberal discovery might have different effects in the criminal context, since constitutional principles change the effects that discovery statutes have on cases.

Since Brady does the heavy lifting in the criminal context, the primary effect of liberal criminal discovery should be to force prosecutors with strong cases to share inculpatory information they would rather keep to themselves, either for witness protection or to prevent the defendant from preparing for trial. To illustrate this assertion, imagine two pools of silent prosecutors. Both pools cannot hide any exculpatory evidence under Brady, so they are withholding inculpatory evidence. The prosecutors in Pool 1 are withholding inculpatory evidence because they cannot credibly reveal the information or do not have access to the information. The prosecutors in Pool 2 have access to the inculpatory evidence, but are withholding the evidence for strategic reasons. Liberal discovery will lead to more plea bargains, because it forces the prosecutors in Pool 2 to reveal the information they can access, which will prevent any cases where a defendant goes to trial because she makes a mistaken prediction about the strength of the case. Thus, liberalizing criminal discovery will result in more and better-informed plea bargains.[51]

  1. Other Relevant Differences Between the Civil and Criminal Contexts

Among other key differences between the civil and criminal contexts, the most conspicuous is what is at stake within each system. A civil defendant faces a pecuniary loss if forced to pay damages, whereas a criminal defendant faces jail time. Thus, with rare exceptions like commitment or immigration proceedings, the defendant risks a much harsher outcome in criminal proceedings.[52] Though this difference has incredible substantive importance, for the purposes of this model, it can be set aside or easily incorporated. All that matters for the model is that the defendant faces an outcome she prefers to avoid. How much she wishes to avoid the punishment is irrelevant for the model. It does not matter whether a punishment is $100 or $1,000 or a year in prison. In all three cases, the defendant faces an outcome she wishes to avoid or minimize as much as possible. In this sense, the civil model can be applied to the criminal model.

Other significant differences include the fact that criminal cases use a much more complex procedure to assess punishments and the fact that criminal defendants have greater access to jury fact-finding.[53] These substantive differences may affect the expected outcomes of a case, but they do not change the predictions of the model. Imagine that a plaintiff in a civil case has an 80% chance of victory, using the “preponderance of the evidence” standard.[54] In a criminal case, which uses the much stricter “reasonable doubt” standard, a prosecutor with the same strength of evidence might have only has a 40% chance of success.[55] While this difference is incredibly important from the parties’ perspectives, it does not affect the model’s predictions about decision-making so long as both parties agree on the chance of the claim’s success.

Another key distinction between the civil and criminal context involves the difference between ordinary plaintiffs and prosecutors. This difference implicates one of the basic assumptions in Shavell’s model. Shavell assumes that plaintiffs seek to maximize the expected payoff, but prosecutors, who also have the duty of ensuring justice, do not (in theory) seek to maximize punishment.[56] In Shavell’s model, a plaintiff who comes to believe that she is in the wrong would still like to win a payoff.[57] In the criminal context, a prosecutor who comes to believe that a defendant is innocent is ethically bound not to try the case.[58] Nonetheless, the fact that a prosecutor does not always want to maximize prison sentences should only make a difference in exceptional cases. If we assume that in the vast majority of criminal cases, the prosecutor believes that the defendant is guilty of the crime charged, then we may also assume that the prosecutor wants to win the case. Thus, absent the a case where a prosecutor comes to believe that a defendant is innocent, the key assumption in Shavell—that plaintiffs want to achieve the opposite goal of civil defendants —continues to hold true in the criminal context.

  1. The New Theoretical Model of Criminal Discovery

The civil model of discovery yields two predictions that should also hold true for the criminal context. First, liberal discovery will increase the number of plea bargains because it increases the inculpatory information that a prosecutor must share.[59] As a result, defendants will no longer have to guess the strength of a prosecutor’s case and fewer defendants will reject a plea bargain because they make a mistaken guess about their likelihood of winning at trial. Second, case lengths will be generally shorter. Although a case that leads to settlement could, in theory, last as long as or longer than a case that ends in trial, it seems unlikely that this will be true on average. Trials take much more preparation: juries must be selected, evidence must be gathered, and cases must be presented. Further, a trial will only occur after the breakdown of settlement discussions. While cases that end in plea bargains could conceivably end very quickly, even before discovery occurs, cases that end in trial will only occur after discovery and the resolution of any collateral issues, like the admissibility of evidence and the defendant’s ability to stand trial. As such, if liberal discovery leads to more settlements, then cases will be shorter on average.

  1. Other Models and Predictions About Discovery

The predictions based on the criminal discovery model developed above match predictions based on other models of civil discovery as well as predictions made by New York City practitioners. Cooter and Rubinfeld present a model of civil litigation that predicts trials will occur when opposing parties disagree about the strength of a case.[60] In this model, there are no silent prosecutors.[61] Rather, parties go to trial because a defendant believes she has a high chance of going free while the prosecutor believes the government’s case is extremely strong.[62] Even when faced with the same evidence, parties might be too optimistic about their chance of prevailing, in which case they will not agree on a mutually agreeable settlement and will go to trial.[63] Under this model, discovery will have an ambiguous effect in civil trials.[64] If it forces the sharing of information that lowers a party’s expectations of victory, then discovery will result in more settlements, because it counteracts optimism.[65] If discovery promotes the sharing of information that improves a party’s outlook, then settlements become less likely.[66] In the criminal context, however, where Brady mandates the sharing of all material exculpatory evidence, criminal discovery will only have the first effect. By mandating the sharing of inculpatory information, discovery will lower a defendant’s expectations, leading to more settlements.

When New York City legal organizations call for more liberal discovery, they use arguments nearly identical to what the model predicts. The Legal Aid Society, the largest public defender organization in New York, argues that liberal discovery leads to more pleas and quicker case resolution.[67] Defense attorneys and prosecutors both in New York City and in other jurisdictions that have experimented with different discovery methods echo this view.[68] Thus, the abstract economic models of discovery match the intuition of many practitioners who understand how discovery works in practice.

  1. How Discovery Works in Practice: New York City

The New York State criminal discovery statute is among the strictest in the country;[69] it limits initial discovery to few kinds of evidence.[70] Other jurisdictions require prosecutors to share nearly everything in their files, such as police reports, witness lists, and statements from witnesses.[71] Some even allow for pre-trial depositions of witnesses.[72] New York, by contrast, requires prosecutors to share a more limited scope of evidence.[73]

Although New York’s strict discovery statute sets a baseline for minimum discovery, individual offices develop their own discovery standards and practices.[74] Thus, the Manhattan and Brooklyn District Attorney’s Offices have widely divergent discovery practices. While the Manhattan District Attorney (“DA”) follows the state discovery statute closely, its counterpart in Brooklyn has adopted a policy more akin to open-file discovery for years.[75] In Manhattan, prosecutors only offer discovery upon a request from the defendant, and discovery consists of a “Voluntary Disclosure Form,” which is usually a few pages in length.[76] Conversely, a Brooklyn prosecutor will typically hand over her entire file to the defense attorney and make new evidence available on an on-going basis.[77] One Brooklyn judge remarked that the open-file discovery policy implemented in the late 1980s was adopted in response to budget constraints—further confirming that the model’s predictions jibe with practitioners’ views that more liberal discovery rules are cost efficient.[78]

The different practices in Brooklyn and Manhattan offer a unique opportunity to test the predictions from theoretical models of the effects of discovery. Because the two boroughs operate under the same state law, are geographically proximate, and share the same primary public defender organization,[79] many confounding variables disappear. This paper exploits these similarities to test the model presented above. If the model holds true, Brooklyn, which has more liberal discovery, should have fewer trials and shorter case lengths than its stricter counterpart, Manhattan.


The Study

This section describes the dataset as well as the results of the multiple regression analysis used to test the model presented above. The regression tests whether liberal discovery is associated with fewer trials and shorter case lengths.

  1. Data

All data were compiled at the county clerk’s offices in Brooklyn and Manhattan, where public criminal case files are kept. The clerks’ offices in question only kept files for felony cases, so this analysis is limited to felonies. A pseudorandom number generator was employed to pull over 200 case files from the years 2005 and 2006.[80] Two hundred and three usable observations were collected from these case files.[81] Each observation includes data on the two dependent variables—case length and outcome—as well as any other variables that could have an impact on these dependent variables. Case length is measured from the day of arraignment until the day of a disposition, either by plea, dismissal, or trial. Case outcome is a categorical variable: cases can end either in a plea bargain, a guilty verdict, an innocent verdict, or a pre-trial dismissal.

The case files allowed for the collection of several independent variables as well. First, the severity and type of crime were recorded for each case, since crimes classified as more serious might take longer to get resolved. The crime-severity variable tracks New York State’s classification of felonies. Under New York law, felonies are classified on a range between class E and class A.[82] Only homicide crimes receive the class A status, which is the most serious.[83] The crime-severity variable in this model is an index variable where 1 corresponds to the lowest class, 2 to the second-lowest class, and so forth. The crime-type variable tracks the nature of the crime charged, such as whether it is an assault, a robbery, or a murder. The dataset also includes information on bail status, since a defendant who is in jail might seek a quicker resolution of her case. This is a binary variable, with 1 representing a client awaiting a disposition in jail and 0 representing a client out on bail. In addition, the data includes information on the number of evidentiary hearings and motions, which could prolong cases. Case files show whether a motion for an evidentiary hearing was filed and whether any hearing was held.[84] These are both binary variables.

Attorney characteristics are also potentially significant factors. In An Analysis of the Performance of Federal Indigent Defense Counsel, Radha Iyengar asserts that a defense attorney’s law school background and experience impact the development of cases.[85] In keeping with this work, the dataset includes information on defense attorney characteristics, such as law school attended, years of experience,[86] and compensation structure. For compensation structure, New York public defenders are split into two categories. Some are part of a public defender organization, such as the Legal Aid Society. These public defenders are paid an annual salary.[87] Others are known as 18-B attorneys (named because their compensation is defined by Article 18-B of New York County Law) and are paid hourly wages.[88] Literature and common sense suggest that the way an attorney is paid will have an impact on how much she decides to work,[89] so using this data, the statistical models can take these concerns into account. The compensation structure variable is a binary variable that simply measures whether an attorney is paid by the hour or annually.

There are shortcomings in this dataset stemming from the inability to measure certain variables. Omitted variables might distort the results. For instance, Brooklyn might have faster judges than Manhattan, but the case files would not reflect it. In this case, the results would suggest quicker case lengths in Brooklyn for reasons having nothing to do with the model. Further, the variables gathered are imperfect. For instance, the distinction between Manhattan and Brooklyn discovery policies is premised on the assumption that all Manhattan cases adhere to stricter discovery practices. However, it is possible that in some cases the prosecutor, in her discretion, shares all available information with the defense. If this situation is common, then even if the model is correct, the statistical analysis will suggest that Brooklyn’s discovery regime has no impact. Many variables exist that are impossible to measure by looking through case files to construct the dataset. These shortcomings are addressed in the discussion of the results.

  1. Methodology

In order to test the hypothesis that more liberal discovery rules are associated with fewer trials and shorter cases, multiple regression tests and t-tests were run to determine whether there is a significant difference between Brooklyn and Manhattan cases for two variables: 1) number of trials; 2) length of time between arraignment and final disposition.[90] In addition to testing the key independent variable—the borough in which the case occurs—the regression also takes into account other variables that might have an effect on the case length or number of trials, as discussed above.

Multiple regression techniques are useful in this case because they allow us to isolate the effect of one independent variable while keeping the others constant.[91] The ability to isolate the effects of different variables is vital when studying a complex situation where multiple factors can affect an outcome such as the length of a case.[92] For example, consider two hypothetical criminal cases. The first case involves a petty theft charge with a plethora of evidentiary issues that present novel legal questions in Fourth Amendment jurisprudence. In this case, the prosecution engages in open-file discovery, and the case lasts fourteen months. Now consider a different petty theft case in which the admissibility of evidence is not at issue. In this second case, the prosecutor engages in strict discovery, and the case lasts twelve months. It would be fallacious to compare the case lengths to argue that strict discovery caused the second case to end two months sooner. Rather, it is likely that another factor—the existence of evidentiary issues—also had an impact on the case length. Multiple regression analysis helps to counter this issue. In effect, multiple regression analysis allows scholars to group cases that are similar along certain defined dimensions in order to isolate the impact of one factor that varies among these cases.[93]

In addition, multiple regression tests allow for the assessment of statistical significance. Not all differences in outcome are meaningful: the data could show that having a strict discovery policy is associated with cases lasting two days longer on average, but it is unclear whether such a small finding is random variation one would expect from statistical sampling or a meaningful relationship. Determining statistical significance effects distinguishes between variations so mild that they can be explained by randomness and variations that are more extreme and thus unlikely to be caused by randomness.

  1. Results

The table below presents the findings for all cases in the dataset:

Table 1: Effect of Strict Discovery on Case Length (months)

Model Variable β B SE B t
Strict Discovery Policy (Manhattan) -.009 -.09 .64 -.15
Defense attorney paid hourly .22 2.35 .67 3.50***
Crime Severity .29 .53 .10 5.18***
Motion for suppression hearing filed .19 2.26 .83 2.71***
Note: β = standardized regression coefficients, B = unstandardized regression coefficients, SE B = the standard error of the regression coefficient (hetroskedasticity-robust), t = t-test. R-squared = .23.

*** p < .01   ** p<.05   *p<.10

As Table 1 illustrates, the model failed to produce accurate predictions. Cases in Manhattan are not longer than cases in Brooklyn. The model predicts that liberal discovery will result in fewer trials in Brooklyn than in Manhattan, but this prediction is not borne out with any statistical significance. Note, however, that trials were extremely rare across both boroughs. In the dataset as a whole, Manhattan had only four trials and Brooklyn had one. It is possible that liberal discovery yields fewer trials, but there simply were not enough trials in this dataset to test this prediction.

The independent variables are all statistically and practically significant in the expected directions. Cases where a suppression motion was filed lasted 2.26 months longer on average. As previous literature predicts, cases where defense attorneys were paid by the hour tend to last longer. These cases take 2.35 more months to resolve compared to cases where the defense attorney is paid an annual salary.[94] Unsurprisingly, the severity of the crime was associated with longer case lengths.[95]

  1. Rethinking the Model: Taking Defendants’ Strategic Responses into Account

The model predicts that stricter discovery will lead to longer case lengths and more trials, but these predictions are not borne out. Though surprising at first, the model’s failure begins to make more sense after considering the underlying legal environment and how parties strategically react to rules. As legal historian Edward Purcell, Jr. noted in describing the response to Justice Brandeis’ Erie Railroad v. Tompkins opinion, “[w]ithout constant reference to changing social dynamics and consequences, students of procedure can scarcely know what they are talking about.”[96] Theoretical work that models legal procedure and its effects will fail if it ignores “the reality that the effects of legal rules are more important to clients’ goals and their lawyers’ strategies than are the purposes of legal rules.”[97]

Many scholars have noted that in the absence of statutory discovery, defendants will use imperfect discovery substitutes, such as hearings to assess probable cause or suppression hearings.[98] These pre-trial hearings allow defendants to interrogate government witnesses, get impeachment information on those witnesses, and learn more about the government’s theory of the case.[99] For example, consider a case where the primary evidence against the defendant is the testimony of a police officer that observed the alleged crime firsthand. In a suppression hearing, the defendant could potentially cross-examine this police officer. Though the purpose of the suppression hearing is to determine whether a Fourth Amendment violation occurred, the defense could nonetheless learn the police officer’s account of the events, assess the officer’s potential credibility in front of a jury, and learn about other witnesses or evidence that would be relevant to the case. In other words, the defense could collect the types of information it would have otherwise received through more liberal discovery.

The Legal Aid Society of New York recently pushed for a liberalization of Manhattan’s discovery practices, arguing that strict discovery causes inefficiency because it forces defense lawyers to employ such time-consuming strategic responses.[100] Specifically, the Legal Aid Society notes that in response to strict discovery, lawyers “turn to other systems to obtain pre-trial information, such as . . . pretrial hearings.”[101] A state-appointed commission points out that such pretrial hearings allow defendants to access evidence that normally would not be available until trial, absent open-file discovery.[102] Judges have expressed worry about the strategic use of pre-trial hearings as discovery substitutes.[103] The data gathered in this study support the concern expressed by practitioners, judges, and academics. The table below compares the number of suppression motions and hearings in each borough:

Table 2: Number of suppression motions and hearings by borough

Brooklyn (n=87)

Liberal Discovery Policy

Manhattan (n=116)

Stricter Discovery Policy

Number of suppression hearings 6 57
Number of suppression motions 7 92
*Chi-square test p-value < .01 for both number of motions and hearings

Table 2 shows huge disparities in the number of suppression motions filed and pre-trial hearings held between the two boroughs. Manhattan defense attorneys filed suppression motions in three out of four cases, while Brooklyn attorneys filed such motions in less than one out of fifteen cases. Ostensibly, defense attorneys in Manhattan courts filed these motions because they felt that evidence was illegally obtained, but if the suspicions of practitioners, judges, and academics are correct, suppression hearings also substitute as discovery vehicles. Furthermore, Manhattan judges granted hearings in nearly half of cases in which defendants moved for suppression hearings—a much higher rate than in Brooklyn, where hearings were granted in one out of fifteen attempts. Out of the 63 suppression hearings held across both boroughs, the defendant prevailed only once.[104]

Given this procedural context, the failure of the original model makes more sense. Strict discovery might cause there to be more trials and slower dispositions, all other things being equal. However, in trying to measure this phenomenon, an econometric model taking evidentiary hearings into account will mask the effect that these discovery rules have. That is, if strict discovery causes more suppression hearings, then controlling for whether a motion to suppress is filed, as done in the analysis above, will make cases in strict discovery jurisdictions seem quicker than they actually are. Indeed, as shown in Table 1, case lengths are extended by more than two months when defense attorneys file motions for suppression.

Taking strategic use of suppression hearings into account, the model more accurately predicts case length. If suppression hearings help defendants get around strict discovery limits, then any increase in case lengths due to strict discovery in Manhattan would be masked, since the econometric model would instead ascribe this delay to the suppression hearing. If we take this into account, then the dataset matches the model’s predictions.

  1. Other Critiques of the Model and Data
  2. Omitted Variables Bias

The nature of the study rules out many potential confounding variables. For instance, prosecutorial caseloads could have a huge impact on how long cases last, but this variable is consistent across Manhattan and Brooklyn.[105] Different demographic characteristics, crime rates, and local cultures could play a role in explaining some of the differences in case lengths. Variations in criminal procedure might also confound results. A jurisdiction with extensive probable cause hearings might rely less on suppression hearings later on, but again, Brooklyn and Manhattan have consistent criminal procedure law.

The low adjusted R-Squared scores (mostly clustering in the .25 range across models) show that the econometric model used here explains little of the variation in case lengths and hearings, suggesting that omitted variables may play a significant role in explaining any differences between boroughs. For example, the judge involved might have an impact, as different judges work at different speeds. Future research could try to control for judge effects. Similarly, the complexity of the case should also have an impact. A case that relies on DNA testing, expert witnesses, and thousands of documents would take longer for both sides to wade through and argue. This model does not take this type of complexity into account. One imperfect way to account for case complexity would be just to measure the size of the case file—thicker files might mean more complex cases and a longer time until a disposition. It is also possible that case length is the result of a lot of randomness, and any model will have a low R-Squared. The worst-case scenario for my model is that one of these omitted variables has an impact on case length and is correlated with the borough where the case occurs. For example, faster judges could be causing a lot of the variation in case length, and if Manhattan judges tend to be faster for some unrelated reason, the shorter case lengths in Manhattan, as compared to Brooklyn, might have nothing to do with discovery rules.

There is no obvious reason why these omitted variables would be correlated with the borough involved. Manhattan might have less complex cases or faster judges, leading us to potentially attribute to open-file discovery what might be caused by something else entirely. However, Brooklyn could also have less complex cases or faster judges. In this case, the effects of open-file discovery would be understated by this model: Brooklyn cases would move faster because the cases are simpler or the judges are faster, but the statistical approach used here would attribute this speed to the discovery regime. Nonetheless, even though the study’s design helps counter many potentially confounding variables, future analyses should gather more data to try to explain more of the variance in case lengths. Specifically, a more robust study should try to control for judge effects and complexity.

Another critique is that, in spite of geographic proximity, Brooklyn and Manhattan are still too different for the statistical approach in this study to be effective. For example, the 2010 census shows sizable differences in median income and demographics. Manhattan’s median income is $68,706, whereas Brooklyn’s is $43,166.[106] The income disparity could translate to differences in jury pools and the types of crimes committed. Further, because the court systems for each borough are separate, bail might be handled differently in each borough. Again, these differences limit the effectiveness of the statistical approach used in section IV.0. The econometric model presented cannot identify the impact of different discovery rules if this impact is masked by differences between the boroughs. Proper identification is impossible. The criticism that differences between Manhattan and Brooklyn undermine the conclusions of this paper undoubtedly has some merit. The empirical approach presented here is an observational one, not an experimental one, so such problems are inescapable. Future research could try observational approaches between other jurisdictions. If such research replicates the findings presented here, then that suggests that omitted variables related to Manhattan and Brooklyn are not driving the results of this paper. Similarly, scholars could replicate the methodology of this study, but using an event study approach, which looks at a single jurisdiction to measure case outcomes before and after discovery rules were changed.[107] In sum, the observational approach has inherent limits in terms of causal inference. Further study is warranted and caution is necessary when interpreting the findings of this paper.

  1. Shortcomings in the Variables

Furthermore, the independent variables used in this study are subject to critique. The crime charged is an imperfect measure of the seriousness of the crime because a defendant could face significantly more jail time for the same crime depending on her criminal history.[108] Further, the crime variable only considers the top charge. Some defendants had multiple charges, meaning that, again, the crime variable will not capture the true seriousness of the case. Finally, for strategic purposes, a prosecutor might try to overcharge at the outset of a case, even if she knows the case is much less serious. Unless Brooklyn and Manhattan practice different strategies, these imperfections should on average be equal across boroughs, but the model will do a worse job of measuring how the independent variables affect the dependent variable than it would if such strategic behavior were absent.[109]

As mentioned above, the borough variable is also an imperfect measure of whether the prosecutor employed open-file discovery. Discovery motions were filed in every Manhattan case, but a prosecutor might have shared her file very early on. Similarly, a Brooklyn prosecutor might drag her heels or hide some evidence, but the case file would not show whether the prosecutor actually followed her office’s discovery procedure. Thus, in addition to capturing omitted non-discovery differences between the two boroughs, the dummy variable which codes the borough where a case occurred is not a totally accurate measure of whether open-file discovery was used. A more complete model might try to take specific prosecutors’ practices into account.

In spite of these methodological problems, the study provides the most robust, effective look at discovery statutes yet presented in the literature. No study has collected data to assess how differing discovery regimes affect case outcomes. While empirical analysis can never be perfect, this study contains detailed information taken from case files that even allows for the measurement of attorney characteristics. Thus, this paper presents the best empirical assessment of different theories on criminal discovery that exists.



This paper used theoretical and empirical tools to assess the effects of criminal discovery rules. The results have practical implications for researchers and policy-makers. From a methodological standpoint, this work demonstrates the importance of verifying theoretical work with real data. All of the models presented yield inaccurate predictions for Brooklyn and Manhattan. The failure of the predictions to explain the data likely occurred because ignored important factors driving case length and outcomes. This paper’s statistical analysis demonstrates the danger in using theoretical work to guide policy before testing the theory against reality with data. Of course, theory is extremely useful in guiding empirical work: it structures the data analysis and helps us know which data to collect in the first place. The results of the statistical analysis in this paper merely suggest that theory and empirical analysis must work together if we are to use economic modeling approaches to guide legal policy.

This paper has important implications for policymakers in Manhattan, and possibly for other jurisdictions. The statistical analysis suggests at least two possible conclusions that can be drawn from the higher frequency of suppression hearings in Manhattan than in Brooklyn. The optimistic conclusion for Manhattan is that these suppression hearings are serving a valuable function in protecting defendants’ constitutional rights. Just as the prevalence of plea bargains could be evidence of an overburdened system that railroads defendants into waiving their rights,[110] it could be that Manhattan’s higher number of suppression hearings is a positive feature of the system: defendants are taking advantage of their constitutional rights. Yet this optimistic portrait of suppression hearings in Manhattan is belied by two findings of this paper. First, defendants lost virtually all of the suppression hearings held, and second, Brooklyn defense attorneys almost never requested such hearings. If suppression hearings are valuable, why do Brooklyn defense lawyers so rarely request them? And if suppression hearings are effective vehicles to safeguard defendants’ Fourth Amendment rights, why do defendants win these hearings only once in fifty tries? For the optimistic story to be true, Brooklyn defense attorneys would have to be relatively less competent than their Manhattan counterparts. Likewise, the optimistic story would have to explain why Manhattan defense attorneys lose 98% of suppression hearings.

The more pessimistic interpretation of the data—and the more likely interpretation—suggests that Manhattan is throwing money away on frivolous suppression hearings and wasting valuable time on unnecessary motions. If suppression motions and hearings are used primarily as imperfect discovery devices, then Manhattan should reform its policies to make discovery easier and let the court system spend time on more important matters. An argument can be made that the quality of Fourth Amendment case law will even improve within the jurisdiction, since it will be easier for courts to identify meritorious suppression motions when the motions are not lost in a crowd of questionable suppression motions.[111]

If the data collected is representative, Manhattan pays for roughly 3,200 suppression hearings per year compared to Brooklyn’s 800.[112] A rough estimate of the cost of these hearings can be constructed using data from the Bureau of Labor Statistics on median hourly wages. Imagine that each hearing consumes three hours of a judge’s time ($68.06 per hour[113]), a defense lawyer’s time ($47.50 per hour[114]), and a prosecutor’s time ($20 per hour[115]). Add in one hour for a court reporter ($46.91[116]). Summing these figures yields a rough and conservative estimate of over $400 per suppression hearing. Note that the estimate does not take any other employees into account or the cost of suppression motions that must be processed but do not necessarily lead to actual hearings. Based on this rough estimate, however, Manhattan is spending nearly $1 million per year on arguably unnecessary hearings.[117] The $1 million figure amounts to between 1% and 2% of the entire budget of the Manhattan District Attorney.[118]

Recently, there were indications that the Manhattan District Attorney’s office would begin to shift towards Brooklyn-style discovery practices.[119] Before he became District Attorney, Cyrus Vance campaigned on a platform that included the liberalization of Manhattan’s discovery policies. Since then, however, he has failed to make good on his promise.[120] In explaining the refusal to shift towards liberal discovery, the D.A.’s office claimed that “there is no empirical evidence that open-file discovery leads to more efficiency.”[121] Whatever Mr. Vance’s reasons for continuing to institute a strict discovery policy, this paper discredits at least one: empirical evidence now suggests that open-file discovery does lead to more efficiency.

