A document collection used in litigation is not defensible simply because it is large, thorough, or well-organized. It is defensible when the decisions behind it can be explained and justified.
What Makes a Collection Defensible
Defensibility is about documented decision-making. A defensible collection is one where the legal team can explain: where they looked, why those sources mattered, what was reviewed, what was collected, how it was coded, what gaps exist, and how the final collection supports the file.
Without that documentation, the collection may be challenged, not because the records are wrong, but because the process behind them cannot be explained.
The Research Plan
A defensible collection begins with a research plan. The plan defines scope, identifies relevant repositories and archives, establishes source priorities, and documents the strategy for collection. It creates a record of intent: what the team set out to find, where they planned to look, and why those choices were made.
Without a research plan, the collection is vulnerable to the argument that it was ad hoc, incomplete, or biased in its selection of sources.
The Files Database and Provenance Tracking
Every source in the collection should be tracked through a files database: a structured record of where each file came from, when it was accessed, who handled it, and how it entered the collection. This is the provenance record: the documented chain of custody from source to collection.
Provenance tracking matters because the reliability of a record depends partly on where it came from and how it was handled. A record without a clear source trail is harder to authenticate and easier to challenge.
Reliable Coding
A coding protocol defines how records in the collection are classified, described, dated, and tagged. Consistent coding makes the collection searchable, auditable, and reliable. Inconsistent coding (different coders applying different standards, undocumented naming conventions, unreliable date formats) creates a database that cannot be trusted.
Reliable coding requires a documented coding manual, clear conventions for dates, names, document types, and issue categories, and a quality assurance process that catches errors before they compound.
Explained Gaps
No historical records collection is complete. Archives have gaps. Records are lost, destroyed, restricted, or never created. What makes a collection defensible is not the absence of gaps, but the ability to explain them.
Identified and explained gaps, documented in the collection methodology, show that the team understood the limits of the record and accounted for them. Unexplained gaps invite the argument that the collection is incomplete, biased, or unreliable.
Production Readiness
A defensible collection must be production-ready: searchable, documented, and prepared so that the methodology behind it can be explained under scrutiny. Production readiness is the point where the collection, its coding, its provenance records, and its methodology documentation come together as a system that can be relied on, produced, and defended.
A collection is not defensible because it is large. It is defensible because the decisions behind it (where the team looked, what was reviewed, how it was coded, and where gaps exist) can be explained.