A Tutorial on Sampling in Predictive Coding

A Tutorial on Sampling in Predictive Coding

Published in 2013 by Herbert L. Roitblat, Ph.D.

Paper Extract

From an information-science perspective, eDiscovery is about separating the responsive documents from the non-responsive ones. For the attorneys, of course, eDiscovery involves a lot more, but this information perspective is a natural precursor to the legal analysis. An effective predictive coding system will classify most of the responsive documents as putatively responsive and will classify few of the non-responsive documents as putatively responsive. Many parties prefer to use predictive coding because it generally requires substantially less effort and cost than manual review and it typically returns more accurate results.

Predictive coding systems learn from examples. Systems differ in how they get their examples, but all of them rely on a set of examples provided by one or more knowledgeable, authoritative reviewers. The examples can be selected randomly and categorized, can be provided by expert reviewers, chosen by the computer, or some combination of these. These training documents are a sample of all of the documents in the collection, though for many predictive coding systems they may not be a random or representative sample.

Complete Paper

To access this paper, please provide your business email address below and you will immediately be presented a download link directly under the submit button.

Email Address (Required)    

About The Author

Herbert L. Roitblat, Ph.D. is Chief Scientist, Chief Technology Officer, OrcaTec LLC, Co-founder of OrcaTec LLC (CA). Before starting OrcaTec, Dr. Roitblat was Chief Scientist and a co-founder of DolphinSearch, as well as an award-winning Professor of psychology at the University of Hawaii. He has been awarded four patents on conceptual search technologies. Dr. Roitblat is widely recognized as an expert in search and retrieval technology, particularly in the area of eDiscovery. Dr. Roitblat, was the technology expert in the recent Global Aerospace case, in which the court approved the use of predictive coding over the objections of opposing counsel.

About OrcaTec

OrcaTec helps clients address and manage business and legal challenges associated with the discovery and management of unstructured data with advanced analytics and predictive coding technologies delivered in the form of products and services to law firms, corporations and governments. OrcaTec offers a complete suite of textual analytics tools including concept search, visual clustering and predictive coding as part of the OrcaTec Document Decisioning Suite™.  The suite provides legal professionals with an all-in-one  offering for the analysis and review phases of the electronic discovery process and includes OrcaPredict for predictive coding, early case assessment and first pass document review, OrcaSearch for concept searching, OrcaCluster for visual clustering and OrcaReview for second pass document review.

To learn more about OrcaTec and the OrcaTec Document Decisioning Suite, visit OrcaTec.com.



Comments are closed.

Visit Us On TwitterVisit Us On LinkedinVisit Us On Google PlusCheck Our Feed