As case complexity increases outsourcing (arguably) enables law firms
to improve efficiencies, decrease costs, and utilize specialized skills
that may not be available with smaller cases. On the other hand, firms
lose the craftsmen's control exercised on smaller, less complex matters.
To maintain control of these complex, often partially outsourced cases,
measures need be taken to allow suppliers and consumers of legal services
to measure the quality of their work. This enables firms to maximize outsourcing
efficiencies while maintaining the level of excellence that clients expect.
To this end, I am presenting one simple, tested method for determining
quality, available to both the managing attorney and the supplier of legal
work. It produces a fact-based, reproducible measurement used to determine
whether the outsourced product meets the given standard of excellence
or whether the firm is dealing with a substandard product that needs to
Before we start, I would like to address attitude and the desire to excel.
I doubt there is a provider of legal data who knowingly produces anything
less than perfect workmanship. Yet in this world of imperfection, we know
errors are inevitable. We know that humans fall asleep, that machines
misread text. Because of inevitable errors, practical quality measurement
techniques need to be taken by the legal professional to understand the
true quality of the work they produce and use.
There are a number of ways to do this wrong, all generally listed under
the category of Quality Control, or "QC" as it is generally
known. Under this banner, well-intentioned firms advocate and implement
expensive, inadequate, and occasionally frivolous document inspections,
data checks and overall hand wringing that achieve no defensible objective.
One very common method of nonproductive QC is to check "a bunch"
of the work to make sure it is right. Worse is the firm who employs a
specialized person to check all or some of the work, all the while having
no proscribed technique to determine what to inspect, how much to inspect,
and how to determine whether to accept or reject the work in front of
them. In both cases time, effort and money flow freely down a drain, while
no value is added to the product and no meaningful work is done.
Having taken this shot at my well-intentioned colleagues, it is time
to describe a system that data providers and consumers can use to prove
that they have produced the high quality work promised. Additionally where
conflict arises, it can be used to defend or refute work quality either
in court or at the bargaining table. Having promised to deliver, let me
introduce - or perhaps reintroduce - the legal industry to one of oldest,
simplest methods to measure work quality, namely MIL-STD-105.
MIL-STD-105 (pronounced "mil standard 105") has been a manufacturing
standard since World War II. Moreover, since it is straightforward and
easy to implement, it maintains a stalwart position in modern MBA operational
science courses, along with the more theoretical and statistically complex
theories that make this simple system work.
MIL-STD-105 has two defining features:
1. It is easy to use.
2. It is repeatable and defensible.
Originally the standard was designed to create a non-arbitrary, meaningful
and repeatable method of determining whether products supplied to the
U.S. Armed Services met agreed quality standards. As in many situations,
the services reduced the complexity of quality measurement into two simple
tables that are both easy to use and technically concise. However, to
implement this standard, several terms need to be defined.
- ACCEPT QUALITY LEVEL (AQL). This is the standard of perfection to
which both the supplier and producer agrees. It originates as a percentage
of expected quality, such as 97 to 99.999 % compliant, and is quickly
translated into the number of defects (or errors) that will be allowed
in a batch of work before the whole batch is sent back to the producer
for rework (presumably at the producer's expense).
- ATTRIBUTES. This is probably the most misunderstood term in quality
measurement. Attributes are the measurable features that define the
product. In the case of a 12-foot long 2x4, measurable attributes would
include the height of the board (2 inches), the width of the board (4
inches) and its length. In the legal world, measurable attributes might
include coded fields, data extracted, or text legibility.
- DEFECTS: These are any deviation outside of the standard set
for each attribute. In the 2x4 example, the actual standard may state
that widths greater than 1.75 inches or less that 1.5 inches are not
acceptable (i.e., defects). In legal work, a defect could include a
misspelled name, a missing field entry, an incorrectly entered data
item or an image that is not accessible or useable.
- LOT SIZE: Lot size is the quantity of items produced. It might
be 130,000 pages imaged or 50,000 docs coded.
- SAMPLE SIZE: This is the quantity of items to be measured, and is
determined by the Lot size and the AQL.
- ACCEPT / REJECT CRITERIA. This is both the simple strength of MIL-STD-105
and the characteristic that allows this standard to withstand cross-examination.
Based on the AQL established during negotiations, it explicitly states
the number of defects allowable in an acceptable lot of data or images,
or such, without losing confidence that the job is as good as expected
or, perhaps, as good as humanly possible. Conversely, it describes the
point at which statistical confidence is lost and the job cannot be
accepted. Rigid, fact-based Accept/Reject criteria, along with the predetermined
sample sizes, differentiate MIL-STD-105 from the well-intentioned QC
program described earlier.
With this standard, there are no "redo's" or "maybe I
should inspect a few more." The test is statistically sound and designed
in such a way that a minimum number of samples (i.e., a known, minimum
cost) can provide accurate and meaningful description of the over-all
quality of the work on hand. This does not mean that attorneys will not
debate the issue. (Heaven forbid for those of us who support you.) Rather
it provides a concrete, reproducible test that both the consumer of data
and the provider of data can implement to ensure themselves that they
are producing the quality of work expected and advertised.
Implementing MIL-STD-105, A Case Study
Having reviewed the merits of using MIL-STD-105, I would like to create
a simple, realistic case study that we can use to learn how to implement
the tool. As an example, let's assume the following:
- This is a coding job with 5 fields to be coded per document.
- There are 15,232 documents to be coded.
- The agreed AQL is 99.85%. That is there can be no more than 15 documents
with one or more incorrect entries per 10,000 documents. Finally this
correlates to a defect rate of no more than
0.15% (100.00 - 99.85).
- Finally, let's assume that we are dealing with a supplier, or internal
department, that has a history of good, but not perfect, work.
Using our example, before we review the first item, we need to make a
tactical choice: do we count each field individually, or do we count documents?
In the first case, our batch size would be 5 fields times 15,232 documents,
or 76,160 items in the batch. In the second the batch size is 15,232 documents.
In our example, I am really interested in how well each document is coded,
so I am going to choose to view the document pool as the batch. Having
made this decision, I am now required to look at each sample document
in its entirety, meaning that we as the inspect team will need to verify
that each of the 5 required fields is coded correctly. If any field is
coded incorrectly, then I need to reject the document. If they are all
correct, then the document passes.
Similar reasoning could be used to view each field as a single entity.
In that case, one incorrect field would be an error out of a batch size
Determining the Correct Inspection Level
To determine how many documents to inspect, we need to establish our inspection
level. In Table I, you will notice there are a variety of inspection levels
available including three general inspection levels and 4 special levels.
As in many cases, we will start in the middle and work outwards. According
to the American Society for Quality (ASQ), level II - called normal inspection
- is appropriate for unknown suppliers and for suppliers of modest quality.
Consequently, we will use this level in our example; however, the standard
is designed to be fluid, so inspection levels may change over time depending
on how the quality of the work changes over time. For example the ASQ
states that if 10 lots are inspected with no errors then sampling can
be reduced from normal level II to a reduced level I. On the other hand
if two of 5 jobs are rejected, then inspection should be tightened from
level II to level III until 5 consecutive jobs are accepted. Finally if
any job is rejected from level I, inspection automatically returns to
level II. The other levels, the special levels S1 - S4, are for very small
jobs, and in our case, we would be doing this work in-house, probably
reviewing everything in its entirety rather than sampling or coding.
Knowing the batch size, in our case 15,232, and our inspection level, normal
level II, we use Table I to determine the number of samples we need to
inspect. We find this by following down the left hand column, until we
find that 15,232 falls between 15,001 and 500,000. Reading across the
top of the table we find general inspection level II. Locating the intersection
of our row and column, we determine that our batch size is "P."
So how many is "P"? To answer this, we need to make one more
look up. With "P" written down on the back of our hand, we go
to Table II to determine the sample size as a real number. By following
down the left-hand column we see "P." Just to the right of P,
we find that we need to look at 800 documents.
At this point we can see why "choose a few" is totally inadequate
as a QC measure. From my experience, very few firms would actually review
800 documents to prove they are really 99.85% accurate. Most likely they
would review a few score and call it a day. Yet to have the kind of accuracy
demanded, 800 is the inspection size required. On the other hand, other
firms might attempt to inspect 15,000 documents. Eight hundred is a lot
less than 15,000, and makes for just as reliable an inspection for a number
of reasons, chiefly inspection fatigue.
Having learned our inspection lot size, and knowing that we are looking
for 99.85% accuracy (or a .15% error), we read across table II-A to 0.15
and find that in those 800 documents, we can find as many as 3 with errors
and still accept that job; however if we find 4 or more errors, then we
cannot state with any certainty that this job is good enough, and the
whole job needs to be re-done and resubmitted. The lines pointing up and
down indicate that if we are reading across and do not find a number in
our row, we either skip up or down to the numbers provided. This allows
one simple table to handle the very widest possible number of AQL and
sample sizes and still remain uncluttered.
Inspecting the Job
Knowing our sample size, the job of inspecting is very simple.
- Grab 800 documents at random. Review them to make sure that each document
is coded correctly.
- Record the number of error-free documents, and record the number of
documents with errors.
- Compare the results to the requirements of Table II-A.
- Accept or reject the job. That is if 3 or less documents with errors
are found then - correct the errors of course - and accept the job;
otherwise, consider the job as meeting standard, and ship it.
this point we have completed our review of MIL-STD-105; however, there
are two matters left: gathering a truly random sample and record keeping.
Our first impulse may be to simply grab 800 documents, but this almost
always favors some attribute or person (like sampling only file boxes
on top and in the aisle). To prevent this unintentional skewing of the
results, there are a couple of simple ways to get random numbers. Before
computers, random number tables were commonly available. (Perhaps they
still are.) But today, with computers on almost every desktop, generating
a custom random list is fairly simple task that we can do ourselves.
Creating a Random Number Table
There are a number of ways to create random number tables, but you can
use the description below to create a custom, random number table in MS
Excel that admittedly it is not perfect, but it is certainly good enough.
USING MICROSOFT EXCEL:
- Type the random number function "=rand()" into the first
cell (A1). This generates a number between 0 and 1.
- In cell B1, multiply A1 times our lot size of 15,232 (=A1 * 15323)
to get a document or record number in the appropriate range.
- Copy A1 and B1, and highlight down 800 cells to get 800 random numbers
between 1 and 15,232.
We now have a valid random number table. The remaining steps are optional
but helpful in creating a functional spreadsheet that we can use as an
- Sort the list so that the numbers are in the same order as the documents.
This will make pulling the documents a lot easier. To do this we need
to turn the auto-calculate option off or else we will simply get another
unsorted list of new random numbers. (I did this a couple of times writing
this paper, so I know from experience.) To toggle auto-calculation look
under Tools>Options>Calculation and set the option to manual.
- After turning auto-calculation off, the list sorts correctly; however,
the list will still recreate itself each time we reopen the spreadsheet
or hit F9. This will completely destroy any traceability, which we will
need if we ever go back and review our inspect results. To freeze the
record numbers or DocID, one simple solution is to export the values
into the csv format and then re-import back into Excel or Access. This
erases all reference to the rand function and makes the numbers permanent.
(Optionally, we could have set the auto-calculate to not re-calculate
before save, but this strategy is too risky when dealing with records
we are going to retain for any period of time.)
- Either before of after we freeze the numbers, use the cell format
function to eliminate the decimal values. They don't mean anything in
our context, so they should be eliminated to avoid confusion. This is
done by selecting the column B and then setting the decimal places to
0. You can find this under Format>Cells>Number and typing 0 into
the decimal places box.
- Finally, don't forget to turn auto-calculate back on or else the
rest of your spreadsheets will not update like you expect. (This will
cause a lot of head scratching the first few times the spreadsheet doesn't
update as expected.)
Having gone to the trouble of creating and formatting a random list, we
might as well use it to record a few numbers. This will provide meaningful
traceability and a way to prove that we did what we said. In our case,
we already have the Document number 1-800. In another case this might
be a bates number or a DocID. In all cases it should be the unique identifier
that tells us precisely which documents we reviewed. From here we could
- Personal ID and client information: This would include the name (or
names) of the inspector(s), the date the results were tested and which
matter we are inspecting.
- Pass / Fail Results: Indicate any records that contain errors with
a check mark or perhaps with the inspector's initials. To avoid excessive
documentation, I would indicate accept as a blank.
- Error Description: This would include the reason for rejecting the
document, like "address misspelled" or "incorrect author."
This information is valuable for understanding typical errors and error
sources even when the project meets the AQL.
- Results summary: This would include number passed, number failed, and whether
the batch is accepted or rejected.
Having taken all of these steps - pulling the applicable number of documents
at random, reviewing the documents and recording our results - we now
stand ready to state with authority that the work we are producing or
purchasing meets a known quality standard. If there is a question about
whether the job should be reworked, the consumer and the producer of the
data can review the test documents and see the exact results of the original
testing. Finally, and hopefully in most cases, both the consumer and producer
of the legal service will have a meaningful, repeatable measurement that
proves the work product meets the level of quality expected. This, in
turn, is a vital first step in understanding and improving overall quality
for our clients.
The inspection measurement technique reviewed in this article is not the
most complete inspection tool available to modern litigation support managers
and attorneys. Moreover, there are valid criticisms of using MIL-STD-105.
The primary complaint is that it does not in any way improve quality.
It simply measures each batch pushed through the process. Still, this
standard has enough strength that it continues after 60 years of implementation
to remain a mainstay of quality measurement for several reasons:
- It is simple enough that it can be implemented with a very little
- The technique is valid on both big and small batches.
- The results are definitive rather than arbitrary, and
- The results are reproducible.
In summary MIL-STD-105 provides a simple means of measuring quality on
a day to day basis without involving complex math or training, while at
the same time creating a just and reproducible quality measurement system
that employees and clients can understand.
*Quality Council of Indiana. Terre Haute, Indiana. The Quality Council
of Indiana provides a definitive and useful guide for serious students
of product and process quality who are interested in passing upcoming
ASQ exams. They are found on the web at www.qualitycouncil.com.
* American Society for Quality (ASQ) is a professional association "advancing
learning, quality improvement and knowledge exchange to improve business
Additionally they provide examination and testing for nationally recognized
certifications such as Certified Quality Engineer, Certified Quality Manager
and several others. They may be found at
* MIL-STD-105E is available through the Navy Publishing and Printing
Service Office and is sold by any number of technical publication vendors.
You can also locate various editions of the standard in PDF by running
a search on http://assist.daps.dla.mil/quicksearch.
R. SAM GILCRIST is an independent litigation technical consultant who
does trial work through Litigation Tech and who does computer consulting
and e-discovery consulting through Gilcrist.com. He is completing a BS
in Computer Science and holds a BS and MBA in management from Georgia
State College, Augusta. You may reach Sam for trial work at
or for computer programming or further information on forensic production