


|
|
|
| Written by Michael Arkfeld |
|
This area addresses one specific issue — seeking clarification of what the drafters of Federal Rule of Civil Procedure 34(b) and state rules meant by stating that a requestor for electronically stored information “may specify the form or forms" in which electronically stored information is to be produced. Set out below are a proposed set of terms to clarify the questions raised by the “form or forms” reference under Rule 26(f) and Rule 34 of the FRCP. These terms can be freely used by attorneys and other legal professionals as a "common glosssary" with opposing counsel to discuss the "form" of disclosure of ESI. The specific definition of the terms below such as "native file" are not copyrighted. They are intended to be freely distributed. However, we would request that if these terms are used that the source of these terms be disclosed. For example, if you decide to use these terms as an attachment to a letter to opposing counsel the citation to the glossary would be, ELawExchange "Form or Forms", available at http://www.elawexchange.com. Overview - "Form or Forms" of Electronic Discovery Excerpt from Arkfeld on Electronic Discovery and Evidence (2nd Ed.), § 7.7(G), ESI Form(s):
The following are the common "form or forms" of "electronically stored information."
Native fileComputer application software programs generally maintain their data in proprietary file formats commonly referred to as “native file” format. A native file is the file format used by a specific software application. These files often have a specific file extension, which are the last three or more letters following the file name and period, such as “smith.PST” for a Microsoft Outlook file (PST) relating to a person named “Smith.” These proprietary data file formats determine which software program can read the data file and it also contains the file’s metadata. Unless otherwise stipulated, an agreement to exchange "native" files will include the entire computer file, including all metadata.
MetadataThe Court in Aguilar v. Immigration & Customs Enforcement Div., No. 07-8224, 2008 U.S. Dist. LEXIS 97018, at *11-15 (S.D.N.Y. Nov. 20, 2008) stated: 1. Types of Metadata Metadata, frequently referred to as “data about data,” is electronically-stored evidence that describes the “history, tracking, or management of an electronic document.” . . . . It includes the “hidden text, formatting codes, formulae, and other information associated" with an electronic document. . . . . Autotech Techs. Ltd. P'Ship v. AutomationDirect.com, Inc., 248 F.R.D. 556, 557 n.1 (N.D. Ill. 2008) (Metadata includes “all of the contextual, processing, and use information needed to identify and certify the scope, authenticity, and integrity of active or archival electronic information or records”). Although metadata often is lumped into one generic category, there are at least several distinct types, including substantive (or application) metadata, system metadata, and embedded metadata. . . . There are three types of “metadata.” a. Substantive Metadata Substantive metadata, also known as application metadata, is "created as a function of the application software used to create the document or file" and reflects substantive changes made by the user. (citation omitted). This category of metadata reflects modifications to a document, such as prior edits or editorial comments, and includes data that instructs the computer how to display the fonts and spacing in a document. (citation omitted) Substantive metadata is embedded in the document it describes and remains with the document when it is moved or copied. . . . b. System Metadata System metadata “reflects information created by the user or by the organization's information management system.” (citation omitted) This data may not be embedded within the file it describes, but can usually be easily retrieved from whatever operating system is in use. . . . Examples of system metadata include data concerning “the author, date and time of creation, and the date a document was modified.” . . . Courts have commented that most system (and substantive) metadata lacks evidentiary value because it is not relevant. (citations omitted) System metadata is relevant, however, if the authenticity of a document is questioned or if establishing “who received what information and when” is important to the claims or defenses of a party. (citation omitted) This type of metadata also makes electronic documents more functional because it significantly improves a party's ability to access, search, and sort large numbers of documents efficiently. . . . . c. Embedded Metadata Embedded metadata consists of “text, numbers, content, data, or other information that is directly or indirectly inputted into a [n]ative [f]ile by a user and which is not typically visible to the user viewing the output display” of the native file. . . . Examples include spreadsheet formulas, hidden columns, externally or internally linked files (such as sound files), hyperlinks, references and fields, and database information. . . . This type of metadata is often crucial to understanding an electronic document. For instance, a complicated spreadsheet may be difficult to comprehend without the ability to view the formulas underlying the output in each cell. . . . .
DatabaseIn computer terminology, a database is simply a collection of mutually related data or information stored in computer record fields. They are organized collections of information similar to index cards, phone books, or file cabinets of documents. In business, all kinds of data, from e-mail and contact information to financial data and records of sales, are stored in some form of a database. Databases track employee information, payrolls, job classifications, retirement benefits and a host of other business related information. The Fed. R. Civ. P. 34, Advisory Committee Note of 2006 recognized “dynamic databases” and how databases may store different forms of ESI, "[e]lectronically stored information may exist in dynamic databases and other forms far different from fixed expression on paper. . . . Using current technology, for example, a party might be called upon to produce word processing documents, email messages, electronic spreadsheets, different image or sound files, and material from databases." A subset of a database can be produced by the use of queries and reports. They can be exported into a comma, delimited, spreadsheet or other file format.
SpreadsheetSpreadsheet application programs perform simple and complex mathematical calculations automatically. Spreadsheets can be used in a variety of business functions, and oftentimes are used by individuals to keep their financial and other records. Spreadsheets have long been used in business and are of interest for electronic discovery because of their content. Calculations and mathematical analyses, mailing lists, to-do lists, attendance rosters, invoices, real estate closing statement calculations, truth in lending statements and mortgage payments, loan calculations and amortization schedules are a few of the uses of spreadsheets. Database information can be downloaded into a spreadsheet format. "Native" spreadsheet computer files contain: Calculations or formulas that are not visible in a printout version (only the result of the calculation is visible); Hidden cells, columns, rows and post-it style comments; Hidden worksheets; Hidden formulas; and Display of all rows and columns.
ImageImaging is a process where paper or ESI is scanned into a system and stored electronically as a “picture.” These digitized computer files of documents are known as “images.” Images may be searchable or non-searchable. Metadata from the original document is generally not converted by the producing party if you convert and print a file to a TIFF or PDF format. ESI can be converted to images for disclosure purposes, redaction or presentation in the courtroom. In addition, images may be linked to a database to assist in the retrieval process. If an image is not searchable it can be converted into searchable text by using optical character recognition (OCR). OCR is a process of converting letters or numbers that appear on an image or printed page to a bit mapped image and then into ASCII that can be searched. However, the process is not 100% accurate and it may be expensive to "clean up" the converted image data. An image that is created from ESI is generally searchable since the ESI is converted directly into an image with the accompanying “text.” This is usually accomplished by using the PDF image format. In addition, ESI can be converted to a TIFF nonsearchable image and be accompanied with searchable “text.” The two most popular image forms for litigation are TIFF and PDF. TIFF (Tagged Image File Format). A TIFF is a standard proprietary file format for storing images as bit maps. It is used especially for scanning documents because it can support any size, resolution and color depth. TIFF images are not searchable. If the document is OCR’ed, which is not 100% accurate, then it will become partially searchable. PDF (Portable Document Format). PDF stands for Portable Document Format and is also a proprietary file format (Adobe, Inc.) that preserves the fonts, images, graphics and layouts of any source document, such as a Microsoft Word document, regardless of the application and platform used to create it. After converting a file to a PDF format anyone with (a free) Adobe Reader software can view the document as it originally appeared in the application program. This precludes the necessity of having to obtain a licensed copy of the application program to view the document. There are two different types of PDF files: IMAGE ONLY format is an exact electronic picture of the paper document. It cannot be word searched, unless the image is subsequently OCR’ed; IMAGE FORMAT with SEARCHABLE TEXT FORMAT is an electronic picture or image of the document that also contains background “hidden” text that can be word searched using Adobe Reader software. Documents, spreadsheets, e-mail and graphics can all be converted to TIFF or PDF.
Fed. R. Civ. P. 34(a) states: “(a) Scope. Any party may serve on any other party a request (1) to produce . . . documents or electronically stored information — including . . . images . . . stored in any medium from which information can be obtained (emphasis added).”
Text, ASCII, and Conversion FormatsText or ASCII. “Text” or "ASCII" (American Standard Code for Information Interchange) documents are those documents that have the “text” of a document stored in a computer file. These documents can be word or phrase searched and one can instantly access the exact location of the words in the text documents. ASCII is a format that most computer programs recognize for transferring data between programs and to conduct “text” searches. Essentially any document produced is in a “text” format, if it is in an ASCII format. Once in an ASCII format, it can be imported and searched in a search and retrieval software program. An example of a “text’ document is the deposition of a witness. Other examples of “text” documents include; business documents, trial transcripts, witness interviews, expert reports, etc. Conversion Formats. Generally, in order to transfer data between different database, spreadsheet and automated litigation support (ALS) programs you have to convert data into a format that is recognized by both programs. For example, one common format for transferring data from one application to another is “comma-delimited” in which each piece of data is separated by a “comma.” Most database and spreadsheet programs are able to import and export “comma-delimited data.” Any character can be used to separate the data, but the common separators or delimiters are the comma (usually referred to as CSV (comma-separated value)), text, vertical bar, space and the tab key. Column headers are usually included as the first line and used as “field descriptors” for identification purposes.
PaperFed. R. Civ. P. 34, Advisory Committee Note of 2006 recognizes that: “[t]he form of production is more important to the exchange of electronically stored information than of hard-copy materials, although a party might specify hard copy as the requested form.”
Automated Litigation Support (ALS) Form and Online ESI DepositoryTechnically a disclosure of ESI in an automated litigation support (ALS) format is not a “form.” However, many litigants, courts and agencies will agree or order that the exchange of ESI be provided in an ALS format. For example, the Court may require that all ESI be provided in a Summation “load” file which will enable the requesting party to immediately load, search, analyze and produce ESI reports in Summation. These “load” files may contain a database, images and text. Essentially, ESI is preprocessed to be used immediately with ALS systems. Automated litigation support (ALS) generally refers to computer operations that support legal functions in litigation. Summation and Concordance are two of the leading litigation ALS systems. An ALS system can manage a large volume of documents, electronic data, transcripts and other data in a secure environment for quick retrieval and analysis during litigation. In addition, online ESI depositories, similar to such as Lextranet or CaseVault, are web based ALS systems and are increasingly being used for hosting and management of ESI.
Audio and VideoAudio has moved from analog recording with LPs (long-playing records) and tape cassettes as the playback medium to digital recording using computers with a digital sound playback. Digital audio files are usually compressed for less storage requirements and faster transmission. The most popular audio file format today is MP3. MP3 is a standard technology and format for compressing a sound sequence into a very small file while preserving the original level of sound quality when it is played. Video generally refers to recording, manipulating and displaying moving images in a format that can be presented on a television or on a computer monitor. It is a recording produced with a video recorder (camcorder) or some other device that captures full motion.
|