Full Text Compared to Images
Images, such as a TIFF file, are an “electronic snapshot” of a paper document. It is important to note that images CANNOT be searched using full text software. The words on an imaged document are not in a “full text” or ASCII formats. They are merely dots on a bitmap digital image. To locate a document image, the image must be linked with an index or database. The database is searched and after the database record is located, the attached image can be viewed. The cost of an image is approximately 10 cents or less per page. Converting a document to full text requires OCR software and is inexpensive. However, if “cleanup” is desired the cost generally increases from $.50 to $2.00 per page. Images generally take approximately 50 K of storage for each page. This is to be contrasted with full text where a 400-word one-page letter only occupies 3 K of space, since only the words are digitized and not the whole page. Full text conversion of your material is not the complete answer to controlling factual information in your cases. One of the main limitations to full text searching is language. The language used to describe any event is too variable. An event, person or concept can be described in a number of ways with different words. I can refer to a person as John Smith, husband, manager, friend, owner, debtor, and so on. Words are inartfully used without standards among people. Another example is the use of medical terms among the lawyers, physicians and others in a case. Was it a broken arm, fractured arm, or comminuted fracture? Dates also pose a problem since a May 15, 1990 date will be missed if listed as 5/15/90. Limitations of Full Text Full text software still lacks sorting and precision in managing text in a structured manner. Full text software references, text but does not manage it. Full text lacks the structure and precision of a database. Should You Convert Documents to Full Text? In a typical lawsuit, you will obtain written discovery from opposing counsel. These can be answers to interrogatories, tax returns, corporate documents, and any other written documents pursuant to a Request for Production or other discovery mechanism. If you cannot receive the information in a digital format, a decision has to be made whether to convert these written documents to ASCII full text, so that you can search these materials with a full text software program. Besides the cost of conversion, which may be substantial, one issue that has to be addressed is whether full text conversion will meet your needs to control the document information in a case. There is one school of thought that all of the documents in a case should be converted to full text, no matter the cost. They will argue that this meets their needs for document control and that no database coding is needed. They will contend that full text searches will locate the evidence they'll need in a case. However, as discussed above, there are severe limitations in locating information in a full text only environment. For example, will a search for "Mr. Kowoski" in 1000 documents that have been converted to full text, find all the relevant documents? Probably not, because Mr. Kowoski may be referred to as a "manager", "Frank", "sales manager," or any number of references other than his last name. To counter the problem, some attorneys resort to abstracting or coding the complete document collection, depending on the size of the document collection. This also had its inherent limitation because someone would have to decide the proper coding, certain codes may be missed or the coder might be inexperienced or inattentive when valuable documents were being reviewed. Many of today’s experts recommend that you use a combination of full text and indexing. They suggest that you full text the important documents in your case and code the other documents. If part of the abstracted documents becomes relevant later on, then they can be OCR’ed and then added to your document collection. Also, with decreasing scanning costs to “image” a document and the immense storage capabilities of CD-ROM, the actual attached image of the document can be attached to either the abstracted document database or to the OCR version. The image itself can be converted to full text at any time. Full Text Compared to Database Abstracts Some factors to consider:
Chapter 7 provides further legal application techniques for full text documents and techniques on how to effectively manage these types of documents. |