PDF Print E-mail
Chapter 6 - Computer Concepts and Legal Applications

Advanced Search Features

The following are advanced search features available with some software programs. As you will discover, the emphasis is on a sort of “artificial intelligence” built into the program.

  • Text Box:  Soundalikes - A method whereby the computer will take a word and produce a list of words that “sound” similar. For example, searching soundalikes for the word “confident’ would yield confidential, confidence, etc. It will use the consonants of the main word and will retrieve soundalike words.
  • Phrase Search - The search phrase “General Motors” would locate text where the words are side by side.
  • Synonym Search - Many programs allow you to search for synonyms. For example, if car and automobile were used in the document, then when one searches for car, all occurrences of car and automobile would be retrieved.
  • Proximity Search - To retrieve a word only when it occurs within a specific number of lines or words of another word, use a proximity search. For instance, to retrieve all instances in which car occurs within 3 lines of the word accident, type in the appropriate search command such as car /3 accident.
  • Combined Word Searching - Combined word searches are used to combine synonym, proximity, and/or Boolean searches. For example, you can search and locate all words beginning with auto, but not the word automatic, and also get hits for the word car.
  • Similar Document Searching - Finds all documents that are similar to the primary document. For example, if you locate a particularly important document, then the computer would analyze the document for keywords, location of words and concepts within the document, the number of times they appear, and so on to compare with the other documents. Then other similar documents would be retrieved for your viewing.
  • Fuzzy Searching - Locates words that closely match the spelling of the primary word. Many of the popular word processor packages have this capability when you do a spell check. It will retrieve similarly spelled words and suggest possible replacements. For example, if the primary word is liability then the location of the words libility, liable, lability should be brought to the reader’s attention. Fuzzy searching can be particularly useful when locating materials that have been OCR’d. (Optical Character Recognition converted). When converting documents to full text using OCR software, the errors that occur include leaving out or misspelling letters in words. Fuzzy searching will retrieve close spellings of those words for review. The sophistication of fuzzy searching varies from product to product.
  • Statistical Searching - Is where other words are retrieved that are statistically related to your primary word. Statistical association of words to other words is accomplished at the time the document is “indexed”. When a document is imported into the full text software, the program produces an index of all the words and their relationship to other words. When you “expand” your search of the primary word, it will retrieve all documents where “other” words are statistically related to the primary word, even though the primary word is not in the document. When the primary search term is not in the document, then documents that you would have not been able to locate because they do not contain the primary word will be retrieved.
  • Conceptual, Thesaurus, or Related Searching - Provides other words that are similar or close meaning to the primary word. There can be shades of different meanings to the similar words. An example would be the ability to search for the word “car”; hits returned would include locations of “auto” or “vehicle” automatically. Some full text software comes with a thesaurus. Otherwise, you can purchase a thesaurus that is related to the subject matter of your case. Some permit you to add or create a thesaurus for your case. However, this can be quite time consuming.
  • Topical Searching - Enables you to search documents by topics and subtopics relevant to your case. You can provide weighted relevance to the different topic outline to retrieve only those documents that pertain to the topics and the precision will be determined by the weight given to the topic. This can be especially useful if you know and can define the various topic outline - implying that you know the contents of your document set - in order to retrieve the documents based upon the weight you give to the various topics and subtopics. Topical searching offers an advantage if you specialize in a particular area of law and the terms as applied to a group of documents remains the same. Verity allows users to index manage and store corporate information without requiring users to store according to any particular format, location or process. Automated classification of information using concept based tools. Verity (www.verity.com).
  • Weighted Relevance Searching - Will sort and retrieve documents according to the “weight” given to the documents. The weight given to documents will depend on the weighted criteria of the software program. Some programs will give statistical weight to the number of times the primary search terms are found in a given document and their proximity to each other. The purpose is to retrieve the most “relevant’ documents first for your review. In comparison, a topical search would return the documents where you provide the emphasis given to a topic - whereas the weighted relevance would return the documents where the terms are located most frequently and in proximity to each other. Natural language programs are using weighted relevance as they group words and return the relevance of the terms in close proximity.
  • Adaptive Pattern Recognition - This system indexes every letter on every page. It learns and remembers binary patterns found in the text. When you conduct a search, it conducts the search based on discreet patterns in the text. For example, if you search for asbestos, then even if the word is spelled asbest~~ or asbestos, then the algorithm will find enough patterns to locate these hits. This technology has proven especially useful with documents that have been OCR’ed and have not been cleaned up after conversion. If one searches for the word “beer”, then any word that matches those characters will be retrieved. In adaptive pattern recognition if the term is spelled be~r because of an untranslate, due to a problem with an OCR conversion, the word would still be found based on the pattern recognition technology.
  • Traditional search engines are based within the context of alphanumeric data and provide little help for locating information in visual data, such as an image or picture. Software is now available to match photographs, video clips, trademark, fingerprint images, or any other type of visual image.
    Associative Retrieval - Where certain terms appear frequently near the terms you are searching, these may provide clues for further searching using the new associative words.
  • Natural Language or Non-Boolean Retrieval - Instead of using and/or connectors, you prepare your search request in ordinary language and the computer automatically converts it into algorithms. Depending on the software, it can be extremely beneficial and enable you to retrieve exact information quickly and accurately.
  • Clusters of Related Phrases - One technique that may provide assistance to your case is to find all documents that contain clusters of related phrases. For example, SemioMap form Semio deconstructs sentences linguistically, indexing relevant phrases. It builds table of terms occurring together and finds statistical clusters of these co-occurrences. Then the results are delivered in a series of navigable concept maps. One can discover patterns in vast bodies of text. This is especially useful for the search and retrieval of all kinds of digital data - emails, word processing documents, etc. Semio (www.semio.com) and Cartia (www.cartia.com). Full text search programs are a valuable tool if used with a clear understanding of their strengths and weaknesses. Full text search advances are important to the location of information, but the sharp persevering inquisitive mind of a researcher cannot be overstated. For example, an individual familiar with the vocabulary of a case can increase the retrieval of key information tenfold.
 

Find Legal Software


Sponsors






eDiscovery Alerts

Click here to sign up for ediscovery e-mail alerts that provide news on the latest electronic discovery and evidence issues.