The Conceptual Search Game is Finally On!!

I have been writing (probably preaching to some) about advanced search technology, the differences between conceptual search and keyword search and the importance of advanced search technology in both the Early Case Assessment (ECA) and document review phases of eDiscovery for the past 3 years.

Concept Search Cash Law Emerging http://ediscoveryconsulting.blogspot.com/2008/06/concept-search-case-law-emerging.html
Concept Search vs. Keyword Search http://ediscoveryconsulting.blogspot.com/2008/12/concept-search-vs-keyword-search-in.html
Litigators Need ESI Analytics – Not Boolean Search Tools http://ediscoveryconsulting.blogspot.com/2010/05/litigators-need-esi-analytics-not.html

However, I have been somewhat disappointed in regards to the level of adoption of true conceptual search technology by the leading Litigation Technology vendors.  That appears to be changing.  As an example, in a June 17, 2010 post by StoredIQ titled, “Email Search: Nowhere to Hide”, the author provides an overview of StoredIQ’s search technology with specific focus on Natural Language Processing (NLP).  The reason that I am pointing this out is that 12 months ago, StoredIQ would not have been spending marketing dollars or Blog space on Natural Language Processing (NLP) because the market didn’t know what it was and didn’t care.

In this Blog post, StoredIQ now contends, “Probably of greatest interest to litigators during the discovery process is StoredIQ’s ability to perform natural language processing (NLP), which is the ability to extract linguistically derived natural language concepts from within email and user files including people, places and things. Legal teams can immediately search using over 250 out-of-the-box concepts and attributes including credit card accounts, social security numbers and stock symbols. NLP identifies word usage based upon context within a sentence. For example, NLP can identify if the word ‘will’ is used to identify a person’s name, a legal document or an auxiliary verb showing intent. StoredIQ has proprietary technology for adaptive sentence boundary disambiguation (ASBD) which substantially increases the precision of Natural Language Processing to address common grammatical deficiencies that are present in many business documents. No other information management technologies have this capability. NLP is a critical capability necessary to accurately perform eDiscovery, records management or risk management as full text indexes alone cannot provide the required level of precision.

Interestingly enough,  in my discussions with General Counsel and their litigation support teams from the Information Technology departments over the past 6 months, I have found a new awareness and appreciation for true conceptual search or semantic search or NLP.    So, StoredIQ is on the right track with their current product  offerings and I would bet that they have a product roadmap with more of the same.

So, I guess the conceptual search game is on and the other litigation technology vendors had better take notice of what their clients are saying in regards to what search technology they need.

The full text of the  StoredIQ Blog post is as follows:

A recent article by Jacob Goldstein, 23 Things Not To Write In An Email, illustrates the type of granularity as well as breadth of keywords that can be used by litigators during the legal discovery process to search for relevant information. He points out some keywords that may raise a legal red flag and should be used carefully when constructing emails. However, today’s technology search capabilities provide such precise, complete and accurate results, that there just isn’t anywhere to hide.
For instance, StoredIQ’s advanced search capabilities can look within compressed files, email archives and email attachments, in addition to the text contained in the email message itself. It can also search non-printable text within a document or email and can search through comments and revisions. In addition to search using keywords, StoredIQ supports many advanced search capabilities including:

  • Single term search
  • Multiple term search
  • Concept-based search
  • Boolean operators
  • Logical grouping of terms
  • Wildcards within search terms or Boolean expressions
  • Proximity searches
  • Natural language entities
  • Regular expressions
  • Macro-based searches
  • Object level attributes
  • By hash value (digital signatures)

Probably of greatest interest to litigators during the discovery process is StoredIQ’s ability to perform natural language processing (NLP), which is the ability to extract linguistically derived natural language concepts from within email and user files including people, places and things. Legal teams can immediately search using over 250 out-of-the-box concepts and attributes including credit card accounts, social security numbers and stock symbols. NLP identifies word usage based upon context within a sentence. For example, NLP can identify if the word ‘will’ is used to identify a person’s name, a legal document or an auxiliary verb showing intent. StoredIQ has proprietary technology for adaptive sentence boundary disambiguation (ASBD) which substantially increases the precision of Natural Language Processing to address common grammatical deficiencies that are present in many business documents. No other information management technologies have this capability. NLP is a critical capability necessary to accurately perform eDiscovery, records management or risk management as full text indexes alone cannot provide the required level of precision.
I know a lot of these terms can be a mouthful, but the underlying take away is that legal teams have the technology to precisely and accurately search electronic data, including email, making it much easier for litigators to discover data that was at one time hidden from them.

About Charles Skamser
Charles Skamser is an internationally recognized technology sales, marketing and product management leader with over 25 years of experience in Information Governance, eDiscovery, Machine Learning, Computer Assisted Analytics, Cloud Computing, Big Data Analytics, IT Automation and ITOA. Charles is the founder and Senior Analyst for eDiscovery Solutions Group, a global provider of information management consulting, market intelligence and advisory services specializing in information governance, eDiscovery, Big Data analytics and cloud computing solutions. Previously, Charles served in various executive roles with disruptive technology start ups and well known industry technology providers. Charles is a prolific author and a regular speaker on the technology that the Global 2000 require to manage the accelerating increase in Electronically Stored Information (ESI). Charles holds a BA in Political Science and Economics from Macalester College.