Wednesday, November 12, 2008

Pit Falls in Data Mapping and Relevant Information Collection

With the growth in the usage computers and content creation tools, the volume of electronically stored information has outstripped corporate IT administrator’s ability to keep up. Keeping all information electronically generated materials is no longer viable from a cost perspective, as well as a litigation risk perspective. Many corporations have instituted electronic information policies which delete documents and emails after a specific period of time, e.g. three months to six months after creation. However, many employees hoard information and have found way around the information management policies, which increase litigation risk exposure for corporations.

How do employees get around the IT enforcement policies?

1. Rename Exchange Archive PST to another extension – most enforcement software does not cross check the content of a file with the extension. Employees have learned to create an archive of their email on a regular basis and simply rename the AAA.pst to AAA.doc to circumvent the email enforcement policy. Make sure that the eDiscovery software can not only find documents, but verify that the document and the document type are not in conflict to find all relevant information on the network.

2. Save PSTs to USB thumb or harddrive – If the data is not on a company share or computer, the enforcement policy is circumvented. However, opposing counsel can depose senior officers and uncover “personal” copies of company information for discovery purposes.


3. Non-IT storage – With strict IT enforcement policies, some corporate divisions have deployed a “divisional” storage server which is outside the knowledge and purview of corporate IT. Although storing electronic information can be useful, it poses a significant litigation exposure risk. Defense and Opposing counsel need to ensure a complete organizational Data Map has been created that has searched the network for “Rogue Storage Sites” to fully comply with FRCP. Make sure that the eDiscovery software can identify all informational sources on the network….otherwise the Organizational Data Map is worthless.

With eDiscovery cost being reduced with automated software tools, corporations have the ability to deploy systems to manage their informational assets efficiently. Furthermore with the dropping cost of eDiscovery, opposing counsel can now trust the defense’s Data Map, but ask for verification with a robust eDiscovery suite, like Kazeon’s eDiscovery software, to ensure and verify compliance.

Tuesday, November 11, 2008

Info Management Technology

21st century e-discovery technology clearly has value in that it facilitates discovery for a host of initiatives that require information access and classification. In short, it is a technology with broad information governance and management applicability, a claim that niche players can never make. This technology integrates directly with the most common data storage, email, database and date archival platforms in existence today to provide corporate clients with flexibility without re-tooling, an approach that not only extends the lifecycle, utility and value of any technology investment, but will be sure to put a smile on even the most austere CFO’s face.

Use case: IT storage management

While storage managers have any number of issues on their front burners at any given time, the one issue that may subsume them all is the management of storage growth. In an environment where one click of the send button on a keyboard can result in a file proliferating five hundred fold across enterprise and geographical boundaries, the concept of data deduplication has never been more relevant or important. In fact, storage managers should take heart, the pedigree of information access technology and core competencies of the technology architects can be traced back to storage management. Date de-duplication is an important factor; file type mismatch reports help uncover MP3 music files cleverly renamed to look into conduct that could adversely impact the organization.

These tools have other uses too, particularly file size and aging reports which are of enormous benefit to the planning and budgeting for IT multi-tiered storage strategies and “green data center” initiatives, which in turn have a significant impact on disaster recovery and business continuity planning.

Use case: Records retention policy and management protocol

For many organizations, the perennial information management challenge they face is records retention. Some organizations my either have no cognizable records retention policy or a state-of-the-art records management protocol they can never seem to get effectively implemented. Regardless of what state its records policy or management protocols are in, and organization that seeks to implement one should have a sense of the following fundamental characteristics of their data and organization:
a. File create, modify and access times;
b. File “owner”;
c. file content;
d. The rules that define which group within the organization’s internal data taxonomy, the classified information belongs to, i.e. – HR, finance, taxes, customers, operations, marketing, etc.;
e. The applicable regulatory framework for the organization – this consists of nondiscretionary externally mandated records retention requirements; and
f. Pending (to the extent they exist) litigation hold requirements.
1. Defensible data remediation; getting rid of files with little business value and high exposure.
Organizations that have implemented records retention programs without classification are likely over-capturing information. This means that while they get what they should, they may also capture superfluous information that is of no business value that could well represent significant risk to the organization.
Records retention managers who leverage search and classification technology will be able to report on a host of information, including, but not limited to, the age of files it encounter during a network scan as well s file access and modify times. This allows records retention managers to being data remediation based on file age and utility. Arguably, old files that are never modified or accessed likely have little or no business value. In fact these files are the ones that constitute the greatest organizational threat in the form of dormant data liability. However, simply because they exist, they may be responsive to a litigation or regulatory request.
2. Data classification; defining that which is an organizational record.
The precursor step to defining what constitutes an organizational record is data classification. Prior to data classification, the data must be discovered and its contents accessed. In today’s world, file metadata and content become part of an enterprise index. Policies, driven by corporate stakeholder criteria, can then identify the items that qualify as “records” based on the organization’s rules and apply user-specific metadata to them.
In the records retention use case, records retention managers or policy makers can create and automate simple or complex classification rule sets that will classify and tag relevant documents with values such as “tax record,” “HR record,” “final contract,” etc. The resulting record can then be migrated to its archival resting point on secondary storage or read-only media. In short, powerful metadata tagging capabilities now allow records managers and policy makers to get their arms around vast amounts of data and apply flexible and elegant classification schemes that heretofore would have been inconceivable.
II. Use case: Information security
In a recent study conducted by TIP, and independent IT research group founded by Gartner, EMC, Giga, and Bell Labs alumni, upwards of 70% of information security (infosec) professionals interviewed have confirmed that there has been a shift in their focus from external threats to internal threats to their information security. Some hot points for information storage mangers include stemming intellectual property (IP) leakage, identifying network security gaps and managing information access. Helping infosec professional by giving them insight into the nature of data at rest is a core foundation for infosec solutions. Questions such as, “who in the enterprise has data related to project X and where is it?” can easily be answered by leveraging regular expression content filter engines that identify and alert infosec professional to the existence and network coordinates of certain types of information that meet the organization’s risk profiles. Infosec managers should be able to scan, locate and sequester information such as credit card data, social security information or any other pattern or keyword-based sensitive and proprietary material. Even more importantly, they should be able to conduct automated risk rankings that combine multiple metadata and content conditions. Information access technology delivers these types of solutions. Now infosec personnel can ascertain data ownership and access rights and be in a better position to help make policy decisions about avoiding future organizational exposure.

Use case: Litigation discovery

1. Implementing a legal hold.
When the specter of litigation looms, counsel has a common law obligation to prevent spoliation. Legal hold implementation has been notoriously taxing on IT department operations and disruptive to custodian workflow. Effective litigation holds are predicated on the ability to identify responsive information and the ability to sequester the information in a defensible manner.

The implications and benefits of this functionality for in-house counsel and IT are:
a. Centralized and consistent collection methodology;

b. Uniformity of collection criteria;

c. Scalability – to collect from 10 or a thousand employees; from 10GB to 100TB;

d. Collection transparency; removal of the onus of collection from organizational custodians;

e. Centralized auditing of the collection steps, criteria and the data collected;

f. The ability to “tag” data based at the point of collection by collector, mater, privilege, responsive term, issue or any other relevant criteria;

g. The ability to de-duplicate data prior to providing access to outside counsel;

h. The ability to use a single interface to search, “logically tag” and sequester email, loose electronic files and database records across multiple geographic locations; and

i. The ability to implement multiple “lights-out litigation holds”’ a process whereby multiple automated rules (policy) engines can scan locations, identify, copy or move responsive information to litigation hold locations.

2. Addressing the disclosure requirements of FRCP 26(a) – Identifying data storage points, quantifying data volumes and types to facilitate initial disclosures.
Simplify the entire litigation data mapping process by automatically creating a “data profile” of:

a. Network file servers;
b. User desktops and laptops;
c. Email (MS-Exchange); and
d. Databases (Oracle).

Products should automatically identify all data storage devices on a network and begin indexing the contents of the devices to which it has access, then automatically extract all metadata and file ownership information from files it scans, including email and PST files. Depending upon the speed of the network, the configuration of the related target data source and the mix of data types, the indexing and classification should proceed at a pace of approximately one terabyte of data per day. Lastly, the solution should have built-in-reporting to provide counsel with summary or detailed reports on the data to help with providing outside counsel with the requisite information for disclosures.

3. Facilitating counsel effectiveness in FRCP rule 26(b)(2)(B) and 26(b)(5)(B) meet and confer-Provide a substantive data landscape that facilitates keyword and form of production negotiations.

Now that counsel has unfettered access to data, s/he can gauge the responsiveness of electronically stored information (ESI) to various terms. This can greatly assist coming up with and finalizing the production criteria as well as help with the form of production.

4. Reducing the likelihood of motions to compel from accessible sources.
The solution functionality and audit trail will substantiate the thoroughness of any discovery initiative to the extent the relevant data sources were accessible.

5. Saving upwards of 40% on litigation support services disbursements.
Today, corporations will pay up to $2,000 per gigabyte to “process” electronic files so they can be reviewed by outside counsel. The bulk of the processing fee is associated with: data de-duplication (getting rid of duplicate files during review), keyword searching and metadata extraction.

Newer products using information access technology perform all of these services in house at a cost of approximately $4 per gigabyte, representing a major evolution. They also generate output that can be used by common litigation support and document review applications like Concordance and Summation.

Information Management - The Big Picture

Given the explosive growth of electronic information in corporate America, managing electronic discovery is increasingly a challenge for corporate IT departments, in-house and outside counsel, each of whom are stakeholders. In December of 2006, the Judicial Conference of the US amended the Federal Rules of Civil Procedure (FRCP) to clarify the roles, responsibilities and discovery obligations of the various parties to litigation. The amendments, for the first time, made specific reference to electronically stored information, or ESI, as it is now commonly known. The changes in attitudes toward e-discovery are noticeable and the amendments have, without question, helped create an unprecedented level of dialog and collaboration to understand how electronic information is created, used, managed and disposed of in the corporate environment.

Why, then, have the amendments, intended to reduce confusion, also introduced a level of complexity to the e-discovery process that has left a lot of people scratching their heads?

For example, corporate counsel in a defense posture is keyed in on everything from creating corporate data maps to handling multiple and complex litigation holds, as well as establishing repeatable and defensible guidelines for discovery. What happens the following week when the storage administrator retires a key server and implements his data consolidation strategy? How good is the data map then?
Records retention managers have also been significantly affected. For years, they have been seen as silent corporate operatives who had murky roles and dealt with boxes of old documents. Today, nothing could be further from the truth. They are on the front lines of protecting an organization from a data management policy perspective.

Another role that has seen significant evolution is that of the “storage administrator.” Corporate data storage administrators are IT personnel whose roles are largely characterized by their knowledge of an organization’s data growth and proliferation patterns – key factors that allow them to make recommendations as to how, when and if an organization’s data management hardware and associated software platforms need modification or change.

Another driver is the evolution of technology for e-discovery to serve both proactive and reactive use cases. The vast majority of matters today are addressed in a reactive fashion with a mind to quickly address pressing, active concerns that demand rapid retrieval of responsive ESI for early case assessments, meet and confer and other matter-specific requirements. However, the future is clear in that there is a need for consistent, repeatable and targeted e-discovery processes that can also be deployed across a company, creating an “e-discovery ready,” proactive environment.

Therefore, the answer may lie in the fact that while the amendments impose obligations on the parties, they don’t specifically state how one should go about fulfilling them. When it comes to corporations today, the old silo-based information management paradigms will not work when it comes to information discovery of any kind, for any reason. The bottom line is: litigation, storage management/data consolidation, records retention, regulatory responses, internal investigations, information security initiatives, personnel policy management, business intelligence, data mining, compliance and monitoring are all effectively subsets of what we call “e-discovery.” This new paradigm of e-discovery subsumes many previously compartmentalized departmental initiatives that are under the auspices of legal, IT, records management, HR and finance. It is predicated on the degree to which an organization has information access and the ability to perform effective data classification. In short, companies should be able to leverage enterprise data for multiple business needs from a common underlying information access and classification platform.