Information Governance Joins the Dark Side

Over the course of the last year or so, the buzz-phrase “dark data” has entered the common lexicon of data management, information management, technology and business analytics circles.  When I first heard the term my mind conjured up the image of the bat-cave wired with the technical capabilities to track Gotham City’s super villains.  Armed with my rapacious curiosity I set out on a deliberate quest to study the shadowed periphery of the information landscape.

This four part blog series shall examine the emergent phenomenon known as “dark data” with the objective of evaluating and contextualizing the trend’s influence on information management practice discipline.


My comic book daydreams aside… there is still no common definition of what constitutes dark data amongst those who seek to use it.  For example, Gartner positions dark data as “information assets that are collected, processed and stored” over the regular course of business and in turn fail to be leveraged for their value to support additional business purposes (such as analytics, direct monetization or business relationships).  The Gartner definition goes further to expressly articulate that many organizations retain dark data only for compliance purposes incurring additional operational expense and risk as opposed to business value.  Accordingly, Gartner’s definition seemingly pits compliance and legal risk against operational risk… vanquishing the potential for a balanced risk-based approach to data management for any organization.

On the other hand, I would define dark data as information that an organization is unable to efficiently identify, process and/or utilize but know that it exists based on the effect this data type has on its other information-based assets. Archivists, librarians and records managers alike are well acquainted with this reality… knowing that a large volume of an organization’s information assets have traditionally been paper-based and stored offsite with little ability in the present to extrapolate analytics from the collection besides this initial indexing action.   Records managers know that these offsite records are there … somewhere… and work through the implementation of organizational strategies to minimize the information asset risks associated with what they commonly call “orphaned” or even “unmanaged” record collections.  This traditional information management model has learned there is an important difference between the information an organization can’t see versus the information that hasn’t yet been discovered.  Accordingly, these same patterns of practice should be considered within the framework of the digital age as many organizations begin to map out and implement their big data management strategies.

Stay Tuned for Part 2 of our Series…
Activating the Bat Signal – Shining a Light on Enterprise Information Governance.



Posted in Governance, Information Management | Leave a comment

E-mail stronger than ever!

Every year IM and IT specialists release their predictions for the upcoming year. John Mancini, President of AIIM, was one of them. One of his prediction caught our eye: the renaissance of the email. As many of you already know, the death of the email has been forecasted for at least the last 3 years.

With the evolution of technology we can now better tame the overflow of email. To learn more about some of these new capabilities, AIIM has published a report presenting tools to help manage emails through Outlook and SharePoint. AIIM will also be presenting a free webinar on January 28th on this topic.

Posted in Governance, Information Management, Technology | Tagged , , , | Comments Off

Have you heard of the new bookbook?

Have you heard about the all new bookbook?

No? I’m sure you did. Published more than 200 million times around the World, its the all new 2015 IKEA Catalogue… presented with a technological and humoristic twist!

A fun way to mock technology through this always highly anticipated catalog.

Posted in Technology | Tagged , | Comments Off

EDRM Diagram: increased focus on Information Governance recently published the third version of its Electronic Discovery Reference Model (EDRM) diagram. This version emphasizes what was previously called ”information management” by changing its name for “Information Governance” and by leveraging the importance of a good information governance all along the EDRM process. is also the creator of the Information Governance Reference Model. The link between the two models is now clearer than ever.

Posted in E-Discovery, EDRM, Governance, Information Management | Tagged , , | Comments Off

Action required: Heartbleed vulnerability

What is Heartbleed?

Heartbleed is a vulnerability within OpenSSL; a popular software product used by many websites and network devices to provide secure connections. The vulnerability exists due to a logic error within the OpenSSL code. This flaw allows criminals to access parts of a web server’s memory that may contain sensitive information

How serious is this problem?

Very serious. The Heartbleed defect could expose information such as usernames and passwords, credit card information and other sensitive information that would be sent by the user to the website, network device or mail servers. Web technologies are present in devices that are not web servers meaning you may have more at-risk technology than is immediately obvious. There is some indication that certain web browsers may be affected although specifics are not yet known.

How KPMG can help

KPMG’s member firms have designed and implemented Cyber Security capabilities in some of the world’s largest corporations and assisted clients in handling complexsecurity breaches. This insight provides our teams with a unique viewpoint on the building blocks for detecting and defending against cyber criminals. KPMG can assist in addressing the Heartbleed issue by:

  • Assessing systems and networks for the presence of the vulnerability
  • Performing forensic analysis of affected systems and supporting networks to identify indicators of abuse
  • Analyzing the risk associated with compromised systems


For more information on Heartbleed, including a quick decision tree, on our related services and on whom to contact across Canada, please consult our Heartbleed Slipsheet.

Posted in Cyber Security, Information Security | Tagged , | Comments Off

Roadmap for the Long Term Preservation of Digital Assets

One year after the release of the UNSECO/Vancouver UBC Declaration (PDF) on preservation of digital information, UNESCO has taken measures to achieve the objectives mentioned in the declaration.

It all started in September 2012 in Vancouver during the UNESCO conference Memory of the World, to which we had the opportunity to assist. The declaration was released a few months later in January 2013.

Following the declaration, an international conference was held in The Hague in December 2013 with the aim of regrouping different actors from ICT, government and heritage organizations. “The discussion between such diverging stakeholders showed that there was insufficient awareness between industry and heritage institutions about the relative concerns of the other and that there needed to be a platform to discuss digital preservation”.

To answer this problem, UNESCO have set up PERSIST (Platform to Enhance the Sustainability of the Information Society Transglobally) with the aim of encouraging collaboration and discussion between the industry, heritage institutions and government on a variety of long term preservation aspects to build a proper roadmap in a year time.

For more details, we suggest the reading of the executive summary (PDF) of the December 5-6 conference.

Posted in Governance, Information Management, Records Management, Technology | Tagged , , , | Comments Off

Organizational culture check

Information management initiatives focus in part on explicit, recorded information; in other words, systems are implemented that support the management of tangible assets embodied in such documents as reports, articles, meeting minutes, and training materials. In contrast, tacit knowledge, or the expertise and know-how residing in the heads of individuals, is an indispensable intellectual asset of organizations that can also be harnessed and utilized in the implementation of new systems and initiatives. Moreover, technology alone will likely be unsuccessful without a supportive organizational culture.

Research on organizational culture and values sheds light on effective strategies, favourable cultural characteristics, and best practices to enhance knowledge sharing and ultimately increase overall organizational effectiveness and competitiveness.

Proven practices:

  • Cooperation: Build cooperative relationships among employees by providing opportunities for collaboration and learning
  • Trust: Use language and behaviours that cultivate trust and demonstrate a high level of corporate commitment to employees to increase knowledge sharing
  • Organizational memory: Encourage mindfulness and respect for the past to develop a culture that improves the preservation of knowledge within the organization
  • Autonomy: Give employees freedom in their work (e.g. scheduling, approaches) to improve their willingness to collaborate
  • Avoid hierarchy and competition: A focus on rules, standardized procedures, and productivity hinders knowledge sharing

Knowledge management principles focusing on the transfer of tacit knowledge between individuals as well as to the organization itself in its corporate memory represents an invaluable process with significant potential for positive outcomes, and should therefore be a consideration in the planning and implementation of information management initiatives.

Posted in Information Management, Knowledge Management, Uncategorized | Tagged | Comments Off

Who said records and information management (RIM) were dull?

Did you know U2 has written songs about RIM? Or maybe it’s the contrary?

Jeffrey Lewis, a RIM expert, developed his own Information Governance playlist from U2 in his last blog post for AIIM’s Expert Blogs.

An article mixing U2 and RIM? Nothings’ best a few days before Christmas!

And you, what would be your RIM favorite playlist?

Posted in Governance, Information Management, Information Security, Records Management | Tagged , , | Comments Off

KPMG Canada Helps Develop Technology that Uses Text Analytics for Priv Review

Technology-assisted review (“TAR”) is being more widely adopted as a way of saving time and money in large-scale document reviews. But virtually everyone assumes that the advanced text-analysis algorithms underlying TAR and embedded in tools like Relativity, Recommind, Clearwell and Equivio can really only be trusted to perform basic relevance coding, so-called first pass review or maybe issue-based classification.

Can you think of anyone who has suggested that these tools can perform the much more challenging and nuanced task of identifying privileged content? Most lawyers would scoff at the idea. And for very important reasons, their scepticism must be taken seriously. Entrusting priv review to a machine is a high-stakes proposition. Some work in this area as part of TREC 2010 (Interactive Task 304, see pp. 33-35) yielded poor results.

But what if a tool could be trusted to find the kind of text that lawyers should look at, to assess both privilege and waiver? And what if that tool did a better job of finding that kind of text than the average reviewer? Even a well-trained reviewer?

Working with Porfiau, a well-credentialed software development company run by a team of text-classification experts with connections to the University of Waterloo, KPMG Canada has both helped to enhance the computational tool built by Porfiau and developed a set of workflow protocols whereby a new generation of text-classification technology appears to have achieved this goal. The technology uses finite state machines and the workflow is much like those already adopted in TAR situations (start with a seed set, build the algorithm, get a first set of results, have a SME code a sample, retrain the algorithm, and so on). Initial results, based on several iterations and retrainings against a target population of over 300,000 documents, strongly suggest Porfiau’s technology, together with carefully designed processes, can identify potentially privileged content with a level of recall of 0.90 or higher.

And the tool has consistently found important potentially privileged material that the lawyers missed.

This work is described in the final section of a paper that I co-wrote with Chris Paskach and Manfred Gabriel of KPMG US, The Challenge and Promise of Predictive Coding for Privilege[PDF] The paper is one of four selected by peer review for presentation at the DESI V Workshop in Rome this coming Friday.

Posted in Conference, Document Review, E-Discovery, Legal Technology, Predictive Coding, Technology | Tagged , , , , , , , , , , , , | Comments Off