Room for Improvement

XCiPi has capabilities that can benefit a number of sectors one example is the use of Intelligent Data storage, this contrasts to Virtual Data Rooms (VDR) – VDR’s are used to share information; typically during Merger and Acquisition (M&A) activities and processes such as due diligence. Generally an online cloud storage area is established and parties involved in the deal submit documents, in many formats, to the data store. A librarian is often appointed to keep track of the documents and some information may be collected on who has accessed particular documents. In many cases a series of spreadsheets are used to index the documents and control the tasks need to be performed on them, marking reviewers, review outcomes etc. VDR’s aim to provide security and global access to documents and the data within them.

The VDR market has grown at an estimated annualised rate of 16.7% and was valued at US$880 million in 2014 * .  The M&A market is estimated as having deal values of  US$4.9 trillion at the end of 2015 * . A look across all the main VDR providers (iDeals Virtual Data Room, Intralinks Dealspace, Merrill Datasite, Brainloop Secure Dataroom, Watchdox by Blackberry, Box Virtual Data Room) show that while 4 out of the 6 provide full text search, in the end all the offerings are very similar in that they remain dumb storage areas. Files get mismanaged, put in the wrong places, ignored, overwritten, subject to versioning problems and demand a lot of curating. An intelligent data room would provide a solution to many of these problems. The use of data aware storage for these documents would reduce the time and work load of very expensive consultants by a considerable amount.

capture-sort-explore

Capture

So how does Intelligent Data work and how is it differentiated? The first stage is to capture the data, through file uploads and by API calls to file stores or data sources. In capturing the data held in these documents, two vital processes can be achieved with minimal effort. First of all the data can be blended and once combined it can then be viewed as an entire entity. Treating it and storing it as data helps in ensuring consistency and completeness as well as enabling it to be queried. The second important process that becomes possible is to enable sorting and organising of the data. Machine learning routines can parse the documents to determine categories and suggest key words. A mix of automated and manual processes can categorise; sources, groups of information types, track changes and access history. It can pull out; indexes, distinct sections, groups of sub data sets and metrics.

heat-map

Heat map, showing documents added against documents expected

Sort

Once these categorisations are made they can be stored as metadata describing the record sets that are available for searching. The meta data could also be used the derive metrics, such as number of documents in a category or the completeness of sections dealing with a particular topic. It also provides ways of having the same data in different places to suit different use cases as well as representing data or documents in tree representations, for example setting a hierarchy of data using Modified Preorder Tree Traversal. There is much to be gained by having flexible representations of documents or data.

pmtt

Explore

By exposing the data within the documents and making it searchable it can be sorted into rational collections and sequences. Logic could be built into simple queries to confirm compliance; as an example matching supplier agreements with actual suppliers. By saving these simple queries and enabling them to be reused they could be used to detect non conforming states; for example contract documents with non consecutive revision numbers and dates. Visualisations of missing document types or expected numbers could show hot spots and track when a process is near completion or requires more attention. Saved queries can be run to measure the whole data set agains preconfigured templates of what the data set should be comprised of. Further analytic processes can be applied to reveal expected performance profiles. Machine learning procedures can be used to form predictions of what a compliant data set should contain.

in-use-encryption

Share

The role of a VDR is to allow data to be shared safely. Intelligent Data is so far away from the passive data stores that serve as VDR’s where the data is just plain dumb. Being dumb can be dangerous as well as inefficient and irritating.  A dumb shared file space will get filled with data that is no longer of use or was not really of use in the first place, because it is indiscriminating it treats this data all the same and considers all data to being equal. Some documents may contain data may be highly sensitive and poses a security risk, dumb data stores are blissfully unaware of its risk potential and pile up document that may contain security sensitive information . Before opening access to data from documents Intelligent Data can extract the sections that are useful for analytics and discard the sections that are security sensitive but not essential for the shared context. Redacting sensitive data provides extra protection: as an example this could involve extracting information from HR records to determine overall salary levels by job type while removing information about individuals in those positions. Secure redacting on digital documents can be a complicated problem and difficult and time consuming to complete manually, programmatically it can be made quite simple. Providing data ownership and control and allowing the application functional access achieved by ‘encryption-in-use’ where processing functionality on the encrypted data is permitted by giving the customer direct control of the process using their own encryption keys. Encryption provides security to data in transit, data in use and data at rest, indexing, meta data and tagging allow non sensitive generalisations and fast searches, data transformed into textual and tabular reports and visualisations gives management overviews, key metrics confirm compliance and uniformity.

tbg

So when looking at the existing Virtual Data Rooms we find much room for improvement and XCiPi Intelligent Data is where we seek to do it.

 
%d bloggers like this: