Metadata

Data Stovepipes Limit Data Value

CIOs and IT administrators rank managing data growth at the top of their concerns. Not only are they dealing with how to manage the cost of storing and protecting increasing volumes of data, but they must also figure out how to even know what data is worth preserving, and how to extract value out of that data.

The storage industry sees the rapid growth of data as a very lucrative problem for which they position themselves as the solution. Whether it is I/O-optimised flash and disk arrays or capacity-optimised file and object-based or cloud storage solutions there are plenty of options for storing data to meet the particular needs of the moment.

More importantly, as more data stovepipes emerge within an organization, the more difficult it is to manage them and derive useful intelligence from the data scattered across them. Just having the data doesn’t mean it can be meaningfully exploited.

Simply building more and better storage containers does not solve the problem of managing data variety and really getting the most value from the data. It would be like car manufacturers adding more fuel tanks to vehicles in response to decreased engine efficiency.

Metadata is the Key

Metadata is literally data about the data. Think of it as a roadmap that gives you a bird’s eye view of everything, without actually needing to access it directly. The traditional infrastructure-based approaches are like planning a trip by first driving all of the available routes before deciding which is best. With a roadmap, the decision is simple and ensures you select the best possible route.

Every digital file contains multiple types of metadata. There is file-system metadata that describes its basic attributes, such as file size, location, name, when it was last modified, and so on. But there is also much richer descriptive metadata contained within the files that can enrich the roadmap, and can give you more information to work with. Whether a satellite image, the output of an MRI scanner, a genome sequence or a medical record. Text-based contents of standard office files contain information that can provide greater insight when correlated with all of the other types of metadata available. Even the absence of metadata can be significant. Everything about the data leaves a digital fingerprint that can be analysed, and which can extract maximum value.

Storage-centric solutions to data management simply cannot provide the intelligence about the data they store, nor were they designed to. Coalescing metadata can be done without even moving the data, and provides an intelligent roadmap to data management and useful intelligence to enable new insights without needing to fundamentally alter the underlying infrastructure.

Mediaflux: Leveraging the Power of Metadata

Mediaflux is a comprehensive data + metadata management software platform that addresses the need to classify and order data so it’s readily available to those with the authority to access and analyze it, from sensor to field deployment, data acquisition to decision. Metadata is the key to the rapid discovery of data, orders of magnitude faster than searching the data itself. Metadata can be:

  • Automatically extracted as data is ingested – for example, geospatial co-ordinates or bounding boxes, images types and resolutions, and text can all be extracted by plug-in content analyzers.
  • Automatically generated – for example, revision histories and audit trails.
  • User generated – existing metadata may be updated or new metadata added manually at any time. Examples are annotations, labels, tags, comments, and workflow-specific actions.

Metadata can conform to any standard or to your own customized schema.

All metadata is managed using XML, to ensure maximum system and application interoperability.

Metadata, a core Mediaflux capability, is the key, not only to selecting the best possible route, but to reaching the destination in the fastest possible time.