There are few topics that can create as much confusion as the Data Life Cycle flood of contradicting content online. The goal of this post is to help you learn how data actually matures through its existence, and how you can turn this knowledge into processes that enable your data to grow from an infant obligation to a mature asset.
Let’s begin with sharing the most comprehensive visualization of the Information Life Cycle that you can find online:
Next, I will explain the 7 phases in the above picture that your data can evolve through. Let me warn you: Without a management plan and supportive actions, data does not progress automatically through the phases and in many cases never really reaches its full potential. Rather it cripples through this cycle. As a data steward it is your privilege and responsibility to see your data flourish. Here’s how:
Create / Collect
Creation or Collection of data results in the first manifestation of your most precious asset: Raw Data! However, just like every ore that might contain some gold, it takes a lot of refinement before the true value of your data unfolds.
It is important to realize that Data Creation or Collection needs to be a three staged process. At the beginning it takes a PLAN that answers the big W’s (What, Why, Where, When) and outlines the important HOW. After the actual assembly of data, either produced from observations through instruments or collected from an existing source, the data needs to be checked in a stage frequently referred to as quality assurance.
Let me get this straight.
Only data that is the result of an executed plan, that was properly collected or created, and passed a previously defined quality assurance check can be promoted to the next lifecycle phase. Much of the common data-mess found in research labs and enterprises is the result of haphazardly unchecked bit-pollution. If you want to be FAIR to your data (more on this later), you will not allow it to be born in a torn patchwork family, but rather into a well-established household with plans, rules and order. Just as Tolstoi said: “Happy families are all alike; every unhappy family is unhappy in its own way”; it’s the same with your data.
Description of data is without a doubt the milestone in a data’s life that has the greatest impact on its future career. A mishap here will turn a promising future Nobel Laureate into a misguided social annoyance. Description of data is the process of creating INFORMATION, the combination of some RAW DATA and a set of META DATA that contains insights about what it actually is that was graduated from phase one of the life cycle.
Meta Data, or data about your data, will preserve for the future users of your data all the details that went into the plan, creation, and collection process. It will provide a framework according to which a user can position your data and derive value from it.
This is where we find one major problem with most alternative representations of the “Data Lifecycle”. Once phase 1 and phase 2 are completed successfully, your DATA underwent a metamorphosis and emerged now as a beautiful piece of INFORMATION that can have a butterfly effect on your business. At this stage it makes no longer sense to speak about DATA. This is where the INFORMATION LIFECYCLE begins. Information is what needs to be passed on to the next stage – Information brings power and value.
Ironically, we all have used DATA at one point in our careers. Those were these frustrating moments when someone passed over a file with numbers and statistics that simply made no sense when we looked at them. You might recall desperate attempts to open up related reports or to talk to colleagues and peers to put some meaning to the symbols in front of you. What you were dealing with was the most frequently encountered type of bit-pollution existent in databases and data warehouses today. Unevolved raw data that can only be understood by the creator.
Most of our data science problems originate in the simple problem that people attempt to do science with data. It’s a problem that could be remedied, if they used information instead. A conscientious data steward will make sure that information (data + metadata) is the only object that any productive use will be performed with.
Let’s assume that you are one of those good stewards and your information is ready in phase 3 for usage. What exactly happens now? Well, this is the great thing about information – anything goes at this stage! Your innovation and abilities are the only boundaries to the value that you generate from your information. Whatever it is that your ingenuity decides to do at this stage, the result will always be some different kind of information: Information about customers, information about products, information about nature, maybe even information about your previous information!
Beware, do not use information at this stage to produce data once more. That would be the equivalent of humans giving birth to in-vitro implanted monkeys – so basically an evolutionary step down.
In the previous phase your information has begat more information! True business value and increase of our current knowledge frontier can only happen if your insights are shared. There are many distribution channels that are used to communicate our results, from PowerPoint to Peer-Reviewed-Publications. Sharing is the graduation party of our information, the moment where it can shine and glory in its own importance. For many of myopic minded members of mankind this is the end of the lifecycle.
It’s unfortunate, if a young and promising football player is injured in his first NFL game before he ever experienced the full benefits of a professional football career. Similarly, a piece of information that only shined once and was then discarded in the trash can of history, shortchanged its potential.
In order to let your information grow to its full capacity it has to be delivered to an appropriate first employer, full with great examples, role models, processes, and opportunities to grow and improve. The necessary next phase for information requires the data steward to move it into a long-term domicile.
Archive – a word with terrible connotations. A dusty data grave that will inflict you with deadly curses if you unravel its mummified contents. You might fear that sending information into an archive is damning it to eternal hellfire, however, the contrary is true. A living archive as required by the information lifecycle is true exaltation for information. While information existed before it was separated from the wide universe of its siblings and cousins with the narrow scope of its original use, it is now united with all the information your enterprise or lab has collected in the past. Completely new analytics and integration paths become available when a sufficiently large basis of well harmonized information can be queried.
Sometimes, the term data preservation is employed to render a more positive description of this phase, but while this is not such a dusty term, it only captures part of the archive’s responsibilities. Preservation, or keeping the information alive, is necessary, however not sufficient to make it into the most desired last phase of the lifecycle, the REUSE phase, after which reincarnation/iteration begins.
Reuse is the holy grail of data stewardship. If all your information produces value repeatedly, you will experience exponential growth in all your endeavors. Nevertheless, there is no free lunch. In order to reuse your data, the archive has to perform more basic, but absolutely mandatory tasks. It has to make your information FINDABLE and ACCESSIBLE. And it has to guarantee INTEROPERABILITY. This means the archive keeps the information at all times in a state where search operations can quickly identify it and give a user the ability to retrieve the data. Because of well managed information policies, such as strict adherence to data standards (like Allotrope) and preservation best practices (like OAIS), the archive maintains all information in a state where it can be compared and integrated with all other information.
This illustrates how important an archive is in achieving the goal of truly FAIR data, which you realize now should actually be termed Findable, Accesible, Interoperable and Reusable information.
Not to repeat myself too much, but just to make sure this message hits home: If you consistently fail to reuse your data without enormous efforts, you are not developing your most important company assets in the way they deserve it!
If you manage to treat data kindly, let it grow into information and be consistently FAIR to it, through Information Lifecycle Management compliant software (like the ZONTAL Space platform) you will reap infinite reuse and competitive advantage from your core assets.
Even the best things have to come to an end, and sometimes you might decide it is better for certain information to be made unavailable. To do this, an exit of the always iterating lifecycle loop needs to be triggered from the archiving system. A delete can come in different forms and shapes. It could be the deletion of either data or metadata, or even both (information). Based on your needs (maybe due to regulatory compliance?) an archiving system must provide flexibility.
Well-groomed data is like a beloved pet. If you invest in its live early, tame and train it, you will have an abundance of joy with it. What’s more, you can produce offspring that might revolutionize the breed and become so much more valuable than you might expect today.
I hope that this blog makes you rethink your relationship and responsibility to turn your data into information. If you look for a companion on your journey that can assist you in creating a data management plan or are looking for a product that guides the necessary change processes in your team, contact us. We are here to enable you!
Try it Out Go ahead. Click to experience our unlike-any-other archive.
In the last three months, we have successfully completed our effort to deliver the ZONTAL Space 2.1 platform.
The result is a not-so-ordinary digital archive.
Unlike typical data management solutions, our unique approach to information lifecycle management equips any enterprise with the necessary tools to achieve tangible, scalable results without sacrificing data integrity. ZONTAL Space 2.1 benefits both scientific and non-scientific organizations. That’s right. For the first time, even highly regulated enterprises can begin to use and act on data across all departments, business units, and even divisions, fueling machine learning and AI with all information possibly available.
We owe great thanks to the feedback we have received from 2.0 early adopters. Community input has been key to amplifying usability at every stage of the data value chain. Our 2.1 release introduces automated workflows, extended public web service APIs, electronic signatures, and a user interface for file submission.
Explore new and exciting ZONTAL Space applications as part of our next live demo Q&A.
High Integrity Automated Workflows in ZONTAL Space
Since the beginning, we launched ZONTAL Space with the purpose of building a platform that equips the entire data value chain. We pioneered new ways to reduce the workload for enterprise stakeholders as part of our introduction of automated workflows. The system automates the archiving process, saving Business and IT time and money.
Minimizing lower-level administrative tasks means innovators can focus on innovating rather than record keeping. Moreover, any automated workflow can be saved as a standard reference template to be leveraged by any other department seeking to collaborate on a shared data initiative. This is key to the success of future information lifecycle management efforts which will require more data governance and a larger variety of standard operating procedures over time.
New Extended Public APIs
For any department in any enterprise, there is a way of doing things. Our public web service APIs give anyone using data the freedom to integrate with the applications that are most suitable for their unique business unit needs. From ingesting data into the system, searching for information in the system, or retrieving data from the system, ZONTAL Space 2.1 gives you the ability to work within your own desired context.
The result: your unique way of doing things has no impact on how others discover and use data across the organization. Because ZONTAL Space is application agnostic and seamlessly integrates into any business process, other business units may elect to view and analyze that same data using whatever app or dashboard that serves their own use case.
In the last six months, we have successfully completed our effort to make the ZONTAL Space platform regulatory compliant. The introduction of electronic signature in release version 2.1 was the final step to making ZONTAL Space fully compliant with laws, regulations and corporate policies, introducing a highly secure, tamper-proof platform to the market. Most businesses still use paper records — because alternative data management solutions today often fail to meet the Code of Federal Regulations.
The future starts here:
- ZONTAL Space 2.1 is compliant with Title 21 CFR Part 11 Code of Federal Regulations. Our electronic signature is considered trustworthy, reliable, and makes our platform “equivalent to paper archives”.
- GxP compliance means any user follows “best practice” guidelines and regulations that ensure a product is safe and reflects its intended use. GxP guides quality in regulated industries including food, drugs, medical devices and cosmetics.
New Submission User Interface
ZONTAL Space provides a variety of ways to submit data: public web service APIs, file share monitoring and even integration to laboratory instruments to provide near-real time data aggregation. With release 2.1 we add another member to the family: The new submission user interface, a place where you drop your files when your business process progresses or completes.
This new “drop box” provides greater flexibility specifically for less automated processes while maintaining the capability to have full control over enterprise data access policies and contextual metadata required to preserve your data value over time.
In short: Data Innovation at Enterprise Scale
- Share data and documents within or across your organization
- Discover 100% of the information available to make better, more informed decisions
- Ensure compliance with regulations, even in ad-hoc business processes
- Manage data access permissions and retention policies with ease.
- Reduce risk of data loss while maximizing returns on highly discoverable data sets
- Compound the value of existing big data, machine learning and AI efforts