As Google chief Eric Schmidt said “While it took from the dawn of civilization to 2003 to create 5 exabytes of information, we now create that same volume in just two days!” Unstructured data is growing rapidly and at a more rapid rate than structured data. According to a 2011 IDC study, unstructured data will account for 90 percent of all data created in the next decade.
So what exactly is unstructured data?
Unstructured data is a type of data that is variable in nature and can come in many different formats, e.g. text, video, images, documents, speech and more. Some examples are voice recordings of customer service interactions, customer emails, blog posts, and social media etc.
So, why is unstructured data useful for growing a business and how should marketers approach the zettabytes of unstructured information being created every day? Below are a few key principles that can be used as guidelines:
Ensure you have the appropriate technology: If you are going to analyze unstructured data on your own, ensure you are adopting non-relational technologies like Mongo which are ideal for storing unstructured data, are schema-less, and scale horizontally. You should also have access to or are licensing technology that can analyze unstructured data, e.g. NLP engines for text, voice analytic technology, speech to text transcribers, social graph analytic software, and machine data analysis tools etc.
Prioritize unstructured data:
- Data rich in content: Ultimately the effort involved in analyzing unstructured data will be worth the effort only if the unstructured data is rich in content. For example, speech recordings of customer conversations with customer care/service representatives are far richer in content than typed or hand written notes that contains only high level information or may miss critical customer sentiment information.
- Data that can be linked to an individual: If this data can be linked to an individual customer, it can be combined with other pieces of information related to that customer to enable intelligent business decision making. For example, the sentiment expressed in a customer service call can be highly predictive of customer attrition.
Today several pieces of unstructured data, e.g. social, Twitter etc. can be difficult to link back to an individual due to privacy reasons, although there is plenty of opportunity to link unstructured data using, voice, speech and text data that is often available and does not violate any privacy issues.
Extract relevant signals from unstructured data and combine with structured data:
Unstructured data is noisy, so it is critical to extract high value structured signals from unstructured data (e.g. sentiment scores, emotion categories etc.), and combine them with other structured pieces of information such as customer transactional and/or behavioral data to gain a more holistic view of your customer. Follow-up testing to determine whether these sentiment/emotion signals provide incremental value is very important. Also, it’s a good idea to save these structured signals over time for trend analyses, but beware of keeping the raw unstructured data for too long, as it becomes costly to store.
Monitor brand health at the aggregate level: Even if unstructured data is not linkable to the individual level, it can be useful by providing an early read on how consumers are feeling and talking about your brand. Leveraging commercially available social media monitoring tools to listen and act accordingly can be extremely insightful.
There is a lot of hype around big data and unstructured data. Unstructured data is massive and ever growing, and it is noisy, therefore it is critical to only extract high value signals and do it efficiently.
## ## ##
About the Author:
Niren Sirohi is Vice President, Predictive Analytics at iKnowtion and responsible for leading the company’s predictive analytics practice. For more than 20 years, Dr. Sirohi has been developing and implementing strategic analytic solutions for global brands across a variety of industries including financial services, retail, consumer goods, hospitality, and telecommunications.
Never before has finding the needle in the proverbial haystack been more true. If you would like to learn more about how your organization can start to explore the power of unstructured data, feel free to contact me at firstname.lastname@example.org.