We should see a 4,300 percent increase in annual data production by the year 2020. Less than two years from today, according to Medium.com, every person on this planet will produce 1.7 megabytes of data per second. Read more mind-blowing stats in the Medium.com article, “Making decisions with data — The importance of data preparation.”
So, what’s that mean for those of us trying to find the bullseye for that segmented group we are hoping to attract? It means we have a lot of statistics at our disposal, and we need to use this data effectively to succeed. Most importantly, we need to collect the right data and sift through it efficiently.
In the worst-case scenario, that’s a whole lot of wasted expense and labor for unnecessary information. In the best-case, we have a stream-lined arrow, headed for the target.
We find that the key to data prep is a collaboration between departments—a business approach where tech experts, marketing experts, and virtually everyone else in the company works as a team.
What’s Data Preparation and How Can You Use it?
Data preparation requires a clear process and understanding of what you’re trying to achieve.
The data prep process:
- Collect: Create a dataset. Sounds simple, right? Begin with a solid foundation. According to a 2017 SAS article, “Why analytical data preparation is so important,” sloppy data prep collection leads to inefficient, wasted data.
- Clean: Now it’s time to sort through the data. Where are there discrepancies? Is the data reliable and valid? Do your data sources make the grade? You should see consistencies when cross-checking columns. Take cleansing seriously, any missteps could influence future projects on a variety of levels.
- Consolidate: This is where the clean data replaces the dirty until, finally, the data is stored.
According to the article, “Why data preparation should not be overlooked,” found at Data Science Central, aligning and transforming the data you’ve collected and cleansed will influence your company’s future data standards.
Data Preparation Tools and Tips
Here are some of the more tangible benefits good data quality offers:
- Collection tools: Avoid a cracked foundation by creating an observation and performance window. Here, you and your team choose the period for predictor variables and the target variable is defined. Look at the model in the SAS article.
The observation window may include historical data like a response to a specific marketing promotion. In this case, use the responses, or lack of, as your target variable. Other data sources could include—Credit bureau data, customer demographics, customer product data, social data, customer transactional data. - Cleansing tools: Excel can only go so far. Consider other available tools: Data integration platforms (ETL), data warehouse automation, data discovery, data science/mining tools. If this is out of your wheelhouse or budget, then ask IT for suggestions. Generally, standards like frequency tables and plots are valuable for any data-cleanse.
Don’t be afraid to get down-and-dull. Compare summarized data against existing data. Do they validate each other?
- Consolidation tools: Data continues to grow in complexity and sophistication. As a result, consolidating and classifying the data is ever-evolving. Make sure you and your IT team keep up-to-date with the latest in analytical techniques during ETL.
Further reading: BARC’s article, “Data preparation: Refining raw data into value,” is a terrific resource. BARC conducted an independent survey to break down valuable data prep trends.
Dig through the survey to find a variety of data warehouse automation tools now available, not to mention, sophisticated data discovery and data science/mining programs. Look at the section labeled, “3. The right tools” for specific tools and methods most used.
Collaboration: The Most Important Data Prep Step
We’ve covered the nuts-and-bolts, it’s time to focus on the most important tool in your toolbox: Collaboration. Let’s look at a hypothetical data prep timeline to push this point home.
Jack’s dog food company is considering a new organic dog food and management needs data analysis to identify the value of the new product.
Here’s where many companies fall short. One department takes the ball and, in isolation, relies on historical data/outdated techniques to collect/cleanse data.
Instead, Jack’s brings departments together to ask, “How do we collect the right data to create an accurate visual?” This brainstorming may include training in areas like data integration and democratization.
- In collaboration, real-time variables are identified and data is studied. New software tools are implemented to segment demographic data further and lessen the Excel reliance. Employees are trained to find stored data and analyze it easily.
- Data is collected.
- Data is cleansed. Collected data is painstakingly studied, validated and inconsistencies are identified. Only then, does the company feel confident moving forward. Plus, the data foundation created for this project fosters future projects.
Jack’s starts its marketing push for the new product. Thanks to effective data preparation, Jack’s understands their organic dog food segmentation in depth.
Once your data is in good shape, it’s time to use advanced analytics to make better business decisions. Advanced analytics can help you use data to predict customer behavior, optimize campaign performance, and design your marketing initiatives to show a strong return on investment.
With effective, successful data preparation, everyone works together. A need for training, however, is required and it’s worth every penny.
Don’t get overwhelmed, get informed. Without thorough, in-depth, up-to-date data preparation, your organization is stuck.
Ready to get more out of your data? Talk to an analyst to learn how Lityx can partner with you to help you assess your data management and leverage advanced analytics to improve your marketing efforts.
Also see Non-Profits: Work Smarter, Not Harder.
Sign up here to subscribe to the blog