Skip to main content
Introduction to Data Science

Learn the basics of data science and how we use it to gain insights and successful results from your campaigns.

Persado 1st Level Support avatar
Written by Persado 1st Level Support
Updated over a year ago

Persado’s successful results aren’t random; they’re backed by data and numbers. So, how do we use language with mathematical certainty? Through the use of data science.

After reading this article, you will understand:

  • What data science is and some of its common elements

  • How Persado uses data science

  • Why data science is important for successful results.

Please note, the following applies primarily to Experiments and some aspects may not apply to Predictive Content, which prioritizes speed over the detailed process of experimental design.

What is Data Science?

Data science is the process of studying data for patterns and trends. Several key components are needed to ensure results aren’t random, such as:

  • Experimental design, which makes sure separate elements are tested against one another in a fair, balanced way

  • Statistical significance, which minimizes the chance of external factors affecting results

  • Balancing Simpson’s Paradox, a common statistics paradox which skews results when data is aggregated across multiple variables.

What does the Data Science team do?

Data Science provides statistical support, explains our testing and methodology, and assists with results and reporting as needed. A Data Scientist might:

  • Make recommendations on test designs and execution for Experiments (especially for non-standard campaigns)

  • Generate reports, monitor data trends, and interpret results

  • Answer data/statistics-related questions regarding your campaigns.

What Data Science Isn’t

Random guessing or luck. Data science operates on numbers, purposeful design, and statistics; the outcome is calculated based on these factors and biases are accounted for within a certain degree of error.

Common Data Science Elements

To make the best use of data science, certain requirements must be met to ensure the results gained are reliable. Let’s review those requirements at a high level.

Experimental design consists of a few important components to ensure as balanced and efficient a test as possible. It allows for testing multiple elements simultaneously, minimizes the sample size requirement (since every element is tested multiple times within an Experiment), and reduces bias by replicating the Experiment’s structure. At Persado, this comes into play every time we run Experiments with up to 16 Variants: using multiple elements to test multiple values each, we can determine with certainty which elements perform better than the others. This provides more insights than a typical A/B test.

Statistical Significance

Statistical significance measures the degree of certainty that your results were caused by what you did, and not by external factors. First, you need to pick a threshold (e.g., 90%, 95%, 99%, etc.); at Persado, we usually go with 95%. This means that, if we repeat the same Experiment 100 times, we would expect to win in at least 95 of them.

One-tail vs. Two-tail Tests

A one-tail test is used when you want to detect a change or difference in only one direction, while a two-tail test is used to detect a change or difference in either direction. At Persado, we only use two-tail tests because we're interested in if results are significant regardless of whether a Persado variant wins or loses against the control. Whereas a one-tail test would only be appropriate in a case where we only care if the Persado Variant wins—a test which would be unhelpful for our more learning-based system.

Simpson’s Paradox

Simpson’s Paradox is a well-known paradox in statistics in which aggregate data doesn't make sense after combining variables across groups of different sizes in the data set. At Persado, this paradox comes into play when we aggregate performance data of multiple campaigns and see counterintuitive results, such as revenue loss in the aggregate even though the incremental revenue of each individual campaign was positive. Our Data Science team applies a correction for this behind the scenes by using a weighting methodology.

Data Science in Action

In an Experiment with one of our customers, a Variant did well in the Exploration phase but poorly in the Broadcast phase. This happened consistently across all Experiments in the web channel for this customer. The fact that this was a pattern across multiple Experiments was key to allowing our Data Scientists to find the problem.

Statistics is a major part of data science: Certain things happen within a certain probability. Because the results were outside the degree of error, we knew there had to be something wrong with the way the message was being served through their provider. After several back and forths with the customer’s provider, it was confirmed there was an error with how the message was displaying on their end. Thanks to data science, we were able to discover and fix this error.

Conclusion

Data Science is a tool fundamental to how we test Variants and produce mathematically proven better results. It informs our experimental design process, pinpoints issues, and helps us understand the results of Experiments and Predictive Content so we can leverage them for maximum success.

Did this answer your question?