Learn how to identify statistically significant differences in group means, survey results and A/B test outcomes with a simple t-test.
While anyone can see the difference between two numbers, finding out whether that difference is statistically significant can require more effort.
Let’s say you have conducted a customer satisfaction survey at work. Your manager wants to analyse whether men give your company a lower Net Promoter Score℠ (NPS) than women.
In the data, you see that the average rating from male respondents was 9, compared to an average score of 12 from female respondents. How can you determine if 9 is significantly different from 12? This is where t-tests are useful.
In this article, we will define t-tests and their use cases, share examples of t-tests and explain how to interpret your results.
A t-test is a statistical test that assesses whether the difference between two means is significant using the t-distribution. It helps you determine whether an observed gap between groups reflects a real difference or is likely due to chance.
Testing for statistical significance is common in concept testing and product testing. In concept testing, AB tests are commonly used to determine whether one advert concept performs better than another. Similarly, product testing can establish if a product will hold its own when launched into the market.
T-tests use specific formulae to compare means and determine whether a difference is statistically significant. The two-sample t-test is the most common in survey analysis:
Here are the formulae for the one-sample t-test and paired t-test:
In both the one-sample and paired t-tests, the calculated t-value is compared with a critical value from the t-distribution to assess significance.
Use a t-test when you want to know whether two averages are meaningfully different, not just numerically different, in your survey results. T-tests help you compare group means, evaluate sample differences and decide whether a gap is statistically significant based on a p-value and confidence level.
Common survey scenarios include:
Use a t-test when you need to assess a difference in means, test a benchmark comparison or validate a hypothesis with small sample sizes. This makes it a reliable choice for survey analysis, A/B testing and any situation where you need evidence that a difference in your data is genuine.
Before you run a t-test, ensure your data meets a few basic assumptions so the results are reliable.
A quick check on these basics helps ensure that any difference you observe reflects a real signal, not noise in the data.
There are three types of t-tests commonly used by researchers. These t-tests serve different purposes, which we will explain below.
The one-sample test examines whether the mean (also known as the average) of data from one group (in this case, the overall CES) is different from a value you specify.
Example: Your company's current average Customer Effort Score (CES) is 4.2. Is a CES of 4.2 significantly more difficult than the industry standard of 5.0?
Two-sample t-tests examine whether the means of two independent groups are significantly different from each other. If group variances appear unequal or sample sizes are unbalanced, switch to Welch’s t-test (offered by most tools) as it does not assume equal variances.
Example: Your hypothesis is that men give your company a lower NPS than women. The average NPS from male respondents is 9, while the average score from women is 12. Is 9 significantly different from 12?
This test is used when you give the same group of people the same survey twice. A paired t-test shows if the mean changed between the first and second surveys.
Example: You surveyed the same group of customers twice: once in April and again in May, after they had seen an advert for your company. Did your company’s NPS change after customers saw the advert?
There are four steps to performing a t-test.
This section walks through the four steps using the NPS ratings example from the beginning:
Your hypothesis is that men give your company a lower NPS than women. The average NPS from men is 9, while the average score for women is 12. Is 9 significantly different from 12? This is an example of performing a two-sample t-test.
Let’s dive into the steps and t-test example.
Each type of t-test uses a different formula for calculating the t-statistic. For this example, we will use the two-sample t-test formula where:
You will probably be conducting the t-tests in a spreadsheet or statistical programme (such as Excel or SPSS). However, if you would like to do the calculations by hand, the formulae for the other two types of t-tests are included below.
Degrees of freedom are the number of ways the mean could vary. In this case, the degrees of freedom are the number of NPS ratings you could have in a given group of respondents. Similar to the t-statistic, the formula for degrees of freedom will vary depending on the type of t-test you perform.
This formula must be used to determine degrees of freedom in two-sample t-tests.
The critical value is the threshold at which the difference between two numbers is considered statistically significant.
According to this table, for a two-tailed test with an alpha level of 0.05 at 41 degrees of freedom, the critical value is 2.02. Note that most analysts use a two-tailed test rather than a one-tailed test, as it is more conservative.
For more information on the differences between one-tailed and two-tailed tests, see this video from Khan Academy.
If your t-statistic is larger than your critical value, your difference is significant. If your t-statistic is smaller, then your two numbers are, statistically speaking, indistinguishable.
In our example, the absolute value of the t-statistic is 0.86, which is not greater than the critical value of 2.02, so you can conclude that men do not give significantly lower NPS ratings than women.
Interpreting t-test results involves reviewing the t-value, p-value and confidence interval to determine whether the difference between your groups reflects a real effect or random variation. These metrics work together to show the size of the gap, the strength of the evidence and the level of confidence you can place in the result. The Q&A below explains what each one tells you and how to analyse t-test results.
The t-value shows how large the difference between group means is relative to the variability in your data. A larger absolute t-value means the signal rises above the noise; a smaller one suggests the gap may be due to chance.
The p-value indicates how likely it is to observe your results just by chance if the null hypothesis (no true difference) were actually correct. Many teams use a 0.05 threshold; p ≤ 0.05 suggests a statistically significant difference, while p > 0.05 indicates no meaningful difference in this sample.
A confidence interval (CI) provides a likely range for the true difference in means, adding context beyond a yes/no significance call. If the CI crosses zero, the effect is not conclusive; if it stays above or below zero, the result is significant at your chosen confidence level.
A meaningful difference is both statistically significant and practically important. Consider the estimated effect size and CI to understand how large the gap could be and whether it matters for your decision.
Larger samples reduce variability, tighten confidence intervals and make it easier to detect real differences. Smaller samples introduce more uncertainty, which can make borderline effects harder to interpret.
A clear t-test results summary shows why you carried out the comparison, what the test revealed and how confident you can be in the difference between groups. Your role is to translate the statistical output into plain language, connect it to the original question and highlight what the findings suggest for the decisions that follow.
Include these core elements when summarising t-test results:
Avoiding a few simple errors can help you obtain cleaner, more trustworthy t-test results from your survey data.
T-tests are used to determine whether the difference in the means of two sample groups is statistically significant. You can use t-tests during survey data analysis to help demonstrate the reliability of your data.
SurveyMonkey enables you to streamline the process of creating and sending surveys to sample groups for your organisation’s research needs. With SurveyMonkey, you can build market research surveys and questionnaires from scratch or use our broad selection of over 400 survey templates.
Start collecting survey data for analysis today to help your organisation make better decisions for growth. Create a free account today.
NPS, Net Promoter and Net Promoter Score are registered trademarks of Satmetrix Systems, Inc., Bain & Company and Fred Reichheld.

SurveyMonkey can help you do your job better. Discover how to make a bigger impact with winning strategies, products, experiences and more.

Learn how to write effective qualitative research questions that uncover deep insights. Explore types, examples and tips to craft questions.

Discover how to write a research question that drives meaningful insights. Follow this step-by-step guide to create impactful questions.

Market research helps you to understand customers, spot trends and reduce risks. Discover 10 key benefits and how to leverage insights for growth.





