Ready to send a survey? One of the first things you’ll need to do is to define a sampling frame, or a set of individuals from whom you intend to collect your data. There are seemingly numerous ways of defining a survey sample though, and it's easy to get lost and confused, especially when distinguishing a sample from a research population. In this guide, we’ll share insights into the different sampling approaches out there, their pros and cons, and when to use which, setting you on the right path for your market research journey.
You might hear the terms sample and population used interchangeably when it comes to survey-based research, but in fact, they are very different groups of individuals. A population is the full set of individuals who could potentially take part in your research. For example, if you’re trying to gain customer feedback on a product you launched last year, the population would be every individual who had purchased, tried or otherwise interacted with the product. A sample, on the other hand, is a subset of the population. The sample can be identified and selected in a number of ways. For instance, you can focus on customer demographics if you’re interested in feedback from female customers, so gender would be the basis of your sampling strategy. Other characteristics that could form the basis of sampling include geographical attributes or behavioural attributes. In addition, if the population is very large, making data collection from the population unwieldy, you might prefer to select a smaller, more manageable sample using a random approach.
Collecting data from a population or from a sample both have merits. In addition, there are some good rules of thumb you can use to guide you as to which approach to use, and when.
In an ideal world, when conducting any kind of research, whether brand awareness research, or gathering customer feedback data, data would be gathered from the entire population. Why? If every member of the population supplies research data, you have the best guarantee that inferences you make about the results are representative of the population. In other words, collecting data from the population helps to improve the validity and reliability of your research findings.
In practice, however, it’s not always possible to collect data from a research population. The main reason is that populations are often hard to identify, and even harder to access for your responses to be statistically valid. If the boundaries of the population are clearly delineated, and the audience is in some way captive, gathering data from a population makes sense. For instance, if you’re interested in gathering employee engagement data, you can probably take a population-based approach, using a list of all your employees from HR records, and emailing each one directly.
In addition, this approach makes sense if the population is small and cooperative or interested in the survey outcomes (such as all 30 pilot users of a new service). However, when the boundaries of the population are unclear, or the population is very large or geographically dispersed, it will usually be necessary to draw a sample.
If collecting data from a population puts you in the best position to derive valid and accurate insights, why would you survey a sample instead of a population? The short answer is necessity. It is rarely practicable for researchers to access the entire target population, given its size and geographical dispersion. Let’s suppose that you run a busy food van on a business park. You’re interested in surveying executives in the surrounding offices on their lunch preferences. Taking a population-based approach will require access to a complete and accurate list of all the workers, which is something you’re unlikely to have. In cases like this, it is necessary to gather data from a subset of the population. The findings can then be generalised to the wider population. In other words, using a sample, you can often assume that the research findings are representative of the broader population from which the sample was drawn. Often, but not always. Let's take a deeper look.
There are two main sampling strategies available if you do decide to take a sample-based approach: probability sampling and non-probability sampling.
Probability sampling is a random method of sampling
It describes any approach where every member of the population has an equal chance of being included in the sample. For example, if you had a population list, known as a sampling frame, you might use a random number generator and then select each individual whose position in the list corresponds to the number generated. This is known as a simple random sampling approach.
Another way might be to use a systematic random sampling approach, selecting, for example, every 10th or 100th individual in the sampling frame. Stratified sampling is similar to random sampling, but in the first instance, the population is divided into groups with similar attributes. For instance, customers might be divided into groups on the basis of how frequently they shop with you or how much they spend. Then, a simple or systematic random sampling procedure is used to select individuals from each group. This helps to ensure that different segments of the population are represented in the final sample.
Non-probability sampling is more selective
With this method, not all members of the population have an equal chance of being chosen for the sample. For example, if you surveyed every visitor to your website on a Saturday morning, only shoppers who shop at the weekend have a chance of being surveyed. Alternatively, you might only send surveys to customers who you have a personal relationship with, while customers you do not know well are ignored. This can introduce some error into your sample and may mean that the sample is not representative of the population. Why then might you use this approach? Probability-based approaches, while ideal, require you to have access to that all-important and often-elusive population list.
As we’ve seen, in many instances, you'll need to gather your data from a sample rather than from the full population. Just because you’re driven to do so through necessity, however, doesn’t mean that there aren’t many benefits to collecting data from a sample:
Whether you’re gathering your data from a sample or a population, make sure you get your terminology right. One of the major differences in the population-based and sample-based approaches relates to how to determine sample size. As we explain in more detail here, sample size is an estimate of the target number of individuals you’d ideally like to complete your survey. The terms statistic and parameter are two related but distinct concepts relevant to gathering data from a sample or a population. Let’s take a look at each.
A parameter is a measure of some trait of a population, based on data gathered from the entire population. Let’s supposes, for instance, that you have decided to drop to a four-day working week as a way of improving staff motivation and commitment (lucky staff!). You send out a survey to all members of staff asking which day of the week they’d prefer to have off. If all of your employees fill in the survey, and 80% of your employees say they’d prefer the Friday off, then that figure is a parameter of the population.
A statistic, on the other hand, is a finding inferred from data gathered from a sample of the population. Imagine your employee base is very large, so you decided to send your survey to a random, representative sample. The results are broadly the same if you were to have gathered data from the entire population: the vast majority of workers (77%) are hoping for a long weekend with Fridays off. In this case, the outcome doesn’t change but the way that you describe it does – that 77% is now called a statistic. Why do you need to know the difference between the two? The answer lies in sampling error.
Sampling error is another important piece of sample-related terminology you should know. Simply speaking, a sample error is the difference between a population parameter and a sample statistic. Going back to our earlier example, we saw that when the entire population was surveyed about their preferred day off, 80% said Friday, but when a sample was surveyed 77% said Friday. Sampling error is the difference between the results derived from the population and the results derived from the sample, which in this case is 3%.
This example demonstrates the importance of trying to obtain a sample that is as representative of the population as possible. What if, for instance, you sampled only part-time workers, including many who never work Fridays anyway? You might obtain a very different result that is not indicative of the preferences of the wider population.
It’s to maintain accuracy and keep errors to a minimum. Sampling errors can occur even when a probability-based sampling strategy is derived. This is because statistical measures of dispersion and central tendency (such as means and standard deviations) will differ slightly, even if the sample is representative of the population. Your goal is to keep your sampling error as low as possible. You can reduce sampling error by increasing the size of your sample.
How do you decide how many individuals to target for your survey? Design it, send it out and hope for the best? Not quite. If you’re able to gather data from your population, this question is moot: the ideal size of the audience is exactly the same as the size of the population. If you’re surveying a sample, however, there’s more to consider.
First, you need to estimate the size of the population. Even if you don’t have an up to date population list, it's a good idea to have a rough figure in mind. For instance, if you’re interested in learning about the perils that cyclists perceive on the roads in your region, you might use secondary data to estimate that there are around 20,000 cyclists in your catchment area. Once you have that figure, you can apply a margin of error. This is simply a measure of the accuracy of your results, and it is expressed as a percentage. If you are willing to tolerate a margin of error of 5%, this means that you estimate that a true result lies in a range that is 5% more or less than your statistic. So, applying a 5% margin of error to the statistic showing that 77% of sampled workers would prefer Fridays off means that the true figure is probably 72% to 82%.
Finally, you can use a sample size chart to compare your population size with your margin of error to give you a rough estimate of your target sample size. Of course, don’t forget that not everyone will fill in your survey! So, if your sample size is 100, you will want to target many more respondents than that to reach your target audience size.
So there, in a nutshell, is the difference between collecting data from a population and a sample. Whatever market research you’re looking to do, start with exploring all the different types of market research surveys that exist and find the best one.
Need market research but not ready to take on the task yourself? Learn more about the solutions we offer. Momentive, the maker of SurveyMonkey, delivers purpose-built solutions and comprehensive programs for all your market research needs.
Collect market research data by sending your survey to a representative sample
Get help with your market research project by working with our expert research team
Test creative or product concepts using an automated approach to analysis and reporting