So why don't they? The most common reason is time, as any effective study group only tests for one variable at a time. Since most marketers do not have patience for true scientific models, they tend to test for multiple measures at one time or split variables (A, B, C) among demographics that they hypothesize are more likely to respond favorably to each variable.
An historic case study regarding mixed variables.
Unfortunately, mixing variables can have adverse or disastrous results. Probably one of the most famous accounts is tied to a 1996 McDonald's campaign, which became one of the most expensive marketing flops in history. You can find some background about the campaign on Wikipedia or a reference to it at The New York Times. But it's not the whole story.
Prior to the launch of the Arch Deluxe, McDonald's had simultaneously launched various deluxe versions of its burger across the United States — including one I was directly involved with. Out West, there was no Arch Deluxe (at least, not before the national rollout of the Arch). McDonald's had marketed the California Deluxe, which was also an adult burger with different ingredients.
There were other regions with variations too. If memory serves me right, there were six regions (but I could be wrong here). And as much as the Arch Deluxe was test marketed in the Northeast, the California Deluxe was test marketed in primarily the West Coast.
Each test area also had localized campaigns, created by regional advertising agencies to market the burger. And the winner, determined by total sales, would be the one McDonald's would pick for a national rollout. The Arch Deluxe won, and none of the others were even mentioned again.
On the surface, it seemed to someone that the test market idea was a solid marketing approach. Until, of course, you consider the variables: different products, marketed to different test markets with different concentrations of population, using different messages (within McDonad's mandatories). Add it all up and the marketing study they created measured nothing, even though it had convinced McDonad's to invest $200 million into the campaign.
As a point of interest, the California Deluxe rivaled Big Mac sales in its test markets. But the smaller sampling size predetermined that the better burger was doomed out of the starting block.
A quick take on developing a better test market model, using the historic case study.
McDonald's could have created a different test model, but the timing to execute the campaign would have taken significantly longer. It could have introduced three burgers in one test market with a singular campaign asking people to choose. It could have rolled out one burger at a time in several areas across the United States. Or, well, any number of ways with an emphasis on minimizing variables.
It's one of the lessons marketers (and bloggers for that matter) would be well served to apply. In science, medicine, or psychology, for example, researchers generally create an experimental group (one receiving an independent variable) and a control group (one receiving a similar experimental situation, but without the variable), with the participants randomly assigned.
Provided there is no other tampering, the variable could be anything. It could be two products, one with an "improved feature." It could be the same product, with different creative campaigns. It could be a specific incitement offer. It could be the same everything, but tested in two or more different test markets. Or maybe two different price points. And so on and so forth.
In terms of social media, for example, narrowing the variable can help marketers determine what content different social networks respond to or the style of communication. (Managing several social programs, we've seen differences in each network community emerge over several months.)
The point is to narrow the measurable variables, which increases the reliability (the ability to get the same results in successive studies) and validity (the ability to measure what you want to measure). The benefit is increasing the return on investment by running continuous tests until patterns emerge.
In the case of the Deluxe debacle, for example, they might have found that people in the Northeast also preferred the California Deluxe (or one of the others) over the Arch Deluxe too. But ironically, no one will ever know. Instead, all they learned was the Arch Deluxe could not support itself nationwide.