A recent inquiry prompted me to take a closer look at the General Lifestyle Survey, something I have not done for some time. This raised a number of issues regarding this widely used data source and also the wider implications of a topic that I rarely see addressed in the day-to-day application of research; sampling.
It has been a long time since I examined the methodology for the GLS, so there has been quite a bit of catching up to do. The GLS team should be commended for their thorough documentation of methodologies, but I suspect this detail and the resulting large document puts many off venturing there. An initial review highlighted a number of issues with the use of GLS data to accurately evaluate smoking prevalence at a regional level, particularly for the North East.
The sampling method for GLS is stratified random sampling. The strata are multi-layered. Region forms the primary layer, then the secondary layer is a composite of the three demographic parameters; car ownership, socio-economic group and number of pensioners. These demographic parameters are used to rank postcode sectors based on census 2001 data. While two of  these demographic parameters are reasonable at describing smoking behaviour, clearly other important ones such as specific age bands are not included - we will come back to that later.
Within the strata an appropriate number of postcode sectors are randomly selected, then within each of those 23 households are randomly selected. Each of these households are then approached to take part - and around 70% do so.
If we now consider the North East sample for say 2010. The total sample was 368 households. Given 70% participation and 23 randomly selected households per postcode sector then this sample must consist of around 23 sampling units. The North East region is divided into two geographical strata (metropolitan and non-metropolitan) and each of these into 27 demographic sub-strata - 54 in total. As can be seen the number of sampling units is not even half of the number of strata. Stratified sampling should reduce variance in the data; this is valid for the whole sample, but in the NE example the reverse may well be true (see below).
These sampling variances are increased by one further complication. By surveying households we have a further layer from cluster sampling. As behavioural influence is significant (smokers are more likely to live together and young people growing up in a smoking household are more likely to become smokers) then this further increase in the sample variance and is allowed for in the design factor used.
As the variance in a stratified sample is more influenced by the between strata than within strata, we can expect a large variance in the sample data (and therefore potentially large sampling error). The biased sample is not corrected in this instance by the weighting procedure for two reasons: the weightings are based on stratum probabilities of the total sample which may not be homogeneous in all regions; secondly we must have strata with no sample at all. An example of the impact of this is immediately apparent in the gender mix of the NE sample with women outnumbering men by over 1.3 to 1. Furthermore, those age 55+ are significantly over-represented (50% of sample) and those aged 35-54 under-represented (27% of sample). While the sample details of age within gender are not published at regional level, given that the probability of younger men partaking in any survey, we can reasonably assume that these groups must be significantly under-represented.
Since 2005 the GLS has adopted a longitudinal rather than cross-sectional methodology. A quarter of the sample is replaced each year with three-quarters being replicated. The fact that only a quarter of the sample is undertaken each year compounds the difficulties outlined above.
While I can see the rationale for longitudinal samples, I think this introduces another issue when measuring lifestyle behaviour. While all survey respondents are ultimately self-selecting, this becomes more significant in longitudinal studies. There are two potential problems: those who undertake what may be seen as socially inappropriate behaviour are less likely to take part (common to all surveys, but more likely in longitudinal studies when you partake for up to four years); and the act of participation over four years may in itself influence behaviour (if people know they are going to asked the same questions again they may be influenced to try and make what may be seen by others as an improvement in that behaviour).
Further issues are also possible from the longitudinal study. With participants remaining in the sample for four years, then by definition the sample is ageing at a greater rate than the population. Given that age is such a critical factor in smoking prevalence this is clearly an issue that could have considerable impact. In fact we can approximate the rate of change of prevalence as about one third of a percent of year of age (assuming linear relationship for simplicity), so our over-ageing sample could well reduce overall prevalence by around 1% (approx the step change we see around 2006 for the national figure).  Furthermore, the age issue is also important when considering how people enter and leave the smoking population; in it's extremity this is young (pre-16) smokers who enter the sample by virtue of becoming aged 16 and those who quit or die at the other. Clearly both of these dynamics can be heavily affected by any concentrations that may or may not occur at critical ages within the sample structure and impact significantly on the variance of smoking prevalence data.
As far as I can see none of these issues are addressed in the technical notes accompanying the survey.
There is some evidence that the switch to a longitudinal study did have an influence; there is an apparent step change in smoking prevalence (evident on the national survey also) around 2006, by which time half of the sample was longitudinal.
Sampling is not the most exciting of subjects, but this example highlights why it is often worthy of a more critical examination than it usually receives.