Time-trend analysis, time series designs

PLEASE NOTE:

We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed.

Time-trend designs are a form of longitudinal ecological study, and can provide a dynamic view of a population’s health status. Data are collected from a population over time to look for trends and changes. Like other ecological studies, the data are collected at a population level and can be used to generate hypotheses for further research, rather than demonstrating causality.

Ecological studies are described elsewhere in these notes, but there are four principal reasons for carrying out between-group studies:¹

To investigate differences between populations
To study group-specific effects, for example of a public health intervention aimed at a group
Where only group-level data are available, such as healthcare utilisation
They are relatively cheap and quick to conduct if routine data are available

In a time-trend analysis, comparisons are made between groups to help draw conclusions about the effect of an exposure on different populations. Observations are recorded for each group at equal time intervals, for example monthly. Examples of measurements, typically expressed as numbers, proportions or rates, include prevalence of disease, levels of pollution, or mean temperature in a region.

Uses of time-trend analysis

Trends in factors such as rates of disease and death, as well as behaviours such as smoking, are often used by public health professionals to assist in healthcare needs assessments, service planning, and policy development. Examining data over time also makes it possible to predict future frequencies and rates of occurrence.

Studies of time trends may focus on any of the following:

Patterns of change in an indicator over time – for example whether usage of a service has increased or decreased over time, and if it has, how quickly or slowly the increase or decrease has occurred
Comparing one time period to another time period – for example, evaluating the impact of a smoking cessation programme by comparing smoking rates before and after the event. This is known as an interrupted time series design.
Comparing one geographical area or population to another – for example, comparing changes in rates of cardiovascular deaths between the UK and India.
Making future projections – for example to aid the planning of healthcare services by estimating likely resource requirements

Analysis of time-trend studies

The most obvious first step in assessing a trend is to plot the observations of interest by year (or some other time period deemed appropriate). The observations can also be examined in tabular form. These steps form the basis of subsequent analysis and provide an overview of the general shape of the trend, help identify any outliers in the data, and allow the researcher to become familiar with the rates being studied.

Detailed knowledge of the statistical methods used in analysis is beyond the scope of the DFPH examination, but methods include:

Regression analysis (if the trend can be assumed to be linear)
Mann-Kendall test (a non-parametric method which can be used for non-linear trends)

Time series analysis

Time series analysis refers to a particular collection of specialised regression methods that illustrate trends in the data. It involves a complex process that incorporates information from past observations and past errors in those observations into the estimation of predicted values. Briefly, there are three types of modelling used to analyse time series data: autoregressive (AR) models, integrated (I) models and moving average (MA) models.

Autoregression is based on the premise that past observations have an effect on the current, and the number of previous observations that contribute to the current observation can be varied in the model. For example, in a first-order autoregressive model – AR(1) – the current observation is only predicted by the immediately preceding value, and in a second-order model – AR(2) – the current observation is predicted by the previous two observations, etc. Moving average models are slightly different. Here, instead of using past observed values as predictors, we instead use the errors of previous forecasts. Again, the number of previous forecasts used in the model can be set, so an MA(1) model only uses the error of the previous forecast. The AR and MA models can be combined to produce autoregressive moving average (ARMA) models. An assumption in ARMA models is that the time series is stationary (i.e. that the mean and variance is constant over time). However, this isn’t always the case, such as with global temperatures over time. Addition of an integrated (I) term helps account for any underlying trends (i.e. it makes non-stationary data appear stationary) – such models are known as autoregressive integrated moving average (ARIMA) models.

Presentation of trend data

Presentations of time-trend data should usually include the following:

Graphical plots displaying the observed data over time
Comment on any statistical methods used to transform the data
Report average percent change
An interpretation of the trends seen

Moving averages (or rolling averages) provide a useful way of presenting time series data. (Note that “moving averages” is not the same as a “moving average model”, described above!) The calculation and plotting of moving averages highlights long-term trends whilst smoothing out any short-term fluctuations, and they are also commonly used to analyse trends in financial analysis. For example, if you have five years of cost data (say, the annual cost of statin prescriptions in the UK [*fictional data]), as follows:

Year	Cost (£million)
2010	80
2011	100
2012	120
2013	101
2014	120
2015	110

… we can calculate a three year moving average for each year by taking the average of the value of each given year and the values either side of it. For 2010 this can’t be done as we don’t have the data from the preceding year. For 2011, the moving average value would be the average of the 2010, 2011 and 2012 costs [ (80+100+120)/3 = 100 ]. This could be repeated for 2012, 2013 and 2014. (We can’t calculate the moving average value for 2015 as we don’t have the 2016 data.) This would give us the following:

Year	Cost (£million)	Moving average (£million)
2010	80	-
2011	100	(80+100+120)/3 = 100
2012	120	(100+120+101)/3 = 107
2013	101	(120+101+94)/3 = 105
2014	94	(101+94+111)/3 = 102
2015	111	-

We can see from the plots of these two data sets, below, that the moving average (blue solid line) gives smoother results than the original dataset (red dashed line).

INSERT DATA SET PLOTS HERE

When using moving averages to smooth data, be careful not to average too many years’ worth of data for each calculation (e.g. using 10-year moving averages), as you risk over-smoothing the line and losing potentially important trends.

Interpretation of trend data

The results of time-series designs should be interpreted with caution:¹

Data on exposure and outcome may be collected in different ways for different populations
Migration of populations between any groups during the study period may dilute any difference between the groups
Even within a single population, there may be underlying changes, such as in age structure, which affect the outcome
Seasonal variation can results in fluctuations which affect the outcome trend (although this can be accounted for during analysis)
Such studies usually rely on routine data sources, which may have been collected for other purposes
Ecological studies do not allow us to answer questions about individual risks

References

Carneiro I, Howard N. Introduction to Epidemiology. Open University Press, 2011.