What is the difference between patient specific and aggregate data
The same statistical analysis, covariate inclusion, definition of exposures and outcomes are used when calculating each individual study estimate. Next, individual estimates are weighted and pooled using fixed or random effects methods, as described above.
In the one-step approach, the individual data points from all of the studies are fitted together in a single model or set of analyses, rather than calculating estimates for each study individually.
When fitting the model, accounting for clustering must occur or else investigators risk finding significant effects when there are none, or vice versa Abo-Zaid et al. A fixed effect approach e. In general, the one-step approach will produce summary effects similar to the two-step approach, with the exception of when there is binary outcome data Stewart et al. Even if summary effects are similar, the one-step approach might offer additional flexibility that make it useful for incorporating more complex statistical methods used in single-study analyses.
First, one-step approaches allow researchers to focus on the within study between individual differences in effects, adjusting for clustering through either fixed effect or random effect models.
Second, the one-step approach allows researchers to compare a number of different models with different assumptions or compare nested models using model fit measures such as AIC Stewart et al.
While this may also be accomplished in a two-step approach, each new model must be calculated for each individual study before pooling, which can be time consuming. Third, the one-step approach may preferable to addressing questions using longitudinal data Jones et al.
In the one-step approach, researchers can account for the correlation of repeated observations in the analysis, rather than losing that information in the pooling step of the two-step approach.
This is expected to yield more appropriate standard errors for the pooled estimate. Lastly, the one-step approach may offer more flexibility to investigate interactions than the two-step approach, allowing for the fitting of non-linear covariates, for example Stewart et al. While the one-step approach allows for increased model complexity and non-linearity of exposures and covariates, models can also be more difficult to interpret and may require additional statistical expertise Riley et al.
Further, an important assumption of the one-step approach is that variables are measured in a comparable way in all studies Smith-Warner et al. Assessing model heterogeneity in an IPD meta-analysis is just as important as in meta-analysis of AD. In an IPD meta-analysis, heterogeneity is reduced through data harmonization, selection of the same effect measure e.
Sensitivity analyses are also completed for IPD meta-analyses. Rather than focus on study-level approaches e. In the case where some individual data was not attainable, a sensitivity analysis combining published results and the results of the IPD might be conducted Stewart and Tierney, Other sensitivity analyses might include comparison of summary estimates derived from one or two-step approaches and analyses using fixed or random effects.
However, instead of reviewing the existing literature and contacting investigators to identify all available existing studies, the process of prospectively pooled analyses involves the creation of inclusive collaborative groups to plan future studies.
Prospectively planned pooled analyses take the longest to complete several years of the three methods and are more costly to complete than meta-analyses of AD. Despite added costs of time and money, they offer the advantage of having additional control over study design and data measurement than IPD meta-analyses because they are prospectively planned.
The steps in conducting a prospectively planned pooled analysis are the same as those outlined above for IPD meta-analysis and are not further detailed here. Instead two notable differences are described. Rather than starting with a systematic review of the literature, prospectively planned pooled analyses begin with discussion between investigators to form a collaborative group.
The collaboration is designed to be as inclusive as possible and often includes research centers from several countries. While the rewards of participating in the collaboration are great, a significant amount of time and money is needed to handle the logistics of the collaboration.
Logistics include identifying who will be the secretary who oversees the collaboration, the center that will securely store data, and data cleaning responsibilities Stewart et al. Additionally, final reporting and publication will need to be discussed and often individual studies may be restricted from publishing their unique study findings until the collaborative study results have been published.
Data standardization, rather than data harmonization is one of the major goals in the planning process. Data standardization aims to establish a uniform way in which variables will be defined, measured, and collected.
The process of standardization may be easy when there is a gold standard, but may require further discussion and compromise if there are several measures to assess a variable. Differences in approach to variable measurement might vary from country to country or between research centers.
Even though variables of interest are standardized across studies, prospectively planned pooled analyses are not the same as multi-center trials. Centers involved in the pooled analyses are expected to adhere to the same standardized variable measures and some key inclusion criteria for participants, but are allowed to differ in many other regards e.
Unlike meta-analyses of AD, IPD meta-analyses and prospectively planned pooled analyses must take on additional consideration of participant privacy and confidentiality. Any analyses that utilize participant level data must ensure that the data can be stored and accessed in a secure fashion. This requires that one of the institutions within the collaboration must take responsibility for storing and hosting this information. In the case of prospectively planned pooled analyses, these considerations can be taken into account from the beginning and should be incorporated into planned costs for the project, protocols submitted to human research review boards at each participating institution, and documents of informed consent for participants.
In the case of retrospective pooled analyses, data collection may already be complete. In the process of obtaining the original data, investigators conducting the secondary data analysis must also verify that the original studies had approved human subject protocols. Additionally, it is recognized that study participants likely consented for the original research study, but may not have consented to participate in a secondary study with potentially different research questions.
In the U. This process differs from country to country Philips et al. A brief table comparing the three quantitative approaches to summarizing evidence across studies is provided for quick reference. J Clin Epidemiol; 66 8 : International Journal of Epidemiology; In: Introduction to Meta-Analysis. Cooper, H and Patall, EA The relative benefits of meta-analysis conducted with individual participant data versus aggregated data. Psychological Methods; 14 2 , Based on these findings, we provide guidance for determining systematically when standard AD meta-analysis will likely generate robust clinical conclusions, and when the IPD approach will add considerable value.
PLoS Med 17 1 : e This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. LAS received no specific funding for the work. Competing interests: The authors have declared that no competing interests exist.
Most standard reviews continue to rely on published AD [ 1 , 2 ], and if some eligible trials are unpublished, or reported trial analyses are based on a subset of participants or outcomes, then information may be limited, and AD meta-analyses will be at risk of reporting biases [ 3 ].
There are additional considerations for AD meta-analyses evaluating the effects of interventions on time-to-event outcomes, which are frequently based on hazard ratios HRs , either derived directly from trial publications, or estimated indirectly from published statistics or from data extracted from Kaplan—Meier KM curves [ 4 — 6 ]. Inevitably, each of these methods requires stronger and more assumptions, which, together with varying lengths of follow-up, could have repercussions for the reliability of the results.
The collection of IPD can help circumvent publication and other reporting biases associated with AD, provided data on unpublished trials and all or most participants and outcomes are obtained, and, if relevant, follow-up is extended beyond the time point of the trial publication [ 7 — 10 ]. Also, IPD enable more complex or detailed analyses, such as the investigation of whether intervention effects vary by participant characteristics [ 11 ].
However, it remains unclear whether the IPD approach is always needed for the reliable evaluation of the overall effects, and because these projects can take many years to complete, results may not be sufficiently timely. Moreover, the IPD approach may not be feasible, owing to the expertise and resources required [ 7 , 8 ] or to difficulties obtaining the necessary data. Hence, patients, clinicians, and policy makers will continue to rely on standard AD meta-analyses. While some guidance is available to help reviewers gauge when AD might suffice and when IPD might add value [ 8 , 12 ], it is not backed by empirical evidence.
A large systematic review of published AD versus IPD meta-analyses found that conclusions were often similar, but the comparisons could only be made on the basis of statistical significance [ 13 ]. For meta-analyses of published time-to-event outcomes, individual case studies have shown that they can produce effects that are larger than, smaller than, or similar to their IPD equivalents [ 14 — 23 ].
Bria et al. Moreover, both reviews [ 13 , 24 ] included multiple outcomes from the same meta-analyses, marring interpretation. Here, for a single outcome, we compare the results from a large cohort of cancer systematic reviews and meta-analyses based on IPD, with the best meta-analyses of published AD possible at the time these were completed, to establish when the latter are most likely to be reliable, and when the IPD approach might be required.
The study did not follow a protocol or pre-specified plan. We used a cohort of 18 cancer systematic reviews that included IPD meta-analyses: all of those completed and published by the Meta-analysis Group of the MRC Clinical Trials Unit at University College London over a year period to [ 25 — 36 ], including updates where relevant. Each IPD review included a comprehensive search for all eligible trials, irrespective of publication status.
Thus, at the time point each IPD meta-analysis was completed, we could ascertain which trials were published and include them in the related AD meta-analysis. This ensured that we were comparing each IPD meta-analysis with a meta-analysis of the published data available at that time. We used the corresponding publications for extraction of AD, and if a trial was reported in multiple publications, we used the one with the most up-to-date or complete information.
Although a variety of research and control interventions were used, overall survival was the primary outcome in all of the meta-analyses, and the HR was the effect measure, so these are used as the basis for all our comparisons. One author JFT, SB, or DJF independently extracted all data relevant to the derivation of the HR for the effect of treatment on overall survival and the associated standard error SE of its natural logarithm [ 4 , 6 ], and these data were crosschecked by another author.
These data included reported HRs and SEs, confidence intervals and p -values, numbers of participants randomised and analysed, and numbers of events.
If KM curves were available, we also extracted survival probabilities across a series of time intervals and the related numbers at risk [ 5 , 6 ], or the actual or estimated [ 4 , 6 ] minimum and maximum follow-up, to estimate HRs and SEs [ 4 — 6 ]. One author JFT reviewed all KM curve estimates to ensure a consistent approach to deciding the number and size of these intervals.
We estimated the HRs and SEs using all possible methods [ 4 — 6 ], but preferentially used estimates calculated directly from the reported observed and expected events or the hazard rates for the research intervention and control groups [ 4 , 6 ]. If this was not possible, we used HRs and SEs estimated indirectly using a published log-rank, Mantel—Haenszel, or Cox p -value, and either the associated confidence interval or the number of events, provided the confidence intervals and p -values were given to at least 2 significant figures [ 4 ].
This meant we used the best possible estimate of each trial HR. We matched each AD meta-analysis to the relevant IPD meta-analysis in terms of both the intervention comparisons and the analyses.
Thus, if treatment effects were reported by participant subgroup, the subgroup HRs and SEs were combined using a fixed-effect inverse-variance meta-analysis to provide an appropriate AD estimate for the whole trial or treatment comparison. For a small number of 3-arm trials, we combined very similar treatment arms to provide a single estimate of treatment versus control. Whilst not best practice, we wanted to replicate the original analyses.
For multi-arm trials with treatment comparisons that were eligible for different meta-analyses or a single treatment comparison that was eligible for more than 1 meta-analysis, estimates for the individual comparisons were included as appropriate.
We also performed sensitivity analyses using the DerSimonian and Laird random-effects model [ 37 — 39 ]. All data included in these analyses were aggregate in nature, whether derived from trial publications or from the original analyses of anonymised participant data, and therefore ethical approval was not required. Estimates were compared on the log scale throughout, because the log HR is approximately normally distributed.
We used paired t tests to assess whether log HRs and SEs from AD differed on average from their IPD equivalents, recognising that the statistical significance of these tests relates to the amount of data available. At the trial level, we also used ANOVA to investigate whether the estimation method direct, indirect, or KM curve influenced the extent of agreement.
The Bland—Altman method also allowed us to examine whether agreement was associated with trial or meta-analysis characteristics. This involved plotting the differences between the AD and IPD log HRs against each characteristic and testing for a non-zero regression slope for the average agreement and for non-constant limits of agreement [ 40 ]. As described above, we initially plotted these differences against their averages, thus testing whether agreement improves or worsens with increasing size of the estimates [ 42 ].
We then went on to examine whether agreement was associated with the number of trials, participants, and events in the AD meta-analysis, as well as the proportion of trials, participants, and events in the AD meta-analysis relative to the IPD analysis. Regression slopes were reported as standardised beta coefficients. Subsequently, we also used sensitivity analyses to assess whether agreement at the meta-analysis level might be improved by excluding trials where the reported analyses were at potential risk of bias [ 43 ] from incomplete outcome data or had limited or imbalanced follow-up.
Trials in which more than half of participants were estimated to have been censored prior to what would be considered an appropriate follow-up time for the site and stage of cancer Table 1 were considered to have insufficient follow-up. We classified these based on the reported KM curves and extracted or estimated levels of censoring. Note that only trials judged to be at low risk of bias in terms of randomisation sequence generation and allocation concealment based on information supplied by investigators and checking of the IPD were included in our IPD meta-analyses.
We utilised these results to construct a decision tree for assessing when AD meta-analyses are most likely to be reliable. As per reviewer comments, we have made this only as generalisable as the data allow. The 18 systematic reviews included trials, 5 of which were eligible for inclusion in 2 separate meta-analyses.
As well as being able to include trials that had not been published, and trials that had not been reported in sufficient detail, we were also able to obtain additional participants that had been excluded from published analyses and additional events arising from updated follow-up. Where estimation from a KM curve was the best available method, the associated numbers at risk were reported for only 4 trials, so the minimum and maximum follow-up was used by default to estimate censoring [ 4 ].
The red horizontal line represents no difference i. Dashed and dotted lines represent statistical precision around the average ratio and the limits of agreement, respectively. Individual data points are distinguished by whether the AD estimate was derived directly from a reported HR, indirectly from a reported p -value and associated information, or indirectly from a Kaplan—Meier curve [ 6 ].
The red horizontal lines represent no difference i. Dashed and dotted lines represent statistical precision around the average ratios and the limits of agreement, respectively. Comparisons are ordered by the degree of disagreement, i. We also found no evidence that the limits of agreement narrowed when trials with published analyses at potential risk of bias from incomplete outcome data or that had limited or imbalanced follow-up were excluded Table 2.
Statistical evidence for these associations was less clear under a random-effects model. These associations all remained significant under a random-effects model. Hence, ascertaining the absolute and relative information size of the available AD is a critical part of determining whether a meta-analysis of published HRs is sufficient for robust syntheses, and when IPD might be needed Fig 6. Intuitively, establishing information size should also be a goal for AD meta-analyses of other outcomes and effect measures.
For time-to-event outcomes and binary outcomes, information size will mostly relate to the number of participants and events, and for continuous outcomes, to the number of participants.
For accuracy, this assessment needs to be based on all trials whether published, unpublished, or ongoing, and the actual or projected accrual figures for each. If the absolute information size is small, an AD meta-analysis will lack power and be unreliable. Also, the collection of IPD will add little value unless it can bring about an increase in the number of participants or events Fig 6. If the absolute information size is deemed sufficient, but AD are only available for a small proportion of the eligible participants or the number of events is low, it follows that the relative information size will be small, and any AD estimate is likely to be unreliable.
If further AD are not available, the collection of IPD could be very valuable in increasing the number of participants or events Fig 6. In this scenario, the collection of IPD would only be useful if an intervention effect has been detected and more detailed analyses are required. Our results also suggest that there may still be uncertainty in the size and direction of effect, which could influence any decision to collect IPD.
By applying these limits to a plausible range of AD meta-analysis HRs i. This helps to gauge which observed HRs are most likely to be reliable Table 3. For an observed AD meta-analysis HR of around 0.
Hence, IPD might be needed to provide a greater degree of certainty about whether an effect exists, and its size and precision Fig 6. Note that our example HR ranges purposefully leave gaps, reflecting regions where the reliability of AD and need for IPD may be context-specific and harder to judge Table 3. Purpose: The purpose of this paper is to show the link between repeated measurement models used with aggregate data and those used when individual patient data IPD are available, and provide guidance on the methods that practitioners might use for aggregate data meta-analyses, depending on the type of data available.
Methods: We discuss models for the meta-analysis of longitudinal continuous outcome data when IPD are available. In these models time is included either as a factor or as a continuous variable, and account is taken of the correlation between repeated observations. The meta-analysis of IPD can be conducted using either a one-step or a two-step approach: the latter involves analysing the IPD separately in each study and then combining the study estimates taking into account their covariance structure.
We discuss the link between models for use with aggregate data and the two-step IPD approach, and the problems which arise when only aggregate data are available.
0コメント