Meta-analysis with zero-event studies: a comparative study with application to COVID-19 data

Background Meta-analysis is a statistical method to synthesize evidence from a number of independent studies, including those from clinical studies with binary outcomes. In practice, when there are zero events in one or both groups, it may cause statistical problems in the subsequent analysis. Methods In this paper, by considering the relative risk as the effect size, we conduct a comparative study that consists of four continuity correction methods and another state-of-the-art method without the continuity correction, namely the generalized linear mixed models (GLMMs). To further advance the literature, we also introduce a new method of the continuity correction for estimating the relative risk. Results From the simulation studies, the new method performs well in terms of mean squared error when there are few studies. In contrast, the generalized linear mixed model performs the best when the number of studies is large. In addition, by reanalyzing recent coronavirus disease 2019 (COVID-19) data, it is evident that the double-zero-event studies impact the estimate of the mean effect size. Conclusions We recommend the new method to handle the zero-event studies when there are few studies in a meta-analysis, or instead use the GLMM when the number of studies is large. The double-zero-event studies may be informative, and so we suggest not excluding them. Supplementary Information The online version contains supplementary material available at (10.1186/s40779-021-00331-6).

Since p2n 1`3 c 1 q{p3n 1`4 c 1 q ą 0.5, the squared bias of pX 1`c1 q{pn 1`2 c 1 q is lower than that of pX 1`c1 q{pn 1`c1 q over a half of settings. We also note that the variance of pX 1`c1 q{pn 1`2 c 1 q is always smaller than the variance of pX 1`c1 q{pn 1`c1 q. Hence, the MSE of pX 1`c1 q{pn 1`2 c 1 q is smaller than the MSE of pX 1`c1 q{pn 1`c1 q in most settings.
Appendix 2: Comparison of the p 1 estimates In this simulation study, we propose to explore the effect of c 1 on the family of estimators p 1 pc 1 q " pX 1`c1 q{pn 1`2 c 1 q in terms of coverage probability and expected length. To generate the simulation data, we let p 1 range from 0.01 to 0.99 and c 1 " 0.25, 0.5, 0.75 or 1. We also consider n 1 " 10 and 50 as the different numbers of samples. With N " 100, 000 repetitions for each setting, we generate random numbers from the binomial distribution with parameters p 1 and n 1 to yield the estimates of p 1 and the CIs. By the delta method, the variance of lnrp 1 pc 1 qs can be approximately given as varrlnpp 1 pc 1 qqs « 1{pX 1`c1 q´1{pn 1`2 c 1 q. Hence, the 95% confidence interval of p 1 is given by Then we compute the frequencies of the true RR falling in the CIs as the coverage probability estimates. The expected lengths of the CIs on the log scale are computed by N´1 ř N s"1 rlnpUL s q´lnpLL s qs, where UL s and LL s are the upper and lower limits for the sth CI.
From Additional Fig. 1, it is evident that the CIs with small c 1 , e.g. c 1 " 0.25, yield low coverage probabilities when p 1 is close to 1. On the other side, the CIs with large c 1 , e.g. c 1 " 1, have low coverage probabilities when p 1 is close to 0. In addition, the CIs with larger c 1 will yield shorter expected lengths, but large c 1 may harm the coverage probabilities for small p 1 . As a compromise, we recommend the intermediate value c 1 " 0.5, and our subsequent results show that c 1 " 0.5 is indeed a reliable option for estimating RR.
Additional Fig. 1: Comparison of the CIs of p 1 with c 1 " 0.25, 0.5, 0.75 or 1, and n 1 " 10 or 50. The dot-dashed lines represent the simulation results for c 1 " 0.25, the solid lines represent the simulation results for c 1 " 0.5, the dashed lines represent the simulation results for c 1 " 0.75, and the dotted lines represent the simulation results for c 1 " 1. CI: Confidence interval p 1 " p 2ˆR R with RR ranging from 0.2 to mint5, 1{p 2 u. We also consider the numbers of samples as n 1 " n 2 " 10 or 50. With N " 100, 000 repetitions for each setting, we generate random numbers from the binomial distributions with parameters pp 1 , n 1 q and pp 2 , n 2 q. We then compute the frequencies of the true RR falling in the CIs as the coverage probability estimates. The expected lengths of the CIs on the log scale are computed by

N´1
ř N s"1 rlnpUL s q´lnpLL s qs, where UL s and LL s are the upper and lower limits of the sth CI.
From the top four panels of Additional Fig. 2 and 3 with small p 2 , the CIs associated with Ă RRp0.5, 1q and x RRp0.5, 1q yield shorter expected lengths than the other two CIs, but they have low coverage probabilities when ln(RR) is large. By contrast, the CIs associated with Ă RRp0.5, 0.5q and x RRp0.5, 0.5q are able to provide a better performance for large ln(RR). From the bottom four panels of Additional Fig. 2 and 3 with large p 2 , we note that the CIs associated with x RRp0.5, 0.5q and x RRp0.5, 1q perform better in terms of coverage probability in most settings. Noting also that the expected lengths of the four CIs are almost the same, we thus conclude that the CI associated with x RRp0.5, 0.5q is the best among the four CIs.
Appendix 4: Simulation study for unbalanced n 1 and n 2 In this simulation study, we compare the performance of the four existing intervals and the hybrid interval for unbalanced n 1 and n 2 . Specifically, we consider n 1 " 20 with n 2 " 40, 80 or 160, and consider n 2 " 20 with n 1 " 40, 80 or 160. The other settings are kept the same as those in the main text.
From the top four panels of Additional Fig. 4 to 9 with small p 2 , the Haldane and TACC intervals are more stable than the other CIs in terms of coverage probability. We Additional Fig. 2: Comparison of the four CIs of RR with p 2 " 0.05, 0.15, 0.85 or 0.95, and n 1 " n 2 " 10. The dot-dashed lines represent the simulation results of the CI associated with Ă RRp0.5, 0.5q, the dashed lines represent the simulation results of the CI associated with Ă RRp0.5, 1q, the solid lines represent the simulation results of the CI associated with x RRp0.5, 0.5q, and the dotted lines represent the simulation results of the CI associated with x RRp0.5, 1q. CI: Confidence interval, RR: Relative risk Additional Fig. 3: Comparison of the four CIs of RR with p 2 " 0.05, 0.15, 0.85 or 0.95, and n 1 " n 2 " 50. The dot-dashed lines represent the simulation results of the CI associated with Ă RRp0.5, 0.5q, the dashed lines represent the simulation results of the CI associated with Ă RRp0.5, 1q, the solid lines represent the simulation results of the CI associated with x RRp0.5, 0.5q, and the dotted lines represent the simulation results of the CI associated with x RRp0.5, 1q. CI: Confidence interval, RR: Relative risk also note that the expected lengths of the Haldane interval are much shorter than the TACC interval. From the bottom four panels of Additional Fig. 4 to 9 with large p 2 , the hybrid interval provides the best performance with the coverage probabilities close to the nominal level when n 1 ă n 2 . When n 1 ą n 2 , the CIs except the Carter interval are almost identical in most settings, as long as ln(RR) is not very large.