The results presented here demonstrate, based on a large sample, a significant correlation between the individual disciplines of the BFT (11 × 10-m sprint test, flexed-arm hang and 1000-m run) and the output measured during a bicycle ergometer test completed by the examined cohort of young subjects. The results also show that the BFT provides a similar assessment of physical fitness. The 4.0 % false-negative and false-positive rate, however, showed that the bicycle ergometer test, in contrast with the BFT, which was designed to test more than only maximal aerobic power, resulted in an underestimation of the level of physical fitness and fitness for service. In addition to the size of the sample, the diversity of the areas of physical fitness examined is a distinguishing feature of the present study.
Bicycle ergometer tests are controversial as part of assessments [15, 16]. The reliability of these tests is rated as low in regard to excluding cardiac arrhythmia and detecting coronary heart diseases in a young and healthy cohort [17, 18]. Bicycle ergometer tests as part of medical check-ups for athletes are therefore not recommended under the age of 35 or 40 [19–24]. It is also widely accepted that bicycle ergometer tests are only suitable to evaluate bicycle-specific fitness . Moreover, subjects with a higher body weight have the advantage that the impact of their additional weight is reduced in this evaluation . Ergometer tests have advantages over separate physical fitness tests. They can be performed in a standardized way, independent of weather conditions, and require little space. It should be noted, however, that they require more personnel.
The BFT presented here requires little effort because of its conditions. It merely requires a suitable gym for the sprint and the flexed-arm hang tests, as well as the necessary materials (mats, cones to mark the floor, pull-up bar) and a track suitable for running a 1000-m distance (e.g., a 400-m track). The high correlation of the individual BFT disciplines and the overall BFT score with the bicycle ergometer test (both in absolute values and relative to body mass) indicates that the BFT can be used as an alternative to bicycle ergometer tests. Because it targets the physical fitness skills of endurance, strength and speed, the BFT is the more suitable test procedure.
The high correlation between the individual BFT items and the bicycle ergometer test was confirmed for other tests as well. Williford et al.  analyzed the correlation between bicycle ergometer tests and maximal treadmill tests and found a correlation of r = 0.74. The maximal oxygen uptake (VO2max), however, was considerably lower during the bicycle ergometer test (−17 %). Basset et al.  also found comparability between bicycle and treadmill ergometer tests when examining 6 triathletes, 6 runners and 6 cyclists. They came to the conclusion that in both tests the heart rate and the percentage of the VO2max were comparable. Jaskólska et al.  observed a high correlation (r = 0.71–0.86) between these two types of ergometer tests as well in their examination of 32 male subjects. Carey et al.  did not identify any differences in the maximum heart rate and the VO2max in the examination of 16 experienced triathletes, they did detect significant differences regarding the determination of the anaerobic threshold. Although we can generally assume a relatively good correlation between both test systems, we must take into consideration that both tests were conducted as stationary laboratory tests and that the majority of examined subjects were athletes who were well-trained in the relevant athletic disciplines.
In a study by Grant et al. , a very high correlation (r = 0.92) was observed in 22 young male subjects between the 12-min Cooper test and submaximal cycling output. The results of our study are consistent in this respect because we also observed a very high correlation between the 1000-m run and the bicycle ergometer test output.
The study of Grant et al.  and a study of Cairney et al.  note a high correlation between shuttle runs and bicycle ergometer tests. Grant et al. conducted a multi-stage progressive shuttle run test and detected a correlation of r = 0.86 with the bicycle ergometer test while Cairney et al. examined children doing a 20-m shuttle run and found a correlation of r = 0.71. These results are thus also consistent with the findings of the present study although the test structure referred to in the References section differs from the shuttle run (11 × 10-m sprint test) examined by us.
Whereas subjects with a higher body weight have an advantage in bicycle ergometer tests , a review carried out by Vanderburgh  showed that in the fitness tests common in the US Army, Air Force and Navy, subjects with lower body weight were able to perform better. The present study also demonstrates that weight has an influence on the results achieved during the flexed-arm hang. Only the relative output measured during the bicycle ergometer test correlated with the flexed-arm hang output. In addition, regression analysis variance increased for all test items when the output in the BFT disciplines was compared with the relative output measured during the bicycle ergometer test. This is not surprising, as subjects with a higher body weight achieve considerably lower results in the flexed-arm hang test in particular, whereas in the bicycle ergometer test their results are higher in comparison.
This study does have some limitations. Because the data analysis was retrospective, ergometer test and BFT data were compared irrespective of how much time had passed between the tests. We can therefore not rule out that the physical fitness of the subjects had improved or worsened significantly during this period. Among other things, this could explain the number of false-negative and false-positive results. To provide reliable information on the correlation between the bicycle ergometer test and BFT output, a prospective randomized study with short intervals between the two tests should be conducted. This, however, was beyond the scope of this study.
Because the overall proportion of women in the German armed forces is low (approx. 10 %), only men were included in this study. Therefore, these results cannot be generalized to physical fitness examinations of women. Moreover, it is possible that the group of subjects on which the study is based is not representative of the respective locations. It is conceivable, for example, that only particularly unathletic or sick persons or, on the contrary, especially fit or healthy persons presented to the specialist clinic. This can be considered unlikely because many different locations and unit physicians have referred personnel to the Specialist Clinic for Internal Medicine for medical examination, and because an interim evaluation of the cohort of temporary career volunteers used for the analysis (comprising the assessments of the period from 2007 to 2010) includes both soldiers with a high level of physical fitness and a considerable number of soldiers that were unfit for service . It can therefore be assumed that the overall sample of 323 soldiers has not been affected by significant selection bias through the referral/presentation of subjects.