Military Medical Research

Table 9 Strengths and limitations of commonly-used models

From: Artificial intelligence-driven radiomics study in cancer: the role of feature engineering and modeling

Type	Method	Strengths	Limitations
ML	PCA	It remains most of the main information and has simple calculation process	It would lose some important information and the Interpretation is poor
	mRMR	It is suitable for handling multiple classification tasks	The correlation between feature crosses and target variable is ignored
	LASSO	It is a good solution for solving multicollinearity problems, and the results are easy to interpret	It tends to select one of a set of highly correlated features
	CV	It can evaluate the model more reasonably and accurately and obtain more useful information from limited data	The computation is increased
	SMOTE	The overfitting problem of simple over-sampling is overcome	It requires repeated adjustment of important parameters
	LR	It has low computation cost, fast computation speed, and is easy to understand and implement	It only handles binary classification tasks and is easy to underfit
	SVM	It can solve high-dimensional problems and has strong generalization ability	It can only handle binary classification tasks (conventional SVM) and the efficiency of training large sample is low
	KNN	It is suitable for nonlinear classification, and has high Acc	It requires a lot of memory, and when the sample is imbalanced, the deviation of prediction is large
	DT	It can be analyzed visually, and the running speed is fast	It is easy to overfit and overlook the correlation of attributes in a dataset
	RF	It is suitable for handling high dimensional data, and the ability to adapt to datasets is strong	It is not good at dealing with low dimensional data, and it is much slower than DT
	Cox regression model	It has great flexibility and no requirement on data distribution	The best fitting effect for each data may not be achieved
	Naïve Bayes	It is easy to understand the interpretation of the results, and performs well on small datasets	It is sensitive to the form of input data
DL	3D-CNN	It is easy to handle high-dimensional data, and the feature extraction process is automatic	It is difficult to interpret results and lots of valuable information may be lost
	ANN	It has high classification Acc and strong robustness and fault tolerance	It is difficult to interpret results and requires a lot of parameters
SM	t-test	It is easy to explain, has strong robustness and can control individual difference well	It can not be used for multiple comparisons, only to compare whether the difference between the two averages is significant
	Mann–Whitney U test	There is no requirement for data distribution	When the data conforms to normal distribution and the variance is homogeneous, the test efficiency is lower than the t-test efficiency
	Spearman correlation analysis	It is suitable for nonlinear relations and continuous and discrete datasets	It is less efficient than Pearson correlation coefficient
	Kaplan–Meier analysis	It provides a variety of test methods, and is easy to implement	It can only perform univariate analysis
	Log-rank test	It analyzes the data in combination with all time points	It requires meeting equal proportional risk assumptions and only performs univariate analysis
	Fisher’s exact test	It is suitable for small samples and can accurately calculate the significance of deviations from the null hypothesis	It can only applicable to sample size n < 40 or theoretical frequency T < 1
	Chi-square test	It is convenient, concise, and widely used	It is more complex than t-test and the test efficiency is lower than t-test efficiency

ML machine learning, SM statistical method, DL deep learning, PCA principal component analysis, mRMR maximum relevance minimum redundancy, LASSO least absolute shrinkage and selection operator, CV cross validation, SMOTE synthetic minority over-sampling technique, LR logistic regression, SVM support vector machine, KNN K-nearest neighbors, DT decision tree, RF random forest, CNN convolutional neural network, ANN artificial neural network, Acc accuracy

Back to article page

ISSN: 2054-9369

Contact us

Submission enquiries: Access here and click Contact Us
General enquiries: info@biomedcentral.com