Author Credentials

Zeid Khitan MD Anna P. Shapiro MD Preeya T. Shah MS Juan Sanabria MD Prasanna Santhanam MD Komal Sodhi MD Nader G. Abraham PhD Joseph I. Shapiro MD


hypertension, blood pressure, chronic renal disease, correlation, machine learning, cardiovascular disease




Background: Understanding factors which predict progression of renal failure is of great interest to clinicians.

Objectives: We examined machine learning methods to predict the composite outcome of death, dialysis or doubling of serum creatinine using the modification of diet in renal disease (MDRD) data set.

Methods: We specifically evaluated a generalized linear model, a support vector machine, a decision tree, a feed-forward neural network and a random forest evaluated within the context of 10 fold validation using the CARET package available within the open source architecture R program.

Results: We found that using clinical parameters available at entry into the study, these computer learning methods trained on 70% of the MDRD population had prediction accuracies ranging from 66-77% on the remaining 30%. Although the support vector machine methodology appeared to have the highest accuracy, all models studied worked relatively well.

Conclusions: These results illustrate the utility of employing machine learning methods within R to address the prediction of long term clinical outcomes using initial clinical measurements.

Included in

Nephrology Commons