Author
Abstract
Background: As organizations increasingly seek data-driven insights, the demand for machine learning (ML) expertise outpaces the current workforce supply. Automated Machine Learning (AutoML) frameworks help close this gap by streamlining the ML pipeline, making advanced modeling accessible to non-specialists.Objective: This study evaluates the performance of four open-source AutoML frameworks-Auto-Keras, Auto-Sklearn, H2O, and TPOT-in predictive analytics, focusing on both binary and multiclass classification. The goal is to identify performance strengths and limitations under varying dataset conditions and propose improvements for framework optimization.Methods: Quantitative experimental research design was employed. 22 publicly available datasets were selected from established benchmarking sources, covering diverse predictive analytics challenges. Framework performance was assessed across twelve data segments, defined by characteristics such as sample size, feature count, and categorical feature proportion. Evaluation metrics included AUC for binary and accuracy/F1 for multiclass classification tasks, with standardized runtime constraints applied to ensure comparability.Results: The findings show that H2O delivered strong results across diverse datasets, particularly for binary classification. However, no single framework achieved superior performance across all data segments. Auto-Sklearn performed well in multiclass classification, especially with higher feature counts, while Auto-Keras and TPOT demonstrated variable outcomes depending on dataset complexity. Performance declined notably in scenarios with high categorical proportions, severe class imbalance, or extensive missing values.Conclusion: This study demonstrates that AutoML frameworks can substantially support predictive analytics but exhibit distinct strengths and limitations under specific data conditions. While H2O proved most robust overall, targeted refinements such as enhancing feature selection in Auto-Keras and improving categorical variable handling in Auto-Sklearn could further optimize performance. The findings provide actionable insights for both practitioners selecting frameworks and developers enhancing AutoML design, highlighting the need for ongoing innovation to ensure adaptability to complex predictive analytics tasks.
Suggested Citation
Nicolas Leyh, .
"Automated Machine Learning in Action: Performance Evaluation for Predictive Analytics Tasks,"
Acta Informatica Pragensia, Prague University of Economics and Business, vol. 0.
Handle:
RePEc:prg:jnlaip:v:preprint:id:288
DOI: 10.18267/j.aip.288
Download full text from publisher
As the access to this document is restricted, you may want to
for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:prg:jnlaip:v:preprint:id:288. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Stanislav Vojir (email available below). General contact details of provider: https://edirc.repec.org/data/uevsecz.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.