Predicting Consumer Purchasing Behavior Using Random Forest on Retail Transaction Data

Authors

  • Ningsiah Ningsiah Department of Pharmacy, Faculty of Health, Aisyah University, Indonesia, Indonesia
  • Nur Aminudin Department of Software Engineering, Faculty of Technology and Informatics, Aisyah University, Indonesia, Indonesia

DOI:

https://doi.org/10.22441/fifo.2026.v18i1.002

Keywords:

Predictive analytics, consumer purchasing behavior, retail information systems, machine learning, transaction data

Abstract

The rapid digital transformation in the retail sector has generated massive volumes of consumer transaction data stored within retail information systems. Although these data hold strategic value for decision-making, their utilization often remains limited to descriptive reporting. This study aims to analyze and predict consumer purchasing behavior by integrating machine learning–based predictive analytics into retail information systems using the Kaggle retail transaction dataset. The research methodology includes data preprocessing, exploratory data analysis, feature selection, and predictive model development using logistic regression, decision tree, and random forest algorithms. Model performance was evaluated using accuracy, precision, recall, and ROC–AUC metrics. The results indicate that the random forest model outperformed the other algorithms, achieving an accuracy of 88.76%, precision of 87.92%, and recall of 86.48%, demonstrating superior discriminative capability. These findings confirm that ensemble-based learning methods effectively capture complex and non-linear consumer purchasing patterns. The study contributes theoretically by extending the role of retail information systems from descriptive reporting tools to predictive decision-support systems, while practically providing a robust analytical framework to support inventory optimization, targeted promotion strategies, and personalized service delivery in data-driven retail environments.

Downloads

Download data is not yet available.

References

A. Berisha Qehaja, “Digital Transformation in Retail: Leveraging Data Analytics for Competitive Advantage BT - Data Analytics for Decision Making towards Business Excellence: Through the Lens of Retail and Financial Services,” B. Basu, J. K. Jha, I. Mukherjee, and R. N. Sengupta, Eds., Singapore: Springer Nature Singapore, 2026, pp. 15–37. doi: 10.1007/978-981-95-2429-7_2.

D. J. Kim and S. S. Cha, “Digital Transformation in Retail : A Systematic Review on Omnichannel , AI , and Big Data,” Distrib. Sci. Res., vol. 6, no. 2024, pp. 111–124, 2025, doi: 10.15722/jds.23.06.202506.111.

M. Kholod, A. Celani, and G. Ciaramella, “The Analysis of Customers’ Transactions Based on POS and RFID Data Using Big Data Analytics Tools in the Retail Space of the Future,” 2024. doi: 10.3390/app142411567.

T. Stylianou and A. Pantelidou, “Big data and consumer behavior : A macroeconomic perspective through supermarket analytics,” QFE, vol. 9, no. July, pp. 682–712, 2025, doi: 10.3934/QFE.2025024.

S. Suhairi, N. Nurhazizah, S. Syanda, and R. Nasution, “Transformasi Digital Riset Pemasaran Global dengan Integrasi Teknologi Terkini untuk Menyusun Strategi Responsif terhadap Perubahan Pasar Global,” As-Syirkah Islam. Econ. & Financ. J., vol. 3, no. 2 SE-Articles, Jan. 2024, doi: 10.56672/syirkah.v3i2.175.

Loso Judijanto, I. Pratama, I. W. Jata, and E. Utami, “Analisis Bibliometrik tentang Pengaruh Big Data dan Analitik dalam Pengembangan Produk dan Layanan,” J. Multidisiplin West Sci., vol. 3, no. 01 SE-Artikel, pp. 88–97, 2024, doi: 10.58812/jmws.v3i01.942.

S. A. Alenazi, “Predictive modelling of omni-channel customer behavior using big data analytics for retail marketing,” Int. J. Innov. Res. Sci. Stud., vol. 8, no. 5 SE-Articles, pp. 1350–1359, Aug. 2025, doi: 10.53894/ijirss.v8i5.9134.

J. Tang, “Unlocking Retail Insights: Predictive Modeling and Customer Segmentation Through Data Analytics,” 2025. doi: 10.3390/jtaer20020059.

G. Karpushkin, “Predicting Consumer Behavior Based on Big Data of User-Generated Online Content in Retail Marketing,” Glob. J. Flex. Syst. Manag., vol. 25, no. 1, pp. 163–178, 2024, doi: 10.1007/s40171-024-00372-5.

P. Haessner, J. Haessner, and J. Thomas, “Maximizing Retail Potential: The Role of Big Data Analytics,” J. Strateg. Innov. Sustain., vol. 19, no. 4 SE-Articles, Jan. 2025, doi: 10.33423/jsis.v19i4.7486.

R. S. Ganesh and R. S. Sagidya, “Business Analytics: Challenges and Innovations for the Modern Enterprise,” Shanlax Int. J. Manag., vol. 11, no. 4 SE-Articles, Apr. 2024, doi: 10.34293/management.v11i4.7469.

Robert P Rooderkerk, Nicole DeHoratius, and Andrés Musalem, “The past, present, and future of retail analytics: Insights from a survey of academic research and interviews with practitioners,” Prod. Oper. Manag., vol. 31, no. 10, pp. 3727–3748, Oct. 2022, doi: 10.1111/poms.13811.

Abdullah Al Maruf, “A Systematic Review Of Erp-Integrated Decision Support Systems For Financial And Operational Optimization In Global Retails Business,” Am. J. Interdiscip. Stud., vol. 6, no. 1 SE-Articles, pp. 236–262, 2025, doi: 10.63125/qgbrmf24.

V. Krishna and P. Mattegunta, “Data-driven retail : The interconnected ecosystem of predictive merchandising analytics,” World J. Adv. Res. Rev., vol. 26, no. 01, pp. 4084–4092, 2025, doi: 10.30574/wjarr.2025.26.1.1570.

T. Stylianou and A. Pantelidou, “A machine learning approach to consumer behavior in supermarket analytics,” Decis. Anal. J., vol. 16, p. 100600, 2025, doi: 10.1016/j.dajour.2025.100600.

S. Das and J. Nayak, “Customer Segmentation via Data Mining Techniques: State-of-the-Art Review BT - Computational Intelligence in Data Mining,” J. Nayak, H. S. Behera, B. Naik, S. Vimal, and D. Pelusi, Eds., Singapore: Springer Nature Singapore, 2022, pp. 489–507.

M. Devega, K. Kusrini, E. Utami, K. A. Yuana, and N. Kapoor, “Transformative Impact of Data Diversity and Machine Learning on Supply Chain Management,” in 2025 4th International Conference on Creative Communication and Innovative Technology (ICCIT), 2025, pp. 1–6. doi: 10.1109/ICCIT65724.2025.11167560.

C. Magasi and A. M. Nyamwesa, “Reimagining Consumer Analytics: Predictive and Real-Time Insights Through Dynamic Structural Equation Modeling,” Indones. J. Bus. Entrep., vol. 11, no. 3 SE-Articles, p. 734, 2025, doi: 10.17358/ijbe.11.3.734.

V. Krishna and P. Mattegunta, “Integrated retail ecosystem : The convergence of predictive analytics and omnichannel strategies in modern merchandising,” World J. Adv. Res. Rev., vol. 15, no. 02, pp. 168–176, 2025, doi: 10.30574/wjaets.2025.15.2.0509.

J. Lin, “Application of machine learning in predicting consumer behavior and precision marketing,” PLoS One, vol. 20, no. 5, p. e0321854, May 2025, doi: 10.1371/journal.pone.0321854.

R. Sharipov, “Machine Learning Systems for Predicting Consumer Behaviour,” Asian J. Res. Comput. Sci., vol. 18, no. 6 SE-Review Article, pp. 343–352, Jun. 2025, doi: 10.9734/ajrcos/2025/v18i6704.

K. S. Rekha, T. Amutha, R. Usharani, S. Pushparani, and V. M. Sivagami, “Predictive Analysis of Consumer Behavior Using Supervised Learning Techniques,” in 2024 Second International Conference Computational and Characterization Techniques in Engineering & Sciences (IC3TES), 2024, pp. 1–6. doi: 10.1109/IC3TES62412.2024.10877475.

N. Alsharman, I. Hababeh, and D. Alshorman, “Leveraging Machine Learning with Feature Selection to Enhance Business Intelligence in Predicting E-Commerce Acquiring Behaviors,” Int. J. Adv. soft Comput. its Appl., vol. 17, no. 3, pp. 117–133, 2025, doi: 10.15849/IJASCA.251130.07.

S. S. and S. S., “Intelligent Deep Learning Systems for Predicting Customer Purchase Intentions in Phygital Retail Industry,” in 2025 International Conference on Emerging Trends in Industry 4.0 Technologies (ICETI4T), 2025, pp. 1–6. doi: 10.1109/ICETI4T63625.2025.11132150.

X. Ma and X. Jiang, “Predicting Cross-border E-commerce Purchase Behavior in Organic Products: A Machine Learning Approach Integrating Cultural Dimensions and Digital Footprints,” Int. J. Comput. Inf. Syst., vol. 5, no. 1, 2024, doi: 10.29040/ijcis.v5i1.212.

D. Anwar, M. Faizanuddin, F. Rahman, and R. Dayal, “Analyzing Consumer Behavior in E-Commerce: Insights from Data-Driven Approaches,” Manag., vol. 3 SE-Original, p. 127, Mar. 2025, doi: 10.62486/agma2025127.

S. Hossain, H. Hena, and P. Sampa, “Decoding Consumer Habits: Analyzing Retail Patterns Across Demographics,” Startupreneur Bus. Digit. (SABDA Journal), vol. 3, no. 2 SE-Articles, pp. 148–159, Oct. 2024, doi: 10.33050/sabda.v3i2.638.

A. O. Abiodun, O. Adeolu, B. D. Abayomi, and O. A. Julius, “Development of an Artificial Intelligence-Based Model for Patient’s Vital Signs Deterioration Prediction,” Int. J. Innov. Sci. Res. Technol., vol. 10, no. 9, 2025, doi: 10.38124/ijisrt/25sep1438.

O. E. Aribisala, “Data Quality Assessment and Preprocessing Techniques for Enhancing Machine Learning Model Performance,” Int. J. Innov. Res. Eng. & Multidiscip. Phys. Sci., vol. 12, no. 3, pp. 1–11, 2024, doi: 10.37082/ijirmps.v12.i3.232799.

S. Shahidi, A. W. Samadzai, and H. Shahbazi, “Effective Data Preprocessing in Data Science : From Method Selection to Domain-Specific Optimization,” J. Adv. Comput. Knowl. Algorithms, vol. 2, no. 4, pp. 84–90, 2025, doi: 10.29103/jacka.v2i4.22886.

S. Diwandari, A. E. Permanasari, and I. Hidayah, “Smart Features, Smarter Insights: The Role of Feature Engineering in Customer Purchase Predictions in E-Commerce,” in 2025 International Conference on Advanced Machine Learning and Data Science (AMLDS), 2025, pp. 142–148. doi: 10.1109/AMLDS63918.2025.11159401.

S. Sowndharya and R. Hariharan, “Advanced Machine Learning Algorithms: A Big Data Approach to Predictive Analytics in Business Marketing,” in 2025 4th International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), 2025, pp. 2083–2088. doi: 10.1109/ICIMIA67127.2025.11200955.

J. I. García-Guerra, H. O. Aguilar-Cajas, H. E. Vergara-Zurita, A. L. Rivera-Abarca, F. Armijos-Arcos, and J. I. López-Pumalema, “Predictive Analytics in Digital Marketing: A Statistical Modeling Approach for Predicting Consumer Behavior,” Data Metadata, vol. 4 SE-Original, p. 1061, Jun. 2025, doi: 10.56294/dm20251061.

D. Chicco and G. Jurman, “An Invitation to Greater Use of Matthews Correlation Coefficient in Robotics and Artificial Intelligence,” Front. Robot. AI, vol. Volume 9-2022, 2022, doi: 10.3389/frobt.2022.876814.

J. H. Cabot and E. G. Ross, “Evaluating prediction model performance,” Surgery, vol. 174, no. 3, pp. 723–726, Sep. 2023, doi: 10.1016/j.surg.2023.05.023.

A. L. D. Araújo et al., “Artificial intelligence in healthcare applications targeting cancer diagnosis—part II: interpreting the model outputs and spotlighting the performance metrics,” Oral Surg. Oral Med. Oral Pathol. Oral Radiol., vol. 140, no. 1, pp. 89–99, Jul. 2025, doi: 10.1016/j.oooo.2025.01.002.

J. O. Ogunniyi, J. O. Emuoyibofarhe, J. B. Oladosu, and M. M. Olamoyegun, “Assessment of some selected Machine Learning Performance Metrics in the Prediction of type 2 Diabetes,” FUOYE J. Eng. Technol., vol. 9, no. 3 SE-Articles, pp. 452–457, Dec. 2024, doi: 10.4314/fuoyejet.v9i3.12.

I. S. R, M. M, F. C, and J. K. M, “Microplastic predictive modelling with the integration of Artificial Neural Networks and Hidden Markov Models (ANN-HMM),” J. Environ. Heal. Sci. Eng., vol. 22, no. 2, pp. 579–592, 2024, doi: 10.1007/s40201-024-00920-2.

Downloads

Published

2026-06-08

How to Cite

[1]
N. Ningsiah and N. Aminudin, “Predicting Consumer Purchasing Behavior Using Random Forest on Retail Transaction Data”, FIFO, vol. 18, no. 1, Jun. 2026.

Issue

Section

Articles