Implementation of KNN, RF, and XGB Algorithms for Food Allergen Detection in Indonesian Recipes
DOI:
https://doi.org/10.22441/fifo.2026.v18i1.010Abstract
Food allergies are a growing public health concern, especially in countries like Indonesia where traditional recipes often contain hidden allergens. This study aims to develop a machine learning-based system to detect food allergens in Indonesian recipes using K-Nearest Neighbors (KNN), Random Forest (RF), and Extreme Gradient Boosting (XGB) algorithms. A total of 7,840 recipes were collected from Cookpad.com using web scraping and labeled with five allergen categories, which include milk, peanuts, eggs, seafood, and wheat. The dataset was preprocessed using natural language processing techniques such as tokenization, stemming, and TF-IDF feature extraction. The models were trained and evaluated using accuracy, precision, recall, and F1-score. Experimental results show that XGBoost with hyperparameter tuning via GridSearchCV achieved the best performance, with the highest average recall of 0.9672 and F1-score of 0.9826. RF also showed strong performance, while KNN had the lowest accuracy and recall among the three models. The system was deployed using Streamlit to allow users to input recipe ingredients or URLs and receive real-time allergen predictions. The novelty of this study lies in the development of a large-scale Indonesian-language allergen dataset (7,840 recipes) that was unavailable in prior works, together with a multilabel allergen classification specifically tailored to the Indonesian culinary context. Unlike previous studies that predominantly rely on English-language datasets and non-Southeast Asian food cultures, this research contributes a localized allergen detection system that is directly integrated into a web-based interface. This approach offers a practical tool to support individuals with food allergies in identifying risky ingredients within local dishes and contributes to improving food safety awareness in Indonesia.
Downloads
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The copyright to this article is transferred to Universitas Mercu Buana (UMB) if and when the article is accepted for publication. The undersigned hereby transfers any and all rights in and to the paper including without limitation all copyrights to UMB. The undersigned hereby represents and warrants that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required. The undersigned represents that he/she has the power and authority to make and execute this assignment.
We declare that this paper has not been published in the same form elsewhere.
Furthermore, I/We hereby transfer the unlimited rights of publication of the above-mentioned paper as a whole to UMB. The copyright transfer covers the right to reproduce and distribute the article, including reprints, translations, photographic reproductions, microform, electronic form (offline, online) or any other reproductions of similar nature.
The corresponding author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors. This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s) where applicable. After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted.
Retained Rights/Terms and Conditions
Although authors are permitted to re-use all or portions of the Work in other works, this does not include granting third-party requests for reprinting, republishing, or other types of re-use.
Our Articles are licensed under CC BY-NC

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.









