Predictive Analytics for Property Valuation Using Random Forest in Malang City
DOI:
https://doi.org/10.57152/malcom.v6i1.2411Keywords:
Malang City, Mean Decrease Impurity, Property Price Prediction, Random Forest, Web ScrapingAbstract
The property market in Malang City continues to expand alongside rising housing demand, yet limited price transparency still constrains informed decision-making for buyers, sellers, and developers. This study develops a data-driven property price prediction model using the Random Forest algorithm, selected for its robustness and ability to capture complex nonlinear relationships. An initial dataset of 4,358 property listings was collected through web scraping from Rumah123.com, and after thorough preprocessing including data cleaning, handling missing values, and feature refinement 1,573 valid observations remained for analysis. The model incorporates key property characteristics, covering temporal variables (month, year), physical attributes (land area, building area, number of bedrooms and bathrooms, electricity capacity, number of floors), property characteristics (certificate type, property type, property condition, furniture condition, hook position), and price information. Using optimally tuned hyperparameters, the final Random Forest model achieved an R² of 76.66% and a MAPE of 25.27%, indicating strong predictive performance relative to standard regression benchmarks. These findings offer managerial implications by providing objective, data-driven price estimates that can support developers, agents, and prospective buyers in pricing decisions, marketing strategies, and fair value assessments during negotiations.
Downloads
References
Badan Pusat Statistik, Produk Domestik Bruto Indonesia Triwulanan: Quarterly Gross Domestic Product of Indonesia 2020–2024, vol. 7. Jakarta, Indonesia: BPS, 2024.
M. Chhiller, S. Shivam, and R. Kumar, “Propertie Price Prediction Using Machine Learning,” Int. J. Sci. Res. Eng. Manag., vol. 7, no. 11, pp. 1–11, 2023, doi: 10.55041/ijsrem26731.
D. Widyanto, “The Highest and Best Use Analysis and Feasibility Study of Residential Housing in Malang,” Adv. Civ. Eng. Sustain. Archit., vol. 1, no. 1, 2018, doi: 10.9744/acesa.v1i1.6815.
R. Mitchell, Web Scraping with Python: Collecting Data from the Modern Web. O’Reilly Media, n.d., p. 238.
C. Janiesch, P. Zschech, and K. Heinrich, “Machine learning and deep learning,” Electron. Markets, vol. 31, no. 3, 2021, doi: 10.1007/s12525-021-00475-2.
O. Lohith, A. Jha, and S. C. Tamboli, “Comparative Analysis of Random Forest Regression for House Price Prediction,” Int. J. Creative Res. Thoughts (IJCRT), pp. 2320–2882, 2023. [Online]. Available: www.ijcrt.org.
B. U. Sri, C. S. K. Reddy, C. R. Kumar, A. Vyshnavi, B. Vinod, and B. K. Reddy, “Random Forest-Based House Price Prediction,” in Proc. 2nd Int. Conf. Augmented Intelligence and Sustainable Systems (ICAISS), Trichy, India, 2023, pp. 954–959, doi: 10.1109/ICAISS58487.2023.10250452.
B. Lekkihal, V. Kumar, B. Shankar, and N. Kalyani, “Real Estate Price Prediction Using Artificial Intelligence,” Int. Res. J. Comput. Sci., 2023, doi: 10.26562/irjcs.2023.v1005.29.
T. Ratih, R. Tanamal, T. Wiradinata, Y. Soekamto, and N. Minoque, “House Price Prediction Model Using Random Forest in Surabaya City,” TEM J., 2023, doi: 10.18421/tem121-17.
M. Pebriadi and F. Fitria, “House Price Prediction Using the Random Forest Algorithm on the RapidMiner Application,” Formosa J. Sci. Technol., 2025, doi: 10.55927/fjst.v4i2.20.
G. Aryono, S. Auliana, and B. Permana, “Comparative Analysis of Multiple–Linear Regression Algorithm with Random Forest Regression for Prediction of House Plot Prices,” J. Ilmiah Glob. Educ., 2024, doi: 10.55681/jige.v5i2.2794.
Y. Fang, T. Li, and H. Zhao, “Random Forest Model for the House Price Forecasting,” in Proc. 14th Int. Conf. Computer Research and Development (ICCRD), Shenzhen, China, 2022, pp. 140–143, doi: 10.1109/ICCRD54409.2022.9730190.
Y. Fu, “A Comparative Study of House Price Prediction Using Linear Regression and Random Forest Models,” Highlights in Science, Engineering and Technology, vol. 107, AMMSAC 2024, pp. 96–103, 2024, doi: 10.54097/vcy5n584.
S. Wu, “Shanghai House Price Prediction Using Random Forest,” in Proc. 3rd Int. Conf. Business and Policy Studies, 2024, doi: 10.54254/2754-1169/66/20241234.
H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylonian J. Mach. Learn., pp. 69–79, 2024, doi: 10.58496/BJML/2024/007.
G. Louppe, L. Wehenkel, A. Sutera, and P. Geurts, “Understanding variable importances in forests of randomized trees,” in Advances in Neural Information Processing Systems 26, 2013. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2013/file/e3796ae838835da0b6f6ea37bcf8bcb7-Paper.pdf
P. Dumre, S. Bhattarai, and H. Shashikala, “Optimizing Linear Regression Models: A Comparative Study of Error Metrics,” in Proc. 4th Int. Conf. Technological Advancements in Computational Sciences (ICTACS), 2024, pp. 1856–1861, doi: 10.1109/ICTACS62700.2024.10840719.
A. De Myttenaere, B. Golden, B. Le Grand, and F. Rossi, “Mean Absolute Percentage Error for Regression Models,” Neurocomputing, vol. 192, pp. 38–48, 2016, doi: 10.1016/j.neucom.2015.12.114.
A. Atoum, “The Critical Role of Evaluation Metrics in Handling Missing Data: A Review of Modern Approaches,” Int. J. Adv. Appl. Sci., vol. 12, no. 1, pp. 1–14, 2025. [Online]. Available: https://www.science-gate.com/IJAAS/Articles/2025/2025-12-01/1021833ijaas202501011.pdf
R. West, “Best Practice in Statistics: The Use of Log Transformation,” Ann. Clin. Biochem., vol. 59, pp. 162–165, 2021, doi: 10.1177/00045632211050531.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sandrian Yulian Firmansyah Noorihsan, Tintrim Dwi Ary Widhianingsih, Heri Kuswanto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright © by Author; Published by Institut Riset dan Publikasi Indonesia (IRPI)
This Indonesian Journal of Machine Learning and Computer Science is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

















