Leveraging machine learning and open accessed remote sensing data for precise rainfall forecasting
Main Article Content
Abstract
Rainfall forecasts are essential for human activities enabling communities to anticipate any impacts. Rainfall events correlate with other natural and hydro-meteorological phenomena, which can be used in modeling and prediction. This study used daily CHIRPS for the Gajahwong watershed in Yogyakarta, Indonesia as the precipitation data. It also used Sea Surface Temperature, Land Surface Temperature (Day and Night), Minimum and Maximum Temperatures, Solar Radiation, Wind Speed (U and V components), Cloud Pressure (Top and Base), and Cloud Height (Top and Base) as the parameters. Further, data processing was performed by means of the Google Earth Engine (GEE) platform. Machine learning methods, including Support Vector Regression, Gradient Boosting Regression, Random Forest, and Deep Neural Networks, were applied. The correlation analysis revealed that only the Wind Speed V-component showed significant correlation with rainfall, other seven parameters showed moderate and four showed weak ones. Meanwhile, accuracy assessments indicated that Support Vector Regression had the most accurate predictions accompanied by Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Squared Error (MSE), R2, and Coefficient Correlation (CC) at 1.366, 0.947, 1.866, 0.948 and 0.982 respectively. This study demonstrated that utilizing openly accessible atmospheric datasets processed through the GEE could yield reliable rainfall predictions, facilitating informed decisions on a wide scale. The methodology is adaptable and can be reproduced for any comparable research or operational purposes.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright
Open Access authors retain the copyrights of their papers, and all open access articles are distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided that the original work is properly cited.
The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.
While the advice and information in this journal are believed to be true and accurate on the date of its going to press, neither the authors, the editors, nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
2. Danladi, A.; Stephen, M.; Aliyu, B.M.; Gaya, G.K.; Silikwa, N.W.; Machael, Y. Assessing the influence of weather parameters on rainfall to forecast river discharge based on short-term. Alexandria Eng. J. 2018, 57, 1157–1162, doi:10.1016/j.aej.2017.03.004.
3. Chen, H.; Shao, M.; Li, Y. The characteristics of soil water cycle and water balance on steep grassland under natural and simulated rainfall conditions in the Loess Plateau of China. J. Hydrol. 2008, 360, 242–251, doi:10.1016/j.jhydrol.2008.07.037.
4. Gerrits, A.M.J. The role of biodiversity in the hydrological cycle, Delft University of Technology, 2016.
5. Latif, S.D.; Alyaa Binti Hazrin, N.; Hoon Koo, C.; Lin Ng, J.; Chaplot, B.; Feng Huang, Y.; El-Shafie, A.; Najah Ahmed, A. Assessing rainfall prediction models: Exploring the advantages of machine learning and remote sensing approaches. Alexandria Eng. J. 2023, 82, 16–25, doi:10.1016/j.aej.2023.09.060.
6. Novitasari, D.C.R.; Rohayani, H.; Suwanto; Arnita; Rico; Junaidi, R.; Setyowati, R.D.N.; Pramulya, R.; Setiawan, F. Weather Parameters Forecasting as Variables for Rainfall Prediction using Adaptive Neuro Fuzzy Inference System (ANFIS) and Support Vector Regression (SVR). J. Phys. Conf. Ser. 2020, 1501, doi:10.1088/1742-6596/1501/1/012012.
7. Thoha, A.S.; Saharjo, B.H.; Boer, R.; Ardiansyah, M. Characteristics and causes of forest and land fires in Kapuas district, Central Kalimantan Province, Indonesia. Biodiversitas 2019, 20, 110–117, doi:10.13057/biodiv/d200113.
8. Nurfaida, W.; Ramdhani, H.; Shimozono, T.; Triawati, I.; Sulaiman, M. Rainfall trend and variability over Opak River basin, Yogyakarta, Indonesia. J. Civ. Eng. Forum 2020, 1000, doi:10.22146/jcef.60628.
9. Wani, O.A.; Mahdi, S.S.; Yeasin, M.; Kumar, S.S.; Gagnon, A.S.; Danish, F.; Al-Ansari, N.; El?Hendawy, S.; Mattar, M.A. Predicting rainfall using machine learning, deep learning, and time series models across an altitudinal gradient in the North-Western Himalayas. Sci. Rep. 2024, 14, doi:10.1038/s41598-024-77687-x.
10. Pathan, A.I.; Agnihotri, P.G.; Patel, D.; Prieto, C. Mesh grid stability and its impact on flood inundation through (2D) hydrodynamic HEC-RAS model with special use of Big Data platform—a study on Purna River of Navsari city. Arab. J. Geosci. 2022, 15, doi:10.1007/s12517-022-09813-w.
11. Granata, F.; Gargano, R.; de Marinis, G. Support vector regression for rainfall-runoffmodeling in urban drainage: A comparison with the EPA’s storm water management model. Water (Switzerland) 2016, 8, doi:10.3390/w8030069.
12. Singh, V.; Qin, X. Study of rainfall variabilities in Southeast Asia using long-term gridded rainfall and its substantiation through global climate indices. J. Hydrol. 2020, 585, doi:10.1016/j.jhydrol.2019.124320.
13. Liyew, C.M.; Melese, H.A. Machine learning techniques to predict daily rainfall amount. J. Big Data 2021, 8, doi:10.1186/s40537-021-00545-4.
14. Pramudia, A.; Misnawati; Awanis; Sabur, A.; Hidayanto, M.; Sri Ratmini, N.P.; Dewi, D.O.; Agustini, S.; Fiana, Y.; Bhermana, A. Strengthening the Agroclimatology Analysis against Local Wisdom Paddy Planting Time at Coastal Area in Indonesia. IOP Conf. Ser. Earth Environ. Sci. 2022, 1095, doi:10.1088/1755-1315/1095/1/012027.
15. Kurniadi, A.; Weller, E.; Salmond, J.; Aldrian, E. Future projections of extreme rainfall events in Indonesia. Int. J. Climatol. 2024, 44, 160–182, doi:10.1002/joc.8321.
16. Mislan; Haviluddin; Hardwinarto, S.; Sumaryono; Aipassa, M. Rainfall Monthly Prediction Based on Artificial Neural Network: A Case Study in Tenggarong Station, East Kalimantan - Indonesia. Procedia Comput. Sci. 2015, 59, 142–151, doi:10.1016/j.procs.2015.07.528.
17. Sarasa-Cabezuelo, A. Prediction of Rainfall in Australia Using Machine Learning. Inf. 2022, 13, doi:10.3390/info13040163.
18. Cahyono, B.K.; Aditya, T.; Istarno The Least Square Adjustment for Estimating The Tropical Peat Depth using LiDAR Data. Remote Sens. 2020, 12, 1–22, doi:10.3390/rs12050875.
19. Amini, A.; Dolatshahi, M.; Kerachian, R. Real-time rainfall and runoff prediction by integrating BC-MODWT and automatically-tuned DNNs: Comparing different deep learning models. J. Hydrol. 2024, 631, 130804, doi:10.1016/j.jhydrol.2024.130804.
20. Tharun, V.P.; Ramya, P.; Renuga Devi, S. A Univariate Data Analysis Approach for Rainfall Forecasting. Lect. Notes Networks Syst. 2021, 204, 669–689, doi:10.1007/978-981-16-1089-9_53.
21. Peña, D.; Sánchez, I. Measuring the advantages of multivariate vs. univariate forecasts. J. Time Ser. Anal. 2007, 28, 886–909, doi:10.1111/j.1467-9892.2007.00538.x.
22. Salehi, S.; Kavgic, M.; Bonakdari, H.; Begnoche, L. Comparative study of univariate and multivariate strategy for short-term forecasting of heat demand density: Exploring single and hybrid deep learning models. Energy AI 2024, 16, 100343, doi:10.1016/j.egyai.2024.100343.
23. Adhani, G.; Buono, A.; Faqih, A. Support Vector Regression modelling for rainfall prediction in dry season based on Southern Oscillation Index and NINO3.4. 2013 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2013 2013, 315–320, doi:10.1109/ICACSIS.2013.6761595.
24. Kumar, L.; Mutanga, O. Google Earth Engine Applications; Kumar, L., Mutanga, O., Eds.; Special Is.; MDPI Remote Sensing: Basel, 2019; ISBN 9783038978848.
25. Karimi, P.; Bastiaanssen, W.G.M. Spatial evapotranspiration, rainfall and land use data in water accounting - Part 1: Review of the accuracy of the remote sensing data. Hydrol. Earth Syst. Sci. 2015, 11, 1073–1123, doi:10.5194/hessd-11-1073-2014.
26. De Graaf, M.; De Haan, J.F.; Sanders, A.F.J. TROPOMI ATBD of the Aerosol Layer Height; Paris, 2019;
27. Li, H.; Li, S.; Ghorbani, H. Data-driven novel deep learning applications for the prediction of rainfall using meteorological data. Front. Environ. Sci. 2024, 12, 1–15, doi:10.3389/fenvs.2024.1445967.
28. Rincón-Avalos, P.; Khouakhi, A.; Mendoza-Cano, O.; López-De la Cruz, J.; Paredes-Bonilla, K.M. Evaluation of satellite precipitation products over Mexico using Google Earth Engine. J. Hydroinformatics 2022, 24, 711–729, doi:10.2166/hydro.2022.122.
29. Kan, J. Predicting Drought Hazard In Sweden Using Google Earth Engine And Machine Learning Approach, KTH Royal Institute of Technology, 2022.
30. Cahyono, B.K.; Aditya, T.; Istarno The Determination of Priority Areas for the Restoration of Degraded Tropical Peatland Using Hydrological, Topographical, and Remote Sensing Approaches. Land 2022, 11, doi:10.3390/land11071094.
31. Suprayogi, S.; Purnama Sari, S.; Setiacahyandari, H.K. Flood Risk Analysis in Gajah Wong River, Yogyakarta City. J. Ilmu Lingkung. 2024, 22, 1033–1040, doi:DOI:10.14710/jil.22.4.1033-1040.
32. Funk, C.; Peterson, P.; Landsfeld, M.; Pedreros, D.; Verdin, J.; Shukla, S.; Husak, G.; Rowland, J.; Harrison, L.; Hoell, A.; et al. The climate hazards infrared precipitation with stations - A new environmental record for monitoring extremes. Sci. Data 2015, 2, 1–21, doi:10.1038/sdata.2015.66.
33. Reynolds, R.W.; Banzon, V.F.; NOAA CDR Program NOAA Optimum Interpolation 1/4 Degree Daily Sea Surface Temperature (OISST) Analysis; 2008;
34. Johnson, S.J.; Stockdale, T.N.; Ferranti, L.; Balmaseda, M.A.; Molteni, F.; Magnusson, L.; Tietsche, S.; Decremer, D.; Weisheimer, A.; Balsamo, G.; et al. SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev. 2019, 12, 1087–1117, doi:10.5194/gmd-12-1087-2019.
35. Asuero, A.G.; Sayago, A.; González, A.G. The Correlation Coefficient: An Overview. Crit. Rev. Anal. Chem. - CRIT REV ANAL CHEM 2006, 36, 41–59, doi:10.1080/10408340500526766.
36. Schober, P.; Schwarte, L.A. Correlation coefficients: Appropriate use and interpretation. Anesth. Analg. 2018, 126, 1763–1768, doi:10.1213/ANE.0000000000002864.
37. Car, F.R.A.M.; Sugeng Subagio, B.; Rahman, H.; Care, F.; Subagio, B.S.; Rahman, H. Porous concrete basic property criteria as rigid pavement base layer in indonesia. MATEC Web Conf. 2018, 147, 2008, doi:10.1051/matecconf/201814702008.
38. Izonin, I.; Tkachenko, R.; Shakhovska, N.; Ilchyshyn, B.; Singh, K.K. A Two-Step Data Normalization Approach for Improving Classification Accuracy in the Medical Diagnosis Domain. Mathematics 2022, 10, 1–18, doi:10.3390/math10111942.
39. de Amorim, L.B.V. V; Cavalcanti, G.D.C.C.; Cruz, R.M.O.O. The choice of scaling technique matters for classification performance. Appl. Soft Comput. 2023, 133, 1–37, doi:10.1016/j.asoc.2022.109924.
40. Yadav, G.; Yadav, D.K.; Chandra Mouli, P.V.S.S.R. Chapter 4 - Statistical measures for Palmprint image enhancement. In Cognitive Data Science in Sustainable Computing; Sarangi, P.P., Panda, M., Mishra, S., Mishra, B.S.P., Majhi, B.B.T.-M.L. for B., Eds.; Academic Press, 2022; pp. 65–85 ISBN 978-0-323-85209-8.
41. Parmann, L.D.; Paarmann, L.D. Design and analysis of analog filters: a signal processing perspective; Springer Science & Business Media: New York, 2001; Vol. 617; ISBN 0792373731.
42. Heddam, S.; Kim, S.; Danandeh Mehr, A.; Zounemat-Kermani, M.; Elbeltagi, A.; Malik, A.; Kisi, O. Chapter 11 - A long short-term memory deep learning approach for river water temperature prediction. In Intelligent Data-Centric Systems; Marques, G., Ighalo, J.O.B.T.-C.T. and A. in C.-A.I.E.D.E., Eds.; Academic Press, 2022; pp. 243–270 ISBN 978-0-323-85597-6.
43. Sui, X.; He, S.; Vilsen, S.B.; Meng, J.; Teodorescu, R.; Stroe, D.I. A review of non-probabilistic machine learning-based state of health estimation techniques for Lithium-ion battery. Appl. Energy 2021, 300, 117346, doi:10.1016/j.apenergy.2021.117346.
44. Mesut, B.; Ba?kor, A.; Buket Aksu, N. Chapter 3 - Role of artificial intelligence in quality profiling and optimization of drug products. In; Philip, A., Shahiwala, A., Rashid, M., Faiyazuddin, M.B.T.-A.H. of A.I. in D.D., Eds.; Academic Press, 2023; pp. 35–54 ISBN 978-0-323-89925-3.
45. Umoh, U.A.; Eyoh, I.J.; Murugesan, V.S.; Nyoho, E.E. Chapter 14 - Fuzzy-machine learning models for the prediction of fire outbreaks: A comparative analysis. In; Pandey, R., Khatri, S.K., Singh, N. kumar, Verma, P.B.T.-A.I. and M.L. for E.C., Eds.; Academic Press, 2022; pp. 207–233 ISBN 978-0-12-824054-0.
46. Wang, H.; Liu, Y.; Zhou, B.; Li, C.; Cao, G.; Voropai, N.; Barakhtenko, E. Taxonomy research of artificial intelligence for deterministic solar power forecasting. Energy Convers. Manag. 2020, 214, 112909.
47. Marimuthu, R.; Shivappriya, S.N.; Saroja, M.N. Chapter 14 - A study of machine learning algorithms used for detecting cognitive disorders associated with dyslexia. In; Jude, H.D.B.T.-H. of D.S.S. for N.D., Ed.; Academic Press, 2021; pp. 245–262 ISBN 978-0-12-822271-3.
48. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32.
49. Rodriguez-Galiano, V.; Mendes, M.P.; Garcia-Soldado, M.J.; Chica-Olmo, M.; Ribeiro, L. Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: A case study in an agricultural setting (Southern Spain). Sci. Total Environ. 2014, 476–477, 189–206, doi:https://doi.org/10.1016/j.scitotenv.2014.01.001.
50. Nurwatik, N.; Ummah, M.H.; Cahyono, A.B.; Darminto, M.R.; Hong, J.-H. A Comparison Study of Landslide Susceptibility Spatial Modeling Using Machine Learning. ISPRS Int. J. Geo-Information 2022, 11, 602, doi:10.3390/ijgi11120602.
51. Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232.
52. Otchere, D.A.; Ganat, T.O.A.; Ojero, J.O.; Tackie-Otoo, B.N.; Taki, M.Y. Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. J. Pet. Sci. Eng. 2022, 208, 109244, doi:https://doi.org/10.1016/j.petrol.2021.109244.
53. Khan, A.A.; Chaudhari, O.; Chandra, R. A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation. Expert Syst. Appl. 2024, 244, 122778, doi:https://doi.org/10.1016/j.eswa.2023.122778.
54. Hoang, N.D.; Tran, V.D. Deep Neural Network Regression with Advanced Training Algorithms for Estimating the Compressive Strength of Manufactured-Sand Concrete. J. Soft Comput. Civ. Eng. 2023, 7, 114–134, doi:10.22115/SCCE.2022.349837.1485.
55. Gao, B.; He, Y.; Chen, X.; Chen, H.; Yang, W.; Zhang, L. A Deep Neural Network Framework for Landslide Susceptibility Mapping by Considering Time-Series Rainfall. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 5946–5969, doi:10.1109/JSTARS.2024.3370218.
56. Plevris, V.; Solorzano, G.; Bakas, N.; Ben Seghier, M. Investigation of performance metrics in regression analysis and machine learning-based prediction models; 2022;
57. Raniprima, S.; Cahyadi, N.; Monita, V. Rainfall Prediction Using Random Forest and Decision Tree Algorithms. J. Informatics Commun. Technol. 2024, 6, 110–119.