Utilization of geocoding for mapping infrastructure impacts and mobility due to floods in indonesia based on twitter analytics

Abstract

Flooding, a frequent natural disaster in Indonesia, is caused by several factors such as high-intensity rainfall, climate change, inadequate drainage and urban infrastructure challenges, impacting communities, infrastructure and economic activities. The lack of accurate and centralized data hinders government efforts to identify affected areas and respond effectively. Named Entity Recognition (NER), a machine learning-based information extraction tool, offers the potential for geocoding flood-related data from social media, such as Twitter. The purpose of this research is to develop a Named Entity Recognition (NER)-based model to extract location information from Twitter and visualize flood impacts through geocoding. The method used is a combination of Qualitative Analysis with Machine Learning and Geospatial Analysis to assess flooding impacts using Twitter data. Initially, a qualitative analysis of tweets extracts flood-related keywords to identify patterns. Then, Named Entity Recognition (NER) identifies locations, which are converted into geographic coordinates through geocoding for map visualization. The results show that location extraction from flood-related tweets using the Named Entity Recognition (NER) model and geocoding produces very useful and accurate data. About 50% of the flood-related tweets included location tokens, which shows the importance of geographic information in understanding the impact of disasters. The location extraction process using the NER model proved to be effective, although there were some discrepancies between the extracted location tokens and the actual geographic data, especially at the more detailed location level. However, the evaluation results show that 99.5% of the extracted locations correspond to valid locations, especially in the Indonesian region. This shows that the use of the NER model and geocoding is highly effective in analyzing flood impacts and provides significant benefits in disaster management and geospatial analysis based on social media data.
Keywords
  • Geocoding
  • Text Mining
  • Named Entity Recognition
  • StanfordNER
  • Twitter
References
  1. Alam, M. M., Torgo, L., & Bifet, A. (2022). A survey on spatio-temporal data analytics systems. ACM Computing Surveys, 54(10s), 1–38. https://doi.org/10.1145/3507904
  2. Caliskan, A., Özkan Özen, Y. D., & Ozturkoglu, Y. (2020). Digital transformation of traditional marketing business model in new industry era. Journal of Enterprise Information Management, 34(4), 1237-1261. https://doi.org/10.1108/JEIM-01-2020-0033
  3. Chen, Z., & Lim, S. (2021). Social media data-based typhoon disaster assessment. International Journal of Disaster Risk Reduction, 64, 102482. https://doi.org/10.1016/j.ijdrr.2021.102482
  4. Cui, P., Peng, J., Shi, P., Tang, H., Ouyang, C., Zou, Q., Liu, L., Li, C., & Lei, Y. (2021). Scientific challenges of research on natural hazards and disaster risk. Geography and Sustainability, 2(3), 216–223. https://doi.org/10.1016/j.geosus.2021.09.001
  5. Cui, L., Wu, Y., Liu, J., Yang, S., & Zhang, Y. (2021). Template-based named entity recognition using BART. ArXiv Preprint ArXiv:2106.01760. https://doi.org/10.48550/arXiv.2106.01760
  6. Dartanto, T. (2022). Natural disasters, mitigation and household welfare in Indonesia: Evidence from a large-scale longitudinal survey. Cogent Economics & Finance, 10(1). https://doi.org/10.1080/23322039.2022.2037250
  7. Deekshith, A. (2021). Data engineering for AI: Optimizing data quality and accessibility for machine learning models. International Journal of Management Education for Sustainable Development, 4(4).
  8. Fariz, T. R., Suhardono, S., & Verdiana, S. (2021). Pemanfaatan Data Twitter Dalam Penanggulangan Bencana Banjir dan Longsor. CogITo Smart Journal, 7(1), 135–147. https://doi.org/10.31154/cogito.v7i1.305.135-147
  9. Feng, Y., Huang, X., & Sester, M. (2022). Extraction and analysis of natural disaster-related VGI from social media: review, opportunities and challenges. International Journal of Geographical Information Science, 36(7), 1275–1316. https://doi.org/10.1080/13658816.2022.2048835
  10. Ferreira, M. A., Oliveira, C. S., & Francisco, R. (2024). Tsunami risk mitigation: The role of evacuation routes, preparedness, and urban planning. Natural Hazards. https://doi.org/10.1007/s11069-024-04917-7
  11. Girsang, A. S., Isa, S. M., & Fajar, R. (2021). Implementation of a geocoding in journalist social media monitoring system. International Journal of Engineering Trends and Technology, 69(12), 103–113.
  12. Govindarajan, S., Mustafa, M. A., Kiyosov, S., Duong, N. D., Raju, M. N., & Gola, K. K. (2023). RETRACTED: An optimization based feature extraction and machine learning techniques for named entity identification. Optik, 272, 170348. https://doi.org/10.1016/j.ijleo.2022.170348
  13. Hair, J. F., Jr., & Sarstedt, M. (2021). Data, measurement, and causal inferences in machine learning: Opportunities and challenges for marketing. Journal of Marketing Theory and Practice, 29(1), 65-77. https://doi.org/10.1080/10696679.2020.1860683
  14. Havas, C., & Resch, B. (2021). Portability of semantic and spatial–temporal machine learning methods to analyse social media for near-real-time disaster monitoring. Natural Hazards, 108, 2939–2969. https://doi.org/10.1007/s11069-021-04679-9
  15. Hou, H., Shen, L., Jia, J., & Xu, Z. (2024). An integrated framework for flood disaster information extraction and analysis leveraging social media data: A case study of the Shouguang flood in China. Science of The Total Environment, 949, 174948. https://doi.org/10.1016/j.scitotenv.2024.174948
  16. Iparraguirre-Villanueva, O., Melgarejo-Graciano, M., Castro-Leon, G., Olaya-Cotera, S., Ruiz-Alvarado, J., Epifanía-Huerta, Á., Cabanillas-Carbonell, M., & Zapata-Paulini, J. (2023). Classification of tweets related to natural disasters using machine learning algorithms. International Journal of Interactive Mobile Technologies (iJIM), 17(14). https://doi.org/10.3991/ijim.v17i14.39907
  17. Khan, M. T. I., Anwar, S., Sarkodie, S. A., Yaseen, M. R., Nadeem, A. M., & Ali, Q. (2023). Natural disasters, resilience-building, and risk: Achieving sustainable cities and human settlements. Natural Hazards, 118, 611–640. https://doi.org/10.1007/s11069-023-05839-1
  18. Kocerka, J., Krześlak, M., & Gałuszka, A. (2022). Ontology extraction from software requirements using named-entity recognition. Advances in Science and Technology. Research Journal, 16(3). https://doi.org/10.12913/22998624/149941
  19. Lai, K., Porter, J. R., Amodeo, M., Miller, D., Marston, M., & Armal, S. (2022). A natural language processing approach to understanding context in the extraction and geocoding of historical floods, storms, and adaptation measures. Information Processing & Management, 59(1), 102735. https://doi.org/10.1016/j.ipm.2021.102735
  20. Lawu, B. L., Lim, F., Susilo, A., & Surantha, N. (2021). Social media data crowdsourcing as a new stream for environmental planning & monitoring: A review. IOP Conference Series: Earth and Environmental Science, 729(1), 012013. https://doi.org/10.1088/1755-1315/729/1/012013
  21. Mansourian, A., & Oucheikh, R. (2024). ChatGeoAI: Enabling geospatial analysis for public through natural language, with large language models. ISPRS International Journal of Geo-Information, 13(10), 348. https://doi.org/10.3390/ijgi13100348
  22. Mata, P., Cullano, R. A., Tiu, A. M., Gonzales, G., Selerio Jr., E., Maturan, F., Evangelista, S. S., Burdeos, A., Yamagishi, K., & Ocampo, L. (2023). Public satisfaction with the government's disaster response during Typhoon Odette (Rai). International Journal of Disaster Risk Reduction, 84, 103483. https://doi.org/10.1016/j.ijdrr.2022.103483
  23. Mitchell, S. M., & Pizzi, E. (2021). Natural disasters, forced migration, and conflict: The importance of government policy responses. International Studies Review, 23(3), 580–604. https://doi.org/10.1093/isr/viaa058
  24. Mollick, T., Azam, M. G., & Karim, S. (2023). Geospatial-based machine learning techniques for land use and land cover mapping using a high-resolution unmanned aerial vehicle image. Remote Sensing Applications: Society and Environment, 29, 100859. https://doi.org/10.1016/j.rsase.2022.100859
  25. Motta, M., de Castro Neto, M., & Sarmento, P. (2021). A mixed approach for urban flood prediction using Machine Learning and GIS. International Journal of Disaster Risk Reduction, 56, 102154. https://doi.org/10.1016/j.ijdrr.2021.102154
  26. Nikkanen, M., & Räsänen, A. (2023). Spatial data, methods, and mismatches for adaptive governance research. In Political Science and Public Policy 2023 (pp. 99–114). Edward Elgar Publishing. https://doi.org/10.4337/9781800888241.00016
  27. Okonkwo, I., & Awad, H. A. (2023). The role of social media in enhancing communication and collaboration in business. Journal of Digital Marketing and Communication, 3(1), 19-27. https://doi.org/10.53623/jdmc.v3i1.247
  28. Plyushteva, A., & Schwanen, T. (2022). “We usually have a bit of flood once a week”: conceptualising the infrastructural rhythms of urban floods in Malate, Manila. Urban Geography, 44(8), 1565–1583. https://doi.org/10.1080/02723638.2022.2105003
  29. Putra, P. K., Mahendra, R., & Budi, I. (2022). Traffic and road conditions monitoring system using extracted information from Twitter. Journal of Big Data, 9(1), 65. https://doi.org/10.1186/s40537-022-00621-3
  30. Salamkar, M. A. (2024). Data visualization: AI-enhanced visualization tools to better interpret complex data patterns. Journal of Bioinformatics and Artificial Intelligence, 4(1), 204–226.
  31. Sathianarayanan, M., Hsu, P.-H., & Chang, C.-C. (2024). Extracting disaster location identification from social media images using deep learning. International Journal of Disaster Risk Reduction, 104, 104352. https://doi.org/10.1016/j.ijdrr.2024.104352
  32. Sitorus, M. E. J., Nababan, D., & Bangun, H. A. (2023). Dampak Bencana Banjir Terhadap Kesehatan Masyarakat Siatas Barita. Tour Abdimas Journal, 2(2), 54–59. https://tourjurnal.akupuntour.com/index.php/tourabdimasjournal/article/view/78
  33. Splendiani, S., & Capriello, A. (2022). Crisis communication, social media and natural disasters – the use of Twitter by local governments during the 2016 Italian earthquake. Corporate Communications: An International Journal. https://doi.org/10.1108/CCIJ-11-2021-0201
  34. Tekumalla, R., & Banda, J. M. (2022). TweetDIS: A large Twitter dataset for natural disasters built using weak supervision. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data) (pp. 4816–4823). IEEE. https://doi.org/10.1109/BigData55660.2022.10020214
  35. Ye, X., Du, J., Gong, X., Na, S., Li, W., & Kudva, S. (2021). Geospatial and semantic mapping platform for massive COVID-19 scientific publication search. Journal of Geovisualization and Spatial Analysis, 5, 1–12. https://doi.org/10.1007/s41651-021-00073-y