Sentiment and Linguistic Analysis of Epidemic Outbreak Data from Official and Alternative Sources

Authors

Keywords:

Epidemic outbreaks, sentiment analysis, social networks, epidemiological surveillance, web mining

Abstract

Information on epidemic outbreaks is a key input for health surveillance, as it allows for the assessment of the spread and associated social perception. This study examines emotional and linguistic patterns in narratives disseminated by international organizations (WHO, UN, CDC) and digital platforms (Google News and Reddit) over a three-month period. The KDD process was applied in R Studio (selection, preprocessing, transformation, modeling, and evaluation), using Bing and NRC lexicons and a supervised Naive Bayes model to enhance the detection of emotional nuances. A total of 12,340 texts (3,100 from official sources, 4,240 from Google News, and 5,000 from Reddit) were analyzed using standardized queries in English: pandemic, confinement, epidemic, and HMPV. Official sources showed a greater presence of positive emotions linked to cooperation and security; Google News concentrated negative narratives with terms such as risk and dangerous; Reddit combined fear and sadness with appearances of hope. The analysis included t-tests and ANOVA with 95% confidence intervals. The work is exploratory and preliminary in nature and suggests that surveillance systems should integrate the monitoring of social networks and digital media, along with public policy measures to improve communication in health crisis situations.

Downloads

Download data is not yet available.

Author Biographies

  • Karina Michelle Ordóñez Guerrero, Universidad Técnica Estatal de Quevedo

    Karina Michelle Ordóñez Guerrero is a Systems Engineer and holds a Master's Degree in Data Science from the Technical State University of Quevedo (UTEQ). She has experience in programming, data analysis, and developing technological solutions. She collaborates with teachers on projects involving applied analytics, text mining, and visualization, contributing to experimental design and the creation of scripts in R and Python.

    During her pre-professional internship at the University Wellness Unit (UBU), she participated in data collection, processing, and analysis, provided technical support, and helped generate reports to improve institutional processes. At the National Institute of Statistics and Census (INEC), she was involved in the planning and supervision of operational processes focused on the management and updating of cartographic data, standing out for her organization and results-oriented approach.

    She is currently a programmer and technical assistant at the Royal Dental Center, where she develops solutions that optimize operational and administrative processes. In addition, she supports startups as a technical assistant, gathering requirements, prototyping, testing, and implementing web and mobile applications, and providing technological support to teams starting projects. She has participated in scientific conferences and training sessions on artificial intelligence, data analysis, and emerging technologies, strengthening her technical profile.

  • José Steven Cordero Bazurto, M.Sc., Universidad Técnica Estatal de Quevedo

    José Steven Cordero Bazurto is a Systems Engineer and holds a Master's Degree in Data Science from the Technical State University of Quevedo (UTEQ). He works as a programmer and researcher in technological applications and is affiliated with UTEQ, where he collaborates on innovation projects. He is currently a developer of UTEQ's Academic Management System (SGA), participating in the design, development, and improvement of academic modules, database integration, and information analysis to support institutional decision-making.

    He is the author of scientific articles in indexed journals, including a publication in Ciencia Huasteca from the Higher School of Huejutla. His areas of expertise include software development, data analytics, and visualization, with experience in requirements gathering, architecture design, API creation, dashboard construction, and data flow automation. At UTEQ, he has collaborated with academic teams on initiatives aimed at improving processes through web solutions and integrated services.

  • Geovanny José Brito Casanova, Universidad Técnica Estatal de Quevedo

    Geovanny José Brito Casanova has a degree in Systems Engineering from the Quevedo State Technical University (UTEQ), where he is currently a lecturer at the Faculty of Computer Science and Digital Design. He holds a Master's degree in Development and Operations (DevOps) from the International University of La Rioja (Spain) and a Master's degree in Data Science from UTEQ.

    During his academic training, he was recognized for his excellent academic performance within his degree program and faculty, receiving institutional distinctions and being awarded national and international postgraduate scholarships. His academic and professional experience focuses on the development and implementation of technological solutions, particularly in the areas of education, data science and cloud computing. He has collaborated as a reviewer for scientific journals and has participated as a speaker in academic events with national and international reach.. His research work covers topics such as educational software, digital infrastructure, environmental automation and the use of new technologies in educational processes.

    He is currently involved in university research projects that focus on data analysis, the development of digital environments and the improvement of educational processes through technology.

  • Eduardo Amable Samaniego Mena, M.Sc., Universidad Técnica Estatal de Quevedo

    Eduardo Amable Samaniego Mena holds a degree in Systems Engineering from the Technical State University of Quevedo (UTEQ), a Master's degree in Connectivity and Computer Networks from UTEQ, and a Master's degree in Visual Analytics and Big Data from the International University of La Rioja (UNIR, Spain). He is an undergraduate and graduate professor at UTEQ and a tenured professor. His academic activity revolves around applied research, with an emphasis on computer networks, data analytics, and visualization.

    In his research work, he develops solutions based on text mining, machine learning, and statistical analysis, integrates end-to-end data pipelines, and builds analytical dashboards for decision-making. He participates in projects with multidisciplinary teams, supervises degree theses, and publishes results aimed at solving real-world problems. He uses tools such as Python, R, SQL, Power BI, Tableau, and network infrastructure and security technologies, combining good engineering practices with scientific methodology.

    .

References

[1] K. H. Manguri, R. N. Ramadhan, and P. R. Mohammed Amin, “Twitter Sentiment Analysis on Worldwide COVID-19 Outbreaks,” Kurdistan Journal of Applied Research, pp. 54–65, May 2020, doi: 10.24017/covid.8.

[2] A. Joshi, S. Karimi, R. Sparks, C. Paris, and C. Raina Macintyre, “Survey of Text-based Epidemic Intelligence,” ACM Comput Surv, vol. 52, no. 6, Nov. 2019, doi: 10.1145/3361141.

[3] K. Sherratt et al., “Characterising information gains and losses when collecting multiple epidemic model outputs,” Journal Epidemics, vol. 47, Jun. 2024, doi: 10.1016/J.EPIDEM.2024.100765.

[4] J. A. Polonsky et al., “Outbreak analytics: a developing data science for informing the response to emerging pathogens,” Philosophical Transactions of the Royal Society B: Biological Sciences, 2019, doi: 10.1098/rstb.2018.0276.

[5] K. O. Bazilevych et al., “Information system for assessing the informativeness of an epidemic process feature,” System research and information technologies, vol. 2023, no. 4, pp. 100–112, Dec. 2023, doi: 10.20535/SRIT.2308-8893.2023.4.08.

[6] A. N. Desai et al., “Real-time Epidemic Forecasting: Challenges and Opportunities,” Health Secur, vol. 17, no. 4, p. 268, Jul. 2019, doi: 10.1089/HS.2019.0022.

[7] J. L. Herrera-Diestra, J. M. Buldú, M. Chavez, and J. H. Martínez, “Using symbolic networks to analyse dynamical properties of disease outbreaks,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 476, no. 2236, Apr. 2020, doi: 10.48550/arXiv.1911.05646

[8] T. H. Nguyen, M. Fisichella, and K. Rudra, “A Trustworthy Approach to Classify and Analyze Epidemic-Related Information From Microblogs,” IEEE Trans Comput Soc Syst, 2024, doi: 10.1109/TCSS.2024.3391395.

[9] J. Tolles and T. Luong, “Modeling Epidemics With Compartmental Models,” Journal Jama Network, vol. 323, no. 24, pp. 2515–2516, Jun. 2020, doi: 10.1001/JAMA.2020.8420.

[10] M. Marani, G. G. Katul, W. K. Pan, and A. J. Parolari, “Intensity and frequency of extreme novel epidemics,” Proceedings of the National Academy of Sciences, 2021, doi: 10.1073/pnas.2105482118/-/DCSupplemental.

[11] A. Braunstein, L. Budzynski, and M. Mariani, “Statistical mechanics of inference in epidemic spreading,” Phys Rev E, vol. 108, no. 6, Dec. 2023, doi: 10.1103/PhysRevE.108.064302.

[12] J. Wu, Z. Niu, and X. Liu, “Understanding epidemic spread patterns: a visual analysis approach,” Health Systems, vol. 13, no. 3, pp. 229–245, Jul. 2024, doi: 10.1080/20476965.2024.2308286.

[13] S. Gracy, P. E. Pare, H. Sandberg, and K. H. Johansson, “Analysis and distributed control of periodic epidemic processes,” IEEE Trans Control Netw Syst, vol. 8, no. 1, pp. 123–134, Mar. 2021, doi: 10.1109/TCNS.2020.3017717.

[14] K. M. A. Kabir and J. Tanimoto, “Analysis of epidemic outbreaks in two-layer networks with different structures for information spreading and disease diffusion,” Commun Nonlinear Sci Numer Simul, vol. 72, pp. 565–574, Jun. 2019, doi: 10.1016/J.CNSNS.2019.01.020.

[15] Z. Wang, C. Xia, Z. Chen, and G. Chen, “Epidemic Propagation with Positive and Negative Preventive Information in Multiplex Networks,” IEEE Trans Cybern, vol. 51, no. 3, pp. 1454–1462, Mar. 2021, doi: 10.1109/TCYB.2019.2960605.

[16] B. Wang, M. Gou, and Y. Han, “Impacts of information propagation on epidemic spread over different migration routes,” Nonlinear Dyn, vol. 105, no. 4, pp. 3835–3847, Sep. 2021, doi: 10.1007/S11071-021-06791-8/METRICS.

[17] Z. Wang, X. Rui, G. Yuan, J. Cui, and T. Hadzibeganovic, “Endemic information-contagion outbreaks in complex networks with potential spreaders based recurrent-state transmission dynamics,” Physica A: Statistical Mechanics and its Applications, vol. 573, Jul. 2021, doi: 10.1016/J.PHYSA.2021.125907.

[18] S. S. Chikkaraddi and G. R. Smitha, “Epidemic Disease Expert System,” 1st IEEE International Conference on Advances in Information Technology, ICAIT 2019 - Proceedings, pp. 571–576, Jul. 2019, doi: 10.1109/ICAIT47043.2019.8987421.

[19] K. Osadcha, V. Osadchyi, and V. Kruglyk, “The role of information and communication technologies in epidemics: an attempt at analysis,” Ukrainian Journal of Educational Studies and Information Technology, p., 2020, doi: 10.32919/uesit.2020.01.06.

[20] M. Imanipour, M. Shahmari, │ Saeideh, A. Mahkooyeh, A. Ghobadi, and P. Sanjari, “Reflections on health information sources in epidemics in synchrony with the COVID-19 pandemic: A scoping review,” Journal of Nursing Advances in Clinical Sciences, vol. 1, 2024, doi: 10.32598/JNACS.2401.1005.

[21] S. L. Peng et al., “NLSI: An innovative method to locate epidemic sources on the SEIR propagation model,” An interdisciplinary Journal of NonLinear Science, vol. 33, no. 8, Aug. 2023, doi: 10.1063/5.0152859.

Downloads

Published

2026-01-08

Issue

Section

Research Articles for the Regular Issue

How to Cite

[1]
“Sentiment and Linguistic Analysis of Epidemic Outbreak Data from Official and Alternative Sources”, LAJC, vol. 13, no. 1, pp. 34–44, Jan. 2026, Accessed: Jan. 20, 2026. [Online]. Available: https://lajc.epn.edu.ec/index.php/LAJC/article/view/443

Most read articles by the same author(s)