Sentiment and Linguistic Analysis of Epidemic Outbreak Data from Official and Alternative Sources
Keywords:
Epidemic outbreaks, sentiment analysis, social networks, epidemiological surveillance, web miningAbstract
Information on epidemic outbreaks is a key input for health surveillance, as it allows for the assessment of the spread and associated social perception. This study examines emotional and linguistic patterns in narratives disseminated by international organizations (WHO, UN, CDC) and digital platforms (Google News and Reddit) over a three-month period. The KDD process was applied in R Studio (selection, preprocessing, transformation, modeling, and evaluation), using Bing and NRC lexicons and a supervised Naive Bayes model to enhance the detection of emotional nuances. A total of 12,340 texts (3,100 from official sources, 4,240 from Google News, and 5,000 from Reddit) were analyzed using standardized queries in English: pandemic, confinement, epidemic, and HMPV. Official sources showed a greater presence of positive emotions linked to cooperation and security; Google News concentrated negative narratives with terms such as risk and dangerous; Reddit combined fear and sadness with appearances of hope. The analysis included t-tests and ANOVA with 95% confidence intervals. The work is exploratory and preliminary in nature and suggests that surveillance systems should integrate the monitoring of social networks and digital media, along with public policy measures to improve communication in health crisis situations.
Downloads
References
[1] K. H. Manguri, R. N. Ramadhan, and P. R. Mohammed Amin, “Twitter Sentiment Analysis on Worldwide COVID-19 Outbreaks,” Kurdistan Journal of Applied Research, pp. 54–65, May 2020, doi: 10.24017/covid.8.
[2] A. Joshi, S. Karimi, R. Sparks, C. Paris, and C. Raina Macintyre, “Survey of Text-based Epidemic Intelligence,” ACM Comput Surv, vol. 52, no. 6, Nov. 2019, doi: 10.1145/3361141.
[3] K. Sherratt et al., “Characterising information gains and losses when collecting multiple epidemic model outputs,” Journal Epidemics, vol. 47, Jun. 2024, doi: 10.1016/J.EPIDEM.2024.100765.
[4] J. A. Polonsky et al., “Outbreak analytics: a developing data science for informing the response to emerging pathogens,” Philosophical Transactions of the Royal Society B: Biological Sciences, 2019, doi: 10.1098/rstb.2018.0276.
[5] K. O. Bazilevych et al., “Information system for assessing the informativeness of an epidemic process feature,” System research and information technologies, vol. 2023, no. 4, pp. 100–112, Dec. 2023, doi: 10.20535/SRIT.2308-8893.2023.4.08.
[6] A. N. Desai et al., “Real-time Epidemic Forecasting: Challenges and Opportunities,” Health Secur, vol. 17, no. 4, p. 268, Jul. 2019, doi: 10.1089/HS.2019.0022.
[7] J. L. Herrera-Diestra, J. M. Buldú, M. Chavez, and J. H. Martínez, “Using symbolic networks to analyse dynamical properties of disease outbreaks,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 476, no. 2236, Apr. 2020, doi: 10.48550/arXiv.1911.05646
[8] T. H. Nguyen, M. Fisichella, and K. Rudra, “A Trustworthy Approach to Classify and Analyze Epidemic-Related Information From Microblogs,” IEEE Trans Comput Soc Syst, 2024, doi: 10.1109/TCSS.2024.3391395.
[9] J. Tolles and T. Luong, “Modeling Epidemics With Compartmental Models,” Journal Jama Network, vol. 323, no. 24, pp. 2515–2516, Jun. 2020, doi: 10.1001/JAMA.2020.8420.
[10] M. Marani, G. G. Katul, W. K. Pan, and A. J. Parolari, “Intensity and frequency of extreme novel epidemics,” Proceedings of the National Academy of Sciences, 2021, doi: 10.1073/pnas.2105482118/-/DCSupplemental.
[11] A. Braunstein, L. Budzynski, and M. Mariani, “Statistical mechanics of inference in epidemic spreading,” Phys Rev E, vol. 108, no. 6, Dec. 2023, doi: 10.1103/PhysRevE.108.064302.
[12] J. Wu, Z. Niu, and X. Liu, “Understanding epidemic spread patterns: a visual analysis approach,” Health Systems, vol. 13, no. 3, pp. 229–245, Jul. 2024, doi: 10.1080/20476965.2024.2308286.
[13] S. Gracy, P. E. Pare, H. Sandberg, and K. H. Johansson, “Analysis and distributed control of periodic epidemic processes,” IEEE Trans Control Netw Syst, vol. 8, no. 1, pp. 123–134, Mar. 2021, doi: 10.1109/TCNS.2020.3017717.
[14] K. M. A. Kabir and J. Tanimoto, “Analysis of epidemic outbreaks in two-layer networks with different structures for information spreading and disease diffusion,” Commun Nonlinear Sci Numer Simul, vol. 72, pp. 565–574, Jun. 2019, doi: 10.1016/J.CNSNS.2019.01.020.
[15] Z. Wang, C. Xia, Z. Chen, and G. Chen, “Epidemic Propagation with Positive and Negative Preventive Information in Multiplex Networks,” IEEE Trans Cybern, vol. 51, no. 3, pp. 1454–1462, Mar. 2021, doi: 10.1109/TCYB.2019.2960605.
[16] B. Wang, M. Gou, and Y. Han, “Impacts of information propagation on epidemic spread over different migration routes,” Nonlinear Dyn, vol. 105, no. 4, pp. 3835–3847, Sep. 2021, doi: 10.1007/S11071-021-06791-8/METRICS.
[17] Z. Wang, X. Rui, G. Yuan, J. Cui, and T. Hadzibeganovic, “Endemic information-contagion outbreaks in complex networks with potential spreaders based recurrent-state transmission dynamics,” Physica A: Statistical Mechanics and its Applications, vol. 573, Jul. 2021, doi: 10.1016/J.PHYSA.2021.125907.
[18] S. S. Chikkaraddi and G. R. Smitha, “Epidemic Disease Expert System,” 1st IEEE International Conference on Advances in Information Technology, ICAIT 2019 - Proceedings, pp. 571–576, Jul. 2019, doi: 10.1109/ICAIT47043.2019.8987421.
[19] K. Osadcha, V. Osadchyi, and V. Kruglyk, “The role of information and communication technologies in epidemics: an attempt at analysis,” Ukrainian Journal of Educational Studies and Information Technology, p., 2020, doi: 10.32919/uesit.2020.01.06.
[20] M. Imanipour, M. Shahmari, │ Saeideh, A. Mahkooyeh, A. Ghobadi, and P. Sanjari, “Reflections on health information sources in epidemics in synchrony with the COVID-19 pandemic: A scoping review,” Journal of Nursing Advances in Clinical Sciences, vol. 1, 2024, doi: 10.32598/JNACS.2401.1005.
[21] S. L. Peng et al., “NLSI: An innovative method to locate epidemic sources on the SEIR propagation model,” An interdisciplinary Journal of NonLinear Science, vol. 33, no. 8, Aug. 2023, doi: 10.1063/5.0152859.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Karina Michelle Ordóñez Guerrero, M.Sc., José Steven Cordero Bazurto, M.Sc., Geovanny José Brito Casanova, M.Sc., Eduardo Amable Samaniego Mena, M.Sc.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright Notice
Authors who publish this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-Non-Commercial-Share-Alike 4.0 International 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Disclaimer
LAJC in no event shall be liable for any direct, indirect, incidental, punitive, or consequential copyright infringement claims related to articles that have been submitted for evaluation, or published in any issue of this journal. Find out more in our Disclaimer Notice.





