|Year : 2022 | Volume
| Issue : 3 | Page : 291-297
Trends and analysing the correlation of population density and percentage of population suffered from COVID-19 - A linear regression model
Kamlesh Garg1, Aarushi Mathur1, Surinder Kumar2, Ruchika Nandha3
1 Department of Pharmacology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India
2 Department of Emergency and Accident Services, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India
3 Department of Pharmacology, Dr. Harvansh Singh Judge Institute of Dental Sciences and Hospital, Panjab University, Chandigarh, India
|Date of Submission||07-Oct-2021|
|Date of Acceptance||25-Apr-2022|
|Date of Web Publication||17-Sep-2022|
Dr. Kamlesh Garg
Room No. 607, 6th Floor, Department of Pharmacology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi - 110 029
Source of Support: None, Conflict of Interest: None
BACKGROUND: Since the emergence in December 2019, coronavirus disease (COVID-19) has impacted several countries and made it a worldwide pandemic. It is assumed that chances of transmission of infection of COVID-19 are increased if the population of a particular area is dense as it is a highly contagious disease and measures like social distancing could not be followed. The objectives of this study were as follows: to compare the trend of confirmed, recovered, deceased cases and recovery and death rate of COVID-19 (severe acute respiratory syndrome coronavirus 2) infection in the top 5 worst-hit states of India with National Capital Territory of Delhi and to analyze the correlation of population density with percentage of population suffered.
MATERIALS AND METHODS: This descriptive population study retrieved the data published by daily health bulletins of states and Press Information Bureau, Government of India. The correlational coefficient and linear regression analysis were used to analyze the relation between population density and percentage of population suffered from COVID-19.
RESULTS: Maharashtra continued to be the upmost Indian state with the highest number of confirmed, recovered, deceased cases and death rate. Further, it is estimated that population density has a negligible to low positive correlation (correlation coefficient value: 0.30) with the percentage of population suffered from COVID-19 and there is no significant relationship with P > 0.05 between the above two parameters as obtained using linear regression model.
CONCLUSION: The population density does not have a strong correlation with the percentage of population suffered from COVID-19 in India.
Keywords: COVID cases, COVID-19, percentage of population, population density
|How to cite this article:|
Garg K, Mathur A, Kumar S, Nandha R. Trends and analysing the correlation of population density and percentage of population suffered from COVID-19 - A linear regression model. Indian J Health Sci Biomed Res 2022;15:291-7
|How to cite this URL:|
Garg K, Mathur A, Kumar S, Nandha R. Trends and analysing the correlation of population density and percentage of population suffered from COVID-19 - A linear regression model. Indian J Health Sci Biomed Res [serial online] 2022 [cited 2022 Sep 25];15:291-7. Available from: https://www.ijournalhs.org/text.asp?2022/15/3/291/356273
| Introduction|| |
India is considered to be the top country in Asia continent and ranked number two globally among the worst-hit countries by COVID-19. The first human case of COVID-19, the disease, subsequently named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first reported by officials in Wuhan city, China, in December 2019. As on August 18, 2021, there were 32,321,395 confirmed cases of COVID-19 in India and the top 5 worst-hit states in descending order of number of cases are Maharashtra, Kerala, Karnataka, Tamil Nadu, and Andhra Pradesh. In earlier months, Delhi was among the top three worst-hit states, but later on, its position has stepped down in terms of caseload of COVID-19. In contrast, states such as Andhra Pradesh and Karnataka have shown worst growth in COVID-19 cases in the last 1 month of the study period.
India was one of the earliest countries to declare a lockdown and that certainly gave India some time and opportunity to slow the spread. India is perhaps the only major country that was locked down before the virus took hold. The measures that target public behavior, including mandatory face covering and quarantining, are known as nonpharmaceutical interventions. One of the successes of the Indian government's pandemic policy lies in public health messaging. In between March 25 and June 30, the Prime Minister addressed the nation six times, urging people to be disciplined about COVID appropriate behavior and carried out rigorous campaigns, sending the public frequent reminders about wearing masks and maintaining social distance, thus creating awareness about the vaccines. Such messages helped in creating the herd effect across pharma, economic, health, and public safety sectors that enabled strict national lockdown.
India's high population density brings together another challenge in managing this widespread pandemic. The two metropolitan cities, New Delhi and Mumbai, have a population density of 29,259.12 and 73,000 per square mile, respectively, being some of the most densely populated cities in the world. Most of the residents of Mumbai live in confined houses located far away from their workplace, leading to long commuting time, resulting in prolonged exposure in public transit system. Comparatively, New Delhi covers a larger area than Mumbai. Social distancing in such densely populated cities along with 10%–15% of these cities' populations being illiterate coupled with cultural practices that facilitate gathering in groups might contribute to the emerging infectivity rate. For these dense communities in India, inadequate shelter and overcrowding are also some of the high-risk factors aiding in transmission of the virus. According to a recent report by the National Centers for Disease Control, unauthorized colonies and jhuggi-jhopri clusters pose a serious problem as a large number of people live in these colonies. Residents of these inadequate housing facilities usually lack access to adequate sanitation facilities, and self-isolation is often impossible.
There is a need to explore the various factors such as population, population density, and precautionary measures affecting the population suffered from COVID-19 in some of the worst-hit Indian states. Hence, we analyzed whether there is any correlation between the population density and the percentage of people suffered from COVID-19 with the help of linear regression model. Thus, the objective of this study was to compare the trend of COVID-19 in the top 5 worst-hit states of India with National Capital Territory (NCT) of Delhi and to analyze the correlation between population density and percentage of population suffered from COVID-19 in these states. The population, order of population density, and preventive measures of the 5 states may also be responsible for population density.
- To compare the trend of confirmed, recovered, deceased cases and death rate and recovery rate due to COVID-19 (SARS-CoV-2) infection in the top 5 worst-hit states of India with NCT of Delhi
- To analyze the relationship of population density with percentage of population suffered from COVID-19 in the top 5 worst-hit states of India and NCT of Delhi.
| Materials and Methods|| |
This was an observational descriptive population-based retrospective study to compare the pattern of COVID-19 (SARS-CoV-2) in the top 5 worst-hit states of India, i.e., Maharashtra, Karnataka, Andhra Pradesh, Tamil Nadu, Kerala, and NCT of Delhi. This study retrieved, organized, and analyzed the data published by daily health bulletins of the government of these states and Press Information Bureau, Government of India, which is available in public domain., In this study, we compared the number of confirmed, recovered, deceased cases and recovery and death rate in these states with Delhi from January 30, 2020, to August 18, 2021, on a cumulative basis. Further, the relationship of population density with percentage of population suffered from COVID-19 in different states of India and NCT of Delhi was also statistically analyzed using correlation analysis and linear regression model. Institutional ethics committee permission and informed consent are not required in this study as individual participants are not involved in the research. Although NCT of Delhi is on the 8th position as on August 18, 2021, being the capital of India, the trend of COVID-19 in it is being compared with top 5 worst-hit states in India.
- Daily state-wise cumulative data containing the number of confirmed, recovered, and deceased cases of COVID-19 from January 30, 2020, to August 18, 2021
- State-wise population density in per sq. km.
- Data of COVID-19 from any other unauthorized sites/sources
- Data before January 30, 2020, and after August 18, 2021.
In this, we considered the statistical modeling, i.e., estimated the correlation coefficient followed by linear regression model between the percentage of population suffered from COVID-19 with population density in worst-hit states of India and NCT of Delhi and to know whether any correlation exists between these two variables. The purpose of correlation analysis is to provide information on the strength and direction of the linear relationship between two variables, while a simple linear regression analysis estimates the parameters in a linear equation that can be used to predict the values of one variable based on the other variable.
The computation of the correlation coefficient® is the most commonly used method for analyzing the statistical relationship between two variables which essentially measures the degree of linear association. This is also called as the product-moment correlation coefficient or Pearson correlation coefficient. The value of r lies between −1 and +1. A value of the correlation coefficient close to + 1 indicates a strong positive linear relationship (i.e., one variable increases as the other variable increases). A value close to − 1 indicates a strong negative linear relationship (i.e., one variable decreases as the other increases). A value close to 0 indicates no linear relationship, however, there could be a nonlinear relationship between the variables.
Linear regression model
In this study, linear regression analysis was used to find out the effect of population density (independent variable) on percentage of population suffered from COVID-19 (dependent variable). IBM SPSS Statistics 23 software (IBM Corp., located in Armonk (N.Y., USA) has been used to compute the mentioned concerned parameters from the data. We used the linear regression model which describes the dependent variable with a straight line that is defined by the equation,
Where x and y are the independent and the dependent variables, respectively, a is the slope, b is the intercept on the y-axis, and ε is the error with zero mean value. In the present study, population density is the independent variable (x), while the percentage of population suffered is the dependent variable (y). It is important to examine whether the association is genuine or not which can be done by considering the null hypothesis test. The null hypothesis states that there is no effect or relationship between the variables. We estimated the F value which compares the variances of the two variables and P value which is defined as the most important step to accept or reject a null hypothesis. Since it tests the null hypothesis that its coefficient turns out to be zero, i.e., for a lower value of the P value (< 0.05), the null hypothesis can be rejected otherwise null hypothesis will hold.
| Results|| |
[Graph 1] shows the comparison of COVID-19-confirmed cases in NCT of Delhi with other top 5 worst-hit states of India from January 30, 2020, till August 18, 2021, on a cumulative basis. Since the beginning, Maharashtra continued to hold the top spot and showed a sharp spike from February 2021 onward with maximum number of confirmed COVID-19 cases of 64,06,345 till August 18, 2021, on a cumulative basis, whereas the trend in NCT of Delhi and other top 4 states of India was similar with 14,37,192 confirmed cases in Delhi on the same date.
Recovered cases and recovery rate
India has witnessed 31,517,510 recoveries out of 32,321,395 confirmed cases with recovery rate of 97.51% as on August 18, 2021. [Graph 2] presents the comparison of COVID-19-recovered cases in NCT of Delhi with other top 5 worst-hit states of India from January 30, 2020, till August 18, 2021, on a cumulative basis. As observed, Maharashtra has shown the steepest rise in number of recoveries since the beginning from March 2020 whereas other states have a steep rise in number of recovered cases. Among the top worst-hit states, Andhra Pradesh has shown the maximum recovery rate of 98.52%, followed by Delhi at 98.23% and Kerala having the lowest recovery rate of 94.73%, as shown in [Graph 3]. As per the latest data available, India continues to occupy the top global position as the country with the maximum number of recoveries.
Deceased cases and death rate
India overall presented a death rate of 1.34% due to COVID-19. The comparison of the trend of COVID-19-deceased cases in NCT of Delhi with other top 5 worst-hit states of India from January 30, 2020, to August 18, 2021, on a cumulative basis is presented in [Graph 4]. Maharashtra has shown a steep rise in number of deceased cases since May 2020. On the contrary, Andhra Pradesh is having a minimum number of deaths till August 18, 2021. Other states such as Karnataka, Tamil Nadu, Andhra Pradesh, and Delhi have shown a gradual increase in number of deaths from July 2020 onward, whereas Kerala presented with almost a flat curve till April 2021 and showed a little rise after that till August 18, 2021. Maharashtra is having a maximum death rate of 2.11% followed by Delhi, i.e., 1.74%, whereas Kerala is the state with the least death rate of 0.51% as presented in [Graph 5].
Relationship of population density and percentage of population suffered from COVID-19
[Graph 6] shows a relationship of population density and percentage of population suffered from COVID-19 in NCT of Delhi and other top 5 worst-hit states of India from January 30, 2020, to August 18, 2021, on a cumulative basis. Although Maharashtra is having moderate population density with the highest number of confirmed cases, Kerala having maximum population density among the top 5 worst-hit states presented with the highest percentage of population suffered from COVID-19, i.e., 10.6%. Delhi having the maximum population density among these states is holding the second position in percentage of population suffered from COVID-19, i.e., 7.12% next to Kerala. Thus, a statistical analysis is carried out to know the relation of population density and percentage of population suffered from COVID-19 as these 2 states have highest values in terms of these two parameters. In the statistical modeling, the correlation coefficient is calculated followed by the linear regression analysis. The null hypothesis tests that there is no relationship between the population density and the percentage of population suffered from COVID-19.The dependent variable, percentage of population suffered from COVID-19 was regressed on predicting the independent variable population density to test the hypothesis or the effect. In the regression analysis, F (1,4) = .397 > .05, p = .563 > .056 as given in [Table 1]. The correlation coefficient value is .30 as given in [Table 2]. The value (.3) indicates low positive correlation to negligible correlation, which is further estimated by linear regression analysis as depicted in the equation (1). Since the P value is .563 > .05, which is considered non-significant, thus we failed to reject the null hypothesis and there is no relation between two variables under consideration and null hypothesis holds true. The scatter plot of percentage of population suffered by COVID-19 against population density and the corresponding regression line and equation for the relationship between the variables for the stated states are depicted in [Graph 7].
In this analysis, the results indicate that population density did not play a significant role in predicting percentage of population to suffer from COVID-19. Hence, there is a need to explore, which are the factors beyond population density leading to rise in COVID-19 cases at a particular place.
| Discussion|| |
In the present study, we compared the trend of COVID-19 in the top 5 worst-hit states of India with NCT of Delhi and analyzed the relation between population density and percentage of population suffered from COVID-19 in these states presented in various graphs. In the initial months of its spread (April–August), three states, i.e., Maharashtra, Kerala, and Tamil Nadu, followed by Delhi held the highest number of COVID-19 infections, but the trend witnessed a change soon enough. At present, Delhi has slipped down to 8th position, from 3rd position in April-June, 2020, nowadays, whereas Maharashtra continued to be the topmost in the number of (6,406,345) confirmed COVID-19 cases with till August 18, 2021. Internationally, this trend in Maharashtra (India) can be compared with California (United States of America, USA) having consistent number of COVID-19 cases from the outset of the pandemic and were 238 cases in the month of March, 2020 and reached 41,90,358 till 18th August, 2021. Andhra Pradesh is the state that is represented with a maximum recovery rate of 98.52%, which can be compared, internationally to Texas (USA) having an 89.39% recovery rate. Maharashtra is having maximum death rate in India, i.e., 2.11%, which can be compared to New Jersey, New York, and Massachusetts (USA) having a 2.4% death rate. Kerala presented with a minimum death rate of 0.51% which is similar to Alaska (USA) with a 0.5% death rate. Out of these 6 Indian states, Delhi is having the maximum population density of 11,297 km2 followed by Kerala having an 859 km2 population density, but Kerala is having the highest percentage of population suffered from COVID-19, i.e., 10.60%, followed by Delhi having 7.12% of population suffered from COVID-19.
It is assumed that chances of transmission of infection of COVID-19 are increased if the population of a particular area is dense as it is a highly contagious disease and measures like social distancing could not be followed. In the present study, we estimated the correlation coefficients to know whether there is any correlation between percentage of population suffered from COVID-19 and population density, which is found to be 0.30, indicating negligible-low positive correlation. The P values were computed to test the level of significance between these two variables using regression analysis. The P value was found to be .56> .05 as shown in [Table 3], which is more than the significant level, thus, we failed to reject the null hypothesis and there is no linear relationship between the two variables.
The results of this study are supported by the study done by Hamidi et al. at Johns Hopkins University of Public Health, USA, in which correlation between activity density and infection rate was found to be 0.280. The P value obtained by Structural Equational Modeling analysis is 0.874 > 0.05, which indicates that the infection rate increases with activity density, but the relationship is not statistically significant, possibly due to more adherence to social distancing guidelines. The results of this study can also be compared with the study conducted by Bhadra et al., in which they found a moderate positive correlation (0.49) between the COVID-19 infection rate and the population density in 600 districts of 4 states (West Bengal, Maharashtra, Uttar Pradesh, ad Tamil Nadu) of India. This was further confirmed by regression analysis in which P value was significant at 0.000 level, because of which they rejected the null hypothesis. The results of this study are in contrast with the study conducted by Kadi and Khelfaoui, in which they found a positive correlation using Pearson correlation coefficient, i.e., 0.71, between population density and COVID-19 cases in the country Algeria in North Africa. It was found that population density has a positive effect on increase in number of cases of COVID-19 virus.
In this study and the above-cited studies, the correlation of COVID-19 infection rate was analyzed with population density, but there might be many other factors such as testing rate, airport traffic, and high age groups which may be affecting the number of people getting infected in a particular area. Factors such as wind speed, total number of participants in major sports events, and GDP per capita were positively correlated with the numbers of COVID-19 cases and deaths adjusted by the total population of cities (Spearman's correlation test, P < 0.05). In conclusion, the population density does not have a strong correlation with the percentage of population suffered from COVID-19 in India. Therefore, there is a need of scientific insight for certain unexplored factors beyond population density leading to higher infection rate at a particular place.
The authors are thankful to Ms. Divya, Data Consultant, for the acquisition and retrieval of COVID-19 data.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Mullick, J. Pandemic has changed how India moves, lives, works, and consumes. Hindustan Times 2022;98:70.
Joshi A, Mewani AH, Arora S, Grover A. India's COVID-19 Burdens, 2020. Front Public Health 2021;9:608810.
Bewick V, Cheek L, Ball J. Statistics review 7: Correlation and regression. Crit Care 2003;7:451-9.
Hamidi S, Sabouri S, Ewing R. Does density aggravate the COVID-19 pandemic? J Am Plan Assoc 2020;86:495-509.
Bhadra A, Mukherjee A, Sarkar K. Impact of population density on Covid-19 infected and mortality rate in India. Model Earth Syst Environ 2020;14:1-7. [Doi: 10.1007/s40808-020-00984-7].
Kadi N, Khelfaoui M. Population density, a factor in the spread of COVID-19 in Algeria: statistic study. Bull Natl Res Cent 2020;44:138.
Roy S, Ghosh P. Factors affecting COVID-19 infected and death rates inform lockdown-related policymaking. PLoS One 2020;15:e0241165.
[Table 1], [Table 2], [Table 3]