he forecast of Mexico's population have been traditionally done using the demographic components method, which is based on the estimation of births, deaths, migrants and immigrants, which are elements that determine the change in human populations. This method estimates births and deaths, projecting traditionally fertility and mortality using logistic functions (Partida, 2006). However, emigrants and immigrants have not been projected using mathematical models; their forecast has been restricted only to establishing assumptions about their behavior in future.
The demographic components method also provides us demographic dynamics of the country through predicting the future behavior of its components such as fertility, mortality and international migration. Due to this, the method can introduces for each component (fertility, mortality and migration) several sources of error such as: 1) Errors in data, since if the data are not reliable or accurate, they will produce biased forecasts; 2) Logistic functions used to projecting mortality and fertility are often not adequate, and 3) The minimum and maximum that are set in order Author ? ?: National Population Council from Mexico. e-mails: [email protected], [email protected] to projecting fertility and mortality are not statistical estimations, but are fixed in an ambiguous way.
So that, if all above error types are present in the three components, nine error sources would be introduced. But if is projected the total population only is introduced three error sources and so, the projections may be more accurate. So, purpose of this paper is firstly, demonstrate that the population growth in Mexico follows mathematical laws very accurate that allows estimate the maximum and minimum of the growth and besides determine the evolution pattern of Mexico´s population through time, and secondly, use the results to obtain forecasts of the Mexican population until year 2050.
The United Nations (UN) publishes projections of populations around the world. Traditionally, The UN produced them with standard demographic methods based on assumptions about future fertility rates, survival probabilities, and net migration. Such projections, however, were not accompanied by formal statements of uncertainty expressed in probabilistic terms. In July 2014 the UN for the first time issued official probabilistic population projections for all countries to 2100 (Alkema, 2015).
There exist several methods to project the population. Some of them project the total population using an initial population and future rates of population growth. Other which is called components method, projects the population by age and sex using an initial age and sex structure of the population and projections of fertility and mortality (The Cohort Component Method for Making Population Projections, 2017). Some international organizations prepare population projections for the world, regions and countries. One of them organizations is the UN and the U.S. Census Bureau. Other organizations as World Bank and the International Institute for Applied System Analysis (IIASA) also prepare populations projections for world, major regions, and for individual countries. Each of these organizations uses slightly different methodologies, makes assumptions also different about the future demographic trends, and begins with slightly different estimates of current population size. Nevertheless, for the next 50 years their results fall within a relatively small band (Population Reference Bureau, 2017).
According to the World Population Prospects: the 2015 revision, nowadays world's population is 7.3 billion and is expected to reach 8.5 billion in 2030, 9.7 in 2050, and 11.2 in 2100. China and India continue being the two countries with more population in the entire world, representing 19 and 18% of the world's population respectively. However, projections indicate by 2022 the India´s population will be greater than the China's population. Today, one of the ten countries that have more population worldwide is in Africa (Nigeria), five are in Asia (Bangladesh, China, India, Indonesia and Pakistan), two are in Latin America (Brazil and Mexico), one is in Northern America (USA) and one is in Europe (Russian Federation) (United Nations, 2017).
Currently, the world population continues to grow although more slowly than in the recent past. Ten years ago, world population was growing by 1.24 per cent per year. Today is growing by 1.18 per cent per year or approximately an additional 83 million people annually. The most demographers worldwide expect this growth will continue during the rest of this century (World Population History, 2017).
Also is very important consider that a projection is not a prediction about what it will happen, it is indicating what will happen if the assumptions which underpin the projection actually occur. These assumptions are often based on patterns and data trends which we have previously observed (Australian Bureau of Statistics, 2017).
Data we used in this paper were Mexico´s population of the last 225 years and they covered 1790-2015 period. Source of these data is the National Institute of Statistic and Geography (INEGI by its acronym in Spanish) (Table 1).
At last 225 years Mexico´s population has grown under effect of conditions socials, politics and economic very different. Through this period we can identify three scenarios that undoubtedly have contributed to establish the demographic dynamic and the population volume that has the country nowadays.
The first scenario is located in the nineteenth century in which the growth of the population was very slow. The second scenario refers to a part of the twentieth century in which growth continued slowly, but from the year 1930 began an exponential growth. The third scenario is located at end of the 20th century and what goes of century XXI, in which the exponential growth of the population has ended to giving pass to slower growth than the exponential (Figure 1). As we can see in the figure 1, in 1790, Mexico had only 4.64 million of inhabitants and had to passing little more than 90 years for the population reached the double. For 1900, the population was 13.6 million of persons, while in 1960 reached 34.9 million, this is, 7.7 million more than the double of 1900, this means that, a process that took more than 90 years in century XIX, in century XX it took a little more than 50 years. But the rapid growth of the population in century XX continued to increasing, so that, in 1980 Mexico reached 66.8 million, only 3 million less than double of 1960, what indicated that in a few more of 30 years the population would duplicate again. However, since 1980 to 2015 have passed 35 years and the population has not duplicated, what seems to indicating the rapid growth of the population has been stopped.
We can also see in figure 1 that Mexico's population has been growing since 225 years ago continuously, so that arise follow question: will continue growing and growing in future? We think no, and alike a lot of demographers in our country and in worldwide we expect the population will stabilize or reach a maximum value and will start to decreasing. In both cases implies that it must exist a maximum value to population's grow and the answers regarding its existence and the calculation of this value seem to being in the Stable Bounded Theory (Gonzalez-Rosas, 2012).
The Stable Bounded Theory rests in two fundamental postulates, first, that in each year the population is a random phenomenon, so, according to the probability theory in each year must have a mean and a variance. Second, the mean of the population is equal to a mathematical function which depends on time, what implies then by properties of the mean that in each year the observations of the population will be equal to a quantity determined by the mathematical function plus a certain random deviation which it will happen according a probabilistic law. Medhi (1981) called to the mathematical function, the deterministic component and the random deviation, the stochastic component. Such that, under these postulates the behavior equations of the observations and the mean of the population in each time would be:
t t t f P ? + = ) ((1)) (t f t P = µ (2)Where, ?? ?? , Denotes population in time t, ð??"ð??"(??), Is an unknown mathematical function, ? t Are random variables that we suppose independents, with distribution law Normal, mean ?? ? = 0, And constant variance ?? ? 2 , and ?? ?? ?? , Denotes population's mean in time t.
In order to proving that value maximum exists the Stable Bounded Theory uses the population´s change amount respect time. Due to population´s change amount between a time and other is measured with the slope of the straight line that joins two points of the bi-dimensional space defined by time and the population, we calculated the slopes and middle values of two consecutive population values of following way: The table 2 has the results of the calculations and in figure 2 As we can see in figure 2 the behavior of slope in terms of the middle values is also random, so that, according to the probability theory must also have a mean and a variance, and as a consequence of the second postulate of the Stable Bounded Theory its mean must be equal to other unknown mathematical function that we will denote with the letter g. It is important also to point out that function g depends of population.
i i P i t t P P i i t t ? ? = ? + + 1 1 (3) 2 1 i i i i t tIn figure 2, we can also observe that function g seems to be a parabola, so that, if this assumption is true must there be two values of the population where the g function´s curve intersect the X axis. Those values we will denote them as ?? 1 and ?? 1 + ?? 2 . But besides, is important point out that in those values the change amount regard time is zero, what implies ?? 1 + ?? 2 is a maximum value for the population and ?? 1 is a minimum value. This fact proves empirically that mean of Mexico's population is bounded by those values. From the mathematical point of view, the maximum and minimum values are equal to those values of the population that make the slope of deterministic component of 5 is zero, that is,
C BP AP i i t t + + = 2 0and after, using the formulas to calculating the roots of a parabola we have that
A C A B A B k 2 4 2 2 1 ? + ? = (7) A C A B A B k k 2 4 2 2 2 1 ? ? ? = + (8)These results indicate that formulas 7 and 8 are estimators of the minimum and maximum values of the population respectively. The following table 3 presents the estimates of least squares ordinary of the constants A, B and C, and the p-values to determining their statistical significance. As we can seen, the three coefficients are significantly different from zero, so that, to estimate the maximum and minimum values of the population, the estimations of the coefficients were substituted in 7 and 8, obtaining that,
In addition to the significance of the parameters, value of the F Statistic was 121.87 with a p-value of 0.0000, which proves that the parabola assumption in 6 is true with a determination coefficient 90.29% 1 1 The residual analysis indicates that the random variables of the model are distributed normal, are independent and have constant variance.
. These results together with the fulfillment of the assumptions of the residuals of 5 prove mathematically the existence of the maximum and minimum of the Mexican population. Finally, it is important to clarify that the values ?? 1 = 7.1 and ?? 1 + ?? 2 = 153.6 are bounds for the mean of the population, but not for the observations, which according to the probability theory they will deviate a certain amount around the mean depending on its variance, therefore they can be greater or lesser than ?? 1 = 7.1 and ?? 1 + ?? 2 = 153.6, but their occurrence will be governed by a probabilistic law.
According to the postulates of the Stable Bounded Theory, the behavior equations of the observations and mean of the population in each time are,
t t t f P ? + = ) ( ) (t f t P = µThe problem is that in practice the function ð??"ð??"(??) is unknown, however, the trend of data and the existence of the maximum and minimum values can give us idea of how is its derivative, and the theory of differential equations can help us to deducing its mathematical equation. Firstly, according to trend of observed data, the function ð??"ð??"(??) has to be increasing, and so, its derivative will be positive. Secondly, due to existence of the maximum and minimum values its derivative will have to be zero in those values. Based on these properties the Stable Bounded Theory deduces a function which satisfies those properties mentioned.
The Stable Bounded Theory begin supposing that derivative of the unknown function is given by the product of two functions ? 1 (??) and ? 2 (??), one that depends of the population and other that depends of time, forming a differential equation of separable variables (Wilye, 1979), which has as solution a function that relate the population and time, namely,
) ( ) ( 2 1 t h P h dt dP = (9)Where dt DP , denotes the derivative of ð??"ð??"(??) Now since the derivative must be positive and equal to zero in the minimum and maximum values, then the function ? 1 (??) can be as follows:
) ( ) ( ) ( 2 1 1 1 k k P k P P h ? ? ? = And so, ) ( ) ( ) ( 2 2 1 1 t h k k P k P dt dP ? ? ? =Where ? 1 (??) ?? 1 and ?? 1 + ?? 2 are the minimum and maximum values.
We can observe that due to ?? 1 and ?? 1 + ?? 2 are bounds inferior and superior respectively of the population, then quantity (?? ? ?? 1 ) is always positive, but quantity (?? ? ?? 1 ? ?? 2 ) is negative, therefore (?? ? ?? 1 ) (?? ? ?? 1 ? ?? 2 ) is negative, what implies ? 2 (??) must be negative in order to the derivative be positive as we require. By other hand, we can also see that when the population is equal to ?? 1 and ?? 1 + ?? 2 and then the derivative is zero, the other condition we require.
Solving by partial fractions the indefinite integral of the left hand we have that Where ??(??) is an unknown function such that its derivative is equal to ? 2 (??) and which can be determined by using the observed data, since,
) ( 1 1 2 t k P k Ln ? = ? ? ? ? ? ? ? ? ? ? (11) What implies if the Stable Bounded Theory is true that variable ?? 2 ????? 1 ? 1 must be a function of time t.Gonzalez -Rosas (2010) call to this variable the transformed of the population.
In order to estimate the function ??(??) first we estimated ?? 2 and after substitute estimations on equation 11, this is,
1 . 7 1 = k 6 . 153 2 1 = + k k 5 . 146 2 = kAfter that, we assign the values observed of the population and calculated the transformed of population. In table 4 we can see results and in figure 4 the behavior of the transformed and time. But, due to the derivative of ??(??) has to be equal to ? 2 (??) which has to be negative, so, in the case of straight-line pattern the ? parameter has to be negative, and in the case of parabola pattern the parameters A and B have to be negatives. If we use the straight-line we obtain a pattern called Logistic, but if we use the parabola we have a pattern called Extended Gaussian. The equations of these patterns are respectively, t t e k k P
? ? + + + = 1 2 1 (12) C t B t A t e k k P + + + + = 2 1 2 1 (13)In the equation 12, due to ? is negative when time is increased then is P is near to ?? 1 + ?? 2 , and in equation 13, because A and B are negatives when time is increased P is near also to ?? 1 + ?? 2 . That is, those parameters determine how quickly P approaches the maximum. Due to these characteristics the parameters ?, A and B are called the quickness parameters (González-Rosas, 2018).
In order to determining what pattern is adjusted better to observed data we estimated both the straightline and the parabola. The following table presents the ordinary least squares estimation and the p-values of the straight-line and parabola. As we can see in table 5 all parameters are significantly different of zero with a 5% significant level, except the B parameter which is significant at 6% level. We can also see the p-values of both F tests that indicate both equations are correct at 5% significant level. Finally, we have the determination coefficient which point out that straight line explain the 99.86 percent of the data variation of transformed population, while parabola explain 99.17 percent, this is, the straight -line explain data variation better. And so, estimated equations of the logistic and Gaussian patterns are respectively, (15) In figure 5, we have the graphics of both patterns. As we can see at glance both logistic pattern and Gaussian pattern fit very well to the observed data at all period, however, the Gaussian pattern seems to is adjusted better than Logistic pattern in 1983-1910 period. But by other hand, Logistic pattern seems to When we substituted data at equation 16 we found that the MAE for the logistic pattern was 98.73, while for the Gaussian pattern was 102.26. Based on these results we decided that Logistic pattern explain better the population evolution through time in Mexico.
All the results above prove that behavior of the mean of the population through time in Mexico is governed by following mathematical equation: So that, when we gave values to time variable in equation 17 we obtained punctual forecasts of the mean of Mexico´s population for the 2016-2050 period (Annex 1). In figure 6, we can observe that the model is adjusted very well to the observed data. According to the results of equation 17, we found that in 2025 the mean of Mexico´s population will be 130.2 million of people, in 2040 will be 141.1, and for 2050 the mean of Mexico´s population will reach 148 million of people. Still 8 million per under the population maximum that is of 153.6 people.
It is very important to clarify that what we are forecasting is the population mean not the observations, because, those are random and hence cannot be predictable, so that, in 2025 the real observation can be below or above to the 130.2 million, the same will happen in 2040 and 2050.
If we consider that in each moment of time the population is a random phenomenon, then we can explain behavior irregular observed of the population in figure 1, however, this hypothesis brought as a consequence that we cannot forecast the population, since random phenomena cannot be predicted. So, the question arose, how can we predict what is not predictable?
The answer arrived us of the probability theory, since, according this theory if the population is a random phenomenon must to have a mean and a variance, so that, when we supposed that mean had a deterministic behavior given by a mathematical function that depends of time, then we accept that we would be able predict to least the mean of the population.
After that, according to trend of data, the function had to be growing, however population cannot grow, grow and grow, so that, was better suppose that must tend to the stabilizing or maybe to reach a maximum and after that, decrease. This situation brought us two questions more, firstly, what is the value, where the population is going to stabilize or reach the maximum in future? And secondly, what is the function we had to use to predict the population? This two questions we answered them using the Stable Bounded Theory, which allowed us to prove existence of a stabilizer value and besides to calculate it. Also we find the function or pattern which allowed us to do the predictions of the population.
In Mexico, for the period 1790-2050, the behavior of mean of population through time is governed by a mathematical low that depends of time.
By first time, the scientific community has mathematics tests about subjects that we only watch at glance, that is, tests about the logistic pattern of the population growth.
In Mexico in order to explaining evolution of the population trough time, the demographers have used the logistic pattern, however, never they have given a mathematic test, this paper prove all the hypothesis used about it and substitute the empirical aspects.
Although this exercise was done with data from Mexico, it is important to make it clear that the Stable Bounded Theory can be applied to any country where data on the population are available. Also it can be used to forecasting mortality, fertility and net migration.
However, it is necessary to warn that the results of this paper are based on the assumption of the social, economic and political conditions of Mexico will continue without change. If this assumption it is not fulfill, the forecasts we are giving will not be true.
Also it is necessary to warn that the mathematical modeling of reality is based on assumptions, and that theoretical results are true only if the assumptions fulfill, so that, it is necessary to do a great effort to prove that the assumptions are true.
Finally, is clear that any exercise to predict the future is exposed a lot of error sources: wrong data, false assumptions, and wrong models, so on. Therefore, it is necessary to identify all possible error sources, and then utilize methodologies that minimize those errors. The Stable Bounded Theory is an example of that.
| Year | Population | Year | Population |
| 1790 | 4.64 | 1910 | 15.2 |
| 1803 | 5.76 | 1921 | 14.33 |
| 1810 | 6.12 | 1930 | 16.66 |
| 1820 | 6.2 | 1940 | 19.7 |
| 1827 | 8.0 | 1950 | 25.8 |
| 1830 | 7.996 | 1960 | 34.9 |
| 1838 | 7.004 | 1970 | 48.2 |
| 1842 | 7.015 | 1980 | 66.8 |
| 1850 | 7.5 | 1990 | 81.25 |
| 1858 | 8.6 | 1995 | 91.16 |
| 1870 | 8.78 | 2000 | 97.48 |
| 1880 | 9.0 | 2005 | 103.26 |
| 1893 | 11.99 | 2010 | 112.34 |
| 1900 | 13.6 | 2015 | 119.51 |
| Source: 1790-2010 INEGI, 2017a; 2015 INEGI 2017b | |||
| It is very important point out that the most data | |||
| of the table 1 were calculated by a population census. | |||
| 120 defined by time and the population (Leithold, 1973, Million p. 137), and 140 ???? ?? ?? ?? | 119.51 | |||||||||||
| 100 | ||||||||||||
| people | 40 60 80 | 34.9 66.8 | ||||||||||
| 0 20 | 4.64 | 9.0 | 13.6 | |||||||||
| 1750 | 1800 | 1850 | 1900 | 1950 | 2000 | 2050 | ||||||
| t MV P | = | t P | + | P | ? | P | ||||||
| 2.5000 | ||||||
| 2.0000 | ||||||
| p e s | 1.5000 | |||||
| S l o | 1.0000 | g(p) | ||||
| 0.5000 | ||||||
| 0.0000 | 20.00 | 70.00 | 120.00 | 170.00 | ||
| -30.00 | k1 | k1+k2 | ||||
| -0.5000 | Middle values | |||||
| Year Time Population Middle Points Slopes | ||||||
| 1790 | 0 | 4.64 | 5.20 | 0.0862 | ||
| 1803 | 13 | 5.76 | 5.94 | 0.0514 | ||
| 1810 | 20 | 6.12 | 6.16 | 0.0080 | ||
| 1820 | 30 | 6.2 | 7.10 | 0.2571 | ||
| 1827 | 37 | 8.0 | 8.0 | -0.0013 | ||
| 1830 | 40 | 7.996 | 7.50 | -0.1240 | ||
| 1838 | 48 | 7.004 | 7.01 | 0.0028 | ||
| 1842 | 52 | 7.015 | 7.26 | 0.0606 | ||
| 1850 | 60 | 7.5 | 8.05 | 0.1375 | ||
| 1858 | 68 | 8.6 | 8.69 | 0.0150 | ||
| 1870 | 80 | 8.78 | 8.89 | 0.0220 | ||
| 1880 | 90 | 9 | 10.50 | 0.2300 | ||
| 1893 | 103 | 11.99 | 12.80 | 0.2300 | ||
| 1900 | 110 | 13.6 | 14.40 | 0.1600 | ||
| 1910 | 120 | 15.2 | 14.77 | -0.0791 | ||
| 1921 | 131 | 14.33 | 15.50 | 0.2589 | ||
| 1930 | 140 | 16.66 | 18.18 | 0.3040 | ||
| 1940 | 150 | 19.7 | 22.75 | 0.6100 | ||
| 1950 | 160 | 25.8 | 30.35 | 0.9100 | ||
| 1960 | 170 | 34.9 | 41.55 | 1.3300 | ||
| 1970 | 180 | 48.2 | 57.50 | 1.8600 | ||
| Parameter Estimation Estandar Error | t-Value p-Value | |||
| A | -0.0003 | 0.00005 | -5.8 | 0.000 |
| B | 0.0482 | 0.00558 | 8.64 | 0.000 |
| C | -0.3257 | 0.08217 | -3.96 | 0.001 |
| Year | Time | Population | Transformed of the Population |
| 1827 | 37 | 8.00 | 5.0515 |
| 1830 | 40 | 8.00 | 5.0558 |
| 1858 | 68 | 8.60 | 4.5503 |
| 1870 | 80 | 8.78 | 4.4379 |
| 1880 | 90 | 9.00 | 4.3155 |
| 1893 | 103 | 11.99 | 3.3594 |
| 1900 | 110 | 13.60 | 3.0650 |
| 1910 | 120 | 15.20 | 2.8344 |
| 1921 | 131 | 14.33 | 2.9538 |
| 1930 | 140 | 16.66 | 2.6586 |
| 1940 | 150 | 19.70 | 2.3609 |
| 1950 | 160 | 25.80 | 1.9202 |
| 1960 | 170 | 34.90 | 1.4504 |
| 1970 | 180 | 48.20 | 0.9410 |
| 1980 | 190 | 66.80 | 0.3737 |
| 1990 | 200 | 81.25 | -0.0250 |
| 1995 | 205 | 91.16 | -0.2977 |
| 2000 | 210 | 97.48 | -0.4769 |
| 2005 | 215 | 103.26 | -0.6476 |
| 2010 | 220 | 112.34 | -0.9367 |
| 2015 | 225 | 119.51 | -1.1935 |
| Parameter Estimation | Standar Error | p-Value of t test | p-Value of F Test | R 2 | |
| ? ? | 9.4776 -0.0474 | 0.113421 0.00058 | 0.000 0.000 | 0.0000 | 0.9986 |
| A | -0.0001 | 0.000014 | 0.000 | ||
| B | -0.0076 | 0.00374 | 0.056 | 0.0000 | 0.9917 |
| C | 5.5019 | 0.2301 | 0.000 | ||
| Source: Own calculations based on table 4 and Stata/SE 11.1 | |||||
Teoría Estadística y Probabilística de los Fenómenos Estable Acotados. Statistical and Probabilistic Theory of Phenomena Stable -Bounded, 2010. Tesis de maestría. Universidad Nacional Autónoma de México ; National University Autonomous of Mexico (Master thesis)
The Stable Bounded Theory. An alternative to projecting the net migration. The case of México. In Athens Journal of Social Sciences 2018. January 2018. 5 (1) .
The United Nations Probabilistic Projections: An introduction to demographic forecasting with uncertainly. http://www.abs.gov.au/websitedbs/a3121120.nsf/home/statistical+language+-+estimate+and+projection Australian Bureau of Statistics Statistical Languaje. Estimate and Projection 2015. 2017. June 24. 2017. (37) p. . (Foresight)
Principales resultados de la Encuesta Intercensal de. http://www.beta.inegi.org.mx/contenidos/proyectos/enchogares/especiales/intercensal/2015/doc/eic2015_resultados.pdf INEGI 2017b. 2015. 2015. 2017. (Main results of the Intercensal Survey of. Main results of the 2015 Intercensal Survey)
Projecting Global Population to 2050 and Beyond. http://worldpopulationhistory.org/projecting-global-population/ World Population History, 2017. June 24, 2017.
Sistema para la Consulta de las Estadísticas Históricas de México. http://dgcnesyp.inegi.org.mx/cgi-in/ehm2014.exe/CI010010 INEGI 2017a. 2014. 2014. (System for the Consultation of Historical Statistics of Mexico. Recovered on May)
Understanding and using Population projections. //www.prb.org/Publications/Reports/2001/UnderstandingandUsingPopulationProjections.aspx Population Reference Bureau 2017.