CHAOS Demographic Forecasts

This article summarises how CHAOS forecasts demographic changes at the postcode area level.

Overview

CHAOS applies ML algorithms and offers a wide range of forecasts available at a postal code level, the only such offering on the Finnish market. With CHAOS you get access to the following forecasts: 

  • population;
  • age/generation and gender;
  • income;
  • household type;
  • education

In Finland we forecast 54 municipalities which include all major cities and satellites. All forecasts with the same granularity as for HMA will also be available for other municipalities. Please, contact us in case you wish cities of your interest to be prioritised in our forecasting plan.

FAQ

  1. How do you do the forecasting? What are these forecasts based on?
  2. Do you consider factors such as migration in your forecasts?

  3. How do you validate your results? What is your accuracy?

  4. CHAOS forecasts Vs Open data forecasts?

  5. What are the use cases for demographic forecasts?

How do you do the forecasting? What are these forecasts based on?

CHAOS is using state-of-the-art time series probabilistic modelling for its forecasts. This approach learns seasonalities and trends embedded in historical data to provide future values. Such forecasting technique is widely used throughout different industries from financial market forecasts to predicting consumer behaviour. It has also been successfully applied to various demographic projection problems in academic research.

Do you consider factors such as migration in your forecasts?

CHAOS approach learns seasonalities and trends embedded in historical data to provide future values, so it relies only on the past data and does not consider the external drivers of population change.

How do you validate your results? What is your accuracy?

Whether possible, CHAOS compares its own forecasts to national census population projections (Tilastokeskus). Specifically, we sum up our projections of all individual postcode areas for the entire municipality/city and compare the result to Tilastokeskus population projections.

Below is an example of projection comparisons. We compare our forecasts for 2019-2022 with the population projections for Helsinki, Espoo and Vantaa. You can access the source of the Tilastokeskus projections used for comparison here.

 

Population projections Helsinki_ Tilastokeskus, CHAOS, and actual numbers

Population projections Espoo_ Tilastokeskus, CHAOS, and actual numbers

Population projections Vantaa_ Tilastokeskus, CHAOS, and actual numbers

Across 4 years of the forecast, the estimates of CHAOS and Tilastokeskus are around 1.5% apart from each other. In terms of accuracy, CHAOS estimates are typically about 3% below actual values that translates into 10 000 people on average per municipality.  Tilastokeskus projections are slightly closer to the actual numbers - they are about 1.5% below the actual values, which translates to about 5000 people. 

Mean difference between CHAOS and Tilastokeskus forecasts for Helsinki, Espoo and Vantaa (years 2019 - 2022)

1.48%
Mean absolute error of CHAOS forecasts for Helsinki, Espoo, and Vantaa (years 2019 - 2020) 3%
Mean absolute error of Tilastokeskus forecasts for Helsinki, Espoo and Vantaa (years 2019 - 2020)  1.5%

Mean absolute error of CHAOS forecasts in terms of number of people for Helsinki, Espoo, and Vantaa    (years 2019 - 2020)

10 065 persons
Mean absolute error of Tilastokeskus  in terms of number of people for Helsinki, Espoo, and Vantaa          (years 2019 - 2020) 4 972 persons

 

CHAOS validates its models in multiple ways. Among others, our data scientist team uses a train-test split of the data before modelling. For example, first, our team trains the model on 2010-2015 data and forecasts 2016-2020 years. Then, we evaluate our model performance by comparing it to the 'naive model' - the simplest forecasting model that sets the next forecasted value to be the same as last existing. The available real number from the recent past also used to check the accuracy.

CHAOS forecasts Vs Open data forecasts?

While it is possible to obtain separate municipalities' demographic projections through some open data portals, it is likely that data is not available for all the municipalities and, most importantly, the data sources and the models behind the predictions are not clear.

CHAOS uses quality data sources (read more in the data section) and state-of-the-art time series probabilistic modelling that has been used across different industries. Our team has already done the data sourcing, collecting and cleaning, offering you multiple forecasts that are available in one platform. With easy accessible reliable forecasts, you can save time on tedious data-related tasks and focus on work that brings value.

 

What are the use cases for demographic forecasts?

The demographic forecasts in the most use cases serve as a driver of a change of some phenomenon. For example, demographics is one of the main drivers for demand on residential properties. Therefore, forecasts are typically combined with additional information for drawing conclusions or support decision making.

Traditionally used excel table provide information and are useful if applied as a final piece of the puzzle, meaning cases where you have the rest of the necessary data lined up and visualized already. Otherwise, excel table data is just data without context.  

CHAOS forecasts, on the other hand, are placed in a larger context: they are combined with various additional information and visualised on a map. You can easily practically apply forecasts for different use cases.  Read more How to apply CHAOS  forecasts and Use cases.

If your specific use case is not considered in our dashboard, please, don't hesitate to contact our team.  We are interested in incorporating different use cases into our offering.