Temporal forecast reconciliation
Daniele Girolimetto
2024-06-07
Source:vignettes/articles/Temporal-forecast-reconciliation.qmd
Temporal-forecast-reconciliation.qmd
Introduction
In this vignette, we explore the process of temporal reconciliation
using the FoReco
package, with a focus on the Capital
Country monthly time series (ADB
) from the
vndata
dataset (Wickramasuriya et
al., 2018). Temporal reconciliation is a method used to ensure
that forecasts are coherent across different time periods, providing a
consistent view from high-frequency (e.g., monthly) to aggregated levels
(e.g., yearly). We will first obtain base forecasts using an ETS model
with log transformation and then proceed with several reconciliation
methods: bottom-up, top-down, and optimal combination. Each method has
its own strengths and practical challenges, such as ensuring
non-negativity and incorporating a priori constraints.
Capital Country monthly time series
In the temporal framework, we are working with a single series. In
particular for this vignette we consider the Capital Country monthly
time series (ADB
) from the vndata
dataset
(Wickramasuriya et al., 2018). See the dataset vignette for more
details.
Base Forecast
To obtained the base forecasts, we fit an ETS model with log transformation. We obtain twelve-, six-, four-, three-, two-, and one-step-ahead base forecasts from the monthly data and the aggregation over 2, 3, 4, 6, and 12 months.
data_k <- aggts(y, te_set)
model <- setNames(vector(mode='list', length=length(te_set)), paste0("k-", te_set))
fc_obj <- setNames(vector(mode='list', length=length(te_set)), paste0("k-", te_set))
# ETS model with log transformation
ets_log <- function(x, ...){
x[x==0] <- min(x[x!=0])/2
ets(x, lambda = 0, ...)
}
for(k in te_set){
model[[paste0("k-", k)]] <- ets_log(data_k[[paste0("k-", k)]])
fc_obj[[paste0("k-", k)]] <- forecast(model[[paste0("k-", k)]], h = m/k)
}
We extract the point forecasts and residuals from the fitted models.
# Point forecasts
base <- lapply(fc_obj, function(x) x$mean)
str(base, give.attr = FALSE)
#> List of 6
#> $ k-12: Time-Series [1:1] from 2017 to 2017: 1.31
#> $ k-6 : Time-Series [1:2] from 2017 to 2018: 0.557 0.557
#> $ k-4 : Time-Series [1:3] from 2017 to 2018: 0.356 0.356 0.356
#> $ k-3 : Time-Series [1:4] from 2017 to 2018: 0.243 0.243 0.243 0.243
#> $ k-2 : Time-Series [1:6] from 2017 to 2018: 0.225 0.225 0.225 0.225 0.225 ...
#> $ k-1 : Time-Series [1:12] from 2017 to 2018: 0.178 0.178 0.178 0.178 0.178 ...
# Residuals
res <- lapply(fc_obj, residuals, type = "response")
str(res, give.attr = FALSE)
#> List of 6
#> $ k-12: Time-Series [1:19] from 1998 to 2016: 18.15 -1.18 1.13 31.64 13.37 ...
#> $ k-6 : Time-Series [1:38] from 1998 to 2016: -0.422 18.911 -0.422 -0.422 -0.422 ...
#> $ k-4 : Time-Series [1:57] from 1998 to 2017: -0.221 14.333 4.423 -0.221 -0.221 ...
#> $ k-3 : Time-Series [1:76] from 1998 to 2017: -0.243 -0.228 14.338 4.377 -0.306 ...
#> $ k-2 : Time-Series [1:114] from 1998 to 2017: -0.0904 -0.0904 -0.0904 14.4633 -0.0905 ...
#> $ k-1 : Time-Series [1:228] from 1998 to 2017: -0.0433 -0.0433 -0.0433 -0.0433 -0.0433 ...
Point Reconciliation
Bottom-up reconciliation (Dunn et al., 1976) aggregates the high-frequency forecasts from the monthly level to higher temporal levels (Girolimetto et al., 2024).
fc_bts <- base$`k-1`
rf_bu <- tebu(fc_bts, agg_order = m)
str(rf_bu, give.attr = FALSE)
#> Named num [1:28] 2.137 1.069 1.069 0.712 0.712 ...
To obtain a list of forecasts at different orders of aggregation, we
can use the FoReco2matrix
function.
str(FoReco2matrix(rf_bu), give.attr=FALSE)
#> List of 6
#> $ k-12: num 2.14
#> $ k-6 : num [1:2] 1.07 1.07
#> $ k-4 : num [1:3] 0.712 0.712 0.712
#> $ k-3 : num [1:4] 0.534 0.534 0.534 0.534
#> $ k-2 : num [1:6] 0.356 0.356 0.356 0.356 0.356 ...
#> $ k-1 : num [1:12] 0.178 0.178 0.178 0.178 0.178 ...
In top-down reconciliation for hierarchical time series, the forecast for the annual series (\(k = 12\)) is distributed proportionally to ensure the top-level value stays the same and all bottom-level forecasts are non-negative (Gross & Sohl, 1990).
y_mat <- matrix(data_k$`k-1`, ncol = m, byrow = TRUE)
x12 <- data_k$`k-12`
fc_x12 <- base$`k-12`
# Average historical proportions - Gross-Sohl method A
p_gsa <- colMeans(apply(y_mat, 2, function(x) x/x12), na.rm = TRUE)
rf_td_gsa <- tetd(fc_x12, agg_order = m, weights = p_gsa)
str(rf_td_gsa, give.attr = FALSE)
#> Named num [1:28] 1.314 0.497 0.818 0.216 0.791 ...
# Proportions of the historical averages - Gross-Sohl method F
p_gsf <- colMeans(y_mat)/mean(x12)
rf_td_gsf <- tetd(fc_x12, agg_order = m, weights = p_gsf)
str(rf_td_gsf, give.attr = FALSE)
#> Named num [1:28] 1.314 0.501 0.813 0.079 0.99 ...
To perform temporal reconciliation with FoReco
using the
base forecasts at any temporal aggregation order, it is necessary to
arrange base forecasts (and residuals) in vector form.
The level conditional coherent reconciliation (LCC) is a generalization of the original proposal by Hollyman et al. (2021) and Di Fonzo & Girolimetto (2024) to include the temporal framework.
rf_lcc <- telcc(base = base_vec, agg_order = m,
res = res_vec, comb = "wlsv")
str(rf_lcc, give.attr = FALSE)
#> Named num [1:28] 1.326 0.663 0.663 0.442 0.442 ...
Finally we can obtained the optimal (in least squares sense) combination temporal reconciled forecast (Athanasopoulos et al., 2017).
rf_opt <- terec(base = base_vec, agg_order = m,
res = res_vec, comb = "sar1")
str(rf_opt, give.attr = FALSE)
#> Named num [1:28] 1.329 0.665 0.665 0.443 0.443 ...
The following table shows some options for the optimal combination
temporal reconciliation function terec()
.
Standard | |
approach="proj" \[
\tilde{\mathbf{y}}_h = \left[\mathbf{I}_n -
\mathbf{W}\mathbf{C}'(\mathbf{C}\mathbf{W}\mathbf{C}')^{-1}\mathbf{C}\right]
\hat{\mathbf{y}}_h
\]
|
approach="strc" \[
\tilde{\mathbf{y}}_h =
\mathbf{S}(\mathbf{S}'\mathbf{W}^{-1}\mathbf{S})^{-1}\mathbf{S}'\mathbf{W}^{-1}
\hat{\mathbf{y}}_h
\]
|
Non-negative forecast reconciliation (osqp) | |
approach="proj" + nn="osqp" or
nn="proj_osqp"
|
approach="strc" + nn="osqp" or
nn="strc_osqp"
|
Non-negative forecast reconciliation (sntz) | |
not supported* |
nn="sntz"
|
Immutable forecast reconciliation | |
supported | supported |
* If the cross-sectional zero-constraints matrix is \(\mathbf{C} =
[\mathbf{I}\quad-\mathbf{A}]\), then nn="sntz" is
supported.
|
Practical challenges
Non negativity issues
It is not always true that reconciled forecasts will remain
non-negative even if the base forecasts are non-negative. For example,
consider the case of an shrinkage covariance matrix
("shr"
).
rf_opt_shr <- terec(base = base_vec, agg_order = m,
res = res_vec, comb = "shr")
recoinfo(rf_opt_shr)
#> ✔ Optimal Temporal Forecast Reconciliation
#> ℹ FoReco function: `terec`
#> ℹ Covariance approximation: `shr`
#> ℹ Non-negative forecasts: `FALSE`
To address this issue, we can use two approaches:
- State-of-the-art numerical optimization procedure, osqp (Stellato et al., 2020).
rf_osqp <- terec(base = base_vec, agg_order = m,
res = res_vec, comb = "shr", nn = "osqp")
tmp <- recoinfo(rf_osqp)
#> ✔ Optimal Temporal Forecast Reconciliation
#> ℹ FoReco function: `terec`
#> ℹ Covariance approximation: `shr`
#> ℹ Non-negative forecasts: `TRUE`
tmp$info # OSQP information matrix
#> obj_val run_time iter pri_res status status_polish
#> 1 -16.58821 0.000280082 50 5.551115e-17 1 1
- Simple heuristic strategy: set-negative-to-zero, sntz (Di Fonzo & Girolimetto, 2023).
A priori constrained (immutable) forecasts
Sometimes we may wish to incorporate a priori knowledge during the reconciliation process (Zhang et al., 2023) in order to improve the accuracy of the reconciled forecasts. For example, suppose we want to fix the forecasts of the annual level at the base forecasts values.
rf_imm <- terec(base = base_vec, agg_order = m,
res = res_vec, comb = "sar1", immutable = rbind(c(12,1)))
str(rf_imm, give.attr = FALSE)
#> Named num [1:28] 1.314 0.657 0.657 0.438 0.438 ...
rf_imm[1] - base_vec[1]
#> k-12 h-1
#> 0
Exploring a subset of temporal aggregation orders
Our approaches so far have involved considering all factors of \(m\) as potential aggregation orders. Nevertheless, it is worth noting that we could also focus only on a given subset of these factors. For example, we could be interested only on monthly, quarterly and annual forecasts.
te_subset <- c(12, 3, 1)
base_vec2 <- unlist(base[paste0("k-", te_subset)], use.names = FALSE)
res_vec2 <- unlist(res[paste0("k-", te_subset)], use.names = FALSE)
rf_sub <- terec(base = base_vec2, agg_order = te_subset,
res = res_vec2, comb = "sar1")
str(rf_sub, give.attr = FALSE)
#> Named num [1:17] 1.48 0.371 0.369 0.369 0.371 ...
Probabilistic Reconciliation
Following Panagiotelis et al. (2023) and Girolimetto et al. (2024), the reconciliation of probabilistic forecasts is a two-step process: first, we sample from an incoherent distributio, and then we reconcile each sample.
We can use a non-parametric method, the joint block bootstrap to simulate \(B\) samples and then reconciled them.
# Base forecasts' sample:
# we simulate from the base models by sampling errors
# while keeping the cross-sectional dimension fixed.
B <- 100
base_tejb <- teboot(model, B, m)$sample
dim(base_tejb)
#> [1] 100 28
# Reconciled forecasts' sample:
# we reconcile each member of a sample from the incoherent distribution.
reco_tejb <- t(apply(base_tejb, 1, terec, agg_order = m, res = res_vec, nn = "sntz",
comb = "sar1"))
str(reco_tejb, give.attr = FALSE)
#> num [1:100, 1:28] 18.293 0.608 2.808 0.627 18.293 ...
A parametric method assumes a normal distribution (Gaussian) to generate the incoherent sample set of forecasts.
# Multi-step residuals
hres <- lapply(model, function(fit)
lapply(1:frequency(fit$x), function(h) residuals(fit, type='response', h = h)))
hres <- Reduce("c", lapply(hres, arrange_hres))
# Re-arrenge multi-step residuals in a matrix form
mres <- res2matrix(hres, agg_order = m)
base_teg <- MASS::mvrnorm(n = B, mu = base_vec, Sigma = shrink_estim(mres))
str(base_teg, give.attr = FALSE)
#> num [1:100, 1:28] 0.714 -10.105 8.384 -0.327 -3.272 ...
reco_teg <- t(apply(base_teg, 1, terec, agg_order = m, res = res_vec, nn = "sntz",
comb = "sar1"))
str(reco_teg, give.attr = FALSE)
#> num [1:100, 1:28] 16.62 4.95 18.48 16.02 4.13 ...