The index_mutation option in
calculate_hedonic_index() provides a deeper analysis of a
price index movement. It shows how much observations or groups of
observations contribute to the index mutation in one selected
period.
An index mutation is the period-over-period change in the index. The contribution analysis helps identify which observations or units have the largest influence on that change.
The analysis is activated by setting:
When this option is used, calculate_hedonic_index()
returns a list with two elements:
Index: the regular price index;Index_mutation: the contribution-to-index-mutation
table.The contribution-to-index-mutation calculation uses a leave-one-out approach.
For a selected period, REPS:
The difference between the original index and the recalculated index indicates the contribution of the excluded observation or group.
The input data are the same as for a regular call to
calculate_hedonic_index().
For this vignette, we use a small sample of hedonic_data
to keep rendering fast. The index mutation calculation recalculates the
index many times, so using the full dataset would make the vignette
slower than necessary.
dataset <- hedonic_data
dataset$floor_area <- log(dataset$floor_area)
set.seed(123)
dataset <- dataset |>
dplyr::group_by(period) |>
dplyr::slice_sample(n = 25) |>
dplyr::ungroup()
head(dataset)
#> # A tibble: 6 × 6
#> period price floor_area dist_trainstation neighbourhood_code dummy_large_city
#> <chr> <int> <dbl> <dbl> <chr> <int>
#> 1 2008Q1 6.82e5 4.55 0.216 E 0
#> 2 2008Q1 8.74e5 4.73 0.999 D 1
#> 3 2008Q1 1.69e6 5.04 2.32 D 0
#> 4 2008Q1 8.39e5 4.75 2.50 C 1
#> 5 2008Q1 5.43e5 4.37 1.39 D 0
#> 6 2008Q1 6.40e5 4.51 2.47 A 0The example below calculates a Fisher index and adds contribution-to-index-mutation analysis.
The important arguments are:
index_mutation = TRUE, which activates the
analysis;index_mutation_period, which selects the period to
analyse;unit_variable, which defines whether observations are
excluded individually or by group.result <- calculate_hedonic_index(
dataset = dataset,
method = "fisher",
period_variable = "period",
dependent_variable = "price",
numerical_variables = c("floor_area", "dist_trainstation"),
categorical_variables = c("neighbourhood_code", "dummy_large_city"),
reference_period = "2015",
number_of_observations = FALSE,
index_mutation = TRUE,
index_mutation_period = "2019Q1",
unit_variable = "neighbourhood_code"
)
head(result$Index)
#> period Index
#> 1 2008Q1 101.27033
#> 2 2008Q2 99.89943
#> 3 2008Q3 105.92769
#> 4 2008Q4 103.22324
#> 5 2009Q1 98.38427
#> 6 2009Q2 99.83063
head(result$Index_mutation)
#> neighbourhood_code period Index_excl_observation Index_original
#> 1 B 2019Q1 100.39420 99.34528
#> 2 D 2019Q1 100.15980 99.34528
#> 3 A 2019Q1 98.84730 99.34528
#> 4 C 2019Q1 98.03411 99.34528
#> 5 E 2019Q1 97.92741 99.34528
#> Index_difference PoP_excl_observation PoP_original PoP_difference
#> 1 -1.0489226 -4.172429 -5.173639 1.0012103
#> 2 -0.8145198 -4.396169 -5.173639 0.7774698
#> 3 0.4979707 -5.648958 -5.173639 -0.4753195
#> 4 1.3111694 -6.425167 -5.173639 -1.2515283
#> 5 1.4178636 -6.527008 -5.173639 -1.3533693index_mutation_periodThe index_mutation_period parameter selects the period
for which the contribution analysis is performed.
This matters because index mutation analysis is calculated for one period at a time. The function removes observations or groups only from the selected period. Observations in other periods remain in the dataset during each recalculation.
In the example above, this is done with:
If index_mutation_period = NULL, REPS automatically
selects the latest available period.
Use index_mutation_period when a specific period needs
closer inspection, for example when the index movement is unusually
large or when the latest published period needs to be explained.
The unit_variable argument controls whether the
contribution analysis is performed at observation level or group
level.
If unit_variable = NULL, each row in the selected period
is removed once:
If unit_variable is supplied, REPS removes all
observations belonging to one group at a time. In the example above,
contributions are calculated by neighbourhood:
Grouped output is often easier to interpret than observation-level output, because it shows which units, such as neighbourhoods, contributed most to the index mutation.
The Index_mutation table contains the following main
columns:
Index_excl_observation: index value after excluding the
observation or group;Index_original: original index value;Index_difference: difference between the original index
and the recalculated index;PoP_excl_observation: period-over-period growth after
exclusion;PoP_original: original period-over-period growth;PoP_difference: difference between the recalculated and
original period-over-period growth.The main column for identifying influence on the index level is:
A positive Index_difference means the excluded
observation or group increased the original index. A negative
Index_difference means it lowered the original index.
For analysing the period-over-period mutation, inspect
PoP_difference.
mutation_table <- result$Index_mutation
head(mutation_table[order(-mutation_table$Index_difference), ])
#> neighbourhood_code period Index_excl_observation Index_original
#> 5 E 2019Q1 97.92741 99.34528
#> 4 C 2019Q1 98.03411 99.34528
#> 3 A 2019Q1 98.84730 99.34528
#> 2 D 2019Q1 100.15980 99.34528
#> 1 B 2019Q1 100.39420 99.34528
#> Index_difference PoP_excl_observation PoP_original PoP_difference
#> 5 1.4178636 -6.527008 -5.173639 -1.3533693
#> 4 1.3111694 -6.425167 -5.173639 -1.2515283
#> 3 0.4979707 -5.648958 -5.173639 -0.4753195
#> 2 -0.8145198 -4.396169 -5.173639 0.7774698
#> 1 -1.0489226 -4.172429 -5.173639 1.0012103
head(mutation_table[order(mutation_table$Index_difference), ])
#> neighbourhood_code period Index_excl_observation Index_original
#> 1 B 2019Q1 100.39420 99.34528
#> 2 D 2019Q1 100.15980 99.34528
#> 3 A 2019Q1 98.84730 99.34528
#> 4 C 2019Q1 98.03411 99.34528
#> 5 E 2019Q1 97.92741 99.34528
#> Index_difference PoP_excl_observation PoP_original PoP_difference
#> 1 -1.0489226 -4.172429 -5.173639 1.0012103
#> 2 -0.8145198 -4.396169 -5.173639 0.7774698
#> 3 0.4979707 -5.648958 -5.173639 -0.4753195
#> 4 1.3111694 -6.425167 -5.173639 -1.2515283
#> 5 1.4178636 -6.527008 -5.173639 -1.3533693The results should be read as a sensitivity analysis: they show what happens to the selected-period index movement when a specific observation or group is removed.