--- title: "Getting Started with Price Index Calculation using REPS" author: "" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with Price Index Calculation using REPS} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: ./REFERENCES.bib --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, include=FALSE} library(REPS) data("hedonic_data") ``` ## Introduction The `calculate_hedonic_index()` function is the central entry point in `REPS` for computing price indices using various **hedonic-based methods**. It supports six commonly used approaches: * **Laspeyres** - hedonic double imputation base-weighted index * **Paasche** - hedonic double imputation current-weighted index * **Fisher** - geometric average of Laspeyres and Paasche (both hedonic double imputation) * **HMTS** - Hedonic Multilateral Time Series re-estimation Splicing * **Time Dummy** - single-regression hedonic index using time dummies * **Rolling Time Dummy** - chained index based on overlapping time-dummy regressions * **Repricing** - quasi-repeat-sales method comparing observed vs predicted price changes This vignette demonstrates how to apply each method using a consistent interface, making it easy to compare results across approaches. The HMTS method implemented in `REPS` is a multilateral, time-series-based index that balances stability, limited revision, and early detection of turning points in the context of property price indices [@51c4602ed48c4adbb7b7d15176d2da7a]. For broader context and international guidelines on the compilation of property price indices, including traditional methods such as hedonic double imputation Laspeyres, Paasche, Fisher and Repricing, we refer to Eurostat's *Handbook on Residential Property Price Indices (RPPIs)* [@eurostat2013rppi]. For (Rolling) Time Dummy we refer to Hill et al. [@hill2018repricing; @hill2022rolling]. --- ## Required Data Before running any calculations, ensure that your dataset is available and contains the necessary variables: ```{r} # Example dataset (you should already have this loaded) head(hedonic_data) ``` The required variables include: * `period_variable`: the time period * `dependent_variable`: usually price * `numerical_variables`: e.g., `floor_area` * `categorical_variables`: e.g., `neighbourhood_code` Typically, for some numerical variables you may want to apply a log transformation. For example, `floor_area` is often log-transformed to improve linearity, stabilize variance, and reduce the impact of extreme values. Log-transforming variables can help meet regression assumptions by making relationships between variables more linear and residuals more homoscedastic (constant variance). Example of log-transforming `floor_area`: ```{r} dataset <- hedonic_data dataset$floor_area <- log(dataset$floor_area) ``` --- ## Using `calculate_hedonic_index()` The `calculate_hedonic_index()` function provides a unified interface for estimating hedonic price indices. You only need to specify the method via the `method` argument - the function handles the rest. ### Example: Single Index Method - Time Dummy ```{r} Tbl_TD <- REPS::calculate_hedonic_index( dataset = dataset, method = "timedummy", period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code"), reference_period = 2015, number_of_observations = FALSE ) head(Tbl_TD) ``` ### Example: Multiple Index Methods ```{r} multi_result <- REPS::calculate_hedonic_index( method = c("fisher","hmts", "laspeyres", "paasche", "repricing", "timedummy", "rolling_timedummy"), dataset = dataset, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("neighbourhood_code", "dummy_large_city"), reference_period = "2015", number_of_observations = FALSE, periods_in_year = 4, number_preliminary_periods = 1, window_length = 4, production_since = NULL, resting_points = FALSE, imputation = FALSE ) head(multi_result$fisher) ``` --- ## Chaining Price Indices (Annual Overlap Method) While standard calls to `calculate_hedonic_index()` compute multilateral or direct indices over an entire dataset, long-term pooled models can suffer from structural market changes over time. To account for shifting consumer preferences, indices are often calculated over shorter periods and linked together. You can easily implement the **Annual Overlap Method** directly inside the hub function by setting the `chained = TRUE` argument. This tells the function to automatically split the data by year, calculate a short-term index using the final period of the previous year as the overlap base, and chain them together into a continuous series. ### Example: Chained Fisher Index You can use the `chained = TRUE` parameter with any underlying hedonic method supported by the package. Here is an example using the Fisher index. ```{r} # Calculate the chained index chained_fisher <- calculate_hedonic_index( dataset = dataset, method = "fisher", chained = TRUE, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code"), reference_period = "2015" ) head(chained_fisher) ``` --- ## Visualizing the Index For quick and clear visualizations, the `plot_price_index()` utility function can be used to generate time-series plots of the calculated indices. Because the output of both chained and unchained indices mirrors the standard REPS dataframe structure, it is fully compatible with this built-in plotting utility. While we encourage users to create custom visualizations suited to their analytical needs, this function provides a convenient starting point for simple and consistent line plots. ```r plot_price_index(multi_result) ``` ```{r echo=FALSE, out.width="100%", fig.align="center"} knitr::include_graphics("multi_index.png") ``` ## Summary The `calculate_hedonic_index()` function streamlines access to multiple hedonic index methods via a consistent interface, including robust support for chained indices. This allows analysts to easily compare outputs and select the most appropriate method for their context without switching between different wrapper functions. ## References