| Title: | Hedonic and Multilateral Index Methods for Real Estate Price Statistics |
|---|---|
| Description: | Compute price indices using various Hedonic and multilateral methods, including Laspeyres, Paasche, Fisher, and HMTS (Hedonic Multilateral Time series re-estimation with splicing). The central function calculate_hedonic_index() offers a unified interface for running these methods on structured datasets. This package is designed to support index construction workflows across a wide range of domains — including but not limited to real estate — where quality-adjusted price comparisons over time are essential. The development of this package was funded by Eurostat and Statistics Netherlands (CBS), and carried out by Statistics Netherlands. The HMTS method implemented here is described in Ishaak, Ouwehand and Remøy (2024) <doi:10.1177/0282423X241246617>. For broader methodological context, see Eurostat (2013, ISBN:978-92-79-25984-5, <doi:10.2785/34007>). |
| Authors: | Farley Ishaak [aut], Pim Ouwehand [aut], David Pietersz [aut], Liu Nuo Su [aut], Cynthia Cao [aut], Mohammed Kardal [aut], Odens van der Zwan [aut], Vivek Gajadhar [aut, cre] |
| Maintainer: | Vivek Gajadhar <[email protected]> |
| License: | EUPL-1.2 |
| Version: | 1.1.1 |
| Built: | 2026-06-09 00:06:03 UTC |
| Source: | https://github.com/vivekag7/reps |
Central hub function to calculate index figures using different methods. Can also calculate chained indices using the Annual Overlap Method.
calculate_hedonic_index( dataset, method, period_variable, dependent_variable, numerical_variables = NULL, categorical_variables = NULL, reference_period = NULL, number_of_observations = TRUE, chained = FALSE, index_mutation = FALSE, parallel = FALSE, ... )calculate_hedonic_index( dataset, method, period_variable, dependent_variable, numerical_variables = NULL, categorical_variables = NULL, reference_period = NULL, number_of_observations = TRUE, chained = FALSE, index_mutation = FALSE, parallel = FALSE, ... )
dataset |
Data frame with input data |
method |
One of: "fisher", "laspeyres", "paasche", "hmts", "timedummy", "rolling_timedummy", "repricing" |
period_variable |
A string with the name of the column containing time periods. |
dependent_variable |
Usually the price |
numerical_variables |
Vector with numeric quality-determining variables |
categorical_variables |
Vector with categorical variables (also dummies) |
reference_period |
Period or group of periods that will be set to 100 |
number_of_observations |
Logical, whether to show number of observations (default = TRUE) |
chained |
Logical. If TRUE, calculates a chained index using the Annual Overlap Method. Default is FALSE. |
index_mutation |
Logical. If TRUE, calculates a contribution-to-index-mutation table for one method. Default is FALSE. |
parallel |
Logical. If TRUE, independent calculations are parallelized where useful. Default is FALSE. |
... |
Additional method-specific arguments passed to the underlying functions:
|
A data.frame (or list for HMTS with resting_points = TRUE; named list if multiple methods are used; or list with Index and Index_mutation when index_mutation = TRUE)
Vivek Gajadhar
## Not run: data("hedonic_data") Tbl_indices <- REPS::calculate_hedonic_index( method = c("fisher", "hmts", "laspeyres", "paasche", "repricing", "timedummy", "rolling_timedummy"), dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("neighbourhood_code", "dummy_large_city"), reference_period = "2015", number_of_observations = FALSE, periods_in_year = 4, number_preliminary_periods = 1, window_length = 4, production_since = NULL, resting_points = FALSE, imputation = FALSE ) ## End(Not run)## Not run: data("hedonic_data") Tbl_indices <- REPS::calculate_hedonic_index( method = c("fisher", "hmts", "laspeyres", "paasche", "repricing", "timedummy", "rolling_timedummy"), dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("neighbourhood_code", "dummy_large_city"), reference_period = "2015", number_of_observations = FALSE, periods_in_year = 4, number_preliminary_periods = 1, window_length = 4, production_since = NULL, resting_points = FALSE, imputation = FALSE ) ## End(Not run)
This function has been renamed. Please use calculate_hedonic_index instead.
calculate_price_index(...)calculate_price_index(...)
... |
Arguments passed to |
For each period in the data, fits a log-linear model and computes diagnostics:
Normality test (Shapiro-Wilk)
Adjusted R-squared
Breusch-Pagan test for heteroscedasticity
Durbin-Watson test for autocorrelation
calculate_regression_diagnostics( dataset, period_variable, dependent_variable, numerical_variables = NULL, categorical_variables = NULL, parallel = FALSE )calculate_regression_diagnostics( dataset, period_variable, dependent_variable, numerical_variables = NULL, categorical_variables = NULL, parallel = FALSE )
dataset |
A data.frame with input data |
period_variable |
Name of the period variable (string) |
dependent_variable |
Name of the dependent variable (string) |
numerical_variables |
Vector of numerical independent variables (default = NULL) |
categorical_variables |
Vector of categorical independent variables (default = NULL) |
parallel |
Logical; whether independent period-level diagnostics are parallelized. |
A data.frame with diagnostics by period
Mohammad Kardal, Vivek Gajadhar
diagnostics <- calculate_regression_diagnostics( dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code") ) head(diagnostics)diagnostics <- calculate_regression_diagnostics( dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code") ) head(diagnostics)
A subset of data from a fictitious real estate data frame containing transaction prices and some categorical and numerical characteristics of each dwelling.
hedonic_datahedonic_data
A data frame with 7,800 rows and 6 columns:
A (string) vector indicating a time period
A (string) vector indicating the transaction price of the dwelling
A real-valued vector of (the logarithm of) the floor area of the dwelling
A real-valued vector of (the logarithm of) the distance of the dwelling to the nearest train station
A categorical code/string referring to the neighbourhood the dwelling belongs to
A vector indicating whether the dwelling belongs to a large city or not
A fictitious dataset for illustration purposes
data(hedonic_data) head(hedonic_data)data(hedonic_data) head(hedonic_data)
Static price index plot using base R graphics with grid lines and external legend.
plot_price_index(index_output, title = NULL)plot_price_index(index_output, title = NULL)
index_output |
A data.frame or named list of data.frames (from calculate_hedonic_index()) |
title |
Optional plot title |
Supports both single index data.frame and named list of multiple methods. X-axis shows only first period of each year with rotated labels to avoid clutter.
None. Draws plots in the active graphics device.
Vivek Gajadhar
Creates a static 3x2 grid of base R plots showing regression diagnostics:
Normality (Shapiro-Wilk)
Linearity (Adjusted R-squared)
Heteroscedasticity (Breusch-Pagan)
Autocorrelation (Durbin-Watson)
Autocorrelation (p-value DW)
plot_regression_diagnostics(diagnostics, title = "Regression Diagnostics")plot_regression_diagnostics(diagnostics, title = "Regression Diagnostics")
diagnostics |
A data.frame as returned by calculate_regression_diagnostics() |
title |
Optional overall title for the entire plot grid (default: "Regression Diagnostics") |
None. Produces plots in the active graphics device.
Vivek Gajadhar
plot_regression_diagnostics( calculate_regression_diagnostics( dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code") ) )plot_regression_diagnostics( calculate_regression_diagnostics( dataset = hedonic_data, period_variable = "period", dependent_variable = "price", numerical_variables = c("floor_area", "dist_trainstation"), categorical_variables = c("dummy_large_city", "neighbourhood_code") ) )