Can Expert Knowledge Enhance Real Estate Mass Appraisal Models?
Real estate professionals rely on accurate property valuations for tax assessments, lending, and investment decisions. In recent years, mass appraisal methods have grown increasingly sophisticated, integrating statistical models and machine learning techniques. However, these methods often depend heavily on high-quality, extensive datasets—a luxury that many local or underdeveloped markets lack. A recent study by Mariusz Doszyń, published in The Journal of Real Estate Finance and Economics, explores the role of expert knowledge in improving econometric models for real estate mass appraisal, particularly in low-data environments.
Key Findings
The study evaluated six econometric models:
- Ordinary Least Squares (OLS)
- Ridge Regression
- LASSO Regression
- Mixed Estimation Model (with prior knowledge)
- Bayesian Regression Model (with prior knowledge)
- Inequality Restricted Least Squares (IRLS) (with prior knowledge)
The focus was on the appraisal of undeveloped residential land, comparing models that use only sample data with those that integrate expert-provided prior knowledge. Notably, models incorporating prior knowledge (Mixed, Bayesian, and IRLS) outperformed their data-only counterparts in prediction accuracy and theoretical consistency.
Why Does Prior Knowledge Matter?
In many smaller or inefficient markets, data quality is a significant challenge. Key property features like location, utilities, and neighborhood are often measured subjectively, with low variability across samples. When datasets are sparse or noisy, purely data-driven models like OLS, Ridge, and LASSO may produce unreliable or counterintuitive results—such as assigning negative impacts to positive property traits.
Expert knowledge—such as local appraisers’ insights into the relative importance of property features—can help bridge this gap. In the study, appraisers provided parameter ranges (e.g., the weight of land area or utilities on property value), which were used as constraints or priors in the models. This additional layer of information improved the models’ ability to generate realistic, actionable valuations.
Model Comparisons
Here’s how the models stacked up:
- OLS, Ridge, and LASSO: These models struggled with parameter consistency, often producing nonsensical results like negative coefficients for location or utilities. While Ridge and LASSO incorporate regularization to combat overfitting, they rely heavily on large, high-quality datasets to shine.
- Mixed Estimation and Bayesian Models: Both models incorporated prior knowledge effectively, yielding more accurate and theoretically sound predictions. The Bayesian model’s flexibility in handling all parameters made it slightly more robust but required more detailed prior input.
- IRLS: This method imposed inequality constraints (e.g., a better neighborhood must not reduce property value). While less flexible than Bayesian or Mixed Estimation, IRLS consistently delivered reliable, monotonic results in line with theoretical expectations.
Practical Implications
- For Rural and Small Markets: Real estate professionals operating in data-scarce regions can benefit significantly from integrating expert knowledge into their valuation models. Instead of relying solely on sparse transaction data, appraisers’ insights can provide valuable context and improve model reliability.
- For Automated Valuation Models (AVMs): Firms using AVMs to estimate property values should consider hybrid approaches that combine data-driven techniques with expert input. This could be particularly impactful in markets with unique characteristics or limited sales activity.
- Reducing Bias in Valuations: By ensuring that models adhere to theoretical principles (e.g., better features always increasing value), approaches like Mixed Estimation and IRLS can help mitigate appraisal errors and improve stakeholder confidence.
- Efficiency Gains for Lenders and Tax Assessors: Accurate mass appraisal models save time and resources by reducing the need for manual, individual property evaluations. Integrating expert knowledge could make these models viable even in less developed markets.
Potential Shortcomings
The paper does not compare its expert-knowledge models to more advanced machine learning techniques. OLS, Lasso, and Ridge regression, while included, are relatively basic methods by industry standards. Although Doszyń rightly highlights that machine learning models perform best with large datasets, they can still outperform these older methods in certain contexts. Including comparisons with models like XGBoost or Random Forest would have strengthened the analysis, especially as Random Forest often performs well with smaller datasets. If the expert models beat the most advanced machine learning techniques, that would be truly informative, even if it were just in a few markets. Nevertheless, the paper correctly emphasizes that incorporating expert knowledge can enhance model performance, particularly when working with limited data.
However, implementing expert knowledge in practice may be cost-prohibitive. Explicitly seeking out this expert opinion, unless done in just a handful of markets, would require vast resources. Otherwise, organizations would need to collect expert input passively—perhaps by embedding data collection into a product used by appraisers or other real estate professionals (e.g., processing appraisals). Without such innovations, the practical application of the paper’s findings may be limited.
Conclusion
Doszyń’s research underscores the value of combining human expertise with statistical rigor. In markets where data is abundant and high-quality, machine learning and data-driven methods remain powerful tools. But in smaller, less-developed markets, incorporating expert insights can be a game-changer. By leveraging techniques like Mixed Estimation or IRLS, real estate professionals can produce more accurate, consistent valuations—even with limited datasets. For the industry, this approach offers a practical path to bridging the gap between theory and application, ensuring fair and reliable property valuations across diverse markets.