Airbnb Pricing Prediction Using Machine Learning: A Case Study on Seattle Listings

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The growth of peer-to-peer accommodation platforms has transformed the tourism and hospitality industry by introducing decentralized, host-driven pricing systems. However, many Airbnb hosts rely on intuition or limited platform recommendations to set nightly rates, often resulting in inconsistent pricing strategies. This study develops and evaluates a machine-learning model for predicting Airbnb prices using publicly available data from Inside Airbnb for Seattle, Washington. The analysis integrates listing, review, and calendar data to identify key determinants of nightly rates. Following extensive data cleaning and feature engineering, three predictive models were tested: Linear Regression, Ridge Regression, and Random Forest Regression. The Random Forest model achieved the best performance, with an R² of 0.726 and a mean absolute error (MAE) of approximately $51 per night. Cross-validation and multi-seed testing confirmed model stability and reproducibility. Feature-importance analysis revealed that property capacity and amenity richness were the strongest predictors of price, while neighborhood tier and host activity contributed moderately. These findings reinforce hedonic pricing theory by demonstrating that tangible property characteristics explain most pricing variation in peer-to-peer rentals. The study contributes a reproducible and interpretable framework for short-term rental analytics, offering practical guidance for hosts, policymakers, and researchers seeking to understand data-driven pricing in the sharing economy.

Article activity feed