Main Article Content
feature engineering, exploratory data analysis, cross-validation.
The House Price Index (HPI) is commonly used to estimate the changes in housing prices. The sale price of the house is correlated with many other factors like geographical location, size of the house, age of the house, the area and population of the neighborhood etc. There has been a considerably large number of datasets released in the literature of various locations to explore the correlation of the sale price of houses with their corresponding features. However, all the features don’t affect the sale price in equal proportion. Some features strongly correlate with each other and, while some features don’t carry any importance or are probably redundant. As a result, to explore various impacts of features on sales prices, we performed a detailed data analysis on the original house dataset. This report also comprehensively validates multiple steps of data analysis with supporting statistics and visualizations to provide an optimistic result of various features and their impact on the sales price of houses.