Python-DSCI 551

DSCI 551 Project: House Side-by-Side Comparison Project Topic: To help people better understand the houses’ properties and present the clear performance for their multipurpose. Project Motivation: When people are trying to rent or buy a house, we found out that there is no website to show houses’ properties differences, and we believe that it is very important to have an intuitive comparison between the houses. For instance, There are some websites which could provide us with information about the house like Zillow,Redfin, and Realtor. These websites listed out the estimated value, the number of bedrooms, lot sizes, and the neighborhood. However, it is difficult for us to visualize the exact differences between these houses. If we want to compare the size or the living room size among some houses we prefer the most, we have to switch between apps to know it. Therefore, we came up with the idea to make a website that can present the information of houses with the most transparency and convenience to help people make a comprehensive comparison. Reference: Project Plan: The user first could search for houses they prefer online through websites like Zillow,Redfin, and Realtor. After the user narrowed down the targets, the users can enter their interested houses’ addresses on the web, then our project is supposed to show the houses’ differences. For example, if there are two or three houses’ addresses entered, the website will list out our customized comparison table which includes the images and detailed description length. To do that, we should first scrape the houses’ information by using the other websites’ API through Python. The packages and libraries we expected to use are requests, urllib and BeautifulSoup for web scraping. Then, we expect to convert the scraped raw data into datasets, store them into Firebase Cloud Database, and analyze data by performing some queries. The packages and libraries for this step are: json, Pandas, and Numpy. Then we will develop a Web-based UI by Javascript to be accessed through the web browser/Internet.