This series features mid-course projects for our Data Science Bootcamp. Students were first tasked with posing an interesting data question and finding a dataset to address that question. Next, they spent time cleaning, wrangling, and exploring the data, before designing and building an interactive Shiny app to display their findings and allow for further exploration.
Although Airbnb is a well-liked online service that helps travelers discover and book one-of-a-kind lodgings all over the world, the company has been called out for not fully addressing its users' safety concerns. So for his mid-course capstone project, Rod Miller of Data Science Cohort 6 decided to take a look at what it would entail to create an integrated map of Airbnb listings and pairing it alongside Google Street View to give users a more thorough understanding of the safety of their intended accommodation and its surrounding locality.
The Data Question
The first question Rod started with was “What areas in the city have the most/least Airbnb listings?” Answering this question would then lead to his second and third questions of “What do crime rates look like around a particular Airbnb listing?”, and “What does the surrounding neighborhood look like?”
Cleaning The Data
In order to answer his questions above, Rod downloaded Airbnb data from insideairbnb.com. He then filtered, downloaded, and cleaned six months of crime data in Nashville. Next he extracted the latitude and longitude coordinates of Airbnbs in Nashville to use as a reference to filter all crimes within a quarter mile of the rental unit using R’s sf package and created URL links to Airbnb listing and Google Street View for use in Leaflet map popups.
One of the challenges Rod faced in the cleaning process was filtering out certain crimes that would not be relevant to Airbnb’s users, such as shoplifting, fraud, and non-specific police inquiry. “There are a lot of crime categories and similar sub-categories which made the process difficult,” he explains. “I tried to filter the crime data before downloading the csv but had to do it by each individual category, which was tedious.” He used tidyverse, sf and glue to clean and package his data.
Visualizing The Data
Rod’s Shiny App
Using Leaflet map, Rod visualized the locations of all the Nashville Airbnb properties as well as surrounding crimes. “Leaflet provides an easy way for the user to interact with the map, click on a popup, etc.,” he shares. “Each property is color-coded based on the total amount of crime that took place in the six months of data so the user can instantly visualize the areas where more/less crime occurred.”
Rod also included bar charts, which help show a crime location summary so the user can see the type of property where most crimes took place, and donut charts, which includes a breakdown of the crime type by number and percentage. Rod describes the decision making process visualizing specific crimes was a difficult process “due to the amount of various types and length of the titles used for particular crimes.” But the Plotly library he implemented in the final application includes hover functionality so users may access the relevant info by hovering with the cursor.
What Rod discovered from the data surprised him! Rod shared that he was expecting there to be more crime occurrences near the downtown area, but wasn’t expecting there to be such a wide difference. “For the six months of data, total crime occurrence downtown was 16-20x the average amount of total crime occurrence elsewhere in Nashville,” he explains.
He also learned that there are a total of 7,733 Airbnb properties in the Nashville area, and motor vehicle burglary was the most committed crime in Nashville in 2022 with about 9,000 total incidents.
Rod believes that this information is useful for not only renters, but could also be a valuable resource for policymakers and urban planners to plan for city expansion and tackle potential safety concerns.
For more insights from Rod’s, visit his Shiny app or his project on GitHub.