Location AI: The Next Generation of Geospatial Analysis
Real world problems are multidimensional and multifaceted. Location data is a key dimension whose volume and availability has grown exponentially in the last decade. At the confluence of cloud computing, geospatial data analytics, and machine learning we are able to unlock new patterns and meaning within geospatial data structures that help improve business decision-making, performance, and operational efficiency.
The power of this convergence is demonstrated by the following example. Cleaned and enriched geospatial data combined with geostatistical feature engineering provides substantial positive impact on a housing price prediction model’s accuracy. The question we’ll be looking at is: What is the predicted sale price for a home sale listing? Keep in mind, however, that this workflow can be used for a broad range of geospatial use cases.
A Light Gradient Boosted Trees Regressor with Early Stopping model was trained without any geospatial data on 5,657 residential home listings to provide a baseline for comparison. This produced a RMSLE Cross Validation of 0.3530. By example, this model predicted a roughly $21,000 increase in price compared to its true price.
In order to isolate the impact of the geospatial features, we compare modeling results with the same blueprint as the baseline model using the data’s available location identifiers. Enabling spatial data in the modeling workflow resulted in a 7.14% RMSLE Cross Validation improvement from the baseline and a $12,000 increase in prediction price compared to the true price, roughly $9,000 lower than the baseline model.
As a practice, spatial data scientists attempt to transfer human-spatial reasoning for machines to learn from. Five hypothesized key factors that contribute to housing prices were used to enrich the listing data using spatial joins:
- select demographic variables from the U.S. Census Bureau,
- walkability scores from the Environmental Protection Agency,
- highway distance,
- school district scores, and
- distance to recreation, namely, ski resorts.
Geospatial enrichment in combination with Location AI’s Spatial Neighborhood Featurizer reveal local spatial dependence structures such as spatial autocorrelation that exists between number of bedrooms, the square footage of the listing data, and the enriched feature for walkability score. Spatial data enrichment resulted in a 8.73% RMSLE Cross Validation improvement from the baseline and a $1,300 increase in price compared to the true price, roughly $11,000 lower than the enabled dataset model and about $20,000 less than the baseline model.
Spatial predictive modeling is applicable to a wide reach of industries because of the general availability of spatial data. Analyzing and understanding the applicability of spatial data enrichment to any particular machine learning scenario does not have to be a complex undertaking. To learn more on the best practices utilized for developing this location-aware model, read the full white paper here.
DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
Optimizing Large Language Model Performance with ONNX on DataRobot MLOpsJune 1, 2023· 11 min read
Belong @ DataRobot: AAPI Heritage Month with the ACTnow! CommunityMay 25, 2023· 3 min read
Deep Learning for Decision-Making Under UncertaintyMay 18, 2023· 5 min read
Discover insights on the specific conditions that make machine learning effective in certain financial applications, such as high-frequency trading. Read more.
In this article, we’ll first take a closer look at the concept of Real Estate Data Intelligence and the potential of AI to become a game changer in this niche.
In this blog post we’ll explore how Mindshare, a global media agency network, has leveraged data science tools to create a fast and reliable decision-making engine. Read more.