Anacostia Riverkeeper Forecasts River Water Quality with DataRobot
The water quality at an urban-adjacent river like the Anacostia River in Washington D.C., can vary wildly from day to day. Severe rainstorms, sewage overflows, and excessive littering are just some of the factors that can affect its quality and cleanliness.
The primary mission of environmental group Anacostia Riverkeeper is to restore and maintain the river and connect it to the local D.C. and Maryland population through program work and advocacy. Part of that work includes using data science modeling to predict the water quality of the river. For this reason, Anacostia Riverkeeper was among the inaugural class of Enterprise AI platform DataRobot’s AI for Good program and is already seeing their data science work get supercharged.
“The DataRobot platform is so easy to use,” said Project Coordinator and Development Lead, Olivia Anderson. “I’m not a data scientist, but I went through the intro DataRobot Essentials course and I feel super comfortable using it. We’re really happy DataRobot teamed up with us to try and figure out this water quality issue.”
The AI For Good: Powered by DataRobot program incorporates trusted AI technology to support organizations as they work to effectively solve global challenges. As one of the inaugural class of five organizations, Anacostia Riverkeeper is building AI models to forecast E. coli levels throughout the Anacostia River on any given day, significantly supplementing the organization’s current volunteer-driven and manual efforts at measuring and predicting water quality.
Olivia testing water samples for E. coli and other bacteria
The current efforts to measure water quality are very citizen science-based, with engaged residents volunteering to collect water samples from 22 sites across the city, as part of the D.C. Citizen Science Water Quality Monitoring program, funded by the Department of Energy and Environment. The Anacostia Riverkeeper team can then test those samples and provide a measure of the river’s water quality at that time. However, the water quality can always change between the time when the samples are collected and when the results are delivered 24 hours later. To address this issue, the team is turning to data science in an effort to get more proactive and predictive. With the help of DataRobot, the small Anacostia Riverkeeper team of three non-data scientists could address gaps in bandwidth and skills.
“The whole platform is so well put together,” said Anderson. “It allows us to instantaneously pull in all these other data and water quality parameters like water temperature or pH or tide data. It’s so user-friendly and it makes me feel like I could go in there and play around and make something I could be confident in using, even without the amazing support of our customer-facing data scientist.”
The team is currently in the process of adding to the dataset so they can build an even more robust model. They’re also figuring out whether it’s best to structure their predictions around continuous or categorical variables and if they want to publicize the water’s bacteria level range. They are also weighing if a simple pass/fail indicator on water quality would clearly indicate safety for the public to enjoy the river.
The team hopes to start finalizing and validating the model and start posting daily water quality forecasts by the summer. There’s also the potential to expand to other watersheds further along the Potomac and the Chesapeake Bay. But as Anderson describes, it all starts with the Anacostia River, and along with the help of DataRobot, they hope this is a turning point for water quality measurements and predictions.
The beautiful Anacostia River
“We call it a small river, but with big potential,” said Anderson. “This model we’re working on could really stretch to more than just the Anacostia River if it’s accurate and successful here.
“We’re really humbled that we were chosen to be part of the inaugural class of the DataRobot AI for Good program.”