Developing a framework for classifying water lead levels at private drinking water systems: A Bayesian Belief Network approach. (1st February 2021)
- Record Type:
- Journal Article
- Title:
- Developing a framework for classifying water lead levels at private drinking water systems: A Bayesian Belief Network approach. (1st February 2021)
- Main Title:
- Developing a framework for classifying water lead levels at private drinking water systems: A Bayesian Belief Network approach
- Authors:
- Fasaee, Mohammad Ali Khaksar
Berglund, Emily
Pieper, Kelsey J.
Ling, Erin
Benham, Brian
Edwards, Marc - Abstract:
- Highlights: Framework fuses discretization, feature selection, and Bayes classifiers. Bayesian Belief Network classifies lead levels in drinking water above 15 ppb. Applied for dataset collected at private drinking water systems in Virginia. Features include plumbing, treatment, water quality, and perceptions. Naïve Bayes model classifies lead with 81.8% accuracy and 47.6% recall. Abstract: The presence of lead in drinking water creates a public health crisis, as lead causes neurological damage at low levels of exposure. The objective of this research is to explore modeling approaches to predict the risk of lead at private drinking water systems. This research uses Bayesian Network approaches to explore interactions among household characteristics, geological parameters, observations of tap water, and laboratory tests of water quality parameters. A knowledge discovery framework is developed by integrating methods for data discretization, feature selection, and Bayes classifiers. Forward selection and backward selection are explored for feature selection. Discretization approaches, including domain-knowledge, statistical, and information-based approaches, are tested to discretize continuous features. Bayes classifiers that are tested include General Bayesian Network, Naive Bayes, and Tree-Augmented Naive Bayes, which are applied to identify Directed Acyclic Graphs (DAGs). Bayesian inference is used to fit conditional probability tables for each DAG. The Bayesian framework isHighlights: Framework fuses discretization, feature selection, and Bayes classifiers. Bayesian Belief Network classifies lead levels in drinking water above 15 ppb. Applied for dataset collected at private drinking water systems in Virginia. Features include plumbing, treatment, water quality, and perceptions. Naïve Bayes model classifies lead with 81.8% accuracy and 47.6% recall. Abstract: The presence of lead in drinking water creates a public health crisis, as lead causes neurological damage at low levels of exposure. The objective of this research is to explore modeling approaches to predict the risk of lead at private drinking water systems. This research uses Bayesian Network approaches to explore interactions among household characteristics, geological parameters, observations of tap water, and laboratory tests of water quality parameters. A knowledge discovery framework is developed by integrating methods for data discretization, feature selection, and Bayes classifiers. Forward selection and backward selection are explored for feature selection. Discretization approaches, including domain-knowledge, statistical, and information-based approaches, are tested to discretize continuous features. Bayes classifiers that are tested include General Bayesian Network, Naive Bayes, and Tree-Augmented Naive Bayes, which are applied to identify Directed Acyclic Graphs (DAGs). Bayesian inference is used to fit conditional probability tables for each DAG. The Bayesian framework is applied to fit models for a dataset collected by the Virginia Household Water Quality Program (VAHWQP), which collected water samples and conducted household surveys at 2, 146 households that use private water systems, including wells and springs, in Virginia during 2012 and 2013. Relationships among laboratory-tested water quality parameters, observations of tap water, and household characteristics, including plumbing type, source water, household location, and on-site water treatment are explored to develop features for predicting water lead levels. Results demonstrate that Naive Bayes classifiers perform best based on recall and precision, when compared with other classifiers. Copper is the most significant predictor of lead, and other important predictors include county, pH, and on-site water treatment. Feature selection methods have a marginal effect on performance, and discretization methods can greatly affect model performance when paired with classifiers. Owners of private wells remain disadvantaged and may be at an elevated level of risk, because utilities and governing agencies are not responsible for ensuring that lead levels meet the Lead and Copper Rule for private wells. Insight gained from models can be used to identify water quality parameters, plumbing characteristics, and household variables that increase the likelihood of high water lead levels to inform decisions about lead testing and treatment. … (more)
- Is Part Of:
- Water research. Volume 189(2021)
- Journal:
- Water research
- Issue:
- Volume 189(2021)
- Issue Display:
- Volume 189, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 189
- Issue:
- 2021
- Issue Sort Value:
- 2021-0189-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-02-01
- Subjects:
- Bayesian Belief Network -- Lead in Drinking Water -- Contamination Classification -- Water Quality
Water -- Pollution -- Research -- Periodicals
363.7394 - Journal URLs:
- http://catalog.hathitrust.org/api/volumes/oclc/1769499.html ↗
http://www.sciencedirect.com/science/journal/00431354 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.watres.2020.116641 ↗
- Languages:
- English
- ISSNs:
- 0043-1354
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 9273.400000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15410.xml