Urban neighborhood socioeconomic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements. (June 2022)
- Record Type:
- Journal Article
- Title:
- Urban neighborhood socioeconomic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements. (June 2022)
- Main Title:
- Urban neighborhood socioeconomic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements
- Authors:
- Wang, Lingqi
He, Shenjing
Su, Shiliang
Li, Yu
Hu, Lirong
Li, Guie - Abstract:
- Abstract: Understanding the dynamic distribution of residents' socioeconomic status (SES) across neighborhoods within cities is essential for urban planning and policy-making aligning to the Sustainable Development Goals 2030. Whereas the promise in explicitly linking geographical features to SES has been highlighted fairly clear in previous works, scholars hold an eclectic attitude in their outlook, given the absence of theoretical ground, the heavy reliance on nontransparent proprietary data sources and the relatively coarse resolution predictions. Drawing on a case study of Hangzhou metropolitan in China, this paper aims to address these problems by demonstrating a novel approach to neighborhood SES inference based on online housing advertisements. We first revisit the theoretical debates on the linkage between neighborhood SES and online housing advertisements. Then, the Naïve Bayes classifier is employed to semantically identify the topics from online housing advertisements and the associated sentiments are quantified using the lexicon-based approach. Following that, seven commonly used machine learning algorithms are compared and utilized to infer the fine-grained neighborhood SES at residential quarters scale based on the housing attributes and extracted topics from online housing advertisements. Results show that machine learning algorithms vary with predictive ability and the tree-based algorithms are much more powerful in inferring neighborhood SES. MoreAbstract: Understanding the dynamic distribution of residents' socioeconomic status (SES) across neighborhoods within cities is essential for urban planning and policy-making aligning to the Sustainable Development Goals 2030. Whereas the promise in explicitly linking geographical features to SES has been highlighted fairly clear in previous works, scholars hold an eclectic attitude in their outlook, given the absence of theoretical ground, the heavy reliance on nontransparent proprietary data sources and the relatively coarse resolution predictions. Drawing on a case study of Hangzhou metropolitan in China, this paper aims to address these problems by demonstrating a novel approach to neighborhood SES inference based on online housing advertisements. We first revisit the theoretical debates on the linkage between neighborhood SES and online housing advertisements. Then, the Naïve Bayes classifier is employed to semantically identify the topics from online housing advertisements and the associated sentiments are quantified using the lexicon-based approach. Following that, seven commonly used machine learning algorithms are compared and utilized to infer the fine-grained neighborhood SES at residential quarters scale based on the housing attributes and extracted topics from online housing advertisements. Results show that machine learning algorithms vary with predictive ability and the tree-based algorithms are much more powerful in inferring neighborhood SES. More specifically, we distinguish 8 reliable features which not only present relative high importance estimated by all the machine learning algorithms but also exhibit great robustness in inferring neighborhood SES and show promising potential to being applied for unraveling social inequalities. We also observe noteworthy spatial heterogeneity in neighborhood SES across the research site. The demonstrated approach not only enables the policymakers to take stock of deprived neighborhoods in a timely manner, but also lays firm ground for framing contextualized strategies of urban governance. This study is among the first attempts to bridge the theoretical interpretation of housing attributes with the proxy indicator -based approach for fine-grained neighborhood SES measurement. … (more)
- Is Part Of:
- Habitat international. Volume 124(2022)
- Journal:
- Habitat international
- Issue:
- Volume 124(2022)
- Issue Display:
- Volume 124, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 124
- Issue:
- 2022
- Issue Sort Value:
- 2022-0124-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Neighborhood socioeconomic status -- Area deprivation -- Machine learning -- Open data -- Social inequalities -- Online housing listings
Human settlements -- Periodicals
307 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01973975 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.habitatint.2022.102572 ↗
- Languages:
- English
- ISSNs:
- 0197-3975
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4237.403000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21496.xml