A data-driven approach to studying changing vocabularies in historical newspaper collections. (5th November 2021)
- Record Type:
- Journal Article
- Title:
- A data-driven approach to studying changing vocabularies in historical newspaper collections. (5th November 2021)
- Main Title:
- A data-driven approach to studying changing vocabularies in historical newspaper collections
- Authors:
- Hengchen, Simon
Ros, Ruben
Marjanen, Jani
Tolonen, Mikko - Abstract:
- Abstract: Nation and nationhood are among the most frequently studied concepts in the field of intellectual history. At the same time, the word 'nation' and its historical usage are very vague. The aim in this article was to develop a data-driven method using dependency parsing and neural word embeddings to clarify some of the vagueness in the evolution of this concept. To this end, we propose the following two-step method. First, using linguistic processing, we create a large set of words pertaining to the topic of nation. Second, we train diachronic word embeddings and use them to quantify the strength of the semantic similarity between these words and thereby create meaningful clusters, which are then aligned diachronically. To illustrate the robustness of the study across languages, time spans, as well as large datasets, we apply it to the entirety of five historical newspaper archives in Dutch, Swedish, Finnish, and English. To our knowledge, thus far there have been no large-scale comparative studies of this kind that purport to grasp long-term developments in as many as four different languages in a data-driven way. A particular strength of the method we describe in this article is that, by design, it is not limited to the study of nationhood, but rather expands beyond it to other research questions and is reusable in different contexts.
- Is Part Of:
- Digital scholarship in the humanties. Volume 36(2021)Supplement 2
- Journal:
- Digital scholarship in the humanties
- Issue:
- Volume 36(2021)Supplement 2
- Issue Display:
- Volume 36, Issue 2 (2021)
- Year:
- 2021
- Volume:
- 36
- Issue:
- 2
- Issue Sort Value:
- 2021-0036-0002-0000
- Page Start:
- ii109
- Page End:
- ii126
- Publication Date:
- 2021-11-05
- Subjects:
- Philology -- Data processing -- Periodicals
Computational linguistics -- Periodicals
410.285 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
http://dsh.oxfordjournals.org/ ↗ - DOI:
- 10.1093/llc/fqab032 ↗
- Languages:
- English
- ISSNs:
- 2055-768X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20346.xml