Suchergebnisse

Filtern nach

Letzte Suchanfragen

Ergebnisse für *

Zeige Ergebnisse 1 bis 2 von 2.

Relevanz

Titel

Typ

Autor

Datum

Peeking Inside the DH Toolbox - Detection and Classification of Software Tools in DH Publications

Autor*in: Ruth, Nicolas; Niekler, Andreas; Burghardt, Manuel

Erschienen: 2022

Verlag: CEUR-WS.org

Volltext:	https://ul.qucosa.de/id/qucosa%3A92316 https://ul.qucosa.de/api/qucosa%3A92316/attachment/ATT-0/
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:15-qucosa2-923161

Digital tools have played an important role in Digital Humanities (DH) since its beginnings. Accordingly, a lot of research has been dedicated to the documentation of tools as well as to the analysis of their impact from an epistemological perspective. In this paper we propose a binary and a multi-class classification approach to detect and classify tools. The approach builds on state-of-the-art neural language models. We test our model on two different corpora and report the results for different parameter configurations in two consecutive experiments. In the end, we demonstrate how the models can be used for actual tool detection and tool classification tasks in a large corpus of DH journals.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt AVL
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Literatur und Rhetorik (800)
Schlagworte:	environmental humanities; computational literary studies; text mining; Ökologie; Biodiversität; Inhaltsanalyse; Literatur
Lizenz:	info:eu-repo/semantics/openAccess

Into the bibliography jungle: using random forests to predict dissertations’ reference section

Autor*in: Gutiérrez De la Torre, Silvia E.; Niekler, Andreas; Equihua, Julián; Burghardt, Manuel

Erschienen: 2022

Verlag: CEUR-WS.org

Volltext:	https://ul.qucosa.de/id/qucosa%3A92321 https://ul.qucosa.de/api/qucosa%3A92321/attachment/ATT-0/
Zitierfähiger Link:	https://nbn-resolving.org/urn:nbn:de:bsz:15-qucosa2-923215

Cited-works-lists in Humanities dissertations are typically the result of five years of work. However, despite the long-standing tradition of reference mining, no research has systematically untapped the bibliographic data of existing electronic thesis collections. One of the main reasons for this is the difficulty of creating a tagged gold standard for the around 300 pages long theses. In this short paper, we propose a page-based random forest (RF) prediction approach which uses a new corpus of Literary Studies Dissertations from Germany. Moreover, we will explain the handcrafted but computationally informed feature-selection process. The evaluation demonstrates that this method achieves an F1 score of 0.88 on this new dataset. In addition, it has the advantage of being derived from an interpretable model, where feature relevance for prediction is clear, and incorporates a simplified annotation process.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt AVL
Sprache:	Englisch
Medientyp:	Konferenzveröffentlichung
Format:	Online
DDC Klassifikation:	Literatur und Rhetorik (800)
Schlagworte:	electronic theses and dissertations; bibliographic reference parsing; information retrieval; machine learning
Lizenz:	info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Peeking Inside the DH Toolbox - Detection and Classification of Software Tools in DH Publications

Into the bibliography jungle: using random forests to predict dissertations’ reference section

Kontaktieren Sie uns!