Suchergebnisse

EuReCo: Not Building and Yet Using Federated Comparable Corpora for Cross-Linguistic Research

Autor*in: Kupietz, Marc; Bański, Piotr; Diewald, Nils; Trawiński, Beata; Witt, Andreas

Erschienen: 2024

Verlag: Leibniz-Institut für Deutsche Sprache (IDS), Mannheim

Zugang:

Resolving-System

Langzeitarchivierung Nationalbibliothek

Verlag (kostenfrei)

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Beteiligt:	Zweigenbaum, Pierre (Herausgeber); Rapp, Reinhard (Herausgeber); Sharoff, Serge (Herausgeber)
Sprache:	Englisch
Medientyp:	Dissertation
Format:	Online
Weitere Identifier:	urn: urn:nbn:de:bsz:mh39-126961
Schlagworte:	Korpus <Linguistik>; Mehrsprachigkeit
Weitere Schlagworte:	Reference Corpora; National Corpora; Federated Corpora; Multilingual Corpora; Cross-Linguistic Research; Comparability
Umfang:	Online-Ressource
Bemerkung(en):	In: Paris : ELRA Language Resource Association, (2024) In: Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) @ LREC-COLING 2024 Diplomarbeit, Mannheim, Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek,

Building and using comparable corpora for multilingual natural language processing

Autor*in: Sharoff, Serge; Rapp, Reinhard; Zweigenbaum, Pierre

Erschienen: [2023]

Verlag: Springer, Cham, Switzerland

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with... mehr

Stuttgart: Universitätsbibliothek Stuttgart

Standort:

Universitätsbibliothek Stuttgart

Fernleihe:

uneingeschränkte Fernleihe, Kopie und Ausleihe

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Hinweise zum Inhalt

Cover

Quelle:	Verbundkataloge
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Druck
ISBN:	9783031313837
Schriftenreihe:	Synthesis lectures on human language technologies
Schlagworte:	Angewandte Informatik; COM094000; COMPUTERS / Computer Science; COMPUTERS / Natural Language Processing; Computational linguistics; Computerlinguistik und Korpuslinguistik; Information technology: general issues; Machine learning; Maschinelles Lernen; Natural language & machine translation; Natürliche Sprachen und maschinelle Übersetzung
Umfang:	viii, 133 Seiten, Illustrationen, Diagramme
Bemerkung(en):	Chapter 1 Introduction.- Chapter 2 Basic principles of cross-lingual models.- Chapter 3 Building comparable corpora.- Chapter 4 Extraction of parallel sentences.- Chapter 5 Induction of bilingual Dictionaries.- Chapter 6 Comparable and Parallel Corpora for Machine Translation.- Chapter 7 Other applications of comparable corpora.- Chapter 8 Conclusions and future research.- Index.

Building and using comparable corpora for multilingual Natural language processing

Autor*in: Sharoff, Serge; Rapp, Reinhard; Zweigenbaum, Pierre

Erschienen: [2023]

Verlag: Springer International Publishing AG, Cham

Intro -- Contents -- About the Authors -- 1 Introduction -- 1.1 Rationale for Working with Comparable Corpora -- 1.1.1 Availability of Truly Parallel Data -- 1.1.2 Translationese in Parallel Data -- 1.2 Levels of Comparability -- 1.3 Methodology for... mehr

Zugang:

Aggregator (lizenzpflichtig)

Leipzig: Universitätsbibliothek Leipzig

Standort:

Universitätsbibliothek Leipzig

Fernleihe:

keine Fernleihe

Intro -- Contents -- About the Authors -- 1 Introduction -- 1.1 Rationale for Working with Comparable Corpora -- 1.1.1 Availability of Truly Parallel Data -- 1.1.2 Translationese in Parallel Data -- 1.2 Levels of Comparability -- 1.3 Methodology for Dealing with Comparable Resources -- 2 Basic Principles of Cross-Lingual Models -- [DELETE] -- 2.1 Monolingual VSMs -- 2.2 Cross-Lingual VSMs -- 2.3 Contextual Embeddings -- 3 Building Comparable Corpora -- [DELETE] -- 3.1 Measures for Document Similarity Across Languages -- 3.2 Evaluation Methods and Datasets -- 3.3 Natural Annotation: Building Strongly Comparable Corpora -- 3.4 Low-Hanging Fruit: Building Weakly Comparable Corpora -- 3.5 Large Scale Document Alignment -- 3.5.1 Structural Similarity -- 3.5.2 Lexical Similarity -- 3.6 Comparable Corpora of Unrelated Documents -- 4 Extraction of Parallel Sentences -- [DELETE] -- 4.1 Extraction from Parallel Corpora -- 4.2 Assessing Cross-Lingual Sentence Similarity -- 4.3 Datasets and Evaluation -- 4.4 General Principles -- 4.5 Pre-neural Methods -- 4.6 Supervised Neural Methods -- 4.7 Limitations of Supervised Methods in Low-Resource Settings -- 4.8 Unsupervised Neural Methods -- 5 Induction of Bilingual Dictionaries -- 5.1 Setting the Task -- 5.2 Bilingual Lexicon Induction From Parallel Corpora -- 5.3 Matching Contexts -- 5.4 Geometric Properties of Word Embedding Spaces -- 5.5 Alignment of Word Embeddings -- 5.6 Alignment of Contextual Embeddings -- 5.7 Evaluation -- 5.7.1 Evaluation Experiments for BLI -- 5.7.2 Evaluation on Multilingual Termbanks -- 5.8 The BUCC 2020 Shared Task on Bilingual Dictionary Induction -- 5.8.1 Resources -- 5.8.2 Evaluation -- 5.9 The BUCC 2022 Shared Task on Bilingual Terminology Extraction -- 5.9.1 Specifications of the Task -- 5.9.2 Shared Task Results -- 6 Comparable and Parallel Corpora for Machine Translation.

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Quelle:	Verbundkataloge
Sprache:	Englisch
Medientyp:	Ebook
Format:	Online
ISBN:	9783031313844
Schriftenreihe:	Synthesis lectures on human language technologies series
Schlagworte:	Angewandte Informatik; COM094000; COMPUTERS / Computer Science; COMPUTERS / Natural Language Processing; Computational linguistics; Computerlinguistik und Korpuslinguistik; Information technology: general issues; Machine learning; Maschinelles Lernen; Natural language & machine translation; Natürliche Sprachen und maschinelle Übersetzung
Umfang:	1 Online-Ressource (viii, 133 Seiten)
Bemerkung(en):	Description based on publisher supplied metadata and other sources

Building and Using Comparable Corpora

Beteiligt: Sharoff, Serge (Herausgeber); Rapp, Reinhard (Herausgeber); Zweigenbaum, Pierre (Herausgeber); Fung, Pascale (Herausgeber)

Erschienen: 2013

Verlag: Springer Berlin Heidelberg, Berlin, Heidelberg ; Springer International Publishing AG, Cham

Zugang:

Resolving-System

Darmstadt: Bibliothek der Hochschule Darmstadt, Zentralbibliothek

Standort:

Bibliothek der Hochschule Darmstadt, Zentralbibliothek

Fernleihe:

keine Fernleihe

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Standort:

Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Fernleihe:

keine Fernleihe

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Hinweise zum Inhalt

Inhaltsverzeichnis

Ausführliche Beschreibung

Quelle:	Verbundkataloge
Beteiligt:	Sharoff, Serge (Herausgeber); Rapp, Reinhard (Herausgeber); Zweigenbaum, Pierre (Herausgeber); Fung, Pascale (Herausgeber)
Sprache:	Englisch
Medientyp:	Ebook
Format:	Online
ISBN:	9783642201288; 3642201288
Weitere Identifier:	doi: 10.1007/978-3-642-20128-8
RVK Klassifikation:	ES 960
DDC Klassifikation:	Datenverarbeitung; Informatik (004); Sprache (400)
Auflage/Ausgabe:	1st ed. 2013
Schlagworte:	Computerlinguistik; Übersetzung; Korpus <Linguistik>; Vergleichbarkeit; Natural language processing (Computer science); Computational linguistics; Application software; Natural Language Processing (NLP); Computational Linguistics; Computer and Information Systems Applications
Umfang:	1 Online-Ressource (XII, 335 Seiten), 70 illus., 14 illus. in color.

Building and using comparable corpora

Beteiligt: Sharoff, Serge (Hrsg.)

Erschienen: 2013

Verlag: Springer, Berlin [u.a.]

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Standort:

Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Signatur:

90.199.59

Fernleihe:

uneingeschränkte Fernleihe, Kopie und Ausleihe

Germersheim: Universität Mainz, Bereichsbibliothek Translations-, Sprach- und Kulturwissenschaft

Standort:

Universität Mainz, Bereichsbibliothek Translations-, Sprach- und Kulturwissenschaft

Signatur:

20034481

Fernleihe:

uneingeschränkte Fernleihe, Kopie und Ausleihe

Gießen: Universitätsbibliothek Gießen

Standort:

Universitätsbibliothek Gießen

Signatur:

000 ES 960 S531

Fernleihe:

uneingeschränkte Fernleihe, Kopie und Ausleihe

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Hinweise zum Inhalt

Inhaltsverzeichnis

Inhaltstext

Cover

Quelle:	Verbundkataloge
Beteiligt:	Sharoff, Serge (Hrsg.)
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Druck
ISBN:	364220127X; 9783642201271
Weitere Identifier:	9783642201271
RVK Klassifikation:	ES 960
DDC Klassifikation:	Sprache (400); Datenverarbeitung; Informatik (004)
Schlagworte:	Korpus <Linguistik>; Vergleichbarkeit; Computerlinguistik; Übersetzung
Umfang:	XII, 335 S., Ill., graph. Darst.
Bemerkung(en):	Literaturangaben

Building and Using Comparable Corpora for Multilingual Natural Language Processing

Autor*in: Sharoff, Serge; Rapp, Reinhard; Zweigenbaum, Pierre

Erschienen: 2024

Verlag: Springer International Publishing AG, Cham

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with... mehr

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications

Export in Literaturverwaltung

RIS-Format
BibTeX-Format

Hinweise zum Inhalt

Cover

Quelle:	Verbundkataloge
Sprache:	Englisch
Medientyp:	Buch (Monographie)
Format:	Druck
ISBN:	9783031313868
Schriftenreihe:	Synthesis Lectures on Human Language Technologies
Schlagworte:	Angewandte Informatik; Artificial intelligence; COMPUTERS / Artificial Intelligence; COMPUTERS / Data Processing / Speech & Audio Processing; COMPUTERS / Enterprise Applications; COMPUTERS / General; Computational linguistics; Computer science; Computerlinguistik und Korpuslinguistik; Informatik; Information technology: general issues; Künstliche Intelligenz; LANGUAGE ARTS & DISCIPLINES / Linguistics; MATHEMATICS / Probability & Statistics / General; Machine learning; Maschinelles Lernen; Natural language & machine translation; Natürliche Sprachen und maschinelle Übersetzung
Umfang:	133 Seiten
Bemerkung(en):	Chapter 1 Introduction.- Chapter 2 Basic principles of cross-lingual models.- Chapter 3 Building comparable corpora.- Chapter 4 Extraction of parallel sentences.- Chapter 5 Induction of bilingual Dictionaries.- Chapter 6 Comparable and Parallel Corpora for Machine Translation.- Chapter 7 Other applications of comparable corpora.- Chapter 8 Conclusions and future research.- Index.

Filtern nach

Aktive Filter

Kategorien:

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

EuReCo: Not Building and Yet Using Federated Comparable Corpora for Cross-Linguistic Research

Building and using comparable corpora for multilingual natural language processing

Stuttgart: Universitätsbibliothek Stuttgart

Building and using comparable corpora for multilingual Natural language processing

Leipzig: Universitätsbibliothek Leipzig

Building and Using Comparable Corpora

Darmstadt: Bibliothek der Hochschule Darmstadt, Zentralbibliothek

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Building and using comparable corpora

Frankfurt/Main: Universitätsbibliothek J. C. Senckenberg, Zentralbibliothek (ZB)

Germersheim: Universität Mainz, Bereichsbibliothek Translations-, Sprach- und Kulturwissenschaft

Gießen: Universitätsbibliothek Gießen

Building and Using Comparable Corpora for Multilingual Natural Language Processing

Kontaktieren Sie uns!