Suche

Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:

Inhalt

Literaturnachweis - Detailanzeige

 
Autor/inn/enMeylan, Stephan C.; Griffiths, Thomas L.
TitelThe Challenges of Large-Scale, Web-Based Language Datasets: Word Length and Predictability Revisited
QuelleIn: Cognitive Science, 45 (2021) 6, (26 Seiten)
PDF als Volltext Verfügbarkeit 
ZusatzinformationORCID (Meylan, Stephan C.)
Spracheenglisch
Dokumenttypgedruckt; online; Zeitschriftenaufsatz
ISSN1551-6709
DOI10.1111/cogs.12983
SchlagwörterData Analysis; Data Collection; Word Frequency; Prediction; Language Research; Measurement
AbstractLanguage research has come to rely heavily on large-scale, web-based datasets. These datasets can present significant methodological challenges, requiring researchers to make a number of decisions about how they are collected, represented, and analyzed. These decisions often concern long-standing challenges in corpus-based language research, including determining what counts as a word, deciding which words should be analyzed, and matching sets of words across languages. We illustrate these challenges by revisiting "Word lengths are optimized for efficient communication" (Piantadosi, Tily, & Gibson, 2011), which found that word lengths in 11 languages are more strongly correlated with their average predictability (or average information content) than their frequency. Using what we argue to be best practices for large-scale corpus analyses, we find significantly attenuated support for this result and demonstrate that a stronger relationship obtains between word frequency and length for a majority of the languages in the sample. We consider the implications of the results for language research more broadly and provide several recommendations to researchers regarding best practices. (As Provided).
AnmerkungenWiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Update2024/1/01
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen
 

Standortunabhängige Dienste
Bibliotheken, die die Zeitschrift "Cognitive Science" besitzen:
Link zur Zeitschriftendatenbank (ZDB)

Artikellieferdienst der deutschen Bibliotheken (subito):
Übernahme der Daten in das subito-Bestellformular

Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)

Teile diese Seite: