Kompetenzstufe: Fortgeschrittene*r
Sprache: Englisch
Format: Code, Fallstudie, Tutorial, Selbstlerneinheit
Medientyp: Code Snippets, Daten, Textmedien
Veröffentlichung: 18.12.2024
ID:
® 10.71627/textprep
TextPrep
Yannik Peters
, Kunjan Shah
Mitwirkende: This tool was developed as part of the KODAQS project - a partnership between GESIS and the University of Mannheim and LMU Munich.
The TextPrep tool, implemented in R, provides text preprocessing and comparative strategies to improve the quality of social media data. It supports common techniques, such as automated translation, minor text operations, and stopword filtering, while allowing systematic comparisons of alternative approaches. The tool helps to assess how different procedures may affect analytical outcomes and provides metrics to quantify differences, helping researchers evaluate choices transparently. As an illustrative use case, a synthetic dataset of social media posts about the 2024 Summer Olympics is processed using different configurations to compare their impact on the resulting text.
Diese Ressource steht unter folgender Lizenz:
Creative Commons Attribution NonCommercial 4.0 International (CC-BY-NC-4.0)