B05

Modelling the Information Landscape (IL) for Assessing and Analyzing Domain-Specific and Generic Critical Online Reasoning

Background and Study Focus

The role of linguistic cues for text readability or Web source credibility has been widely studied. Using a corpus of short offline texts, principal investigators of B05 have also shown that linguistic features are important for predicting university student performance in domain-specific knowledge tests. However, it remains under-researched to what extent such correlations can be generalized to the online information landscape (IL). B05 addresses a key desideratum regarding the modelling of linguistic cues used in the online IL, which students navigate when solving critical online reasoning (COR) tasks.

Concept and Research Objective 

B05 aims to develop a theoretically grounded model of linguistic features that allows predictions of student COR processes (that are embedded in the IL) and performances, depending on the texts students process or produce during COR task-solving. B05 addresses the following research questions: (i) to what extent do the linguistic features involved differ in students’ performance in generic vs. domain-specific COR – and within the four domains of economics, medicine, social sciences and physics? (ii) How do these features differ with respect to the three cognitive COR facets of online information acquisition, critical information evaluation, and reasoning with evidence, argumentation and synthesis? (iii) At which levels do these features apply: single or multiple texts, domains or genres, or the entire IL and the underlying language(s) (e.g. German) as a whole?

Measurements and analyses

B05 consists of both quantitative and qualitative parts. It starts with the qualitative selection of linguistic features that provide information on the evidentiality status, the source of information and the organization of texts. The quantitative part performs the task of operationalizing these features, expanding them using a machine learning-based model, and testing their predictive power and specificity with regard to the above research questions. The integration of qualitative and quantitative analyses is in line with the computational hermeneutic circle in which the quantitative part generates statistical evaluations and predictions that are interpretable as the results of the qualitative part’s linguistic analyses.

Research Outcomes

B05 provides machine learning models that make linguistic features of multiple texts as part of the information landscape accessible to automated analysis, based on a linguistic analysis of COR at the level of fine-grained linguistic information units.

Contribution within the research unit CORE

The A-projects provide texts and data on students’ performances from the longitudinal sample and will obtain results from B05’s linguistic analyses. Since linguistic features are by far the most detailed information units analyzed in the research unit, they are relevant for research in the other B-projects in terms of media and content properties (B04) and narrative and latent meaning structures (B06). The Multimodal Learning Data Science System of C08 is crucial for integrating all data from B05.

Publications

Peer Reviewed Articles

Abrami, G., Genios, M., Fitzermann, F., Baumartz, D., & Mehler, A. (2025). Docker Unified UIMA Interface: New perspectives for NLP on big data. SoftwareX, 29, 102033. https://doi.org/10.1016/j.softx.2024.102033

Scherer, T., Laufer, A., Maurer, M., & Schemer, C. (2025). Assessing the information quality of online sources used by first-year students. Zeitschrift für Erziehungswissenschaft. https://doi.org/10.1007/s11618-025-01344-w

Konca, M., Mehler, A., Lücking, A., & Baumartz, D. (2024). Visualizing domain-specific and generic critical online reasoning related structures of online texts. In O. Zlatkin-Troitschanskaia et al. (Eds.), Students’, graduates’ and young professionals’ critical use of online information (pp. 195–239). Springer Nature. https://doi.org/10.1007/978-3-031-69510-0_10

Baumartz, D., Konca, M., Mehler, A., Schrottenbacher, P., & Braunheim, D. (2024). Measuring group creativity of dialogic interaction systems by means of remote entailment analysis. Proceedings of the 35th ACM Conference on Hypertext and Social Media (pp. 153–166). Association for Computing Machinery. https://doi.org/10.1145/3648188.3675140

Mehler, A., Bagci, M., Henlein, A., Abrami, G., Spiekermann, C., Schrottenbacher, P., Konca, M., Lücking, A., Engel, J., Quintino, M., Schreiber, J., Saukel, K., & Zlatkin‑Troitschanskaia, O. (2023). A multimodal data model for simulation‑based learning with Va.Si.Li‑Lab. In V. G. Duffy (Ed.), Digital human modeling and applications in health, safety, ergonomics and risk management (pp. 539–565). Springer Nature Switzerland. https://doi.org/10.1007/978‑3‑031‑35741‑1_39

Paper and Poster Presentations

Abrami, G., Bönisch, K., & Mehler, A. (2025). Towards unified, dynamic, and annotation-based visualisations and exploration of annotated big data corpora with the help of Unified Corpus Explorer. In Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL): System Demonstrations.

Abrami, G., Baumartz, D., & Mehler, A. (2025). DUUI: A toolbox for the construction of a new kind of natural language processing. In Proceedings of DHd 2025: Under Construction. Geisteswissenschaften und Data Humanities (pp. 446–448). https://doi.org/10.5281/zenodo.14887461

Scherer, T., Laufer, A., Maurer, M., & Schemer, C. (2025). Incomplete, incorrect, and unbalanced? The quality of online content used by students to solve COR tasks [Conference presentation]. Presentation at the 12th Conference of the Society for Empirical Educational Research (GEBF), January 27–29, Mannheim, Germany.

Scherer, T., Laufer, A., Maurer, M., & Schemer, C. (2025). Information quality of online sources used by first-semester students when solving generic COR tasks [Conference presentation]. Presentation at the 2025 Conference of the European Association for Research on Learning and Instruction (EARLI), August 25–30, Graz, Austria.

Konca, M., Mehler, A., Bagci, M., Bönisch, K., Engel, J., Henlein, A., Schrottenbacher, P., Stoeckel, M., & Spiekermann, C. (2024, April 13). Modelling and analyzing the online information landscape university students use for their learning [Paper presentation]. Presentation at the Annual Meeting of the American Educational Research Association (AERA), Philadelphia, Pennsylvania, USA.