Modelling the Information Landscape (IL) for Assessing and Analyzing Domain-Specific and Generic Critical Online Reasoning

The role of linguistic cues for text readability or Web source credibility has been widely studied. Using a corpus of short offline texts, principal investigators of B05 have also shown that linguistic features are important for predicting university student performance in domain-specific knowledge tests. However, it remains under-researched to what extent such correlations can be generalized to the online information landscape (IL). B05 addresses a key desideratum regarding the modelling of linguistic cues used in the online IL, which students navigate when solving critical online reasoning (COR) tasks.

B05 aims to develop a theoretically grounded model of linguistic features that allows predictions of student COR processes (that are embedded in the IL) and performances, depending on the texts students process or produce during COR task-solving. B05 addresses the following research questions: (i) to what extent do the linguistic features involved differ in students’ performance in generic vs. domain-specific COR – and within the four domains of economics, medicine, social sciences and physics? (ii) How do these features differ with respect to the three cognitive COR facets of online information acquisition, critical information evaluation, and reasoning with evidence, argumentation and synthesis? (iii) At which levels do these features apply: single or multiple texts, domains or genres, or the entire IL and the underlying language(s) (e.g. German) as a whole?

B05 consists of both quantitative and qualitative parts. It starts with the qualitative selection of linguistic features that provide information on the evidentiality status, the source of information and the organization of texts. The quantitative part performs the task of operationalizing these features, expanding them using a machine learning-based model, and testing their predictive power and specificity with regard to the above research questions. The integration of qualitative and quantitative analyses is in line with the computational hermeneutic circle in which the quantitative part generates statistical evaluations and predictions that are interpretable as the results of the qualitative part’s linguistic analyses.

B05 provides machine learning models that make linguistic features of multiple texts as part of the information landscape accessible to automated analysis, based on a linguistic analysis of COR at the level of fine-grained linguistic information units.

The A-projects provide texts and data on students’ performances from the longitudinal sample and will obtain results from B05’s linguistic analyses. Since linguistic features are by far the most detailed information units analyzed in the research unit, they are relevant for research in the other B-projects in terms of media and content properties (B04) and narrative and latent meaning structures (B06). The Multimodal Learning Data Science System of C08 is crucial for integrating all data from B05.