In the past, several approaches to rhetorical parsing of linearly organised texts were developed as applications of relational theories of discourse. These approaches are mainly based on lexical discourse markers and deal mostly with the analysis of newspaper texts. For discourse parsing of more complex text types, text type structure and logical document structure can play an additional role.
The corpus-based research of features of the text type structure of scientific journal articles and their representation in a text technological semantic formalism were the main focus of the first phase of the project. In the second phase, procedural aspects are in focus, and a discourse parser for linearly organised texts that exhibit complex document structures will be developed. Besides traditional discourse markers, i.e. connectives and morphological and syntactic features, properties of document structure, thematic structure and text type structure shall be described as (abstract) discourse markers and be processed using text-technolgical methods.