Skip to main content

Table 2 Features of EDTUs identification

From: Building a Chinese discourse topic corpus with a micro-topic scheme based on theme-rheme theory

Name

Description

POS_Pre_Word

Part of speech tagging for the previous word

Rep_Pre_Word

A string representation for the previous word

POS_Foll_Word

Part of speech tagging for the following word

Rep_Foll_Word

A string representation for the following word

Left_Phrase_Label

Left brother’s phrase label

Right_Phrase_Label

Right brother’s phrase label

Con_Phrase_Label

Conjunction of phrase label of left brother and right brother

Con_Family_Label

Conjunction of the ancestors and Con_Phrase_Label

Is_Sub_Conjunction

Is there a subordinating conjunction for left of the comma?

Is_CoordIP

Is the parent of the comma a coordinating IP construction?

Is_Top_Child

Is the comma a top-level child?

Is_Top_CoordIP

Is the parent of the comma with top-level child and coordinating IP construction?

Pun_Mark_Temp

Punctuation mark template of this sentence

Distance_Left_Right

Length difference between the left and right segments of the comma