The Penn Treebank Tagset
Similarly to the SUSANNE Corpus Tagset, the Penn Treebank Tagset consists of two main parts. There is the syntactic tagset and the POS tagset.
The syntactic tagset
| ADJP | Adjective phrase |
| ADVP | Adverb phrase |
| NP | Noun phrase |
| PP | Prepositional phrase |
| S | Simple declarative clause |
| SBAR | Clause introduced by subordinating conjunction or 0 (see below) |
| SBARQ | Direct question introduced by wh-word or wh-phrase |
| SINV | Declarative sentence with subject-aux inversion |
| SQ | Subconstituent of SBARQ excluding wh-word or wh-phrase |
| VP | Verb phrase |
| WHADVP | Wh-adverb phrase |
| WHNP | Wh-noun phrase |
| WHPP | Wh-prepositional phrase |
| X | Constituent of unknown or uncertain category |
| Null elements | |
| * | “Understood” subject of infinitive or imperative |
| 0 | Zero variant of that in subordinate clauses |
| T | Trace—marks position where moved wh-constituent is interpreted |
| NIL | Marks position where preposition is interpreted in pied-piping contexts |
The POS tagset
| CC | Coordinating Conjunction |
| CD | Cardinal Number |
| DT | Determiner |
| EX | Existential there |
| FW | Foreign word |
| IN | Preposition/subordinating conjunction |
| JJ | Adjective |
| JJR | Adjective, comparative |
| JJS | Adjective, superlative |
| LS | List item marker |
| MD | Modal |
| NN | Noun, singular or mass |
| NNS | Noun, plural |
| NNP | Proper noun, singular |
| NNPS | Proper noun, plural |
| PDT | Predeterminer |
| POS | Posessive ending |
| PRP | Personal pronoun |
| PP | Posseive pronoun |
| RB | Adverb |
| RBR | Adverb, comparative |
| RBS | Adverb, superlative |
| RP | Particle |
| SYM | Symbol (mathematic or scientific) |
| TO | to |
| UH | Interjection |
| VB | Verb, base form |
| VBD | Verb, past tense |
| VBG | Verb, gerund/present participle |
| VBN | Verb, past participle |
| VBP | Verb, non-3rd person singular present |
| VBZ | Verb, 3rd person singular present |
| WDT | wh-determiner |
| WP | wh-pronoun |
| WP$ | Possesive wh-pronoun |
| WRB | wh-adverb |
| # | Pound sign |
| $ | Dollar sign |
| . | Sentence-final punctuation |
| , | Comma |
| : | Colon, semi-colon |
| ( | Left bracket character |
| ) | Right bracket character |
| " | Straight double quote |
| ‘ | Left open single quote |
| “ | Left open double quote |
| ’ | Right closed single quote |
| ” | Right closed double quote |
This list is taken from the HTML version of ‘Building a large annotated corpus of English: the Penn Treebank’ by Mitchell P. Marcus, Mary Ann Marcinkiewicz, Beatrice Santorini which also contains a lot of useful information about the Penn Treebank.
