Lemmatization and parsing with TACT preprocessing programs
By its ideal definition, lemmatization is a process wherein the inflectional and variant forms of a word are reduced to their lemma: their base form, or dictionary look-up form. When one lemmatizes a text, one replaces each individual word in that…
Listed in Article
Version 1.0 - published on 13 Jun 2022 doi: 10.25547/J8XF-NR20 - cite this
Licensed under Creative Commons BY 4.0
Description
By its ideal definition, lemmatization is a process wherein the inflectional and variant forms of a word are reduced to their lemma: their base form, or dictionary look-up form. When one lemmatizes a text, one replaces each individual word in that text with its lemma; a text in English which has been lemmatized, then, would contain all forms of a verb represented by its infinitive, all forms of a noun by its nominative singular, and so forth.[1]
Tags
Notes
Original publication information:
Originally published in Digital Studies/le Champ Numérique (1)
Year: 1996
DOI: http://doi.org/10.16995/dscn.233
License: (CC BY 4.0)
Original citation:
Siemens, R. G. (1996). Lemmatization and parsing with TACT preprocessing programs. Digital Studies/le Champ Numérique, (1). DOI: http://doi.org/10.16995/dscn.233
Publication preview
When watching a publication, you will be notified when a new version is released.