Extracting Significant Phrases from Text
229. Y. J. Lui, R. P. Brent and A. Calinescu,
Extracting significant phrases from text,
Proc. 21st International Conference on Advanced Information
Networking and Applications (AINA 2007),
Workshop Proceedings (AINAW07),
Vol. 1, May 2007, 361-366.
Preprint:
pdf (87K).
Paper: available from
IEEE CS Digital Library.
Abstract
Prospective readers can quickly determine whether a document is relevant to
their information need if the significant phrases (or keyphrases)
in this document are provided. Although keyphrases are useful, not many
documents have keyphrases assigned to them, and manually assigning
keyphrases to existing documents is costly. Therefore, there is a need for
automatic keyphrase extraction. This paper introduces a new
domain-independent keyphrase extraction algorithm. The algorithm approaches
the problem of keyphrase extraction as a classification task, and uses a
combination of statistical and computational linguistics techniques, a new
set of attributes, and a new learning method, to distinguish keyphrases from
non-keyphrases. Experiments indicate that this algorithm performs at least
as well as other keyphrase extraction tools and that it outperforms
Microsoft Word 2000's AutoSummarize feature significantly.
Go to next publication
Return to Richard Brent's index page