Publication Details
Signature Extraction Methods for Text
Zendulka Jaroslav, Doc. Ing., CSc. (DCSE FEECS BUT)
Zezula Pavel, prof. Ing., CSc. (FI MUNI)
compression-based method, analytic model, two-level organization of signatures
Signature files seem to be a promising method for text retrieval and document retrieval. According to this method the documents are stored in one file ("text file") while abstractions of the documents ("signatures") are stored in another file ("signature file"). In order to resolve a query, the signature file is searched first and many non-qualifying documents are immediately rejected. In this paper we present signature extraction methods for text and study their performance as a function of the query weight. We derive approximate formulas for estimating the query time of several existing methods. We also propose a new two-level organization method. All analytic formulas are verified by experiments.
@INPROCEEDINGS{FITPUB6660, author = "Zden\v{e}k Kor\v{c}\'{a}k and Jaroslav Zendulka and Pavel Zezula", title = "Signature Extraction Methods for Text", pages = "51--58", booktitle = "Proceedings of the Conference Modelling and Simulation of systems", volume = 2, year = 1998, location = "Sv.Host\'{y}n - Byst\v{r}ice pod Host\'{y}nem, CZ", ISBN = "80-85988-24-0", language = "english", url = "https://www.fit.vut.cz/research/publication/6660" }