IBM Skip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country  
Journals Home  
  Systems Journal  
  ·  Current Issue  
  ·  Recent Issues  
  ·  Papers in Progress  
  ·  Search/Index  
  ·  Orders  
  ·  Description  
  ·  Author's Guide  
Journal of Research
and Development
  Staff  
  Contact Us  
  Related links:  
     IBM Research: AI  
     Data Abstraction
   Research Project
 
IBM Systems Journal  
Volume 41, Number 3, 2002
Artificial Intelligence
 Table of contents: arrowHTML arrowPDF arrowASCII   This article: arrowHTML arrowPDF arrowASCII arrowCopyright info
   

Automated generation of model cases for help-desk applications - References

by S. M. Weiss and C. V. Apte

Cited references and notes

  1. D. Radev, H. Jing, and M. Budzikowska, “Summarization of Multiple Documents: Clustering, Sentence Extraction, and Evaluation,” Proceedings, ANLP NAACL Workshop on Automatic Summarization, Seattle, WA (April 30, 2000).
  2. J. Hartigan and M. Wong, “A K-Means Clustering Algorithm,” Applied Statistics 28, 100–108 (1979).
  3. A. McCallum, K. Nigam, and L. Ungar, “Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching,” Proceedings, 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA (August 20–23, 2000).
  4. A. Griffiths, H. Luckhurst, and P. Willett, “Using Interdocument Similarity Information in Document Retrieval Systems,” K. Sparck Jones and P. Willett, Editors, Readings in Information Retrieval, Morgan Kaufmann Publishers, San Francisco, CA (1997), pp. 365–373.
  5. B. Larsen and C. Aone, “Fast and Effective Text Mining Using Linear-Time Document Clustering,” Proceedings, 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA (August 15–18, 1999), pp. 16–22.
  6. G. Salton and C. Buckley, “Term-Weighting Approaches in Automatic Text Retrieval,” K. Sparck Jones and P. Willett, Editors, Readings in Information Retrieval, Morgan Kaufmann Publishers, San Francisco, CA (1997), pp. 323–328.
  7. D. Cutting, D. Karger, J. Pedersen, and J. Tukey, “Scatter/Gather: A Cluster-Based Approach to Browsing Large Document Collections,” Proceedings, 15th ACM SIGIR International Conference on Research and Development in Information Retrieval, Copenhagen, Denmark (June 21–24, 1992).
  8. E. Voorhees, “Implementing Agglomerative Hierarchical Clustering Algorithms for Use in Document Retrieval,” Information Processing and Management 22, 465–476 (1986).
  9. A “stopword” is a word that is ignored because it has little statistical predictive value; for example, “it” and “we” are common stopwords.
  10. S. Weiss, B. White, C. Apte, and F. Damerau, “Lightweight Document Matching for Help-Desk Applications,” IEEE Intelligent Systems 15, No. 2, 57–61 (2000).
  11. “Nearest neighbor” refers to a standard method of measurement that compares a new vector to a stored collection of vectors and finds the one most similar.
  12. O. Zamir, O. Etzioni, O. Madani, and R. Karp, “Fast and Intuitive Clustering of Web Documents,” Proceedings, 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA (August 14–17, 1997).