|
|
 | |  |
Volume 43, Number 1, 2004
Utility Computing |
|
Table of contents:
HTML PDF | |
This article:
HTML PDF | Copyright info |
 |  |  |  |
| | |
How to build a WebFountain: An architecture for very large-scale text analytics - References
|
 |
by
D. Gruhl, L. Chavet, D. Gibson, J. Meyer, P. Pattanayak, A. Tomkins, and J. Zien
|
 |  |
 |
Cited references and notes
-
Google, http://www.google.com.
-
AltaVista, http://www.altavista.com.
-
T. Sterling, J. Salmon, D. J. Becker, and D. F. Savarese, How to Build a Beowulf, The MIT Press, Cambridge, MA (1999).
-
A. Broder and M. R. Henzinger, “Algorithmic Aspects of Information Retrieval on the Web,” in Handbook of Massive Data Sets, M. R. J. Abello and P. M. Pardalos, Editors, Kluwer Academic Publishers, Boston, forthcoming.
-
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener, “The Lorel Query Language for Semistructured Data,” International Journal of Digital Libraries 1, No. 1, 68–88 (1997).
-
J. M. Hellerstein, M. J. Franklin, S. Chandrasekaran, A. Deshpande, K. Hilldrum, D. Maden, V. Raman, and M. A. Shah, “Adaptive Query Processing: Technology in Evolution,” IEEE Data Engineering Bulletin 23, No. 2, 7–18 (June 2000).
-
G. Arocena, A. Mendelzon, and G. Mihaila, “Applications of a Web Query Language,” Proceedings of the 6th International World Wide Web Conference (WWW6), Santa Clara, CA (1997), pp. 1305–1315.
-
E. Spertus and L. A. Stein, “Squeal: A Structured Query Language for the Web,” Proceedings of the 9th International World Wide Web Conference (WWW9) (2000), pp. 95–103.
-
G. Mecca, A. Mendelzon, and P. Merialdo, “Efficient Queries over Web Views,” Proceedings of the 6th International Conference on Extending Database Technology (EDBT), Valencia, Spain, Lecture Notes in Computer Science1377, Springer-Verlag (1998) pp. 72–86.
-
The Internet Archive, http://www.archive.org.
-
J. Hirai, S. Raghavan, A. Paepcke, and H. Garcia-Molina, “WebBase: A Repository of Web Pages,” Proceedings of the 9th International World Wide Web Conference (WWW9) (2000), pp. 277–293.
-
Web-in-a-Box, Web Archeology, Hewlett Packard SRC Classic Lab, Palo Alto, CA, http://research.compaq.com/SRC/WebArcheology/wib.html.
-
I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” Lecture Notes in Computer Science 2150 (2001).
-
Semantic Web Activity: Advanced Development, Technology and Society Domain, W3C, http://www.w3.org/2000/01/sw/.
-
O. Lassila and R. R. Swick, Resource Description Framework (RDF) Model and Syntax Specification, W3C Recommendation, http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/ (February 1999).
-
D. L. McGuinness and E. F. van Harmelen, OWL Web Ontology Language Overview, W3C Candidate Recommendation, http://www.w3.org/TR/owl-features/ (August 18, 2003).
-
The DARPA Agent Markup Language (DAML) Homepage, http://www.daml.org.
-
A. Wolfe, “IBM Sets Its Sights on Autonomic Computing,” News Analysis, IEEE Spectrum (January 2002).
-
P. Horn, Autonomic Computing: IBM's Perspective on the State of Information Technology, IBM Corporation (October 15, 2001), http://www.research.ibm.com/autonomic/manifesto/autonomic_computing.pdf.
-
A. Newell, “Some Problems of the Basic Organization in Problem-Solving Programs,” Proceedings of the Second Conference on Self-Organizing Systems, Washington, DC (1962), pp. 393–423.
-
L. D. Erman, F. Hayes-Roth, V. R. Lesser, and D. R. Reddy, “The Hearsay Speech Understanding System: Integrating Knowledge to Resolve Uncertainty,” Computing Surveys 12, No. 2, 213–253 (1980).
-
R. Agrawal, R. Bayardo, D. Gruhl, and S. Papadimitriou, “Vinci: A Service-Oriented Architecture for Rapid Development of Web Applications,” Proceedings of the Tenth International World Wide Web Conference (WWW10), Hong Kong, China (2001), pp. 355–365.
-
F. Yergeau, UTF-8, A Transformation Format of ISO 10646, Internet Engineering Task Force (January 1998), http://www.ietf.org/rfc/rfc2279.txt.
-
M. Minsky, A Framework for Representing Knowledge, Technical Report, MIT-AI Laboratory Memo 306, Massachusetts Institute of Technology Artificial Intelligence Laboratory, Cambridge, MA (June 1974).
-
D. A. Patterson, G. Gibson, and R. H. Katz, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” Proceedings of the ACM Conference on Management of Data (SIGMOD) (June 1988), pp. 109–116.
-
M. Seltzer, P. Chen, and J. Ousterhout, “Disk Scheduling Revisited,” Proceedings of the USENIX Winter 1990 Technical Conference, USENIX Association, Berkeley, CA (1990), pp. 313–324.
-
G. H. Sockut and B. R. Iyer, “A Survey of Online Reorganization in IBM Products and Research,” IEEE Bulletin of the Technical Committee on Data Engineering 19, No. 2, 4–11 (1996).
-
D. Gruhl, The Search for Meaning in Large Text Databases, Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA (2000).
-
User queries may be returned in rank orders that are appropriate for viewing, but long-running queries that are processed to completion are returned in UEID order.
-
C. Clarke, G. Cormack, and F. Burkowski, “Shortest Substring Ranking (MultiText Experiments for TREC-4),” Proceedings of the Fourth Text Retrieval Conference (November 1995).
-
S. Chakrabarti, B. Dom, and P. Indyk., “Enhanced Hypertext Classification Using Hyper-Links,” ACM SIGMOD International Conference on Management of Data (1998), pp. 307–318.
-
K. Bharat, A. Broder, M. Henzinger, P. Kumar, and S. Venkatasubramanian, “The Connectivity Server: Fast Access to Linkage Information on the Web,” Proceedings of the 7th International World Wide Web Conference (April 1998), pp. 14–18.
-
Simple Object Access Protocol (SOAP) 1.1, W3C, http://www.w3.org/TR/SOAP/.
-
Web Service Definition Language (WSDL), W3C, http://www.w3.org/TR/wsdl.
-
Since each node represents less than a half percent of our data, having one or even two down does not materially impact the quality of queries that develop an aggregate statistical understanding over a broad data set.
-
For information on the particular set of mining and applications, please contact the WebFountain team directly.37
-
WebFountain Overview, IBM Corporation, Almaden Research Center, http://www.almaden.ibm.com/webfountain.
|
 |
|
 |
|