NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED425727
Record Type: Non-Journal
Publication Date: 1998
Pages: 9
Abstractor: N/A
ISBN: N/A
ISSN: N/A
EISSN: N/A
Undergraduate Data Mining on the World Wide Web.
Scime, Anthony
Currently available World Wide Web search engines determine a site's qualification as a response to a search request by matching keywords in the request to keywords representing the site. The returned sites are given a score and ranked according to the match on keywords. Many of these retrieved sites can be irrelevant to the user's true information needs. Undergraduate students with information retrieval and computer literacy skills should be able to search the Web to find and extract information relevant to a domain. A course in the fundamentals of information gathering from distributed heterogeneous sites can improve these skills. The course described in this paper examines methods, theories, and techniques and provides practice in information retrieval, categorization, and knowledge discovery from text and other unstructured data sources such as the Internet and World Wide Web. This paper reports on the results of the course and the experiences gained. Highlights include: course objectives; assignments; the research paper search architecture; outlining the research paper; Web data mining; Web site evaluation, including syntactic and semantic methods, as well as computing overall relevance using a repertory grid; design of a "data warehouse," i.e., a subject-specific relational database of highly ranked documents; knowledge discovery from the data warehouse; and writing the research paper. (Author/AEF)
Publication Type: Reports - Descriptive; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A