Title: | DSM-TKP: Mining Top-K Path traversal patterns over Web click-streams |
Authors: | Li, HF Lee, SY Shan, MK 資訊工程學系 Department of Computer Science |
Issue Date: | 2005 |
Abstract: | Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate, and just one scan over previously arrived click-sequences. In this paper, we propose a new, single-pass algorithm, called DSM-TKP (Data Stream Mining for Top-K Path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (Top-K Path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data. |
URI: | http://hdl.handle.net/11536/18054 |
ISBN: | 0-7695-2415-X |
Journal: | 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings |
Begin Page: | 326 |
End Page: | 329 |
Appears in Collections: | Conferences Paper |