Title: DSM-FI: an efficient algorithm for mining frequent itemsets in data streams
Authors: Li, Hua-Fu
Shan, Man-Kwan
Lee, Suh-Yin
資訊工程學系
Department of Computer Science
Keywords: Data mining;Data streams;Frequent itemsets;Single-pass algorithm;Landmark window
Issue Date: 1-Oct-2008
Abstract: Online mining of data streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some inherent characteristics. In this paper, we propose a new single-pass algorithm, called DSM-FI (data stream mining for frequent itemsets), for online incremental mining of frequent itemsets over a continuous stream of online transactions. According to the proposed algorithm, each transaction of the stream is projected into a set of sub-transactions, and these sub-transactions are inserted into a new in-memory summary data structure, called SFI-forest (summary frequent itemset forest) for maintaining the set of all frequent itemsets embedded in the transaction data stream generated so far. Finally, the set of all frequent itemsets is determined from the current SFI-forest. Theoretical analysis and experimental studies show that the proposed DSM-FI algorithm uses stable memory, makes only one pass over an online transactional data stream, and outperforms the existing algorithms of one-pass mining of frequent itemsets.
URI: http://dx.doi.org/10.1007/s10115-007-0112-4
http://hdl.handle.net/11536/8314
ISSN: 0219-1377
DOI: 10.1007/s10115-007-0112-4
Journal: KNOWLEDGE AND INFORMATION SYSTEMS
Volume: 17
Issue: 1
Begin Page: 79
End Page: 97
Appears in Collections:Articles


Files in This Item:

  1. 000259960200005.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.