{"title":"Approximate Frequent Pattern Discovery Over Data Stream","authors":"Kittisak Kerdprasop, Nittaya Kerdprasop","volume":11,"journal":"International Journal of Computer and Information Engineering","pagesStart":3439,"pagesEnd":3444,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/9297","abstract":"Frequent pattern discovery over data stream is a hard\nproblem because a continuously generated nature of stream does not\nallow a revisit on each data element. Furthermore, pattern discovery\nprocess must be fast to produce timely results. Based on these\nrequirements, we propose an approximate approach to tackle the\nproblem of discovering frequent patterns over continuous stream.\nOur approximation algorithm is intended to be applied to process a\nstream prior to the pattern discovery process. The results of\napproximate frequent pattern discovery have been reported in the\npaper.","references":"[1] C. Aggarwal, J. Han, J. Wang, and P. Yu, \"A framework for clustering\nevolving data streams,\" in Pro. Very Large Data Bases, 2003.\n[2] R. Agrawal, T. Imielinski, and A. Swami, \"Mining association rules\nbetween sets of items in large databases,\" in Proc. ACM SIGMOD Int.\nConf. Management of Data, 1993, pp. 207-216.\n[3] R. Agrawal and R. Srikant, \"Fast algorithm for mining association\nrules,\" in Proc. Int. Conf. Very Large Data Bases, 1994, pp. 487-499.\n[4] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, \"Model and\nissues in data stream systems,\" in Pro. ACM PODS, 2002.\n[5] J. Bilmes, \"A gentle tutorial of the EM algorithm and its application to\nparameter estimation for Gaussian mixture and hidden Markov models,\"\nDept. Electrical Engineering and Computer Science, University of\nCalifornia Berkeley, Technical Report TR-97-021, 1998.\n[6] G. Coremode and S. Muthukrishnan, \"What-s hot and what-s not:\nTracking most frequent items dynamically,\" in Pro. ACM PODS, 2003.\n[7] A. P. Dempster, N. M. Laird, and D. B. Rubin, \"Maximum likelihood\nfrom incomplete data via the EM algorithm,\" Journal of the Royal\nStatistical Society B, vol. 39, pp. 1-22, 1977.\n[8] P. Domingos and G. Hulten, \"A general method to scaling up machine\nlearning algorithms and its application to clustering,\" in Pro. ICML,\n2001.\n[9] M. A. T. Figueiredo and A. K. Jain, \"Unsupervised learning of finite\nmixture models,\" IEEE Trans. Pattern Analysis and Machine\nIntelligence, vol. 24, pp. 381-396, 2002.\n[10] M. Gaber, A. Zaslavsky, and S. Krishnaswamy, \"Mining data stream: A\nreview,\" SIGMOD Record, vol. 34, pp. 18-26, 2005.\n[11] A. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss, \"One-pass\nwavelet decompositions of data streams,\" IEEE Trans. Knowledge and\nData Engineering, vol. 15, pp. 541-554, 2003.\n[12] J. M. Marin, K. Mengersen, and C. Robert, \"Bayesian modelling and\ninference on mixtures of distributions,\" in Handbook of Statistics, vol.\n25, Elsevier-Science, 2005.\n[13] S. Muthukrishnan, \"Data streams: Algorithms and applications,\" in\nProc. ACM-SIAM Symposium on Discrete Algorithm, 2003.\n[14] B. Resch, \"A tutorial for the course computational intelligence,\"\nAvailable: http:\/\/www.igi.tugraz.at\/lehre\/CI\n[15] J. von Neumann, \"Various techniques used in connection with random\ndigits,\" Applied Mathematics Series, vol. 12, National Bureau of\nStandards, Washington, D.C., 1951.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 11, 2007"}