Dennis Lin presents at the Institute for Computational Science's Fall Seminar Series
On November 1, 2004, Dennis Lin presented "Statistical Data Mining: A Global View and Some Research Potentials" at the Fall Seminar Series hosted by the Institute for Computational Science.
Discussion abstract: Statistical data mining is the exploration and analysis of a large data set by automatic or semiautomatic means with the purpose of discovering meaningful patterns. The patterns are then used for decision making via a process known as knowledge discovery. Much of exploratory data analysis and influential statistics concerns the same problems. The chief distinction between statistical data mining and the exploratory data analysis resides in the size and dimensionality of the data set involved. Data mining in general deals with much more massive data sets for which highly interactive analysis is not fully feasible. In this talk, I will attempt to provide a global view in Statistical Data Mining and share some teaching experience at the beginning. I will then discuss the scales of data set sizes and its limits of feasibility and finally address some research problems in this area.