|     |  | 
      
        
          | 
          Gediz University, Computer Engineering 
          Department  Spring 
          Semester
          2012,
           
          Tuesday
 |  
          |  |  
          | Instructor: Halûk 
          Gümüşkaya | Teaching Assistant: |  
          | Office: 
          D107 | Office: |  
          | Office Hours: Mon, 
			Wed, Thur: 13:00 - 13:45 | Office Hours: |  
          | Phone: 
          0232-355 0000 - 2305 | Phone: |  
          | e-mail: haluk.gumuskaya@gediz.edu.tr | e-mail: |  
          |  |  |  
          |  |  |  
          |  |  |  
          |  |  |  
      
    Course Description 
		Introduction to data 
		mining. Descriptions of Data, Data Preprocessing: data cleaning, 
		integration and reduction. Data Warehousing and On-line Analytical 
		Processing, Association and Correlation Analysis, Classification: 
		decision trees, naïve bayesian classification, support vector machines, 
		neural networks, rule-based classification, pattern-based 
		classification, logistic regression, Cluster Analysis, Outlier Analysis. 
     
    Prerequisite    
	Probability and Statistics 
      
    Lecture Schedule
	(tentative) 
		
			
				
					| 
					W | 
					D | 
					Lec | 
					 Topics Covered |  
					| 1 | 14/02 |  | Introduction: 
					An Overview of Data Mining |  
					| 2 | 21/02 |  | Getting to Know Your Data: 
					Data Objects and Attribute Types, Basic Statistical 
					Descriptions of Data, Data Visualization, Measuring Data 
					Similarity and Dissimilarity |  
					| 3 | 28/02 |  | Data Preprocessing (1/2): Data 
					Preprocessing: An Overview: Data Quality, Major Tasks in 
					Data Preprocessing, Data Cleaning, Data Integration |  
					| 4 | 06/03 |  | Data Preprocessing (2/2): Data 
					Reduction, Data Transformation and Data Discretization |  
					| 5 | 13/03 |  | Mining Frequent Patterns, 
					Association and Correlations-Basic Concepts and Methods: 
					Basic Concepts, Frequent Itemset Mining Methods, Which 
					Patterns are Interesting?Pattern Evaluation Methods |  
					| 6 | 20/03 |  | Advanced Frequent Pattern 
					Mining: Pattern Mining: A Road Map, Pattern Mining in 
					Multi-Level Multi-Dimensional Space, Constraint-Based 
					Frequent Pattern Mining, Mining High-Dimensional Data and 
					Colossal Patterns, Mining Compressed or Approximate 
					Patterns, Pattern Exploration and Application |  
					| 7 | 27/03 |  | Classification-Basic 
					Concepts: Classification: Basic Concepts, Decision Tree 
					Induction |  
					| 8 | 03/04 |  | Classification-Advanced 
					Methods: Rule-Based Classification, Bayes Classification 
					Methods, Neural Networks and Classification by 
					Backpropagation, Support Vector Machines, Classification by 
					Using Frequent Patterns, Lazy Learners (or Learning from 
					Your Neighbors) |  
					| 9 | 10/04 |  | Classification-Additional 
					Topics: Other Classification Methods, Model Evaluation 
					and Selection, Techniques to Improve Classification 
					Accuracy: Ensemble Methods |  
					| 10 | 17/04 |  | Cluster Analysis-Basic 
					Concepts and Methods: Cluster Analysis: Basic Concepts, 
					Partitioning Methods, Hierarchical Methods, Density-Based 
					Methods, Grid-Based Methods, Evaluation of Clustering |  
					| 11 | 24/04 |  | Cluster Analysis-Advanced 
					Methods: Probability Model-Based Clustering, Clustering 
					High-Dimensional Data, Clustering Graphs and Network Data, 
					Clustering with Constraints |  
					| 12 | 01/05 |  | Outlier Analysis: 
					Outlier and Outlier Analysis, Outlier Detection Methods, 
					Statistical Approaches, Proximity-Base Approaches, 
					Clustering-Base Approaches, Classification Approaches, 
					Mining Contextual and Collective Outliers, Outlier Detection 
					in High Dimensional Data |  
					| 13 | 08/05 |  | Project Demonstrations 1 |  
					| 14 | 15/05 |  | Project Demonstrations 2 |  
     
    Textbooks  
        Main Textbook 
        Recommended 
		|  | Data Mining: 
		Practical Machine Learning Tools and Techniques, 3rd Edition, I. 
		H. Witten, E. Frank, M. A. Hall, Morgan Kaufmann, 629 pp, 2011. |  |  | Introduction to Data Mining, P. Tan, M. Steinbach, V. Kumar, 
		Addison-Wesley, 769 pp, 2006. |  |  | Introduction to 
		Machine Learning, 2nd Edition, Ethem Alpaydın, The MIT Press, 
		2010. |    Tools and Development Environments 
		|  | 
		
		Weka, Data 
		Mining Software in Java |  |  | 
		Matlab |  
     
    Grading
 10 % : 
     ADC (Attendance, Discussion and Contribution)
 20 % : HW Assignments
 20 % : 
	Midterm 1 (Classification)
 20 % : 
	Midterm 2 (Clustering)
 30 % : Project
 
 |