Manchester Est. 1824
 

COMP37332 - Data Integration and Analysis `
Lab test 2: Data mining

 

Date, time and place


  • Tuesday 11 May 2010, 12-14 in the 3rd year lab (Kilburn).
 
 

Aims


  • Show understanding of the main data mining concepts
  • Show familiarity with related software (Weka)
  • Analyse the results obtained by data mining
 
 

Questions and Marking


  • This test is worth 10 marks (overall) and some parts will be assessed during the lab session.
  • There will be 5 questions:
    • 2 'theoretical' and
    • 3 practical (involving associations, classification and clustering in Weka).
  • You will have 90 minutes to complete 5 questions.
  • This is an individual, closed book assignment.
  • Slides from the lecture
 
 

Datasets


  • Dataset1 - vote.arff (used in questions 3 and 4)
  • Dataset2 - autos.arff (used in question 5)
 
 

Materials and preparation


  • Revise the lecture notes for data mininng, association rules, data classification and clustering, and solve all tutorial questions in advance.
    • Data mining introduction slides
    • Association rule mining slides
    • Data classification slides
    • Data clustering slides
    • Tutorial sheet and guide answers
  • Familiarise yourself with Weka
    • Weka 3: data mining software
    • Weka tutorial
  • Complete all lab tutorials
    • Weka introductory lab/tutorial
      • data files: labor.arff; contact-lenses.arff
    • Association rule mining in Weka (lab/tutorial)
      • data files: contact-lenses.arff; zoo.arff
    • Classification lab/tutorial (Weka)
      • data files: labor.arff; glass.arff; glass-minusatt.arff; glass-withnoise.arff; vehicle.arff
    • Clustering lab/tutorial (Weka)
      • data files: weather.arff; iris.arff; bank.arff flagdata.arff
    • Data Mining tutorial sheet
      • Tutorial Solution