How to calculate classification accuracy with confusion matrix?


I have Train and Test data, how to calculate classification accuracy with confusion matrix ? Thanks

@attribute outlook {sunny, overcast, rainy}
@attribute temperature {hot, mild, cool}
@attribute humidity {high, normal}
@attribute windy {TRUE, FALSE}
@attribute play {yes, no}


1   sunny       hot     high    FALSE   no
2   sunny       hot     high    TRUE    no
3   overcast    hot     high    FALSE   yes
4   rainy       mild    high    FALSE   yes
5   rainy       cool    normal  FALSE   yes
6   rainy       cool    normal  TRUE    no
7   sunny       cool    normal  FALSE   yes
8   rainy       mild    normal  FALSE   yes
9   sunny       mild    normal  TRUE    yes
10  overcast    mild    high    TRUE    yes
11  overcast    hot     normal  FALSE   yes
12  rainy       mild    high    TRUE    no


overcast    cool    normal  TRUE    yes
sunny       mild    high    FALSE   no

Rules found:

(humidity,normal), (windy,FALSE) -> (play,yes) [Support=0.33 , Confidence=1.00 , Correctly Classify= 4, 8, 9, 12]
(outlook,overcast) -> (play,yes) [Support=0.25 , Confidence=1.00 , Correctly Classify= 2, 11]
(outlook,rainy), (windy,FALSE) -> (play,yes) [Support=0.25 , Confidence=1.00 , Correctly Classify= 3]
(outlook,sunny), (temperature,hot) -> (play,no) [Support=0.17 , Confidence=1.00 , Correctly Classify= 0, 1]
(outlook,sunny), (humidity,normal) -> (play,yes) [Support=0.17 , Confidence=1.00 , Correctly Classify= 10]
(outlook,rainy), (windy,TRUE) -> (play,no) [Support=0.17 , Confidence=1.00 , Correctly Classify= 5, 13]

Xuan Dung

Posted 2014-10-25T23:46:14.493

Reputation: 153

How to calculate classification accuracy with confusion matrix ? Thanks – Xuan Dung – 2014-10-27T11:14:43.897



A confusion matrix is a cross tabulation of your predicted values against the true observed values, and (test) accuracy is emperical rate of correct predictions. So in this case you'll need to

  1. Predict the 'play' attribute for your test set. (Currently you don't have a method to predict your second test case, so for the sake of argument let's assume your model would predict yes for the sunny example.
  2. The following method of keeping track of your predictions is reffered to as a confusion matrix. The top labels are prediced
         ¦     ¦ yes ¦ no ¦
Oserved  ¦ yes ¦ 1   ¦ 1  ¦
         ¦ no  ¦ 0   ¦ 0  ¦

Here the first 1 is from your first test case and the second 1 is from the misclassified second test case.

  1. Calculate accuracy,

Accuracy = (# correct predictions)/(# total predictions) = 1 / 2 = .50.


Posted 2014-10-25T23:46:14.493

Reputation: 321


It is classify test objects: "In classification, let R be the set of generated rules and T the training data. The basic idea of the proposed method is to choose a set of high confidence rules in R to cover T. In classifying a test object, the first rule in the set of rules that matches the test object condition classifies it. This process ensures that only the highest ranked rules classify test objects."

Suppose 1 test case is (overcast, cool, normal, TRUE). Look through the rules top to bottom and see if any of the conditions are matched. The first rule for example tests the outlook feature. The value doesn't match, so the rule isn't matched. Move on to the next rule. And so on. In this case, rule 2 matches the test case and the classification for the play variable is "yes". The second test case is misclassified.


Xuan Dung

Posted 2014-10-25T23:46:14.493

Reputation: 153