Thursday, June 10, 2010

Classification Error

An error is made when an observation $\bf{x}$ which actually belongs to class $\Omega_i$ is classified to a wrong class $\Omega_i^c$, which can be any region except $\Omega_i$.

The probability of error is calculated by integrating over all $\bf{x}$.

\[P(error)=\sum\limits_{i=1}^n\int\limits_{\Omega_i^c}P({\bf x}|\omega_i)P(\omega_i)d\bf x\]
Using this example, we can estimate the classification error as follows

xl=-8:.1:8;
pc1=0.6*normpdf(xl,-1,1);
pc2=0.4*normpdf(xl,1,1);
% The decision boundary is where the two curves meet.
ind1=find(pc1>=pc2);
ind2=find(pc1
pmis1=sum(pc1(ind2))*.1;
pmis2=sum(pc2(ind1))*.1;
error=pmis1+pmis2;

>> error
error =
    0.1539
 
The error 0.1539 is the shade area.
For the two-class case, the decision boundary is

\[P(x|\omega_1)P(\omega_1)=P(x|\omega_2)P(\omega_2)\]
We can use this script to find the decision boundary of the above problem.

>> l=find(xl>-2 & xl <2);
>> xpos=find(abs(pc1(l)-pc2(l))<0.001);
>> xl(l(xpos))

ans =
    0.2000

Thus, the decision boundary is where x=0.2.

No comments:

Post a Comment