Saturday, June 12, 2010

Receiver Operating Characteristic (ROC) Curve

Receiver Operating Characteristic (ROC) Curve is a characteristic of a classifier. It is a plot between P(FA) or "false positive rate" of the class $\omega_2$ (non-target class) vs. P(CC) or "true positive rate" of the class $\omega_1$ (target class).

It also has a meaning that "If we set the alpha to this value, what would the P(correct) be?"

For example, this following code yields a theoretical ROC curve (black, dotted) and classification ROC curve (blue) of

\[p(x|\omega_1)\sim N(-1,1),\;\;\; p(\omega_1)=0.6\]
\[p(x|\omega_2)\sim N(1,1),\;\;\; p(\omega_2)=0.4\]
n=1000;
u=rand(1,n);
x=zeros(size(u));
ind1 = u<= 0.60;
ind2 = ~ind1;
n1=sum(ind1);
n2=sum(ind2);
x(ind1)=randn(1,n1)-1;
x(ind2)=randn(1,n2)+1;
 

% P(false alarm), alpha, False Positive Rate
pfa = 0.01:.01:0.99; 
% theoretical line
pcc=normcdf(norminv(pfa,1,1),-1,1);
plot(pfa,pcc,'k:'); hold on;

% building a classifier
% (setting false alarm value)
pfa2= zeros(size(pfa)); % false alarm : result
pcc = zeros(size(pfa)); % P(correct classification)

% find model parameters
mu2=mean(x(ind2));
sd2=std(x(ind2));
% get decision boundary for each P(fa)
xi=norminv(pfa,mu2,sd2); 
for i=1:length(xi)
    c1=x < x(i)  % classified as class 1
    pcc(i)=sum(c1 & ind1)/n1;
    pfa2(i)=sum(c1 & ind2)/n2;
end

plot(pfa2,pcc,'b'); hold off;
xlabel('P(FA)');
ylabel('P(CC_{\omega_1})');


No comments:

Post a Comment