Skip to content


Computing the standard normal cumulative distribution with R

I recently had to compute the Bayes error for a Gaussian model.
Let xi be a predictive variable, following a Gaussian distribution of variance 1 and mean Ti in a class, and -Ti in the other class.
The Bayes error of the model is e=1-Φ(sqrt(sum(Ti^2)).
In R, the standard normal cumulative distribution function is computed using pnorm().

So, here’s the Bayes error if for instance we have 10 variables with mean 0.5 in a class and -0.5 in the other class:
Bayes_error=1-pnorm(1*sqrt(10*0.5^2));

If we had more complicated things, like 5 variables with mean 0.5,0.6,0.7,0.8,0.9 in a class and -0.5,-0.6,-0.7,-0.8,-0.9 in the other, we could do something like:
mus=c(0.5,0.6,0.7,0.8,0.9);
Bayes_error=1-pnorm(1*sqrt(t(mus)%*%mus));

or not vectorized: Bayes_error=1-pnorm(1*sqrt(sum(mus*mus)));

Edit: I just found again about this post through my statistics panel, and I really have no idea why I titled it “computing the std normal cumulative distrib”. It should rather be “computing the Bayes error”… Leaving it this way in order not to mess with the established pretty URL, though.

Posted in R (R-project), statistics.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.

Sorry about the CAPTCHA that requires JS. If you really don't want to enable JS and still want to comment, you can send me your comment via e-mail and I'll post it for you.

Please solve the CAPTCHA below in order to fight spamWordPress CAPTCHA