-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
when I repeat the code ,I just get the result as follows:
...
hardham.res <- ifelse(hardham.spamtest > hardham.hamtest,
TRUE,
FALSE)
summary(hardham.res)
...
the result is :
Mode FALSE TRUE NA's
logical 243 6 0
I also try:
hardham.res <- ifelse(hardham.spamtest == hardham.hamtest,
TRUE,
FALSE)
the result is:
Mode FALSE TRUE NA's
logical 21 228 0
that means most of the results is equal .
so i double if it's the floating overflow fault. then I change the classify.email function as below:
classify.email <- function(path, training.df, prior = 0.5, c = 1e-6)
{
Here, we use many of the support functions to get the
email text data in a workable format
msg <- get.msg(path)
msg.tdm <- get.tdm(msg)
msg.freq <- rowSums(as.matrix(msg.tdm))
Find intersections of words
msg.match <- intersect(names(msg.freq), training.df$term)
Now, we just perform the naive Bayes calculation
if(length(msg.match) < 1)
{
return((log10(prior)+length(msg.freq)_log10(c))) # return(prior * c ^ (length(msg.freq)))
}
else
{
match.probs <- training.df$occurrence[match(msg.match, training.df$term)]
return((log10(prior)+sum(log10(match.probs)) + (length(msg.freq) - length(msg.match))_log10(c))) # return(prior * prod(match.probs) * c ^ (length(msg.freq) - length(msg.match)))
}
}
this time I get the result:
hardham.res <- ifelse(hardham.spamtest > hardham.hamtest,
- TRUE,
- FALSE)
summary(hardham.res)
Mode FALSE TRUE NA's
logical 80 169 0
my god the conclusion is just error.
who has encounter the same problem ?
where have I make the mistake?