-
I’ve a neural net and want to measure its uncertainty on classification, currently I just use probability of top class as a proxy, how would a Bayesian neural net change that? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This answer is a little longer than I originally expected, but I hope it is clear. Bayesian NNs are NN ensemblesIt is important to define exactly what it is we are trying to compute, so I am going to briefly go over the difference between bayesian neural nets and others. We train classifiers because we are interested in the probability that an item indexed by Where In practice, inference for Bayesian Neural Networks consists in drawing samples from the posterior distribution How confident can a model be?Now we can go back to what I understood as your question: how do we estimate how “confused” our model is? I take it that you consider the value of This one?
Or this one?
I think most people would say that the result is more certain in the first situation. If I were to work with non-bayesian nets, and without any other information about the downstream requirements, I would use the entropy of the histogram as a measure instead. After all, it does measure the amount of information in that histogram. we saw how to compute this histogram with a bayesian net earlier, and we could indeed apply the same entropy measure in the bayesian case. The difference being that this histogram is supposedly more representative of the “true” probability of belonging to either category. This may sound disappointing so far. But you can do something much better with Bayesian nets. We often ask how confident the model is because we are worried about the consequences of misclassification. If mis-classification costed us nothing we would just go with the highest-proability class all the time and not think too much about it. To illustrate let’s take the example a neural net that takes pictures of machine parts and marks them as defective or good to go. There are two possibilities for misclassification:
The bayesian way of handling this situation is to average each of the cost functions over the posterior distribution TL;DR
Update 1 |
Beta Was this translation helpful? Give feedback.
This answer is a little longer than I originally expected, but I hope it is clear.
Bayesian NNs are NN ensembles
It is important to define exactly what it is we are trying to compute, so I am going to briefly go over the difference between bayesian neural nets and others. We train classifiers because we are interested in the probability that an item indexed by$i$ belongs to a category $c$ given a model and a dataset on which we have “trained” the model:
Where$\theta$ is a vector that contains the model's weights, $\mathcal{D} = \left\{x_i, y_i \right\}$ the training data.…