Interpret Caffe FCN output classes

I have set up Caffe and using FCN-8s model with little change with output classes:

layer {
 name: "score_5classes"
 type: "Convolution"
 bottom: "score"
 top: "score_5classes"
 convolution_param {
    num_output: 2
    pad: 0 
    kernel_size: 1 
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "score_5classes"
  bottom: "label"
  top: "loss"
  loss_param {
    normalize: true

I have changed last layer output number to 2, because I want to classify my input images into 2 classes, 0 and 1 (So it seems I should have 2 outputs! I cant understand why?! It could be an output matrix with zeros and ones, couldnt it?)

So my questions are:

1.Should I sum these 2 classes ? because I need 1 output

2.The loss is so small! even when the output is far away from the desired! how Caffe calculates the lost layer?



When doing binary classification, using "SoftmaxWithLoss" with two outputs, is mathematically equivalent to using "SigmoidCrossEntropyLoss". So, if you really only need one output you can set your last layer to num_output: 1 and use "SigmoidCrossEntropyLoss". However, if you want to take advantage of caffe's "Accuracy" layer, you need to use two outputs and "SoftmaxWithLoss" layer.

Regarding your questions:
1. If you opt to use "SoftmaxWithLoss" and you only need one output, take the second output for each pixel as this entry represents the probability of class 1.
I'll leave it to you as an exercise to figure out what you'll get if you take the sum (hint: `"Softmax" output probabilities...)
2. The loss is very small most likely because you have severe class imbalance - most of your pixels are 0 while only very few are 1 (or vice versa), therefore predicting always 0 does not incur such great penalty. If this is your case, I suggest looking at Focal Loss that addresses this issue.