Cost-sensitive loss function in Tensorflow

797 Views Asked by At

I'm doing research for cost-sensitive neural network based on Tensorflow. But because of the static graph structure of Tensorflow. Some NN structure couldn't be realized by myself.

My loss function(cost) ,cost matrix and the computational progress is described as follow and my target is to compute the total cost and then optimize the NN :

Approximately computational progress: enter image description here

  • the y_ is the last full-connect output of a CNN which has shape (1024,5)
  • the y is a Tensor which has shape(1024) and indicates the ground truth of x[i]
  • the y_soft[i] [j] indicates the probability of x[i] to be class j

How can I realize this in Tensorflow?

1

There are 1 best solutions below

0
On

cost_matrix:

[[0,1,100],
[1,0,1],
[1,20,0]]

label:

[1,2]

y*:

[[0,1,0],
[0,0,1]]

y(prediction):

[[0.2,0.3,0.5],
[0.1,0.2,0.7]]

label,cost_matrix-->cost_embedding:

[[1,0,1],
[1,20,0]]

It obvious 0.3 in [0.2,0.3,0.5] refers to right lable probility of [0,1,0], so it should not contibute to loss.

0.7 in [0.1,0.2,0.7] is the same. In other words, the pos with value 1 in y* not contibute to loss.

So I have (1-y*):

[[1,0,1],
[1,1,0]]

Then the entropy is target*log(predict) + (1-target) * log(1-predict),and value 0 in y*,should use (1-target)*log(1-predict), so I use (1-predict) said (1-y)

1-y:

[[0.8,*0.7*,0.5],
[0.9,0.8,*0.3*]]

(italic num is useless)

the custom loss is

[[1,0,1], [1,20,0]]   *   log([[0.8,0.7,0.5],[0.9,0.8,0.3]])    *  
[[1,0,1],[1,1,0]]

and you can see the (1-y*) can be drop here

so the loss is -tf.reduce_mean(cost_embedding*log(1-y)) ,to make it applicable , should be:

-tf.reduce_mean(cost_embedding*log(tf.clip((1-y),1e-10)))

the demo is below

import tensorflow as tf
import numpy as np
hidden_units = 50
num_class = 3
class Model():
    def __init__(self,name_scope,is_custom):
        self.name_scope = name_scope
        self.is_custom = is_custom
        self.input_x = tf.placeholder(tf.float32,[None,hidden_units])
        self.input_y = tf.placeholder(tf.int32,[None])

        self.instantiate_weights()
        self.logits = self.inference()
        self.predictions = tf.argmax(self.logits,axis=1)
        self.losses,self.train_op = self.opitmizer()

    def instantiate_weights(self):
        with tf.variable_scope(self.name_scope + 'FC'):
            self.W = tf.get_variable('W',[hidden_units,num_class])
            self.b = tf.get_variable('b',[num_class])

            self.cost_matrix = tf.constant(
                np.array([[0,1,100],[1,0,100],[20,5,0]]),
                dtype = tf.float32
            )

    def inference(self):
        return tf.matmul(self.input_x,self.W) + self.b

    def opitmizer(self):
        if not self.is_custom:
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits\
                (labels=self.input_y,logits=self.logits)
        else:
            batch_cost_matrix = tf.nn.embedding_lookup(
                self.cost_matrix,self.input_y
            )
            loss = - tf.log(1 - tf.nn.softmax(self.logits))\
                     * batch_cost_matrix

        train_op = tf.train.AdamOptimizer().minimize(loss)
        return loss,train_op

import random
batch_size = 128
norm_model = Model('norm',False)
custom_model = Model('cost',True)
split_point = int(0.9 * dataset_size)
train_set = datasets[:split_point]
test_set = datasets[split_point:]


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(100):
        batch_index = random.sample(range(split_point),batch_size)
        train_batch = train_set[batch_index]
        train_labels = lables[batch_index]
        _,eval_predict,eval_loss = sess.run([norm_model.train_op,
                  norm_model.predictions,norm_model.losses],
                  feed_dict={
                      norm_model.input_x:train_batch,
                      norm_model.input_y:train_labels
        })
        _,eval_predict1,eval_loss1 = sess.run([custom_model.train_op,
                  custom_model.predictions,custom_model.losses],
                  feed_dict={
                      custom_model.input_x:train_batch,
                      custom_model.input_y:train_labels
        })
        # print 'norm',eval_predict,'\ncustom',eval_predict1
        print np.sum(((eval_predict == train_labels)==True).astype(np.int)),\
            np.sum(((eval_predict1 == train_labels)==True).astype(np.int))
        if i%10 == 0:
            print  'norm_test',sess.run(norm_model.predictions,
                  feed_dict={
                      norm_model.input_x:test_set,
                      norm_model.input_y:lables[split_point:]
        })
            print  'custom_test',sess.run(custom_model.predictions,
                  feed_dict={
                      custom_model.input_x:test_set,
                      custom_model.input_y:lables[split_point:]
        })