how to calculate uncertainty about named entities?

21 Views Asked by Balkhrod At 17 August 2025 at 22:34

I'm trying to use the most uncertain sample method on my named entity dataset which is a list made up of phrases which are themselves made up of tokens. To do this, I'm using modal which has already implemented this function but I've identified it so that it works for me.

Firstly, using my CRF() model, I return a matrix with several probabilities for each token in each sentence. In fact, each time, I have several probabilities because I have several classes. But what I'm doing is that I want a single probability per token and I want to calculate the uncertainty rate, so I take 1- the maximum probability of the token to obtain the uncertainty rate. Except that afterwards I'd like to obtain a probability for each sentence, so I've added them together.

```  

     def uncertainty_sampling(self,x,n_instances:int=1):
            uncertainty = self.classifier_uncertainty(x)#matrix probas
            print(uncertainty)
            sequence_uncertainties = np.sum(uncertainty, axis=2)
            total_sequence_uncertainties = np.sum(sequence_uncertainties, axis=1)
            print(total_sequence_uncertainties)
            most_uncertain_indices = np.argpartition(-total_sequence_uncertainties, n_instances)[:n_instances]
            print(most_uncertain_indices)
            most_uncertain_sequences = uncertainty[most_uncertain_indices]
            return most_uncertain_indices

```

It all adds up. Except that when I compare my crf() trained on a random sample and my crf() with active learning, my normal ctf() works better... So I think I've made a mistake in my code. Can you help me?

Here's an example of my data: Matrix of probas called uncertainty where each value = 1-np.max(probabilites of each class)

[[[0.35675915]
  [0.3411394 ]
  [0.32315629]
  ...
  [0.        ]
  [0.        ]
  [0.        ]]

 [[0.37441699]
  [0.44393638]
  [0.31663258]
  ...
  [0.        ]
  [0.        ]
  [0.        ]]

 [[0.34106778]
  [0.23466529]
  [0.1312631 ]
  ...
  [0.        ]
  [0.        ]
  [0.        ]]

Total_sequence_uncertainties (sum of probas for each sentence): [3.21548365 9.10346426 3.68245306 ... 2.66100948 1.95914588 1.56346819] Thank you a lot!!!

Original Q&A

how to calculate uncertainty about named entities?

There are 0 best solutions below

Related Questions in PYTHON-3.X

Related Questions in CRFSUITE

Related Questions in ACTIVE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions