Zhang et al. research in 2020 compared biobert and scispacy ner models accuracy, overall biobert won. How to download and import (preferably using spacy and from huggin face) the latest **trained ** official version of biobert to perform ner on **uncased ** medical text. If there is a better performing medical text ner model, please inform. The goal is to identify diagnosis, operations and *optionally * drug mentions.

looked at lots of hugging face code but does not support pre-trained model usage

1

There are 1 best solutions below

0
norm On

answering my own question, I created a conda environment and installed a few packages..

conda create --name biobertner python=3.11
conda activate biobertner
pip3 install torch torchvision torchaudio --index-url 
https://download.pytorch.org/whl/cu121
pip3 install transformers

i used a biobert model from https://huggingface.co/alvaroalon2/biobert_diseases_ner. For the python i did the following..

    from transformers import AutoTokenizer, AutoModelForTokenClassification
    from transformers import pipeline
    tokenizer = AutoTokenizer.from_pretrained("alvaroalon2/biobert_diseases_ner")
    model = AutoModelForTokenClassification.from_pretrained("alvaroalon2/biobert_diseases_ner")
    nlp = pipeline("ner", model=model, tokenizer=tokenizer)
    example = "she had a cold on the day she was diagnosed with cancer in her left lung on June 2023"
    ner_results = nlp(example)
    for ent in ner_results:
        print(ent)