Instructions to use InstaDeepAI/nucleotide-transformer-v2-500m-multi-species with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use InstaDeepAI/nucleotide-transformer-v2-500m-multi-species with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="InstaDeepAI/nucleotide-transformer-v2-500m-multi-species", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("InstaDeepAI/nucleotide-transformer-v2-500m-multi-species", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Added configuration for Auto models in downstream tasks
Enabled initializing the model as a TokenClassification or SequenceClassification model for use in a downstream task.
Now using
model = AutoModelForTokenClassification.from_pretrained(model, trust_remote_code=True)
or
model = AutoModelForSequenceClassification.from_pretrained(model, trust_remote_code=True)
works, as it does for the NT-V1 models.
Was this functionality left out intentionally? I have tested this change with a fine-tuning Token Classification task with LoRa and seems to work fine.
If this change is desired, it should be integrated in all other NT-V2 models.
Hello @carlesonielfa ,
Good catch, this was not left out intentionallly. Since NT-v1 are actually based on HuggingFace's ESM official implementation, the TokenClassification and SequenceClassification were by default enabled but I forgot to add it to the NT-v2 models.
I will be adding this to all other NT-v2 models.
Cheers !