Model Finetuning

The Finetune section of the BioLM Console allows you to train and personalize your own biological language modelsusing your private datasets.

Finetuning lets you adapt BioLM’s pretrained foundation models (for proteins or DNA) to your specific task - such as custom classification, property prediction, or de novo sequence generation.

Each model type is presented as a card with:

Model Type – the kind of base model being finetuned (e.g. protein classifier, DNA classifier).
Description – what the model does and typical use cases.
API Link – opens the API documentation for the selected finetuning workflow.
Finetune New Model – starts a new finetuning job through the console.

Available Finetuning Workflows

Protein Classifier

Fine-tune one of the largest pretrained protein language models (ESM2) on your own labeled sequence data.

Your model learns to classify protein sequences based on user-defined categories or functional labels.

The resulting model is deployed as a GPU-backed API endpoint for inference, enabling:

Custom sequence classification
Prediction of probabilities or confidence scores for new sequences
Reuse of your personalized model via the API or SDK

Protein Generator

Adapt a generative protein model to your own dataset to create new, biologically coherent sequences.

Fine-tuned models can:

Generate de novo protein sequences consistent with your data
Explore new areas of “sequence space” guided by structure or functional similarity
Be used via the API to produce optimized sequences for experimental validation

Only positive-class sequences (e.g., known active or functional examples) are required for training.

DNA Classifier

Fine-tune a DNA-based BERT model for classification tasks such as promoter prediction, splice site detection, or transcription factor binding site analysis.

You can:

Upload or connect NGS or public datasets
Train classifiers for genomic features or functional labels
Export DNA embeddings for use in downstream ML workflows

Fine-tuned models achieve up to 99% AUC on benchmark tasks and can be accessed via API for batch or online inference.

How to Finetune a Model

Select a finetuning workflow (e.g. Protein Classifier).
Click Finetune New Model to begin setup.

Provide:
- Input dataset (FASTA or CSV format depending on workflow)
- Training configuration (epochs, batch size, validation split, etc.)
- Output model name and description
Launch the job and monitor progress from your workspace.
Once complete, your model will appear as a deployable endpoint accessible through the BioLM SDK or REST API.

Typical Use Cases

Custom protein function prediction
Antibody classification or paratope identification
Genomic site prediction or regulatory element detection
Generation of novel protein variants aligned to your own experimental data

Notes

Finetuning jobs consume GPU compute; ensure your workspace has sufficient budget or quota.
You can view API examples and deployment details through the API link on each model card.
Finetuned models are stored securely under your workspace and can be shared with team members in collaborative plans.