Model Finetuning
The Finetune section of the BioLM Console allows you to train and personalize your own biological language modelsusing your private datasets.
Finetuning lets you adapt BioLM’s pretrained foundation models (for proteins or DNA) to your specific task - such as custom classification, property prediction, or de novo sequence generation.
Each model type is presented as a card with:
- Model Type – the kind of base model being finetuned (e.g. protein classifier, DNA classifier).
- Description – what the model does and typical use cases.
- API Link – opens the API documentation for the selected finetuning workflow.
- Finetune New Model – starts a new finetuning job through the console.
Available Finetuning Workflows
Protein Classifier
Fine-tune one of the largest pretrained protein language models (ESM2) on your own labeled sequence data.
Your model learns to classify protein sequences based on user-defined categories or functional labels.
The resulting model is deployed as a GPU-backed API endpoint for inference, enabling:
- Custom sequence classification
- Prediction of probabilities or confidence scores for new sequences
- Reuse of your personalized model via the API or SDK
Protein Generator
Adapt a generative protein model to your own dataset to create new, biologically coherent sequences.
Fine-tuned models can:
- Generate de novo protein sequences consistent with your data
- Explore new areas of “sequence space” guided by structure or functional similarity
- Be used via the API to produce optimized sequences for experimental validation
Only positive-class sequences (e.g., known active or functional examples) are required for training.
DNA Classifier
Fine-tune a DNA-based BERT model for classification tasks such as promoter prediction, splice site detection, or transcription factor binding site analysis.
You can:
- Upload or connect NGS or public datasets
- Train classifiers for genomic features or functional labels
Export DNA embeddings for use in downstream ML workflows
Fine-tuned models achieve up to 99% AUC on benchmark tasks and can be accessed via API for batch or online inference.
How to Finetune a Model
- Select a finetuning workflow (e.g. Protein Classifier).
Click Finetune New Model to begin setup.
Provide:
- Input dataset (FASTA or CSV format depending on workflow)
- Training configuration (epochs, batch size, validation split, etc.)
- Output model name and description
- Launch the job and monitor progress from your workspace.
- Once complete, your model will appear as a deployable endpoint accessible through the BioLM SDK or REST API.
Typical Use Cases
- Custom protein function prediction
- Antibody classification or paratope identification
- Genomic site prediction or regulatory element detection
- Generation of novel protein variants aligned to your own experimental data
Notes
- Finetuning jobs consume GPU compute; ensure your workspace has sufficient budget or quota.
- You can view API examples and deployment details through the API link on each model card.
- Finetuned models are stored securely under your workspace and can be shared with team members in collaborative plans.