LLM Fine-tuning
Fine-tune large language models efficiently with LoRA and QLoRA.
Overview
AiDotNet provides 34 LoRA adapter implementations for efficient fine-tuning:
| Category | Adapters |
|---|---|
| Core | StandardLoRAAdapter, QLoRAAdapter, DoRAAdapter, AdaLoRAAdapter |
| Low-Parameter | VeRAAdapter, LoRAXSAdapter, NOLAAdapter, VBLoRAAdapter, LoRAFAAdapter |
| Composition | LoHaAdapter, LoKrAdapter, DenseLoRAAdapter, GLoRAAdapter, MultiLoRAAdapter, ChainLoRAAdapter |
| Efficiency | DyLoRAAdapter, FloraAdapter, SLoRAAdapter, LoftQAdapter, PiSSAAdapter |
| Advanced | MoRAAdapter, DVoRAAdapter, DeltaLoRAAdapter, HRAAdapter, RoSAAdapter |
| Specialized | LongLoRAAdapter, GraphConvolutionalLoRAAdapter, LoRETTAAdapter, ReLoRAAdapter |
| Scaling | LoRAPlusAdapter, LoRADropAdapter, XLoRAAdapter, QALoRAAdapter, TiedLoRAAdapter |
Why LoRA?
| Method | Memory (7B Model) | Parameters Trained |
|---|---|---|
| Full Fine-tune | 28+ GB | 100% |
| LoRA (r=8) | 8-12 GB | ~0.1% |
| QLoRA (4-bit) | 4-6 GB | ~0.1% |
Basic LoRA with AiModelBuilder
All LoRA fine-tuning uses the standard AiModelBuilder pattern with .ConfigureLoRA():
using AiDotNet;
using AiDotNet.LoRA;
// Configure LoRA via ILoRAConfiguration
var loraConfig = new DefaultLoRAConfiguration<float>(
rank: 8,
alpha: 16.0f,
dropout: 0.05f
);
// Build model with LoRA using AiModelBuilder
var builder = new AiModelBuilder<float, Tensor<float>, Tensor<float>>();
var result = await builder
.ConfigureModel(model)
.ConfigureLoRA(loraConfig)
.ConfigureOptimizer(new AdamWOptimizer<float>(learningRate: 1e-4f))
.BuildAsync(trainingData, trainingLabels);
// Make predictions
var predictions = builder.Predict(testData, result);
Training Configuration
// Prepare training data (use realistic dataset sizes)
var trainingData = LoadTrainingFeatures(); // Tensor<float> of shape [numSamples, inputDim]
var trainingLabels = LoadTrainingLabels(); // Tensor<float> of shape [numSamples, outputDim]
// Configure LoRA
var loraConfig = new DefaultLoRAConfiguration<float>(
rank: 8,
alpha: 16.0f,
dropout: 0.05f
);
// Train with AiModelBuilder
var builder = new AiModelBuilder<float, Tensor<float>, Tensor<float>>();
var result = await builder
.ConfigureModel(model)
.ConfigureLoRA(loraConfig)
.ConfigureOptimizer(new AdamWOptimizer<float>(
learningRate: 1e-4f,
weightDecay: 0.01f))
.ConfigureLearningRateScheduler(new CosineAnnealingLR(tMax: 100))
.ConfigurePreprocessing()
.BuildAsync(trainingData, trainingLabels);
Console.WriteLine($"Training Loss: {result.TrainingLoss:F4}");
Console.WriteLine($"Validation Loss: {result.ValidationLoss:F4}");
QLoRA (4-bit Quantized)
QLoRA applies LoRA on top of a quantized model to reduce memory usage:
using AiDotNet;
using AiDotNet.LoRA;
// QLoRA uses the QLoRAAdapter directly
var architecture = new NeuralNetworkArchitecture<float>(
inputType: InputType.OneDimensional,
taskType: NeuralNetworkTaskType.MultiClassClassification,
inputSize: inputDim,
outputSize: numClasses
);
var loraConfig = new DefaultLoRAConfiguration<float>(
rank: 8,
alpha: 16.0f,
dropout: 0.05f
);
var builder = new AiModelBuilder<float, Tensor<float>, Tensor<float>>();
var result = await builder
.ConfigureModel(model)
.ConfigureLoRA(loraConfig)
.ConfigureQuantization(new QuantizationConfig { Bits = 4 })
.ConfigureOptimizer(new AdamWOptimizer<float>(learningRate: 1e-4f))
.BuildAsync(trainingData, trainingLabels);
Saving and Loading Models
// Save trained model in AIMF format (includes LoRA weights)
builder.SaveModel(result, "lora-finetuned.aimf");
// Save with encryption (requires license key)
builder.SaveModel(result, "lora-finetuned-encrypted.aimf", encrypt: true);
// Load and use later
var loadedResult = builder.LoadModel("lora-finetuned.aimf");
var predictions = builder.Predict(testData, loadedResult);
LoRA Adapter Variants
AiDotNet provides specialized adapter implementations for different scenarios. Each adapter extends LoRAAdapterBase<T> and implements ILoRAAdapter<T>.
DoRA (Weight-Decomposed LoRA)
Decomposes weight updates into magnitude and direction for better convergence:
// DoRAAdapter decomposes updates into magnitude and direction
var doraAdapter = new DoRAAdapter<float>(
inputSize: 256,
outputSize: 256,
rank: 8,
alpha: 16.0f
);
AdaLoRA (Adaptive Rank)
Dynamically adjusts rank per layer based on importance:
var adaLoraAdapter = new AdaLoRAAdapter<float>(
inputSize: 256,
outputSize: 256,
initialRank: 12,
targetRank: 8,
alpha: 16.0f
);
VeRA (Vector-based LoRA)
Uses shared random matrices with learned scaling vectors for extreme parameter efficiency:
var veraAdapter = new VeRAAdapter<float>(
inputSize: 256,
outputSize: 256,
rank: 256,
alpha: 16.0f
);
Best Practices
Rank Selection
| Task | Recommended Rank |
|---|---|
| Simple tasks | 4-8 |
| Complex tasks | 16-32 |
| Multi-task | 32-64 |
Training Tips
- Start small: Try rank 4-8 first, increase if underfitting
- Learning rate: Use lower rates than full fine-tuning (1e-4 to 5e-5)
- Gradient accumulation: Use gradient accumulation if batch size is memory-limited
- Regularization: Apply dropout (0.05-0.1) to prevent overfitting
- Validation monitoring: Split data into train/validation sets and track validation loss to detect overfitting early
// Example with validation monitoring via AiModelBuilder
var result = await builder
.ConfigureModel(model)
.ConfigureLoRA(loraConfig)
.ConfigureOptimizer(new AdamWOptimizer<float>(learningRate: 1e-4f))
.ConfigurePreprocessing()
.BuildAsync(trainingData, trainingLabels);
// Check for overfitting
if (result.ValidationLoss > result.TrainingLoss * 1.5)
{
Console.WriteLine("Warning: model may be overfitting. Consider increasing dropout or reducing rank.");
}
Memory Comparison
| Model | Full FT | LoRA | QLoRA |
|---|---|---|---|
| 7B | 28+ GB | 10 GB | 5 GB |
| 13B | 52+ GB | 18 GB | 8 GB |
| 70B | OOM | 90 GB | 24 GB |