Adapter Tuning
Definition
A fine-tuning method that adds small trainable adapter layers while keeping most original model weights frozen.
Fine-Tuning & Alignment Terms terms and explanations from the Agentic AI Glossary.
Definition
A fine-tuning method that adds small trainable adapter layers while keeping most original model weights frozen.
Definition
The number of training examples processed together before updating model parameters.
Definition
When fine-tuning causes a model to lose useful abilities it learned during earlier training.
Definition
A saved model state that can be reused, evaluated, restored, or continued during training.
Definition
Removing errors, duplicates, unsafe content, or irrelevant examples from training or evaluation data.
Definition
Selecting and organizing high-quality examples that teach the model the desired behavior.
Definition
Direct Preference Optimization, a preference-tuning method that trains a model to favor preferred responses without a separate reward model.
Definition
One full pass through the training dataset during model training.
Definition
Testing a tuned model against base-model behavior, target tasks, safety cases, and regression benchmarks.
Definition
Updating all or most model weights during fine-tuning instead of using small adapters.
Definition
Examples where people compare or rate outputs to teach the model preferred behavior.
Definition
A training configuration value, such as learning rate or batch size, chosen before training begins.
Definition
Fine-tuning a model on instruction-response examples so it follows user requests better.
Definition
Learning Rate is a training hyperparameter that controls how large each model-weight update is during optimization.
Definition
Low-Rank Adaptation, a parameter-efficient fine-tuning method that trains small adapter matrices instead of all model weights.
Definition
Combining weights or adapters from multiple models to create a new model variant.
Definition
Parameter-efficient fine-tuning, a family of methods that adapt models by training only a small number of parameters.
Definition
Proximal Policy Optimization, a reinforcement learning algorithm often discussed in RLHF training pipelines.
Definition
A dataset containing preferred and rejected outputs used to train alignment behavior.
Definition
A parameter-efficient method that trains prefix vectors added to the model input.
Definition
A parameter-efficient method that learns soft prompt embeddings instead of changing the full model.
Definition
Quantized LoRA, a memory-efficient fine-tuning method that combines quantization with LoRA adapters.
Definition
Reward Model is a model trained to score outputs so a fine-tuning or alignment process can prefer better responses.
Definition
Reinforcement Learning from AI Feedback, where AI-generated preferences help guide model alignment.
Definition
Reinforcement Learning from Human Feedback, where human preferences guide model behavior after pretraining.
Definition
Supervised Fine-Tuning, where a model is trained on labeled examples of desired instructions and responses.
Definition
Training a pretrained model on labeled prompt-response examples.
Definition
Artificially generated examples used for testing, evaluation, training, simulation, or privacy-preserving development.
Definition
Held-out data used for final evaluation after training decisions are complete.
Definition
The examples used to update model parameters during training.
Definition
Held-out data used during development to tune settings and detect overfitting.
Explore more chapters or test your knowledge with quizzes.