LLM Internals Hub

Hub

LLM model knowledge and practice

Explore Hub
Technology

How to Add Special Tokens to LLMs Safely

Learn how to add special tokens to LLMs during fine-tuning without causing catastrophic forgetting. Our guide covers smart initialization and PEFT/LoRA.
Bao Bao Suan Fa Bi Ji
6 min read
#add special tokens to LLM#LLM fine-tuning#catastrophic forgetting#PEFT

Editor's Note: In an era marked by rapid digital transformation, the challenge of cybersecurity has emerged as a critical concern for organizations worldwide. As cyber threats evolve, how can businesses strike a balance between innovation and security? This dilemma not only highlights the need for robust protective measures but also calls into question the responsibility of tech companies in safeguarding user data. As we navigate this landscape, the implications for trust and transparency in digital interactions cannot be overstated.


How to Add Special Tokens to LLMs Without Catastrophic Forgetting

Adding special tokens to a Large Language Model (LLM) during Supervised Fine-Tuning (SFT) is a common technique for structuring conversations with tokens like <|user|> or for custom tasks. While it seems simple, this process can lead to a serious issue known as catastrophic forgetting, degrading your model's performance.

The problem is that new tokens are unknown to the pre-trained LLM. They have randomly initialized vectors in the model's embedding matrix and no corresponding logit in the output layer (LM Head). Introducing these "noisy" tokens during LLM fine-tuning can destabilize the model and erase its pre-trained knowledge.

This guide explains how to add special tokens to LLMs safely. We'll cover the right way to initialize new token embeddings and use Parameter-Efficient Fine-Tuning (PEFT) to preserve your model's core capabilities.

The Risk of Catastrophic Forgetting When Adding Tokens

When you add a new special token without a proper strategy, you risk catastrophic forgetting by introducing instability at three key points in the model architecture.

1. Randomly Initialized Token Embeddings

When you call model.resize_token_embeddings(), the vectors for new tokens are randomly initialized. These random vectors are essentially noise compared to the model's existing embeddings, which have been trained on trillions of tokens to capture rich semantic meaning. The LLM has no basis for interpreting this new, random information.

2. Unstable LM Head Logits

The same issue affects the model's output layer, the LM Head, which predicts the next token. A new, randomly initialized logit is added for your special token. This means the model is essentially guessing when it tries to generate this token, leading to poor output quality.

3. Large and Unstable Gradients

During the initial stages of SFT, the model generates a large loss signal when it encounters these unlearned tokens. This triggers large, unstable gradients that propagate backward through the network, disrupting the model's carefully pre-trained weights and erasing its general knowledge.

How to Add Special Tokens to LLMs: A Step-by-Step Guide

To safely add special tokens during LLM fine-tuning and avoid catastrophic forgetting, follow these steps.

Step 1: Update the Tokenizer and Model Vocabulary

First, you must make both the tokenizer and the model aware of the new tokens.

  • Update the Tokenizer: Use the tokenizer's built-in methods to add your new special tokens. This ensures it can correctly encode them.
  • Resize Model Embeddings: Call model.resize_token_embeddings() to expand the model's vocabulary size and allocate space in the embedding matrix for the new tokens.

Step 2: Smartly Initialize New Token Embeddings

This is the most critical step to prevent performance degradation. Never use the default random initialization. Instead, choose one of these methods to initialize new token embeddings.

Method 1: Initialize with the Mean of Existing Embeddings

This is the most robust approach. By initializing the new token's vector with the average of all existing token embeddings, you give it a neutral starting point. This places the new token at the "center of gravity" of the model's semantic space, reducing initial instability.

Method 2: Initialize with Semantically Similar Tokens

If your new token has a clear meaning, you can initialize its embedding using the vector(s) of similar existing tokens. For example, to initialize <|user|>, you could average the embeddings for "user," "User," and "human." This anchors the new token in a relevant semantic area from the start.

Step 3: Use Parameter-Efficient Fine-Tuning (PEFT)

How you train the model after initialization is just as important. PEFT methods like LoRA are the best defense against catastrophic forgetting.

LoRA (Low-Rank Adaptation) works by freezing the original pre-trained LLM weights and adding small, trainable "adapter" matrices to certain layers. This focuses the training effort only on learning the new task and the function of your special tokens, without overwriting the model's core knowledge.

Pro Tip: When using LoRA, leverage the modules_to_save parameter in your configuration. This lets you specify that the embedding layer and LM Head should be fully trained, not just adapted. This is crucial because the new token embeddings and their corresponding logits must be learned from scratch. This gives you the best of both worlds: targeted training for new tokens and strong protection for the base model.

Alternative: Staged Full Fine-Tuning

If you cannot use PEFT, a staged or "warm-up" approach is a less optimal alternative to full fine-tuning.

  1. Stage 1 (Warm-up): Freeze all model layers except for the embed_tokens and lm_head. Train for a few steps on data containing your new tokens. This allows the new embeddings to stabilize.
  2. Stage 2 (Full SFT): Unfreeze all layers and proceed with your complete Supervised Fine-Tuning process. This is more complex and computationally expensive than using LoRA.

Checklist for Adding Special Tokens to LLMs

Here is a quick checklist for adding special tokens during SFT:

  1. Add Tokens: Use tokenizer.add_special_tokens to update the tokenizer.
  2. Resize Model: Call model.resize_token_embeddings to expand the model's vocabulary.
  3. Initialize Smartly: Initialize new token embeddings with the mean of existing embeddings for stability.
  4. Use PEFT: Fine-tune using a PEFT method like LoRA to prevent catastrophic forgetting. Use modules_to_save for the embedding and LM head layers.
  5. Use High-Quality Data: Ensure your SFT dataset correctly and consistently uses the new special tokens. The model learns their function from your examples.
  6. Evaluate Thoroughly: After fine-tuning, test your model on general benchmarks (e.g., MMLU, GLUE) to ensure its core capabilities haven't regressed, in addition to testing your specific task.

By following these strategies, you can enhance your LLM for specific tasks while preserving the powerful, foundational knowledge that makes it so capable. This methodical approach ensures your LLM fine-tuning efforts build upon, rather than undermine, the model's core competence.

Key Takeaways

• Use smart initialization techniques to prevent catastrophic forgetting in LLMs.
• Implement PEFT/LoRA methods during fine-tuning for better token integration.
• Regularly evaluate model performance to detect and mitigate potential forgetting issues.

Explore More in LLM Internals Hub

This article is part of our LLM Internals series. Discover more insights and practical guides.

Visit LLM Internals Hub

About This Article

Topic: Technology
Difficulty: Intermediate
Reading Time: 6 minutes
Last Updated: September 23, 2025

This article is part of our comprehensive guide to Large Language Models and AI technologies. Stay updated with the latest developments in the AI field.

All Articles
Share this article to spread LLM knowledge