Finetune issue with Llama2-7b

Hello, I encountered the following error when running finetuning Llama2:

![Image](https://github.com/user-attachments/assets/55889c85-06d7-461c-9685-4cc799811ccc)

Some indexes in the input tensor for the embedding layer are outside the vocabulary dimension. I guess the problem occurs when adding a special tokens to the HF Tokenizer vocabulary in tokenizer.py:

![Image](https://github.com/user-attachments/assets/effe240c-89f1-4864-bae5-3e071ddd07b4)

For myself, I solved this by changing the line:

![Image](https://github.com/user-attachments/assets/f9c0b654-9bec-4108-bc13-c3b2caea102c)

Is this the right way?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune issue with Llama2-7b #463

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Finetune issue with Llama2-7b #463

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions