Quantcast
Viewing latest article 26
Browse Latest Browse All 62453

[D] Adding new vocab tokens + fine-tuning LLMs to follow instructions is ineffective

I've been experimenting on instruction-tuning LLMs and VLMs either with adding new specialized tokens to their corresponding tokenizer/processor, or not. The setup is typical: mask the instructions/prompts (only attend to responses/answer) and apply CE loss. Nothing special, standard SFT.

However, I've observed better validation losses and output quality with models trained using their base tokenizer/processor versus models trained with modified tokenizer... Any thoughts on this? Feel free to shed light on this.

(my hunch: it's difficult to increase the likelihood of these new added tokens and the model simply just can't learn it properly).

submitted by /u/AnyIce3007
[link] [comments]

Viewing latest article 26
Browse Latest Browse All 62453

Trending Articles