admin管理员组文章数量:1403221
I'm trying to fine-tune a model using SFTTrainer from trl.
This is how my SFTConfig arguments look like,
from trl import SFTConfig
training_arguments = SFTConfig(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
optim=optim,
save_steps=save_steps,
logging_steps=logging_steps,
learning_rate=learning_rate,
weight_decay=weight_decay,
fp16=fp16,
bf16=bf16,
max_grad_norm=max_grad_norm,
max_steps=max_steps,
warmup_ratio=warmup_ratio,
group_by_length=group_by_length,
lr_scheduler_type=lr_scheduler_type,
report_to="tensorboard",
dataset_text_field="instruction",
max_seq_length=None,
packing=False,
gradient_checkpointing=False,
)
and this is my SFTTrainer block.
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
tokenizer=tokenizer,
args=training_arguments,
)
The error comes from internal function SFTTrainer._prepare_model_for_kbit_training
.
"""Prepares a quantized model for kbit training."""
330 prepare_model_kwargs = {
331 "use_gradient_checkpointing": args.gradient_checkpointing,
332 "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
333 }
I tried passing gradient_checkpointing
as False and gradient_checkpointing_kwargs
as an empty dictionary, but no luck.
How can I avoid this error?
I'm trying to fine-tune a model using SFTTrainer from trl.
This is how my SFTConfig arguments look like,
from trl import SFTConfig
training_arguments = SFTConfig(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
optim=optim,
save_steps=save_steps,
logging_steps=logging_steps,
learning_rate=learning_rate,
weight_decay=weight_decay,
fp16=fp16,
bf16=bf16,
max_grad_norm=max_grad_norm,
max_steps=max_steps,
warmup_ratio=warmup_ratio,
group_by_length=group_by_length,
lr_scheduler_type=lr_scheduler_type,
report_to="tensorboard",
dataset_text_field="instruction",
max_seq_length=None,
packing=False,
gradient_checkpointing=False,
)
and this is my SFTTrainer block.
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
tokenizer=tokenizer,
args=training_arguments,
)
The error comes from internal function SFTTrainer._prepare_model_for_kbit_training
.
"""Prepares a quantized model for kbit training."""
330 prepare_model_kwargs = {
331 "use_gradient_checkpointing": args.gradient_checkpointing,
332 "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
333 }
I tried passing gradient_checkpointing
as False and gradient_checkpointing_kwargs
as an empty dictionary, but no luck.
How can I avoid this error?
Share Improve this question edited Apr 4 at 0:39 Starship Remembers Shadow 9934 gold badges15 silver badges28 bronze badges asked Mar 22 at 16:58 sabira kabeersabira kabeer 11 1- With gradient_checkpointing_kwargs={'use_reentrant':False} ? – rehaqds Commented Mar 22 at 22:46
1 Answer
Reset to default 0Using gradient_checkpointing_kwargs={'use_reentrant':False}
instead of gradient_checkpointing=False
might work.
本文标签:
版权声明:本文标题:python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744307253a2599856.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论