python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing

IT技术

更新时间：2025-04-114

admin管理员组
文章数量:1403221

I'm trying to fine-tune a model using SFTTrainer from trl.

This is how my SFTConfig arguments look like,

from trl import SFTConfig
training_arguments = SFTConfig(
       output_dir=output_dir,
       num_train_epochs=num_train_epochs,
       per_device_train_batch_size=per_device_train_batch_size,
       gradient_accumulation_steps=gradient_accumulation_steps,
       optim=optim,
       save_steps=save_steps,
       logging_steps=logging_steps,
       learning_rate=learning_rate,
       weight_decay=weight_decay,
       fp16=fp16,
       bf16=bf16,
       max_grad_norm=max_grad_norm,
       max_steps=max_steps,
       warmup_ratio=warmup_ratio,
       group_by_length=group_by_length,
       lr_scheduler_type=lr_scheduler_type,
       report_to="tensorboard",
       dataset_text_field="instruction",
       max_seq_length=None,
       packing=False,
       gradient_checkpointing=False,
   )

and this is my SFTTrainer block.

trainer = SFTTrainer(
   model=model,
   train_dataset=dataset,
   peft_config=peft_config,
   tokenizer=tokenizer,
   args=training_arguments,
)

The error comes from internal function SFTTrainer._prepare_model_for_kbit_training.

 """Prepares a quantized model for kbit training."""
    330 prepare_model_kwargs = {
    331     "use_gradient_checkpointing": args.gradient_checkpointing,
    332     "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
    333 }

I tried passing gradient_checkpointing as False and gradient_checkpointing_kwargs as an empty dictionary, but no luck.

How can I avoid this error?

I'm trying to fine-tune a model using SFTTrainer from trl.

This is how my SFTConfig arguments look like,

from trl import SFTConfig
training_arguments = SFTConfig(
       output_dir=output_dir,
       num_train_epochs=num_train_epochs,
       per_device_train_batch_size=per_device_train_batch_size,
       gradient_accumulation_steps=gradient_accumulation_steps,
       optim=optim,
       save_steps=save_steps,
       logging_steps=logging_steps,
       learning_rate=learning_rate,
       weight_decay=weight_decay,
       fp16=fp16,
       bf16=bf16,
       max_grad_norm=max_grad_norm,
       max_steps=max_steps,
       warmup_ratio=warmup_ratio,
       group_by_length=group_by_length,
       lr_scheduler_type=lr_scheduler_type,
       report_to="tensorboard",
       dataset_text_field="instruction",
       max_seq_length=None,
       packing=False,
       gradient_checkpointing=False,
   )

and this is my SFTTrainer block.

trainer = SFTTrainer(
   model=model,
   train_dataset=dataset,
   peft_config=peft_config,
   tokenizer=tokenizer,
   args=training_arguments,
)

The error comes from internal function SFTTrainer._prepare_model_for_kbit_training.

 """Prepares a quantized model for kbit training."""
    330 prepare_model_kwargs = {
    331     "use_gradient_checkpointing": args.gradient_checkpointing,
    332     "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
    333 }

I tried passing gradient_checkpointing as False and gradient_checkpointing_kwargs as an empty dictionary, but no luck.

How can I avoid this error?

Share Improve this question edited Apr 4 at 0:39 Starship Remembers Shadow 9934 gold badges15 silver badges28 bronze badges asked Mar 22 at 16:58 sabira kabeer 11

With gradient_checkpointing_kwargs={'use_reentrant':False} ? – rehaqds Commented Mar 22 at 22:46

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Using gradient_checkpointing_kwargs={'use_reentrant':False} instead of gradient_checkpointing=False might work.

本文标签：

版权声明：本文标题：python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744307253a2599856.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument &#39;gradient_checkpointing

1 Answer 1

更多相关文章

javascript - redux-toolkit-&gt; &#39;A non-serializable value was detected in an action&#39; while converting workin

javascript - How to debug a Cordova&#39;s hook? - Stack Overflow

typescript - How to get the value on the child using the emitter and pass it to the parent grid table in angular 19 - Stack Over

javascript - gdata api v3 youtube, can not retrieve contentDetails - Stack Overflow

How to add a cron job in my functions.php

javascript - Disable waring &#39;Type of property circularly references itself in mapped type&#39; or workaround - Stack

post thumbnails - Display child categories in WooCommerce

html - asp.netjavascript, why is textbox onChange event does not fire? - Stack Overflow

javascript - What is the best way to add APM to NuxtJS project - Stack Overflow

Show extra user profile meta for current user

angularjs - How to get individual value from array using JavaScriptAngular.js - Stack Overflow

javascript - How to update a paginated list after a mutation? - Stack Overflow

javascript - Using Object.DefineProperty and accessing a variable in private scope - Stack Overflow

excel - display &#39;Received&quot; if a row contains referenced loan number and doc type - Stack Overflow

javascript - React native app crashing on android 12+ on emulator - Stack Overflow

customization - Custom page for creatingediting custom post type

javascript - JQuery update input value based on another form input - Stack Overflow

javascript - how to pass some getElementById inside onclick - Stack Overflow

javascript - ForEach ES6 didn&#39;t work - Stack Overflow

c# - Call a Javascript function from aspx - Stack Overflow

发表评论

推荐文章

mysql - Is there a way to sort based on average of multiple columns, ignoring NULL? - Stack Overflow

javascript - How to setup the Dojo Objective Harness test case structure for testing custom code? - Stack Overflow

javascript - Trying to write a regex that matches only numbers,spaces,parentheses,+ and - - Stack Overflow

javascript - Is there a way to hook into Webpack&#39;s AST to make it recognize a new module format? - Stack Overflow

php - Warning in WordPress for Declaration of SplitMenuWalker::walk($elements, $max_depth)

热门文章

javascript - How to get the text after a radio button - Stack Overflow

html - Splitting string 2 times with javascript - Stack Overflow

javascript - How do I get days count for selected dates by bootstrap date range picker? - Stack Overflow

jquery - Javascript Next and Previous images - Stack Overflow

javascript - jQuery: Easy way to check if a href attribute is valid - Stack Overflow

wordpress - Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 62914592 bytes) in publicwp-include

printing - Powershell - Assistance setting an SNMP octet string to rename hostname on a printer using SnmpSharpNet - Stack Overf

javascript - jQuery UI Dialog and Textarea Focus Issue - Stack Overflow

browser - Is there a way to use collations in javascript applications - Stack Overflow

Setting WorkingDirectory for multiple shortcuts with Wix - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - Will having the &lt;script&gt; tag in html effect the result of the html in a browser? - Stack Overflow

custom post types - JS innerhtml changing style when using AJAX

ios universal links not opening iOS app. react-native 0.73 firebase auth - Stack Overflow

javascript - Both .getJSON() and .ajax() not working for REST API call - Stack Overflow

html - setAttribute is not working in JavaScript - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing

javascript - redux-toolkit-> 'A non-serializable value was detected in an action' while converting workin

javascript - How to debug a Cordova's hook? - Stack Overflow

javascript - Disable waring 'Type of property circularly references itself in mapped type' or workaround - Stack

excel - display 'Received" if a row contains referenced loan number and doc type - Stack Overflow

javascript - ForEach ES6 didn't work - Stack Overflow

javascript - Is there a way to hook into Webpack's AST to make it recognize a new module format? - Stack Overflow

javascript - Will having the <script> tag in html effect the result of the html in a browser? - Stack Overflow