admin管理员组文章数量:1332619
I am working on a llama fine-tuning task. When I train on a single GPU, the program runs fine.
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_name = "../models/llama3_8b/"
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map=device,
torch_dtype=compute_dtype,
quantization_config=bnb_config,
)
But when I wanted to use multiple GPUs for fine-tuning, an error occurred. The modified code is as follows:
model = AutoModelForCausalLM.from_pretrained(
model_name,
# device_map=device,
**device_map="auto",** # Modifications
torch_dtype=compute_dtype,
quantization_config=bnb_config,
)
peft_config = LoraConfig(
lora_alpha=16,
lora_dropout=0,
r=64,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
)
training_arguments = TrainingArguments(
...
**local_rank=os.getenv("LOCAL_RANK", -1),** # Modifications
**ddp_find_unused_parameters=False,** # Modifications
)
trainer = SFTTrainer(
model=model,
args=training_arguments,
train_dataset=train_data,
#eval_dataset=eval_data,
peft_config=peft_config,
dataset_text_field="text",
tokenizer=tokenizer,
max_seq_length=max_seq_length,
packing=False,
dataset_kwargs={
"add_special_tokens": False,
"append_concat_token": False,
},
)
trainer.train()
The error are as follows:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0!
Executing Code:
CUDA_VISIBLE_DEVICES=3,4 python llama3.py
Does anyone know how to solve it?
本文标签:
版权声明:本文标题:python - Multi-GPU fine-tuning llama issue. RuntimeError: Expected all tensors to be on the same device, but found at least two 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742266200a2443438.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论