admin管理员组文章数量:1122816
I'm new to Executorch and was trying to convert mDeBERTA model to run on Edge devices. While I was able to at first export the model but post quantization using XNNPACKQuantizer the contized graph is failing to export with the following error
torch._dynamo.exc.TorchRuntimeError: Failed running call_function aten.gather.default(*(FakeTensor(..., size=(12, 28, 512)), -1, FakeTensor(..., size=(12, 28, 28))), **{}): gather(): Expected dtype int64 for index, but got torch.float32
Please find the code snipped below:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
#from torch.export import export_for_training
#from torch._export import exported_for_training
from torch._export import capture_pre_autograd_graph
from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
from torch.ao.quantization.quantizer.xnnpack_quantizer import (
get_symmetric_quantization_config,
XNNPACKQuantizer,
)
# For Executorch
from torch.export import export, ExportedProgram
from executorch.exir import to_edge
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
premise = "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
hypothesis = "Emmanuel Macron is the President of France"
model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, _fast_init=False, torchscript=True)
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
input_shape = [251000,768]
input_data = input["input_ids"].to(dtype=torch.int64)
input_data = input_data.type(torch.int64)
print("Input Data: ", input_data, " Datatype: ", type(input_data))
aten_dialect: ExportedProgram = export(model, (input_data,))
print("Got Aten operation")
quantizer = XNNPACKQuantizer().set_global(get_symmetric_quantization_config())
#prepared_graph = prepare_pt2e(aten_dialect, quantizer)
exported_model = capture_pre_autograd_graph(model, (input_data,))
prepared_graph = prepare_pt2e(exported_model, quantizer)
converted_graph = convert_pt2e(prepared_graph)
print("Quantized Graph")
print("Input Data: ", input_data, " Datatype: ", input_data.type())
inpdatai64 = torch.tensor(input_data.tolist(), dtype=torch.int64)
print("Input Data Datatype: ", inpdatai64.type())
# ERROR: The following line results in error with the gather operation
aten_dialect1: ExportedProgram = export(converted_graph, (inpdatai64,))
print("ATen Dialect Graph")
Note that I have tried with passing dynamic input to export as well but it too ended up in the same error.
What I observe is that it is creating a FakeTensor while tracing the model as part of export. Even if we have passed an input tuple why does it create a FakeTensor? Is there any known issue with FakeTensor or am I missing something while passing the inputs to export.
本文标签:
版权声明:本文标题:pytorch - Error while call_function aten.gather.default Expected dtype int64 for index but got torch.float32 while exporting qua 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736312383a1935080.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论