admin管理员组文章数量:1291221
I’m encountering the following error while running a TTS (text-to-speech) model on a GPU using PyTorch:
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
This issue occurs when multiple requests are made in quick succession using threading. When requests are spaced out (i.e., single or slower sequential requests), the function works as expected, and no errors are thrown. However, when multiple threads simultaneously process text, the error is triggered inconsistently.
Here is the function I use to split text into smaller parts, synthesize it into audio, and save the results as WAV files:
!git clone .git
!pip install -r requirements.txt
!pip install .
!pip install numpy==1.24.3
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
import torch
import soundfile as sf
from pydub import AudioSegment
import base64
import TTS.tts.layers.xtts.tokenizer as xttsTokenizer
import numpy as np
import io
import os
config_path = "/content/drive/MyDrive/XTTS-v2/config.json"
model_path = "/content/drive/MyDrive/XTTS-v2/"
config = XttsConfig()
config.load_json(config_path)
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir=model_path, eval=True)
model.cuda()
def TTS_XTTSv2(prompt, speaker_wav_path, id, lang, speed, text_split_length=226):
split_tts_sentence = xttsTokenizer.split_sentence(text=prompt, lang=lang, text_split_length=text_split_length)
if lang is None or lang == "":
lang = lang_detect(prompt)
output_files = []
for i, part in enumerate(split_tts_sentence):
splitted_text_voice_output_path = f"{voice_test_path}/{id}_{i+1}.wav"
outputs = model.synthesize(
part,
config=config,
speaker_wav=speaker_wav_path,
language=lang,
speed=speed
)
wav_output = outputs['wav']
sf.write(splitted_text_voice_output_path, wav_output, 24000)
output_files.append(splitted_text_voice_output_path)
The error is triggered at the model.synthesize step, which is computationally heavy and runs on the GPU. This function is called within a threaded API using threading.Thread to parallelize text processing.
Full Error Traceback:
Exception in thread Thread-33 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
Exception in thread Thread-35 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
Exception in thread Thread-34 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ ^return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ return func(*args, **kwargs)
return func(*args, **kwargs)
^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
^^^return self.inference(
^^^ return self.inference(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^^return func(*args, **kwargs)
^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
^ return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
return self.inference(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
gpt_codes = self.gpt.generate(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
return func(*args, **kwargs)
gpt_codes = self.gpt.generate(
^^ ^^^^^^ ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
gen = self.gpt_inference.generate(
gen = self.gpt_inference.generate(
^ ^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
gpt_codes = self.gpt.generate(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^ return func(*args, **kwargs)
^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
gen = self.gpt_inference.generate( result = self._sample(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^ ^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3254, in _sample
outputs = model_forward(**model_inputs, return_dict=True)
^^ result = self._sample(
^^^^^^^^^^^^^^^^ return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
^ ^^^^^^^^^^^^^^^ ^^^^^ ^^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3251, in _sample
^^ result = self._sample(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3310, in _sample
^^^^^^^^^^^^^^^^^^^ unfinished_sequences = unfinished_sequences & ~stopping_criteria(input_ids, scores)^^^^^^^^^^^
^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^ ^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/stopping_criteria.py", line 494, in __call__
^^^outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt_inference.py", line 94, in forward
is_done = torch.full((input_ids.shape[0],), False, device=input_ids.device, dtype=torch.bool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
emb = emb + self.pos_embedding.get_fixed_embedding(
^^^^^^^^^^^^^^^^^^ return forward_call(*args, **kwargs)
^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt_inference.py", line 97, in forward
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 40, in get_fixed_embedding
transformer_outputs = self.transformer(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ return self.emb(torch.tensor([ind], device=dev)).unsqueeze(0)
^ ^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
^^^^^^^^^^^^^^^^ return forward_call(*args, **kwargs)
^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 1133, in forward
return self._call_impl(*args, **kwargs)
^^^^^^^^ outputs = block(^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^ ^ ^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/sparse.py", line 190, in forward
return self._call_impl(*args, **kwargs)
return F.embedding(
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/functional.py", line 2551, in embedding
^^^^^^^^^^^^^^^ return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Here’s what I’ve tried so far:
- Validated Input Data: Ensured the prompt text, speaker_wav_path, and lang inputs are in the expected format.
- Checked GPU Usage: Monitored GPU memory with nvidia-smi. Memory usage appears normal, and there’s no indication of overflow.
- Tokenization: Confirmed that the xttsTokenizer.split_sentence function splits the text correctly and within the text_split_length limit.
- Despite these checks, the error persists.
- I expect the function to run without errors, splitting the text, synthesizing audio, and saving WAV files for each text part.
I’m encountering the following error while running a TTS (text-to-speech) model on a GPU using PyTorch:
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
This issue occurs when multiple requests are made in quick succession using threading. When requests are spaced out (i.e., single or slower sequential requests), the function works as expected, and no errors are thrown. However, when multiple threads simultaneously process text, the error is triggered inconsistently.
Here is the function I use to split text into smaller parts, synthesize it into audio, and save the results as WAV files:
!git clone https://github/coqui-ai/TTS.git
!pip install -r requirements.txt
!pip install .
!pip install numpy==1.24.3
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
import torch
import soundfile as sf
from pydub import AudioSegment
import base64
import TTS.tts.layers.xtts.tokenizer as xttsTokenizer
import numpy as np
import io
import os
config_path = "/content/drive/MyDrive/XTTS-v2/config.json"
model_path = "/content/drive/MyDrive/XTTS-v2/"
config = XttsConfig()
config.load_json(config_path)
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir=model_path, eval=True)
model.cuda()
def TTS_XTTSv2(prompt, speaker_wav_path, id, lang, speed, text_split_length=226):
split_tts_sentence = xttsTokenizer.split_sentence(text=prompt, lang=lang, text_split_length=text_split_length)
if lang is None or lang == "":
lang = lang_detect(prompt)
output_files = []
for i, part in enumerate(split_tts_sentence):
splitted_text_voice_output_path = f"{voice_test_path}/{id}_{i+1}.wav"
outputs = model.synthesize(
part,
config=config,
speaker_wav=speaker_wav_path,
language=lang,
speed=speed
)
wav_output = outputs['wav']
sf.write(splitted_text_voice_output_path, wav_output, 24000)
output_files.append(splitted_text_voice_output_path)
The error is triggered at the model.synthesize step, which is computationally heavy and runs on the GPU. This function is called within a threaded API using threading.Thread to parallelize text processing.
Full Error Traceback:
Exception in thread Thread-33 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
Exception in thread Thread-35 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
Exception in thread Thread-34 (generate_tts_response):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
self._target(*self._args, **self._kwargs)
File "<ipython-input-9-d8f16acdbf2d>", line 1168, in generate_tts_response
File "<ipython-input-9-d8f16acdbf2d>", line 1034, in TTS_XTTSv2
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 419, in synthesize
return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ ^return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ return func(*args, **kwargs)
return func(*args, **kwargs)
^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
^^^return self.inference(
^^^ return self.inference(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^^return func(*args, **kwargs)
^ ^^^^^ ^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
^ return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 488, in full_inference
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
return self.inference(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
gpt_codes = self.gpt.generate(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
return func(*args, **kwargs)
gpt_codes = self.gpt.generate(
^^ ^^^^^^ ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/models/xtts.py", line 541, in inference
gen = self.gpt_inference.generate(
gen = self.gpt_inference.generate(
^ ^^^^^^^^^ ^^^^^^^^^^^^^^^ ^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
gpt_codes = self.gpt.generate(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^ return func(*args, **kwargs)
^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
gen = self.gpt_inference.generate( result = self._sample(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
^^ ^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3254, in _sample
outputs = model_forward(**model_inputs, return_dict=True)
^^ result = self._sample(
^^^^^^^^^^^^^^^^ return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2252, in generate
^ ^^^^^^^^^^^^^^^ ^^^^^ ^^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3251, in _sample
^^ result = self._sample(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 3310, in _sample
^^^^^^^^^^^^^^^^^^^ unfinished_sequences = unfinished_sequences & ~stopping_criteria(input_ids, scores)^^^^^^^^^^^
^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^ ^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/stopping_criteria.py", line 494, in __call__
^^^outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt_inference.py", line 94, in forward
is_done = torch.full((input_ids.shape[0],), False, device=input_ids.device, dtype=torch.bool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
emb = emb + self.pos_embedding.get_fixed_embedding(
^^^^^^^^^^^^^^^^^^ return forward_call(*args, **kwargs)
^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt_inference.py", line 97, in forward
File "/usr/local/lib/python3.11/dist-packages/TTS/tts/layers/xtts/gpt.py", line 40, in get_fixed_embedding
transformer_outputs = self.transformer(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ return self.emb(torch.tensor([ind], device=dev)).unsqueeze(0)
^ ^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
^^^^^^^^^^^^^^^^ return forward_call(*args, **kwargs)
^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 1133, in forward
return self._call_impl(*args, **kwargs)
^^^^^^^^ outputs = block(^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^ ^ ^ ^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/sparse.py", line 190, in forward
return self._call_impl(*args, **kwargs)
return F.embedding(
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/functional.py", line 2551, in embedding
^^^^^^^^^^^^^^^ return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Here’s what I’ve tried so far:
- Validated Input Data: Ensured the prompt text, speaker_wav_path, and lang inputs are in the expected format.
- Checked GPU Usage: Monitored GPU memory with nvidia-smi. Memory usage appears normal, and there’s no indication of overflow.
- Tokenization: Confirmed that the xttsTokenizer.split_sentence function splits the text correctly and within the text_split_length limit.
- Despite these checks, the error persists.
- I expect the function to run without errors, splitting the text, synthesizing audio, and saving WAV files for each text part.
- 1 I don't have an answer but this error is usually caused by out of bounds indexing. you can try run on CPU to see if the error is reproduced - you will get a more informative description – Karl Commented Feb 13 at 19:21
1 Answer
Reset to default 0First thing I'd try is to compile pytorch with TORCH_CUDA_USE_DSA
to see if there are just underlying issues that don't effect single thread running but are more prevalent in multithread.
See here how to set that up
Alternatively it may not have as much of the benefit but you could lock the thread to model.synthesize so only one thread can access it at a time:
synthesize_lock = threading.Lock() # call this outside your function
def TTS_XTTSv2(prompt, speaker_wav_path, id, lang, speed, text_split_length=226):
...
for i, part in enumerate(split_tts_sentence):
splitted_text_voice_output_path = f"{voice_test_path}/{id}_{i+1}.wav"
with synthesize_lock: # Ensure exclusive access to the model
outputs = model.synthesize(
part,
config=config,
speaker_wav=speaker_wav_path,
language=lang,
speed=speed
)
...
Good luck and lemme know how it goes
本文标签:
版权声明:本文标题:python - RuntimeError: CUDA error: device-side assert triggered - Compile with TORCH_USE_CUDA_DSA to enable device-side assertio 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741525814a2383458.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论