admin管理员组文章数量:1394099
I am working with pytorch-forecasting to create a TimeSeriesDataSet where I have 30 target variables that I want to predict. However, when I pass this dataset to a DataLoader, I encounter an issue:
Expected Behavior Since I have 30 target variables, I expect TimeSeriesDataSet to return:
A batch where the targets are in the shape (batch_size, 30) as a single torch.Tensor. The dataset should be structured so that the DataLoader can correctly package it into mini-batches without issues. In other words, I expect each batch to contain: A dict of inputs with the necessary features. A torch.Tensor for the targets, with shape (batch_size, 30). Actual Behavior Instead, TimeSeriesDataSet returns a tuple with two elements:
The first element is a dict containing the input tensors, which is fine. The second element is another tuple with two elements: sample[1][0]: A list of 30 tensors instead of a single tensor. sample[1][1]: None, which causes an error when passed to PyTorch's default_collate.
Error Message from DataLoader:
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts, or lists; found <class 'NoneType'>
This suggests that PyTorch cannot handle the None value being returned by TimeSeriesDataSet.
Dataset Information: 7061 time steps. Each row contains 30 numerical values (representing different features). Goal: Predict the values for the next step based on the previous 10 time steps. The dataset is structured as a single time series (one group).
Code:
import pandas as pd
import torch
from pytorch_forecasting.data.encoders import TorchNormalizer
from pytorch_forecasting import TimeSeriesDataSet, MultiNormalizer
from torch.utils.data import DataLoader
# Load dataset
file_name = 'DATA' # CSV file
data = pd.read_csv(f'{file_name}.CSV')
# Drop unnecessary columns
if "Date" in data.columns:
data = data.drop(columns=["Date"])
# Add time index
data["time_idx"] = range(len(data))
data["time_idx"] = data["time_idx"].astype(int)
# Add a dummy group column (since all data belongs to one group)
data["group"] = "single_group"
# Rename columns for uniformity
data.columns = ["num_" + str(i+1) for i in range(30)] + ["time_idx", "group"]
# Convert 'group' to category codes
data["group"] = data["group"].astype("category").cat.codes
# Fill any NaN values
if data.isna().sum().sum() > 0:
print("⚠️ Found NaN values, filling with 0.")
data.fillna(0, inplace=True)
# TimeSeriesDataSet configuration
max_encoder_length = 10 # Past observations
max_prediction_length = 1 # Future prediction
target_cols = ["num_" + str(i+1) for i in range(30)]
# Create TimeSeriesDataSet
training = TimeSeriesDataSet(
data=data,
time_idx="time_idx",
target=target_cols, # 30 targets
group_ids=["group"],
max_encoder_length=max_encoder_length,
max_prediction_length=max_prediction_length,
time_varying_unknown_reals=target_cols,
target_normalizer=MultiNormalizer([TorchNormalizer(method="identity") for _ in range(30)]),
add_relative_time_idx=True,
add_target_scales=False,
add_encoder_length=True
)
# DataLoader
batch_size = 32
train_dataloader = DataLoader(
training,
batch_size=batch_size,
shuffle=False
)
# DEBUG: Inspect the DataLoader output
for batch in train_dataloader:
print("
本文标签:
tensorflowPyTorch Forecasting TimeSeriesDataSet Returns None in DataLoader BatchStack Overflow
版权声明:本文标题:tensorflow - PyTorch Forecasting TimeSeriesDataSet Returns None in DataLoader Batch - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人,
转载请联系作者并注明出处:http://www.betaflare.com/web/1744748146a2623012.html,
本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论