admin管理员组

文章数量:1123664

I'm trying to use torchrl's SyncDataCollector with a DQN I implemented myself in torch. As the DQN uses Conv2d and Linear Layer I have to calculate the correct size for the input of the first Linear Layer, the size param in the following net

class PixelDQN(nn.Module):
    def __init__(self, input_shape, n_actions) -> None:
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(input_shape[0], 32, kernel_size=8, stride=4),
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=4, stride=2),
            nn.ReLU(),
            nn.Conv2d(64, 64, kernel_size=3, stride=1),
            nn.ReLU(),
            nn.Flatten(),
        )
        size = self.conv(torch.zeros(1, *input_shape)).size()[-1]
        self.fc_adv = nn.Sequential(
            NoisyLinear(size, 256),
            nn.ReLU(),
            NoisyLinear(256, n_actions),
        )
        self.fc_val = nn.Sequential(
            NoisyLinear(size, 256),
            nn.ReLU(),
            NoisyLinear(256, 1)
        )

    def forward(self, x: torch.Tensor):
        print(x.shape)
        conv = self.conv(x)
        print(conv.shape)
        adv = self.fc_adv(conv)
        val = self.fc_val(conv)
        outp = val + (adv - adv.mean(dim=1, keepdim=True))
        return outp

is responsible for that. As you can see I expect batched inputs as I will use a replay buffer and sample a batch from that.

I wrap that DQN in the following way and then use the SyncDataCollector:

n_obs = [4,84,84]
n_act = 6

agent = QValueActor(
  module=PixelDQN(n_obs, n_act), in_keys=["pixels"], spec=env.action_spec
)
policy_explore = EGreedyModule(
  env.action_spec, eps_end=EPS_END, annealing_num_steps=ANNEALING_STEPS
)
agent_explore = TensorDictSequential(
  agent, policy_explore
)

collector = SyncDataCollector(
  env,
  agent_explore,
  frames_per_batch=FRAMES_PER_BATCH,
  init_random_frames=INIT_RND_STEPS,
  postproc=MultiStep(gamma=GAMMA, n_steps=N_STEPS)
)

This however fails as the SyncDataCollector doesn't batch the obs from the env before giving them to the DQN so size calc gets wrong and the Linear layer get a wrong input dimension. RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x49 and 3136x256)

I already tried to set buffer=True in SyncDataCollector. I also tried to use

agent_explore = TensorDictSequential(
  UnsqueezeTransform(0, allow_positive_dim=True), agent, policy_explore
)

as this was kinda suggested by ChatGPT, however it didn't seem to have any effect.

I also tried the UnsqueezeTransform in my env creation, but that didn't work either, my env looks like this:

def make_env(env_name: str):
    return TransformedEnv(
        GymEnv(env_name, from_pixels=True),
        Compose(
            RewardSum(),
            EndOfLifeTransform(),
            NoopResetEnv(noops=30),
            ToTensorImage(),
            Resize(84, 84),
            GrayScale(),
            FrameSkipTransform(frame_skip=4),
            CatFrames(N=4, dim=-3),
        )
    )

I could pull the size calc into the forward pass of my PixelDQN and check the size of the input tensor to adapt the calc, but this seems like a weird thing to do, since it would mean I'd need to run the size calc at each single forward pass.

I'm trying to use torchrl's SyncDataCollector with a DQN I implemented myself in torch. As the DQN uses Conv2d and Linear Layer I have to calculate the correct size for the input of the first Linear Layer, the size param in the following net

class PixelDQN(nn.Module):
    def __init__(self, input_shape, n_actions) -> None:
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(input_shape[0], 32, kernel_size=8, stride=4),
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=4, stride=2),
            nn.ReLU(),
            nn.Conv2d(64, 64, kernel_size=3, stride=1),
            nn.ReLU(),
            nn.Flatten(),
        )
        size = self.conv(torch.zeros(1, *input_shape)).size()[-1]
        self.fc_adv = nn.Sequential(
            NoisyLinear(size, 256),
            nn.ReLU(),
            NoisyLinear(256, n_actions),
        )
        self.fc_val = nn.Sequential(
            NoisyLinear(size, 256),
            nn.ReLU(),
            NoisyLinear(256, 1)
        )

    def forward(self, x: torch.Tensor):
        print(x.shape)
        conv = self.conv(x)
        print(conv.shape)
        adv = self.fc_adv(conv)
        val = self.fc_val(conv)
        outp = val + (adv - adv.mean(dim=1, keepdim=True))
        return outp

is responsible for that. As you can see I expect batched inputs as I will use a replay buffer and sample a batch from that.

I wrap that DQN in the following way and then use the SyncDataCollector:

n_obs = [4,84,84]
n_act = 6

agent = QValueActor(
  module=PixelDQN(n_obs, n_act), in_keys=["pixels"], spec=env.action_spec
)
policy_explore = EGreedyModule(
  env.action_spec, eps_end=EPS_END, annealing_num_steps=ANNEALING_STEPS
)
agent_explore = TensorDictSequential(
  agent, policy_explore
)

collector = SyncDataCollector(
  env,
  agent_explore,
  frames_per_batch=FRAMES_PER_BATCH,
  init_random_frames=INIT_RND_STEPS,
  postproc=MultiStep(gamma=GAMMA, n_steps=N_STEPS)
)

This however fails as the SyncDataCollector doesn't batch the obs from the env before giving them to the DQN so size calc gets wrong and the Linear layer get a wrong input dimension. RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x49 and 3136x256)

I already tried to set buffer=True in SyncDataCollector. I also tried to use

agent_explore = TensorDictSequential(
  UnsqueezeTransform(0, allow_positive_dim=True), agent, policy_explore
)

as this was kinda suggested by ChatGPT, however it didn't seem to have any effect.

I also tried the UnsqueezeTransform in my env creation, but that didn't work either, my env looks like this:

def make_env(env_name: str):
    return TransformedEnv(
        GymEnv(env_name, from_pixels=True),
        Compose(
            RewardSum(),
            EndOfLifeTransform(),
            NoopResetEnv(noops=30),
            ToTensorImage(),
            Resize(84, 84),
            GrayScale(),
            FrameSkipTransform(frame_skip=4),
            CatFrames(N=4, dim=-3),
        )
    )

I could pull the size calc into the forward pass of my PixelDQN and check the size of the input tensor to adapt the calc, but this seems like a weird thing to do, since it would mean I'd need to run the size calc at each single forward pass.

Share Improve this question asked 21 hours ago Christian KochChristian Koch 1 New contributor Christian Koch is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 0
Add a comment  | 

1 Answer 1

Reset to default 0

I found the solution, I changed to UnsqueezeTransform(-4, in_keys=["pixels"]) within agent_explore and now I have the wanted behaviour ... (:

本文标签: machine learningtorchrl Using SyncDataCollector with a custom pytorch dqnStack Overflow