admin管理员组文章数量:1123430
I have a feed forward neural network that takes 1404 input data (468 3D facial landmark points flattened as [x1, y1, z1, x2, y2, z2, ...]) and meant to regress 3 values as output. All 1404 input data belongs to one sample and I feed many samples to this network. These samples belong to different subjects. For each subject, I have the input data from different angles np.arange(-40, 41, 10). If I plot the ground truth data for these angles, it makes a cosine curve. For each sample of each subject, I have 3 sets of ground truth data that I obtained from SVD. So, each of these sets is a cosine curve with different parameters. I used the following architecture to train a FFN to regress three values. However, after training the model using the data of 1300 subjects, the model doesn't fit very well.
self.input_size = 1404
self.total_output_size = 3
self.encoder = nn.Sequential(
nn.Linear(input_size, 2048),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(2048, 4096),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(4096, 4096),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(4096, 4096),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(4096, 4096),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(4096, 2048),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(2048, 1024),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(1024, 512),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(512, 256),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(256, 128),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(128, 64),
nn.Tanh(),
nn.Dropout(0.1),
nn.Linear(64, self.total_output_size)
)
For the above network, I used MSE loss function, SGD optimizer with lr = 1e-4, batch size = 256, weight_decay=1e-5, scheduler = StepLR(optimizer, step_size=10, gamma=0.9) and gradient clipping. The val loss starts from 0.084 and ends in 0.005 after 100 epoches. I tweaked all aformentioned hyperparams but very negligible or no improvement.
I started with simple architectures: 1404 -> 512 -> ReLU -> 256 -> ReLU -> 128 -> ReLU -> 3 1404 -> 512 -> ReLU -> 256 -> ReLU -> 128 -> ReLU -> 64 -> ReLU -> 3 with increasing 128 or 64 blocks but the model doesn't learn anything. Using different optimizer rather than SGD didn't help. Based on my experiments diamaond shape network worked better for this task.
Does anyone have any insight or advice to help me with this problem? Thank you in advance!
本文标签: neural networkFFN performs very poor for regressing a cosine shape functionStack Overflow
版权声明:本文标题:neural network - FFN performs very poor for regressing a cosine shape function - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736568274a1944735.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论