I am using huggingface to build a model that is capable of identifying mistakes in a given sentence. Say I have a given sentence and a corresponding label as follows ->
correct_sentence = "we used to play together." correct_label = [1, 1, 1, 1, 1] changed_sentence = "we use play to together." changed_label = [1, 2, 2, 2, 1]
These labels are further padded with 0s to an equal length of
512. The sentences are also tokenized and are padded up(or down) to this length.
The model is as follows:
class Camembert(torch.nn.Module): """ The definition of the custom model, last 15 layers of Camembert will be retrained and then a fcn to 512 (the size of every label). """ def __init__(self, cam_model): super(Camembert, self).__init__() self.l1 = cam_model total_layers = 199 for i, param in enumerate(cam_model.parameters()): if total_layers - i > hparams["retrain_layers"]: param.requires_grad = False else: pass self.l2 = torch.nn.Dropout(hparams["dropout_rate"]) self.l3 = torch.nn.Linear(768, 512) def forward(self, ids, mask): _, output = self.l1(ids, attention_mask=mask) output = self.l2(output) output = self.l3(output) return output
batch_size=2, the output layer will therefore be
(2, 512) which is same as the target_label.
To the best of my knowledge, this method is like saying there are
512 classes that are to be classified which is not what I want, the problem arises when I try to calculate loss using
torch.nn.CrossEntropyLoss() which gives me the following error (truncated):
File "D:\Anaconda\lib\site-packages\torch\nn\functional.py", line 1838, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), igno re_index) RuntimeError: multi-target not supported at C:/w/1/s/tmp_conda_3.7_100118/conda/conda-bld/p ytorch_1579082551706/work/aten/src\THCUNN/generic/ClassNLLCriterion.cu:15
How am I supposed to solve this issue, are there any tutorials for similar kinds of models?