Multi-field text input for LSTM


I'm using LSTM to categorize medium-sized pieces of text. Each item to be categorized has several free-form text fields, in addition to several categorical fields. What is the best approach to using all this information for categorization? I see two options:

  • Concatenate the text from all fields, preceding each field content with a special token. Run concatenated text through LSTM.
  • Train one model per field. Concatenate output from each model in a hidden layer and pass into subsequent layers.

What are the benefits of each of the approaches? Is there an alternative I'm missing?

Derek Hans

Posted 2019-04-01T03:39:55.907

Reputation: 71

Could you give an example of data for both. Have you proceeded and have some results at all? – benbyford – 2019-04-09T13:27:25.430

No answers