I'm currently working with the CHILDES corpus trying to create a classifier that distinguishes children whom suffer from specific language impairment (SLI) from those who are typically developing (TD).
In my readings I noticed that there really isn't a convincing set of features to distinguish the two that have been discovered yet, so I came upon the idea of trying to create a feature learning algorithm that could potentially make better ones.
Is this possible? If so how do you suggest I approach this? From the reading I have done, most feature learning is done on image processing. Another problem is the dataset I have is potentially too small to make it work (in the 100's) unless I find a way to get more transcripts from children.