I am trying to build anomaly detection with low false positives .Dataset that i am using is a patient health sensor data. A number of parameters from the patient's sensors are collected hourly and I have roughly 7k parameters which can act as features.
Issues i am facing are the following:
1: For each patient, I gather hourly data for 8 days of data.Hence I get ~190 rows of these parameter. However I have 7000 parameters being collected for each hour as and hence my dataset for each patient is 190 rows* 7000columns. I feel the columns/attributes are really large compared to rows as attributes(columns) are 7000 vs 190 rows. Are there any recommendation on how to deal with such a big feature set. Should I do dimensionality reduction first and then pass to an isolation forest algorithm? Are there any better ways of dealing with such large attributes (columns) vs low rows?
2: Any recommendations of any additional algorithms that I could try that might give lowest falses? I am currently using PCA, isolation forests, one class svm as the data currently represents only normal behavior.