11

6

I am attempting to solve a set of equations which has 40 independent variables (x1, ..., x40) and one dependent variable (y). The total number of equations (number of rows) is ~300, and I want to solve for the set of 40 coefficients that minimizes the total sum-of-square error between y and the predicted value.

My problem is that the matrix is very sparse and I do not know the best way to solve the system of equations with sparse data. An example of the dataset is shown below:

```
y x1 x2 x3 x4 x5 x6 ... x40
87169 14 0 1 0 0 2 ... 0
46449 0 0 4 0 1 4 ... 12
846449 0 0 0 0 0 3 ... 0
....
```

I am currently using a Genetic Algorithm to solve this and the results are coming out with roughly a factor of two difference between observed and expected.

Can anyone suggest different methods or techniques which are capable of solving a set of equations with sparse data.

2Typo in the title: spare => sparse. – Aleksandr Blekh – 2014-08-06T02:52:16.923