How can I speed up SQL queries in Mathematica?

11

5

I have an application that queries a MySQL database many times for data. At the moment, Mathematica and MySQL are running on the same box and yet it takes over 1.6 seconds for the query to run and return a value. This feels very slow, and I was hoping somebody might give me counsel on how this might be sped up.

AbsoluteTiming[
 SQLSelect[commodDB, "tbcommodprices", "price",  
   SQLColumn["ticker"] == "LAV05 Comdty" && 
    SQLColumn["whichprice"] == "last" && 
    SQLColumn["date"] == "2005-09-01"];]

{1.6093750, Null}

Michael Stern

Posted 2012-01-26T21:54:03.247

Reputation: 4 380

@Michael the mysql will update the indices automatically (unless you do an 'ALTER TABLE DISABLE KEYS'). If your problem is solved, remember to pick an answer. – Gustavo Delfino – 2012-04-14T21:42:23.403

In your database, do you have indices on columns ticker, whichprice and date? – Gustavo Delfino – 2012-01-26T22:06:11.577

1How long does the exact same query take in MySQL itself (command line interface or one of the basic MySQL clients out there)? – Arnoud Buzing – 2012-01-26T22:13:22.437

1Regarding running in MySQL itself: This is a great question. I have just checked and it takes 1.6-1.7 seconds. It looks like this is going to require speeding things up in MySQL, not Mathematica. Regarding indices: I don't think so. ticker is just a VARCHAR, price is a DOUBLE, and date is a DATE. Each record has an INTEGER key. There are about 909,000 records in the relevant table. – Michael Stern – 2012-01-26T22:18:45.550

Following the hint of gdelfino and the DBAs over at http://dba.stackexchange.com/, I created an index for this table with

ALTER TABLE commoddb.tbcommodprices ADD INDEX (ticker,whichprice,date);

That sped up my queries by a factor of 27x. Does anybody know if this index needs to be updated manually, or will the server take care of it automatically?

– Michael Stern – 2012-01-26T22:56:28.300

Looks like you solved your problem. – Leonid Shifrin – 2012-01-26T23:01:30.923

@Michael from what I know, an SQL INDEX does not need to be updated manually - new rows which are added to the database will be automatically added to the index. But it may be useful to optimize the index from time to time, I'm not sure. That would be something to look into at [dba.SE]. – David Z – 2012-01-26T23:10:27.443

Answers

5

Adding a database index is very important when SELECTing data from a big table. Once you add the index, MySQL will take care of keeping it updated. The disadvantage of indices is that your database now takes more space in your HD and that your INSERTs are now slower.

Note that you created a multiple-column index that speeds up you SELECTs when you lookup by:

  • ticker, whichprice and date
  • ticker and whichprice
  • ticker

If you do a SELECTs by whichprice or date or (whichprice and date) will still be slow.

Lastly, if this is a read-only table you could use the MyISAM storage engine and maybe even move to MariaDB (a mysql fork) for better performance.

Gustavo Delfino

Posted 2012-01-26T21:54:03.247

Reputation: 6 847

Thanks for the tip about MariaDB! – nilo de roock – 2012-02-01T16:04:18.637

7

The answer here is found in the comments above:

The MySQL query itself takes 1.6-1.7 seconds, so the time spend in Mathematica on this query is negligible and can not realistically be optimized in this case.

Arnoud Buzing

Posted 2012-01-26T21:54:03.247

Reputation: 9 213

I was too quick with my suggestions, in retrospect this certainly is the first and most natural thing to try / check. +1. – Leonid Shifrin – 2012-01-27T09:12:09.677

3

It looks like you have the answer by added indexes. Note you can check how MySql is satisfying your query using the keyword explain . This will tell you if there are any full table scan going on which will slow down the query enormously depending on the size of the tables.

See here for more info.

Eric

Posted 2012-01-26T21:54:03.247

Reputation: 233