23

6

What is the fastest possible square number test in *Mathematica*, both for machine size and big integers?

I presume starting in version 8 the fastest will be a dedicated C LibraryLink function.

23

6

What is the fastest possible square number test in *Mathematica*, both for machine size and big integers?

I presume starting in version 8 the fastest will be a dedicated C LibraryLink function.

7

Here's an idea similar to Carl Woll's that's a little faster:

```
sQ[n_] := FractionalPart@Sqrt[n + 0``1] == 0;
```

Here are some timing runs similar to @fgrieu's:

```
timeRun[f_] := Module[{a, m},
a = (2^1024 - 3^644)^2;
m = (2^1024 - 3^644)^2 + 9;
First@ AbsoluteTiming@ Do[f[n], {n, m - 200000, m}]
]
timeRun2[f_] :=
First@ AbsoluteTiming[
Do[
f /@ (n^2 + {-2, -1, 0, 1, 2}),
{n, 2^1357, 0, -Floor[2^1357/99]}]
];
```

Tests of a long sequence of consecutive integers about single large square number:

```
timeRun[sQ]
timeRun[SqQ]
timeRun[sqQ1]
timeRun[SquareQ2]
timeRun[SquareQ08]
(*
0.626601 sQ
0.789668 SqQ (@fgrieu)
1.11774 sqQ1 (@CarlWoll)
1.63489 SquareQ2 (@Mr.Wizard)
3.39258 SquareQ08 (@KennyColnago)
*)
```

Tests of short sequences of consecutive integers about many small to large square numbers:

```
timeRun2[sQ]
timeRun2[SqQ]
timeRun2[sqQ1]
timeRun2[SquareQ2]
timeRun2[SquareQ08]
(*
0.002639 sQ
0.003289 SqQ
0.0039 sqQ1
0.005791 SquareQ2
0.01749 SquareQ08
*)
```

A test of just smaller numbers:

```
aa = 1; bb = 10^6;
AbsoluteTiming@Do[sQ@(n), {n, aa, bb}]
AbsoluteTiming@Do[SqQ@(n), {n, aa, bb}]
AbsoluteTiming@Do[sqQ1@(n), {n, aa, bb}]
AbsoluteTiming@Do[SquareQ2@(n), {n, aa, bb}]
AbsoluteTiming@Do[SquareQ08@(n), {n, aa, bb}]
(*
{2.34658, Null}
{3.2571, Null}
{3.18561, Null}
{3.42899, Null}
{3.25997, Null}
*)
```

If you want to verify its accuracy, you can test it against other solutions like this:

```
aa = 10^20 - 100; bb = aa + 10^3;
Table[sQ[n], {n, aa, bb}] === Table[IntegerQ@Sqrt[n], {n, aa, bb}]
(* True *)
aa = 1; bb = 10^6;
Table[sQ[n], {n, aa, bb}] === Table[IntegerQ@Sqrt[n], {n, aa, bb}]
(* True *)
```

Admirable simplicity!! To watch: `sQ`

's correctness varies with Mathematica version. On M4.0 PowerPC, `sQ[9]`

is `False`

. On M5.2 x86, `sQ[2]`

is `True`

. On M7.0.1 x64 it passes my tests (negative of squares give `True`

but that's a matter of convention). Same on M12.0 x64, except for a hiccup at -1. Inexact arithmetic scares me. Also: I tuned my `SqQ`

by removing `n>=0 &&`

and it wins more benchmarks (negatives are still never squares). Incorporating your idea is next (imitation is the sincerest form of flattery). – fgrieu – 2019-11-01T14:52:57.363

12

Sorry for my ignorance not taking into account that the question specifically asked for a Mathematica 7 solution. I updated the complete post.

In Mathematica 7 we don't have the option the compile code into a C-library which includes the thread parallelization which can be turned on when using `RuntimeAttributes->Listable`

and `Parallelization->True`

. Therefore, acl's solution will not run in Mathematica 7 because the RuntimeAttributes option for Compile was introduced in version 8.

This leaves the possibility to not compile the used function and make it a normal Mathematica function where you *can* set the attribute Listable. I tried this, but it was horrible slow.

After a bit of research I found a nice solution which uses some number-properties in base 16. Since (at least in V7) it seems somewhat hard to return lists of True|False, I use 0 and 1 where 0 means no square.

```
fPat = Compile[{{numbers, _Integer, 1}},
With[{l = Length[numbers]},
Module[{n = 0, i = 0, h = 0, test = 0.0, result = Table[0, {l}]},
For[i = 1, i <= l, ++i,
n = numbers[[i]];
h = BitAnd[15, n];
If[h > 9, Continue[]];
If[h != 2 && h != 3 && h != 5 && h != 6 && h != 7 && h != 8,
test = Sqrt[n];
result[[i]] = test == Floor[test];
];
];
result
]
]
];
```

Comparing this with the *almost one-liner* of Sal gives

```
data = Table[i, {i, 1, 10^6}];
fSal = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test]];
BarChart[{Timing[fSal /@ data][[1]], Timing[fPat[data]][[1]]
}, ChartLabels -> {"Sal Mangano", "Patrick V7"},
ChartStyle -> {Gray, Green}]
```

I leave it to you to decide whether such a C-like programming style is worth the small speed up.

The fastest way (using Mathematica only) I know is to compile a C-library and process all data in parallel. Since most computers these days have at least 2 cores, this gives a boost. In Mathematica 8 the compilation to a C-library does not copy the data when it is called.

To make the computation parallel you have to use the Parallization option and the compiled function must be Listable. If you are sure of your input-data, you can additionally switch off most of the data-checks by using RuntimeOptions set to "Speed".

**Update** I include here the parallelized version of the Mathematica 7 code above:

```
data = Table[i, {i, 1, 10^6}];
fSal = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test]];
fAcl = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test],
RuntimeAttributes -> {Listable}];
fPat = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test],
CompilationTarget -> "C", RuntimeAttributes -> {Listable},
Parallelization -> True, RuntimeOptions -> "Speed"];
fPat2 = Compile[{{numbers, _Integer, 1}},
With[{l = Length[numbers]},
Module[{n = 0, i = 0, h = 0, test = 0.0, result = Table[0, {l}]},
For[i = 1, i <= l, ++i,
n = numbers[[i]];
h = BitAnd[15, n];
If[h > 9, Continue[]];
If[h != 2 && h != 3 && h != 5 && h != 6 && h != 7 && h != 8,
test = Sqrt[n];
result[[i]] = test == Floor[test];
];
];
result
]
], CompilationTarget -> "C", RuntimeAttributes -> {Listable},
Parallelization -> True, RuntimeOptions -> "Speed"
];
BarChart[{Timing[fSal /@ data][[1]], Timing[fAcl[data]][[1]],
Timing[fPat[data]][[1]],
Timing[fPat2[data]][[1]]},
ChartLabels -> {"Sal Mangano", "acl", "Patrick",
"Patrick V7 parallel"},
ChartStyle -> {Gray, Gray, Darker[Green], Green}]
```

The results here come from my MacBook in battery-save mode which has 2 Intel cores. The disadvantage is that you need a C-compiler installed on your system which is most likely not true for the majority of Mathematica users.

1

Here's some background info on the method employed in `fPat`

and `fPat2`

for others. I was almost done implementing it, but I see you beat me to it :) I believe that will be the fastest of all approaches

@R.M, sorry ;-) I was ashamed that I didn't read the question carefully and I felt that I should correct this. – halirutan – 2012-01-21T21:52:04.770

btw, really nice is that in a compiled function `result[[i]] = test == Floor[test];`

works as expected for integers and gives a speed-up compared to its if-then-else counterpart. – halirutan – 2012-01-21T21:53:57.747

Very nice, +1. Mind you, @Mr.W will probably not consider this mathematica code, though: http://stackoverflow.com/a/8891064/559318 [see comments] :)

– acl – 2012-01-21T23:40:33.7831At least I didn't use Goto ;-) – halirutan – 2012-01-21T23:50:58.503

@acl in this case I specified fastest possible and I meant it. I'd like to use it for some Project Euler stuff. I don't know that I can make use of a Listable function. For machine integers the `Compile`

solutions are all much faster than `IntegerQ@Sqrt@# &`

but what about big integers? – Mr.Wizard – 2012-01-22T04:46:55.773

11

I voted for all three previous answer because they all taught me something. However they, being `Compile`

solutions, are not helpful with big integers.

At least on my system, Sal Mangano's code appears reducible to this without loss of speed:

```
isSq2 = Compile[n, Floor@# == # & @ Sqrt @ n];
```

For big integers between about 2*10^9 and 2*10^11 I am currently using this code from Sasha:

```
SquareQ =
JacobiSymbol[#, 13] =!= -1 &&
JacobiSymbol[#, 19] =!= -1 &&
JacobiSymbol[#, 17] =!= -1 &&
JacobiSymbol[#, 23] =!= -1 &&
IntegerQ@Sqrt@# &;
```

For integers larger than that I am using code (modified) from Daniel Lichtblau:

```
SquareQ2 = # == Round@# & @ Sqrt @ N[#, Log[10`, #] + $MachinePrecision] &;
```

@Mr.Wizard `SquareQ2 = FractionalPart@Sqrt@N[#, 2 Log[10, #]] == 0 &;`

By removing the extra step you get marginal improvement. Of course, we both know that there is a big assumption there, which is probably not true for sufficiently large $N$. – mhp – 2014-11-06T08:17:05.117

1Also, this is all academic. No serious application would use this method. Mathematica is missing integer remainder square root function $(N-\lfloor \sqrt{N} \rfloor^2)$ which would solve this problem. The computation of that function is more efficient than multiplication of two numbers of the size of $N$. – mhp – 2014-11-06T08:31:50.820

@mhp I haven't thought this through but could `PowerMod`

be of any use here? – Mr.Wizard – 2014-11-06T08:38:38.880

2

Not really. It took me several weeks to develop the algorithm, just to find out that someone had already beat me to it. See: https://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf

– mhp – 2014-11-06T08:42:14.4872@rm -rf `JacobiSymbol[n,p]=1`

for any odd prime $p$ when $n$ is square. `JacobiSymbol[n,p]=1`

for some odd primes $p$ when $n$ is not square. `JacobiSymbol[n,p]=0`

for any odd prime $p$ that divides $n$, where $n$ may or may not be square. `JacobiSymbol[n,p]=-1`

otherwise. Hence, `JacobiSymbol[n,p]=-1`

means $n$ is not square. Evaluating `JacobiSymbol[n,p]`

is much faster than `Sqrt[n]`

. Use `JacobiSymbol`

to filter candidate $n$ before passing to `Sqrt[n]`

. I tested with 8 $p$ near 541 on $n>10^{11}$ and found it faster than SquareQ by @Sasha and SquareQ2 by @Daniel Lichtblau. – KennyColnago – 2012-10-18T16:41:45.417

@KennyColnago would you please post your complete function as an answer? – Mr.Wizard – 2012-10-19T00:24:55.787

Speaking of the code you say you got from Sasha: have a look at the definition of `NumberTheory\`

PowersRepresentationsDump`ProbablePerfectSquareQ[]` sometime... – J. M.'s ennui – 2012-10-20T15:46:41.047

@J.M. what to I need to `Get`

to load that function? – Mr.Wizard – 2012-10-20T16:00:53.433

It's supposed to be one of the subroutines used by `PowersRepresentations[]`

; if you have that, you're supposed to have the Jacobi symbol-based test as well... (unless the implementation changed a bit between version seven and version eight) – J. M.'s ennui – 2012-10-20T16:08:09.740

@J.M. Ah yes, I just needed to use `PowersRepresentations`

once to load it. Quite similar indeed... – Mr.Wizard – 2012-10-20T16:45:59.430

Could you add references or a short explanation as to how `JacobiSymbol`

comes into play here. I don't know the underlying workings of this and I'd like to know... – rm -rf – 2012-02-20T14:21:37.630

9

I don't think there are any built-in functions for this but the following is probably fast enough for most purposes.

```
isSq = Compile[{{n, _Integer}}, With[{test = Sqrt[n]},
Floor[test] == test]];
```

Does 1 million integers in under a second.

```
Timing[Table[isSq[i], {i, 1, 1000000}]][[1]]
(*
0.76195
*)
```

This is under 2 orders of magnitude faster than the un-compiled equivalent, by the way.

1The `Sqrt`

operation is comparatively expensive; I believe one should filter out numbers that are destined to fail with a less expensive test before progressing to `Sqrt`

. – Mr.Wizard – 2012-01-21T15:08:00.817

Adding a test for EvenQ gives a slight speed up but not much. If the numbers are very large it probably helps more. Testing for ending in 0,1,4,6,9, or 25 is possible but I have not tried it and seem like overkill. – Sal Mangano – 2012-01-21T15:23:30.953

1Ignore what I said about EvenQ! Need coffee! – Sal Mangano – 2012-01-21T15:30:15.750

7

More info as requested by @Mr.Wizard.
For $n$ below the $\approx 2*10^9$ limit, Compile gives the fastest solutions. For larger $n$, Sasha used `JacobiSymbol`

with four primes 13, 19, 17, and 23, before resorting to the expensive `IntegerQ[Sqrt[n]]`

. The number of ambiguous cases where `JacobiSymbol[n,p]=0`

decreases as the size of the prime $p$ increases. So using larger $p$ helps filter out more candidates before `Sqrt`

must be called. Similarly, using more primes filters more candidates. However, the computation of `JacobiSymbol`

slows as the number of and size of $p$ increases (no free lunch). As a rough balance, I used SquareQ08.

```
SquareQ08[n_] :=
JacobiSymbol[n, 541] =!= -1 && JacobiSymbol[n, 547] =!= -1 &&
JacobiSymbol[n, 557] =!= -1 && JacobiSymbol[n, 563] =!= -1 &&
JacobiSymbol[n, 569] =!= -1 && JacobiSymbol[n, 647] =!= -1 &&
JacobiSymbol[n, 653] =!= -1 && JacobiSymbol[n, 659] =!= -1 &&
IntegerQ[Sqrt[n]]
SetAttributes[SquareQ08, Listable]
```

@KennyColnago I just found this thread, and tried it myself. I tried `isSq2`

as well as `SquareQ08`

on sets of 100 random integers in various ranges (10^8 to 10^9, 10^9 to 10^10, 10^10 to 10^11, and 10^11 to 10^12) with `RepeatedTiming`

. In each case, the results were remarkably consistent - `isSq2`

was about 7.4 x 10^{-5} seconds and `SquareQ08`

was about 4.5 x 10^{-4} seconds. This is on Mathematica 10.3; I wonder what version you were using and if the performance of `isQq2`

was somehow improved. – rogerl – 2015-12-24T17:09:33.147

@rogerl I don't recall which version was used. On version 10.3, I find `isSq2`

about 6 to 10 times faster than `SquareQ08`

, which is written for integers of any size. I do not understand how `isSq2`

, which is *compiled*, works with integers larger than 2 billion. Can you explain? – KennyColnago – 2015-12-25T01:09:07.037

@KennyColnago Well, my machine is a 64-bit machine, so my reading of the tea leaves is that `Compile`

will work with 64-bit integers (or about $10^18$). Is that right? And, just to be clear, your result regarding isQq1 and SquareQ08 that you give in your comment above is at variance with your original answer, right? Do you know what's going on, or am I just confused? – rogerl – 2015-12-25T01:44:04.047

@rogerl Having failed before by using `Compile`

on integers greater than 2 billion, my original answer never tested `isSq2`

on such large numbers. Therefore, the result in the comment is new, and not at odds with the original answer. However, before I trust any compiled result on large integers, I must see a documented source which explains that `Compile`

on 64 bit machines works on integers above 2 billion. – KennyColnago – 2015-12-25T01:57:06.510

@KennyColnago You might want to look at this question that I posted this morning. I've done some more investigation of this issue.

– rogerl – 2015-12-25T16:12:05.3931Would it be advantageous to select the number of primes to use based on the size of the integer? – s0rce – 2012-10-24T20:31:13.033

@s0rce My rough tests showed that about 50% of candidate $n$ were filtered out for each `JacobiSymbol`

test, and that this percentage was independent of the size of $n$. Your mileage may vary... – KennyColnago – 2012-10-24T23:55:09.583

6

This is a variation of Daniel Lichtblau's contribution that avoids the need to compute logarithms:

```
sqQ1[i_Integer] := Floor[Sqrt[i + If[i>10^16, .1`1, .1]]]^2 == i
```

It seems to be a bit faster than `SquareQ2`

. For example:

```
n = 432^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
```

{2.42*10^-6, True}

{3.2*10^-6, True}

and:

```
n = 43212113212231231241334^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
```

{3.61*10^-6, True}

{5.3*10^-6, True}

But not always:

```
n = 432121231231241334^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
```

{7.8*10^-6, True}

{5.26*10^-6, True}

A "listable" version appears to be faster than the compiled versions (at least when the maximum value is less than 10^16):

```
sqQ2[x:{__Integer}] := With[{add = If[Max[x]>10^16, .1`1, .1]},
UnitStep[Floor[Sqrt[x+add]]^2 - x]
]
```

Comparison with `fPat2`

:

```
data = RandomInteger[10^15, 10^6];
r1 = sqQ2[data]; //RepeatedTiming
r2 = fPat2[data]; //RepeatedTiming
r1 === r2
```

{0.0075, Null}

{0.023, Null}

True

Of course, `sqQ2`

works for any size integers, while the compile solutions only work for integers less than `Developer`$MaxMachineInteger`

.

5

The following is optimized for large values. The main idea is to reduce the integer tested modulo a product of small primes less than 64-bit, so that the cost is low and linear with the bit size of the argument, and filter the remainder using precomputed Jacobi tables to weed out all except very few (1/11595) of the non-squares.

```
SqQ::usage =
"SqQ[n] is True when n is an exact square, and False otherwise.";
(* We reduce n modulo a product of small primes and use *)
(* pre-computed tables of Jacobi symbols to filters out *)
(* most non-squares with a single multi-precision operation. *)
(* We use IntegerQ[Sqrt[n]] on less than 1/11595 integers. *)
(* Pre-computed variables starting in SqQ$ are for internal use; *)
SqQ$m = (SqQ$0 = 59*13*7*5*3)*(SqQ$1 = 23*19*17*11)*
(SqQ$2 = 47*37*31) *(SqQ$3 = 43*41*29);
SqQ$u = SqQ$v = SqQ$w = SqQ$x = 0;
Block[{j},
For[j = SqQ$0, j-- > 0, SqQ$u += SqQ$u + If[
JacobiSymbol[j, 59] < 0 || JacobiSymbol[j, 13] < 0 ||
JacobiSymbol[j, 7] < 0 || JacobiSymbol[j, 5] < 0 ||
JacobiSymbol[j, 3] < 0, 1, 0]];
For[j = SqQ$1, j-- > 0, SqQ$v += SqQ$v + If[
JacobiSymbol[j, 23] < 0 || JacobiSymbol[j, 19] < 0 ||
JacobiSymbol[j, 17] < 0 || JacobiSymbol[j, 11] < 0, 1, 0]];
For[j = SqQ$2, j-- > 0, SqQ$w += SqQ$w + If[
JacobiSymbol[j, 47] < 0 || JacobiSymbol[j, 37] < 0 ||
JacobiSymbol[j, 31] < 0, 1, 0]];
For[j = SqQ$3, j-- > 0, SqQ$x += SqQ$x + If[
JacobiSymbol[j, 43] < 0 || JacobiSymbol[j, 41] < 0 ||
JacobiSymbol[j, 29] < 0, 1, 0]]
];
(* The function itself starts here *)
SqQ[n_Integer] := Block[{m = Mod[n, SqQ$m]},
BitGet[SqQ$u, Mod[m, SqQ$0]] == 0 &&
BitGet[SqQ$v, Mod[m, SqQ$1]] == 0 &&
BitGet[SqQ$w, Mod[m, SqQ$2]] == 0 &&
BitGet[SqQ$x, Mod[m, SqQ$3]] == 0 &&
IntegerQ[Sqrt[n]]]
(* Automatically thread over lists *)
SetAttributes[SqQ, Listable];
```

It comfortably beats `sqQ1`

, `SquareQ2`

and `SqareQ08`

when benchmarked with large non-squares

```
m = (2^1024 - 3^644)^2 + 9;
Timing[s = 0;
For[n = m - 200000, n < m, ++n, If[SqQ[n], ++s]];
s == 1]
```

and more narrowly so when benchmarked/validated as

```
Timing[For[n = 2^1357,
n > 0 && SqQ[s = n^2] && ! SqQ[s + 1] && ! SqQ[s + 2], --n,
n -= Floor[n/99]]; n == 0]
```

5

I am not sure how to speed up each comparison (as in, I spent half an hour trying different things and didn't manage to), but making the compiled function listable speeds things up quite a bit.

If `isSq`

is the direct implementation that Sal gave, simply make it listable and compare:

```
isSqL = Compile[
{{n, _Integer}}, With[{test = Sqrt[n]}, Floor[test] == test],
RuntimeAttributes -> {Listable}
];
```

and then compare:

```
Timing[Table[isSq[i], {i, 1, 10^6}]]; // Timing
isSq /@ Range[1, 10^6]; // Timing
isSqL[Range[1, 10^6]]; // Timing
(*
{0.697799, Null}
{0.545856, Null}
{0.150171, Null}
*)
```

ie, it's 3-4 times faster.

What makes you say `Sqrt`

is expensive? (ie, compared to what?).

True, but this does not speed up the sq test per se it is just a faster way to apply the function across a list than Table is. Good idea though! – Sal Mangano – 2012-01-21T16:25:35.100

@Sal That's true, but I can't get the test faster. Looking at `CompilePrint[isSq]`

, it's hard to see anything more costly thatn `Sqrt`

. – acl – 2012-01-21T16:34:31.623

Perhaps, you'd like to use AbsoluteTiming for these tests, especially if stuff is running in parallel. See the "How can I improve the speed of eigenvalue decompositions for large matrices?" thread. – None – 2012-01-21T17:49:32.687

@ruebenko that is a good point (in this case, though, they give the same answer, so I'll leave this the way it is--but your point about running them in parallel stands) – acl – 2012-01-21T20:17:00.603

Even in version 7, you have MathLink. I don't know how much the MathLink overhead is, but on the local machine MathLink should be very fast as it uses memory mapped files for communication. – Szabolcs – 2012-01-21T14:45:04.300

@Szabolcs; you're right; I was curious what would be best in native

Mathematica, but if someone cares to create a robust MathLink executable and share it here I certainly won't turn it down! – Mr.Wizard – 2012-01-21T15:01:01.7472The overhead of messaging involved in MatheLink would kill it. Possibly Mathematica 8's ability to load a DLL would help if you used C. But consider if your application really warrants this compared to what I offered below. The speed up you may get is probably not worth the effort. – Sal Mangano – 2012-01-21T16:04:40.717