Find second largest elements in list

16

4

Given a list:

lis = {37.21, 37.21, 37.2, 44, 44, 44, 101, 101}

What is a simple way to extract the second largest elements?

In[]:= someFunction[lis]

Out[]= {44, 44, 44}

Conor Cosnett

Posted 2020-06-13T12:12:13.340

Reputation: 6 215

Answers

18

One way, not highly efficient:

lis = {37.21, 37.21, 37.2, 44, 44, 44, 101, 101};

lis ~Cases~ Union[lis][[-2]]
{44, 44, 44}

This should be a bit more efficient:

ConstantArray @@ Sort[Tally@lis][[-2]]

Caveat: both of these methods rely on sorting and therefore require numeric data.


flinty's method with refinements by both C. E. and me:

Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]]

This appears to be the fastest overall and it avoids the sorting issue referenced above.


Benchmarking

A quick test of the methods posted so far reveals an interesting pattern. Note that in the benchmark I use a list of a fixed length of one million and vary the number of unique elements within that list.

Adding methods f5, f6, and f7, and a second test with unpackable data.

Performed in Mathematica 10.1

Needs["GeneralUtilities`"]

SetOptions[Benchmark, TimeConstraint -> 30];

f1[lis_] := lis ~Cases~ Union[lis][[-2]]
f2[lis_] := ConstantArray @@ Sort[Tally@lis][[-2]]
f3[lis_] := MaximalBy[DeleteCases[lis, Max@lis], # &] (* Conor/kglr *)
f4[lis_] := Split[Sort@lis][[-2]]  (* kglr *)
f5[lis_] := Pick[lis, lis - RankedMax[DeleteDuplicates@lis, 2], 0]; (* flinty/C. E. *)
f6[lis_] := Extract[List/@KeySort[PositionIndex[lis]][[-2]]][lis] (* CA Trevillian *)
f7[lis_] := Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]] (* flinty/C.E./me *)

BenchmarkPlot[{f1, f2, f3, f4, f5, f6, f7},
  RandomInteger[#, 1*^6] &, 10^Range[6], Joined -> True]

BenchmarkPlot[{f1, f2, f3, f4, f5, f6, f7},
  Prepend[0.5]@RandomInteger[#, 1*^6] &, 10^Range[6], Joined -> True]

enter image description here

enter image description here

Mr.Wizard

Posted 2020-06-13T12:12:13.340

Reputation: 259 163

2+1 Using v12.1 on my Mac, the benchmarks for f1, f3, and f4 all stop at n == 10^5, only f2 goes up to n == 10^6. Also, the PlotMarkers are visible in the plot just like in the PlotLegends. – Bob Hanlon – 2020-06-13T14:19:17.673

flinty's DeleteDuplicates + RankedMax approach appears to be competitive when used in conjunction with Pick to select all elements. Benchmark - f5 is flinty's method.

– C. E. – 2020-06-13T20:41:59.787

@BobHanlon Try SetOptions[BenchmarkPlot, TimeConstraint -> 30]. I don't need this in version 10.1 when explicitly specifying test points, but I think that should do it. – Mr.Wizard – 2020-06-13T22:36:50.827

1@C.E. Benchmark updated. – Mr.Wizard – 2020-06-13T22:37:07.417

@Mr.Wizard - I get the error message SetOptions::optnf : TimeConstraint is not a known option for BenchmarkPlot. On the first plot, only f2 and f5 go to 10^6. The others only go to 10^5. On the second plot, f2 only goes to 10^5 and the others only go to 10^4. There appears to have been significant changes since v10.1, not all for the better. – Bob Hanlon – 2020-06-14T00:42:12.417

@BobHanlon It seems that the TimeConstraint option is no longer passed down. Please try SetOptions[Benchmark, TimeConstraint -> 30] after loading GeneralUtilities and see if that works. – Mr.Wizard – 2020-06-14T01:00:12.167

1@Mr.Wizard - that fixed everything. Thanks. – Bob Hanlon – 2020-06-14T01:42:22.357

11

another way...

MaximalBy[DeleteCases[lis, Max@lis], # &]
{44, 44, 44}

Conor Cosnett

Posted 2020-06-13T12:12:13.340

Reputation: 6 215

1shorter: MaximalBy[DeleteCases[lis, Max@lis], # &]? – kglr – 2020-06-13T12:43:53.020

@kglr your right. updated – Conor Cosnett – 2020-06-14T06:30:00.927

9

Split[ Sort @ lis][[-2]]
 {44, 44, 44}

Also

Nearest[DeleteCases[Max @ #] @ #, Max @ #] & @ lis
{44, 44, 44}

kglr

Posted 2020-06-13T12:12:13.340

Reputation: 302 076

8

Find the second largest unique element:

RankedMax[DeleteDuplicates@lis, 2]

... or alternatively:

Last@TakeLargest[DeleteDuplicates@lis, 2]

There are multiple ways to get them all:

Cases[lis, RankedMax[DeleteDuplicates@lis, 2]]
Cases[lis, Last@TakeLargest[DeleteDuplicates@lis, 2]]
Select[lis, # == Last@TakeLargest[DeleteDuplicates@lis, 2] &]

flinty

Posted 2020-06-13T12:12:13.340

Reputation: 14 405

1+1. It's faster to use Pick to get all, e.g. Pick[lis, lis - RankedMax[DeleteDuplicates@lis, 2], 0]; The Select one can be sped up by using With to inject the sought after value into the anonymous function (but it will still be much slower than alternatives). – C. E. – 2020-06-13T20:16:46.363

1@C.E. This seems faster still: Pick[lis, lis, RankedMax[DeleteDuplicates@lis, 2]] – Mr.Wizard – 2020-06-14T06:58:25.337

@Mr.Wizard Thank you. I was messing around and somehow didn’t notice... – C. E. – 2020-06-14T08:35:27.960

5

This is probably terribly expensive compared to other methods, but I think it could be done better too, regardless...also I find it odd that Ordering doesn't manage for duplicated values...

Extract[List/@KeySort[PositionIndex[lis]][[-2]]][lis]
{44, 44, 44}

You can just grab the positions directly with

KeySort[PositionIndex[lis]][[-2]]
{4, 5, 6}

Though, I will say this is the only presented method so far that "Extracts" the second-largest value(s) in a list ;)

This is better to look at:

lis[[KeySort[PositionIndex[lis]][[-2]]]]

CA Trevillian

Posted 2020-06-13T12:12:13.340

Reputation: 2 817

5

Another solution:

lis // DeleteCases[#, Max@#]& // Cases[#, Max@#]&

sakra

Posted 2020-06-13T12:12:13.340

Reputation: 4 563

2

Select[Select[c=Sort[lis],#!=Last[c] &],#==Last[Select[c,#!=Last[c] &]]&]

Reda.Kebbaj

Posted 2020-06-13T12:12:13.340

Reputation: 367