What is a Mathematica packed array?

117

83

A simple sounding question with a few sub questions:

  • What is the difference between unpacked vs packed array?
  • Are packed arrays more space efficent, how much so?
  • Are packed arrays more time efficient for certain types of access over the unpacked form?

Bonus:

Is it ever undesirable to make use of packed arrays, even if the data can fit?

nixeagle

Posted 2012-03-25T21:44:05.453

Reputation: 2 193

Related: http://library.wolfram.com/infocenter/Demos/391/ (R. Knapp)

– Michael E2 – 2015-11-28T15:47:04.200

2

Related: http://stackoverflow.com/q/8775372

– rm -rf – 2012-03-25T21:49:07.020

Answers

94

I will answer a couple of your questions only.

Space efficiency

Packed arrays are significantly more space efficient. Example: Let's create an unpacked array, check its size, then do the same after packing it:

f = Developer`FromPackedArray[RandomReal[{-1, 1}, 10000]];
ByteCount[f]
ByteCount[Developer`ToPackedArray[f]]

(*
320040
80168
*)

Time efficiency

The difference seems to be how they are stored; packed arrays can only contain objects of the same type, so mma does not need to keep track of the type of each element. This can also speed up operations with them. Define

ClearAll[timeIt];
SetAttributes[timeIt, HoldAll]
timeIt[expr_] := Module[{t = Timing[expr;][[1]], tries = 1},
    While[t < 1.,
    tries *= 2;
    t = AbsoluteTiming[Do[expr, {tries}];][[1]];
    ];
    Return[t/tries]]

then

ClearAll[f, fpacked];
f = Developer`FromPackedArray[RandomReal[{-1, 1}, 500000]];
fpacked = Developer`ToPackedArray[RandomReal[{-1, 1}, 500000]];

fpacked.fpacked // timeIt
f.f // timeIt

Sin[fpacked] // timeIt
Sin[f] // timeIt

(*
0.0001610173
0.01167263
0.00487482
0.01420070
*)

Unpacking

To be warned of arrays being unpacked, you can do SetSystemOptions[PackedArrayOptions->UnpackMessage->True] or, in versions after 7, On["Packing"] (thanks to OleksandrR for pointing this out). The you see that eg Select unpacks: try Select[fpacked, 3] and a message is produced. Also assigning a value of different type to a packed array unpacks it: try fpacked[[2]] = 4 to see this.

This unpacking explains mysterious slowdowns in mma code most of the time for me.

Addressing

It appears that it is twice as slow to address a single element in a packed vs an unpacked array:

ClearAll[f, fpacked];
f = Developer`FromPackedArray[RandomReal[{-1, 1}, 500000]];
fpacked = Developer`ToPackedArray[RandomReal[{-1, 1}, 500000]];

fpacked[[763]] // timeIt
f[[763]] // timeIt
(*
4.249656*10^-7
2.347070*10^-7
*)

AppendTo is not faster:

AppendTo[fpacked, 5.] // timeIt
AppendTo[f, 5.] // timeIt
(*
0.00592841
0.00584807
*)

I don't know if there are other kinds of addressing-like operations that are faster for packed arrays (I doubt it but could be wrong).

Aside

In the Developer` context there are these names involving Packed:

Select[
 Names["Developer`*"],
 Not@StringFreeQ[#, ___ ~~ "Packed" ~~ ___] &
 ]
(*
{"Developer`FromPackedArray", "Developer`PackedArrayForm", 
"Developer`PackedArrayQ", "Developer`ToPackedArray"}
*)

Developer`PackedArrayForm does this:

ClearAll[f, fpacked];
f = Developer`FromPackedArray[RandomInteger[{-1, 1}, 5]];
fpacked = Developer`ToPackedArray[RandomInteger[{-1, 1}, 5]];

Developer`PackedArrayForm[f]
Developer`PackedArrayForm[fpacked]
(*
{-1, -1, -1, -1, -1}
"PackedArray"[Integer, <5>]
*)

So, you could set $Post = Developer`PackedArrayForm and then packed arrays would be displayed in a special way. I am not sure if this has any other sideeffects (this has been suggested in this great answer by ruebenko).

acl

Posted 2012-03-25T21:44:05.453

Reputation: 19 146

2$Post = Developer`PackedArrayForm is pretty useful. Thanks! – Henrik Schumacher – 2017-07-04T14:37:24.947

Interestingly this is actually documented: http://reference.wolfram.com/language/Developer/ref/ToPackedArray

– b3m2a1 – 2018-04-16T00:12:48.867

3I think On["Packing"] may be one of the most useful things I've come across on this site. Thanks! – Pillsy – 2013-06-19T19:30:01.997

12As of version 7, one may use On["Packing"] as an alternative to SetSystemOptions[PackedArrayOptions->UnpackMessage->True]. Also, AppendTo will not be appreciably faster for packed arrays in most cases because the great bulk of its runtime consists of copying the array. At least this is done without unpacking unless absolutely necessary (therefore probably using optimized memcpy). – Oleksandr R. – 2012-03-26T00:11:24.640

84

The difference

Packed arrays give you pretty much an access to a direct C memory layout, where the arrays are stored. Unpacked arrays reference arrays of pointers to their elements. This explains most of the other differences, in particular:

  • Space efficiency: if you look at how much space is required for packed arrays, you see that it is exactly the amount you'd need in C
  • Limitation to be rectangular: this allows to allocate arrays as contiguous blocks of memory, and perhaps use fast operations for array copying etc (such as memset, memcpy, or whatever custom analogs of them may exist in M implementation).

Runtime efficiency

Packed arrays by themselves would not bring to the table much except space efficiency. However, in addition to the new data structure, most fundamental functions have been internally overloaded to automatically use their specialized and much more efficient versions when arguments are packed arrays. Among these functions: Join, Tally, DeleteDuplicates, UnitStep, Clip, Unitize, Pick, Part,Transpose, Partition, etc.

This is a kind of a partial replacement of compilation in an interpreted environment. Some important things related to this:

  • Most numeric functions are Listable. This Listability is often not distinguished from the high-level one, where you can assign the Listable attribute to any function you write. While conceptually they serve the same purpose, being Listable means a different thing for numeric built-in functions in terms of implementation: it tells them that, given a packed array, they should use a specialized low-level version. This is the reason for huge speed-ups, because you effectively compile this part of the code.

  • Most built-in functions which take and process packed arrays, also output packed ararys, which provides means for composition.

  • Compile operates on packed arrays and produces packed arrays. Most common iteration functions such as Map, Table etc often auto-compile the functions they iterate, thus also produce packed arrays. This adds a lot, since the user is able to extend the set of fast (packed-array based) functions by using Compile. Since M8, the user is also able to produce Listable compiled functions, in the same sense as numeric Listable functions.

  • Sparse arrays use packed arrays internally to store their data

The main idea of all this is to operate on large chunks of data at once, and avoid the main evaluator by pushing most of the work to the kernel. As I said, this IMO can be viewed as a sort of a partial compilation technique. I just want to stress once again that for this to work, the most important part is a tight integration of packed arrays into the core language, which affects many functions. All these functions have specialized low-level versions which are used when packed arrays are supplied to them. Because of the rectangular layout of the arrays, they map directly on native C arrays, so these specialized implementations can be very fast.

Addressing

In addition to the observations of @acl, I just want to stress that addressing measured in isolation seems not really that important (the twofold difference is most likely due to the extra pointer dereferencing, although I may be wrong). The point IMO is that packed arrays are effective when used with an entirely different programming style, where explicit individual indexing is avoided as much as possible (except possibly inside Compile), and instead the code is rewritten in such a way that this indexing is done internally by built-in functions, at a much lower level.

Limitations

  • As mentioned already, arrays must be rectangular and of the same native type (Integer, Real, or Complex)
  • Not all functions benefit from packed arrays. One notable example which does not, is Sort (and also Union, Complement, Intersection, Ordering) with a default comparison function.

When to use

Actually, whenever you can. I can't recall any case off the top of my head where the use of packed arrays would hurt (if they can be used). Just one hypothetical scenario comes to mind: you store a large amount of data in a packed array, but then somewhere in your code it gets unpacked and eats up all your memory. However, while it is stated in the documentation that computations on packed arrays would always produce the same results as on identical unpacked ones, there are probably corner cases like this one, where this is not so. It seems however that such cases are, so to speak, of measure zero.

One useful trick which isn't emphasized enough is that often you can store your data very space-efficiently even when the main array can not be packed, but its elements can. Given such a list as unpacked, you can Map Developer`ToPackedArray on it, which may lead to very siginificant savings, both in terms of run-time and memory efficiency. One example of such use is here.

In general, when you see the recommendation to "vectorize the problem" or "use vectorized operations" for speed, this is exactly about using packed arrays. Various solutions for this question (except mine) are good examples of such vectorized use. There are plenty of other similar ones here on SE, on SO and MathGroup. One example which I find interesting and somewhat standing out is this one, where I used packed arrays to pack a small matrix of positions, and this still lead to a dramatic speed-up because that matrix was used to extract huge numbers of elements from a list at once, and Extract is also optimized on packed arrays - so, in some cases packing of even small arrays can be beneficial.

This illustrates once again my main message: the big deal is not just packed arrays as a stand-alone data structure, but a different programming style possible when all relevant ingredients are packed. It is this style which leads to huge performance boosts, not just packing by itself.

Leonid Shifrin

Posted 2012-03-25T21:44:05.453

Reputation: 108 027

Hi, @LeonidShifrin. You metioned "Unpacked arrays reference arrays of pointers to their elements. " I have doubts on this. AFAIK, a pointer is just holding memory address, commonly,8 bytes the same as float64. But unpacked array is 4 times bigger than packed. What is the rest space used for? I found that I have no picture of how does an unpacked array look like in a memory. What is the memory structure of an unpacked array? – matheorem – 2016-09-03T09:05:14.697

@matheorem How do you measure the size of unpacked arrays? If you mean ByteCount, then certainly what is measured includes both pointers and real data they point at. Since unpacked arrays hold general expressions, there is no other sensible way to store them other than as *expr[], where *expr is a pointer to generic Mathematica expression. That is what I meant. I didn't look at internal code to confirm that, but I am pretty sure that's the way it is. – Leonid Shifrin – 2016-09-03T09:16:01.627

@LeonidShifrin Thank you. So can I understand the unpacking process as "the array of real number becomes array of 8 type pointer, and what these pointers point to is actually not machine number anymore, instead machine numbers becomes expressions with heads, all these account for 3 times bigger memory consumption" . So the copy process occured in AppendTo[list,el] is actually a copy of array of pointers, not copy of the whole list of content, right? – matheorem – 2016-09-03T11:19:10.423

@matheorem Probably true (AppendTo). At least that would've been my guess as well. The general strategy here is copy-on-write, which means that the hard copy is only created once one of the two references to the same array (or, generally, expression) attempts to modify it. It wouldn't make sense to create fully deep copy of an array for an AppendTo operation, unless that is an array of primitive types (integers, doubles, etc), in which case there is no other choice - given that such packed arrays are exposed pretty much as is (as they are in C) for efficiency (I mean, their memory layout) – Leonid Shifrin – 2016-09-03T12:01:36.657

@LeonidShifrin I found I can't understand your "copy-on-write". Do you mean if func1 modifies list a, and if func2 usesa, then func2 will make a deep copy of a? I think func2 only need pointers, why deep copy? – matheorem – 2016-09-03T12:59:30.663

@matheorem If you pass somewhere a mutable reference (symbol) that stores the same array as in some other place in code, it won't get copied, and that symbol will point to the same array as in the other place. But at the point where your code would attempt to modify the array, that array will be copied. If the array is of general expressions, the copy will be shallow. If it is from primitive types (packed), the copy will be deep. Try searching for copy-on-write on the site for more details, Szabolcs gave a good answer some time ago. – Leonid Shifrin – 2016-09-03T13:09:47.213

@LeonidShifrin Oh, that is much clearer. Szabolcs's answer is really good. Thank you so much for your patience, I learned so important thing from you. Best regards : ) – matheorem – 2016-09-03T13:34:50.183

@matheorem Was glad to help. – Leonid Shifrin – 2016-09-03T13:37:12.580

1good point about them being useful for a different style, and useful discussion :) +1 – acl – 2012-03-25T23:37:53.090

@acl Thanks. Your answer is pretty good too, I voted for it. And I also noticed a rather untypical for your answering style length and organization of it. This seems painfully familiar, and apparently contagious :) – Leonid Shifrin – 2012-03-25T23:42:17.787

thanks. About the style, I thought I'd up my game :) I find this organization useful in other people's answers so I try to do the same – acl – 2012-03-25T23:45:39.713

please note how flexibly I am entering into extended discussions in comments, mere hours after proclaiming their absence being the sole differentiator from other online resources! – acl – 2012-03-25T23:46:20.383

@acl Oh yes, I do note. Actually, people do want to get engaged in discussions, and comments on SE is a backdoor for that. I can imagine that SE "doesn't like this" (and there is some explicit evidence, such as automatic invitations to switch to chat), but it has to live with that. Will leave now, to avoid such an invitation and get some sleep - it is 4 a.m. here. – Leonid Shifrin – 2012-03-25T23:51:10.800

@acl I often try to guess who has given a specific answer before I see the tag. Some authors here are pretty recognizable because of their style. In this case I really mixed you up with Leonid ;-). WRT extended discussions: I do some clean-ups now and then. Removing some chit-chat long after they have worked their community-building magic. You can always flag old stuff like that for moderation if you like. – Sjoerd C. de Vries – 2012-03-26T21:58:48.470

27

I would like to point out that Listable in a pure Function effectively unpacks the array, and makes it much slower than Map for pure Functions.

Downvalues always unpack so SetAttributes[f, Listable] doesn't affect performance there.

The bottom line is that if one wants to use user defined listability it must be inside a compiled function, otherwise use Map

data = RandomReal[1, 5 10^6];

AbsoluteTiming[ Developer`PackedArrayQ[Function[u, u^2, Listable]@data]]

  {4.54275,False}

AbsoluteTiming[ Developer`PackedArrayQ[Function[u, u^2, Listable]/@data]]

  {0.177237,True}

I will expand on my answer a bit. The normal evaluation sequence will always unpack a packed array. f/@{1,2,3} >> {f[1],f[2],f[3]} >> .... The second step in the above sequence will unpack the array, even if ... can be packed. The reason Map sometimes returns packed arrays is that by default, it will autocompile when the list is longer than 99.

SystemOptions["CompileOptions" -> "MapCompileLength"]

  {"CompileOptions" -> {"MapCompileLength" -> 100}}

Developer`PackedArrayQ[vec = RandomReal[1, 99]]

  True

Developer`PackedArrayQ[#^2 & /@ vec]

  False

Developer`PackedArrayQ[vec = RandomReal[1, 100]]

  True

Developer`PackedArrayQ[#^2 & /@ vec]

  True

This doesn't apply to downvalues or pure functions with the Listable attribute.

The proper way to deal with packed arrays is to write vectorized code, one can also use Map or CompiledFunctions but downvalues or pure functions with the Listable attribute should be avoided

Eduardo Serna

Posted 2012-03-25T21:44:05.453

Reputation: 611

5Interesting finding. Could you explain the reason? – luyuwuli – 2016-07-28T07:26:56.863

1this behaviour is so odd... – matheorem – 2016-09-03T02:33:57.780