## How to differentiate formally?

38

23

I have been wrapping my head around this for a while now and I have not found a solution so far. I want to work with an arbitrary number of variables in Mathematica and use some built in functions. To make things more specific for starters I want to do the following. Define a sum with $n$ summands each containing a new variable x[i] (in the $i$-th summand):

sum[n_] = Sum[i x[i], {i, 1, n}]


Then I want to differentiate the expression with respect to some x[i] like:

D[sum[n], x]


Mathematica returns $0$ instead of $2$. However if I supply a specific $n$ like:

D[sum, x]


everything works fine. I thought about using Assumption for $n$, but with no success so far. How can I do that right?

@m_goldberg Sorry for the newbie question, but how exactly is D[sum[n], x] the same thing as D[Sum[n, y], x]? In particular, Sum[n,y] = ny since it sums n y times but sum[n] = $\sum^{n}_{i=1} i x[i]$ (you can check it on a notebook on mathematica). – Pinocchio – 2015-08-07T19:48:53.750

Evaluating D[sum[n], x] is essentially the same as evaluating D[Sum[n, y], x], where y is value-free, which is nothing D recognizes as depending on x. – m_goldberg – 2012-12-15T16:10:56.940

2

This question is closely related to and might even be considered a duplicate of this one

– m_goldberg – 2012-12-15T16:25:32.800

1Extending Mathematica to allow sums of symbolic length really would require extended it to allow lists of symbolic length; only then could one properly worry about differentiating such objects. But how to make a robust, general design for such an extension is hardly obvious and raises all sorts of issues, e.g.: Should Infinity to an acceptable value for n? What would the effect be upon processing speed for specific-length objects? – murray – 2012-12-15T17:41:56.640

1I've changed your title to something more specific. please change it to something else if you think it's better – acl – 2012-12-15T19:39:57.960

5For formal differentiation, what I've noticed is if it's in a tuxedo it's probably a guy, and if it's in a gown it's usually a gal. But this is only a rough guideline, and anyway times have changed. – Daniel Lichtblau – 2012-12-15T22:15:22.260

This question is not related to generating a given amount of variables to use later on. The calculation should be without the need to specify a specific amount. The only prior should be that the number of variables is limited. – Wizard – 2012-12-22T17:01:57.537

39

sum[n_] := Sum[i x[i], {i, 1, n}]

x /: D[x[i_], x[j_], NonConstants -> {x}] :=
KroneckerDelta[i, j]

D[sum[n], x, NonConstants -> x]


$\begin{cases} 2 & n>1 \\ 1-n & \text{True} \end{cases}$

The trick here is the use of the NonConstants option of the derivative operator. This then has to be combined with a definition stating that the x[i] are independent variables for the purposes of this differentiation (hence the KroneckerDelta on the second line).

Edit: more discussion

And here is another cool result, completely symbolic:

Assuming[m ∈ Integers, D[sum[n], x[m], NonConstants -> x]]


$\left( \begin{array}{cc} \{ & \begin{array}{cc} m & m\geq 1 \\ 0 & \text{True} \\ \end{array} \\ \end{array} \right)-\left( \begin{array}{cc} \{ & \begin{array}{cc} m & m>2\land m\geq n+1 \\ n+1 & m=2\land n=1 \\ \end{array} \\ \end{array} \right)$

This isn't easy to absorb, but it works if you check it with specific examples by doing

condition = %;

Simplify[condition /. m -> 10]


$\begin{cases} 10 & n>9 \\ 0 & \text{True} \end{cases}$

In summary, it's worth pointing out that a lot of symbolic differentiation tasks can be achieved by using either NonConstants specifications in D or conversely using Constants specifications in Dt.

It might be worth mentioning that when doing this with more than one variable (say $x$ and $y$), it is necessary to have them Sorted in the NonConstants rule for the UpValue of the variables, otherwise it won't find a match when trying to evaluate D (and leave it unevaluated). – Joel Bosveld – 2015-06-17T07:44:04.710

sorry if this a super newbie question, but why does it say $1-n$ for...true? :/ What does that condition even mean? – Pinocchio – 2015-08-07T19:58:25.250

@Pinocchio True is the entry that humans would call "otherwise" - it's the default alternative if none of the above holds. – Jens – 2015-08-07T20:05:49.263

@Jens I see what true means now (weird, the programmer could have just said otherwise in a print statement...). But, how is the derivative wrt x be 1 - n for n < 1? for n<1 the sum doesn't even exist...so wouldn't technically it be the derivative of a non-existent sum (so zero), so the derivative of zero is zero? Or am I completely off? – Pinocchio – 2015-08-07T20:10:42.370

@Pinocchio No, the only other case in that sum is n=1. – Jens – 2015-08-07T22:06:57.973

@Jens if n=1 then $\sum_i i x_i = 1 x_1$ and n-1 = 0. Isn't that still sort of a bizarre ting to do if we are taking the derivative with respect to x? Sorry, I am just trying to understand why the output is the way it is. Currently, it still doesn't quite make sense to me. – Pinocchio – 2015-08-07T22:22:18.010

@Pinocchio Of course it looks weird, but nonetheless it's a true statement because the derivative is taken with respect to a variable that (for n=1) doesn't occur in the sum. Therefore its value would be 1-n==0. What makes it a little confusing is that the output is not in the most simplified form because n still appears even though it can only have one value in that last line. Still, it's correct. – Jens – 2015-08-07T23:01:24.353

@Jens sorry for taking this discussion further, I am genuinely curious about this and there's something that still "bothers me". In particular, why didn't mathematica return $1 - n + n^2 - n^3 + n^4 - n^ 5$ that is also zero...and you might ask me, why did I choose that weird equation and my answer is...Its random. But is mathematica expression random? Or was there a reason it chose $1 - n$ and not (any) other expression that involves n and is zero (in this special case n=1). Do you know what I mean? I wish I knew. If I did I wouldn't have extended this discussion :p – Pinocchio – 2015-08-08T01:35:12.210

This is really helpful. Thank you very much. – Wizard – 2012-12-22T17:03:11.890

21

I did some computation of formal derivatives a while back which might be of interest in this context (though keep in mind that this is anything but bullet proof! it does work for the cases I bothered to check though).

Clear[a]; Format[a[k_]] = Subscript[a, k]


Let us say we have an objective function which is formally a function of the vector a[i]

Q = Sum[Log[Sum[a[r] Subscript[B, r][Subscript[x, i]], {r, 1, p}]/
Sum[a[r] , {r, 1, p}]], {i, 1, n}] Let us define a couple of rules for formal differentiation as follows

Clear[d];
d[Log[x_], a[k_]] := 1/x d[x, a[k]]
d[Sum[x_, y__], a[k_]] := Sum[d[x, a[k]], y]
d[ a[k_] b_., a[k_]] := b /; FreeQ[b, a]
d[ a[q_] b_., a[k_]] := b Subscript[δ, k, q] /; FreeQ[b, a]
d[ c_  b_, a[k_]] := d[c, a[k]] b + d[b, a[k]] c
d[ b_ +  c_, a[k_]] := d[c, a[k]] + d[b, a[k]]
d[Subscript[δ, r_, q_], a[k_]] := 0
d[x_, a[k_]] := 0 /; FreeQ[x, a]
d[G_^n_, a[k_]] := n G^(n - 1) d[G , a[k]] /; ! FreeQ[G, a]
d[Exp[G_], a[q_]] := Exp[G] d[G , a[q]] /; ! FreeQ[G, a]



And a rule to deal with Kroneckers

ds = {Sum[a_ + b_, {s_, 1, p_}] :> Sum[a, {s, 1, p}] + Sum[b, {s, 1, p}],
Sum[ y_ Subscript[δ, r_, s_], {s_, 1, p_}] :> (y /. s -> r),
Sum[ y_ Subscript[δ, s_, r_], {s_, 1, p_}] :> (y /. s -> r),
Sum[ Subscript[δ, s_, r_], {r_, 1, p_}] :> 1,
Sum[δ[i_, k_] δ[j_, k_] y_. , {k_, n_}] -> δ[i, j] (y /. k -> i),
Sum[a_ b_, {r_, 1, p_}] :> a Sum[b, {r, 1, p}] /; NumberQ[a],
Sum[a__, {r_, 1, p_}] :>  Sum[Simplify[a], {r, 1, p}] }


Then, for instance, the gradient of Q with respect to one of the a[k] reads

grad = d[Q, a[k]]  /. ds // Simplify; Similarly the tensor of second derivatives w.r.t. a[k] and a[s] is given by

hess = d[d[Q, a[k]], a[s]] /. ds // Simplify As a less trivial example let us consider the 4th order derivatives of Q

 d[d[d[d[Q, a[k]], a[s]], a[m]], a[t]]; /. ds // Simplify For the problem at hand we check easily that

 Q = Sum[r a[r] , {r, 1, p}];

grad = d[Q, a[k]] // Simplify;


returns k as it should

EDIT

This process can be made a bit more general, say, on this Objective function

  Q = 1/2 Sum[(Sum[a[r] Subscript[B, r, i][a[q]], {r, 1, p}] -
Subscript[y, i])^2, {i, 1, n}] which depends non linearly on a[k] via B.

All we need is to add a new rule for d

d[H_[a[q_]],
a[k_]] := (D[H[x] , x] /. x -> a[k] ) Subscript[δ, k, q]


grad = d[Q, a[k]] // Simplify;
hess = d[d[Q, a[k]], a[s]]; hess /. ds // Simplify As a other example, let us look at a parametrized entropy distance,

 Q = -Sum[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/
Subscript[y, i]) Log[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/
Subscript[y, i])], {i, 1, n}] we can compute its Hessian while mapping twice the sum rule

 Map[# /. ds &, d[d[Q, a[k]], a[s]] /. ds] As a final example, consider a Poisson likelihood

 Q = Sum[Log[Exp[-a[k]] a[k]^Subscript[y, k]/Subscript[y, k]!], {k, 1, n}] so that

 grad = d[Q, a[k]] // Simplify and

  hess =d[d[Q, a[k]], a[s]]  /. ds // Simplify Of course these algebraic rules are not bullet proof but illustrate nicely the way mathematica handles new grammar.

unfortunately, i can not replicate these results in Mathematica 10.2. I would be appreciative if someone could try to reproduce this and/or suggest edits. – Eric Brown – 2015-09-11T01:53:07.447

I also cannot replicate these results using Mathematica 11 – MaPo – 2017-01-26T15:01:26.367

1I have added one rule to solve the pb on the last two example. It now works for Mathematica 10.3. – chris – 2017-01-26T19:18:26.153

@Nasser have you seen this?

– Mr.Wizard – 2012-12-15T19:51:06.057

8

Starting in M11.1, this works:

sum[n_] = Sum[i x[i],{i,1,n}];

D[sum[n],x] //InputForm


Piecewise[{{2, n >= 2}}, 0]

I've got M11, and I still get a result of 0. – clr66 – 2017-07-23T01:27:17.223

Works for me in Mathematica 11.1.1 on MacOS Sierra. – Wizard – 2017-07-24T15:13:23.157

@clr66 I should have said starting in M11.1. – Carl Woll – 2017-07-24T15:44:34.147

Finally managed to upgrade to M11.1.1 and attempted the following: Assuming[{1<=j<=n,Element[j,Integers]},D[Sum[i f[i],{i,1,n}],f[j]]]

(* j *)

So far, so good. However, I also get the following: Assuming[{1<=j<=n,Element[j,Integers]},D[Sum[i g[f[i]],{i,1,n}],f[j]]]

(* Sum[i*KroneckerDelta[i, j]*Derivative[g][f[i]], {i, 1, n}] *)

This answer is technically correct, although it would be nice to get a clean result of j g'[f[j]] Is there a way to get the KroneckerDelta to return this result? – clr66 – 2017-08-02T19:34:04.193

6

One solution might be ...

Method 1.

Define some variables:

x = Table[Unique[], {5}];


Form the inner product and differentiate:

D[Inner[Times, x, Range@Length@x], x[]]


2

Or if you prefer it in a functional form:

sum[n_] := Inner[Times, x[[1 ;; n]], Range@n] /; ( Length@x >= n)

D[sum, x[]]


3

Method 2.

You could take a different approach, but you need to be aware that it leaks variables of the form {x1,x2,x3,...} out into the general context:

sum[n_] := Inner[Times, Symbol["x" <> ToString@#] & /@ Range@n, Range@n]

sum


x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5

D[sum3, x3]


3

This is exactly what I was trying to avoid. The exact amount of variables should not have to be specified. – Wizard – 2012-12-22T17:04:14.123

4

I like image_doctor's solution better, but how about using Array and looking for the position using that index each time? Like this:

xx = Array[x, 10, 1];
sum[n_] := Times[List @@ xx , Range]

sum[n]
(* {x, 2 x, 3 x, 4 x, 5 x, 6 x, 7 x, 8 x, 9 x, 10 x} *)


Now

sum[n_] := Times[List @@ xx^4, Range]
D[sum[n], xx[]][]
(*  20 x^3 *)


and

sum[n_] := Times[List @@ xx, Range]
D[sum[n], xx[]][]

(* 2 *)


It is kinda clumsy though. The problem, of course, is that one can't treat as a single variable as we do on paper. P.S., I tried to see if one can do this in Maple and could not do it directly as you wanted. Had to do a hack as above. (But I do not know Maple much, though.)

1

Nasser, if it matters to you, your "gravatar" is changing along with your I.P. address. You can make it a fixed one by entering an email address in your profile. This email is only visible to moderators and furthermore it doesn't need to be a real one anyway. Also see: http://mathematica.stackexchange.com/q/13525/121

– Mr.Wizard – 2012-12-15T19:42:42.390

3

A possible solution is to use the fact that Mathematica can easily take the derivative of an actual sum, but has problems with the symbolic one. In order to take the derivative on j-th term of a sum, we extract from the sum few terms centered around j. The number of extracted terms is equal to 2*dj+1. There is one drawback of this solution that you can always differentiate on j-th term. This method works with subscripted variables as well.

s1 = Sum[p[k]*a[k + 1] + p[k + 1]*a[k], {k, 1, nn}];
derivSum[s0_, xj_, dj_] := Block[{k, j}, D[Sum[s0[], {k, j - dj, j + dj}], xj]];
derivSum[s1, p[j], 4]
(* a[-1+j]+a[1+j] *)


2

\$Version

(*  "11.1.1 for Mac OS X x86 (64-bit) (April 18, 2017)"  *)

sum[n_] = Sum[i x[i], {i, 1, n}];

Assuming[Element[{n, m}, Integers] && n >= m >= 1, D[sum[n], x[m]]]

(*  m  *)

Assuming[Element[{n, m}, Integers] && n >= 1 && m >= 1,
D[sum[n], x[m]] // Simplify] 