Skip to main content

How to customize derivative behavior via upvalues?


I'm having trouble getting customized derivatives (defined using upvalues) to behave in sane ways. By this I mean that I'd like to define a derivative using an upvalue and then have Mathematica apply all derivative rules it normally would, delegating to the upvalues when appropriate. For example.


X /: D[X[x_], y_] := A KroneckerDelta[x,y]
Y /: D[Y[x_], y_] := B KroneckerDelta[x,y]
D[X[x], x] (* Gives the Expected Answer: A *)
D[X[x] + Y[x], x] (* Desired Output: A + B *)
D[X[x] Y[x], x] (* Desired Output: A Y[x] + X[x] B *)

For the latter two, I get X'[x] + Y'[x] and Y[x] X'[x] + X[x] Y'[x]. I'm not understanding why the upvalues aren't kicking in. A similar question was asked here, but I couldn't adjust it for my problem.



Update:


It looks like I didn't specify my problem in enough detail. To give a better sense of the direction I'm looking, see this question. My (present) understanding of Derivative suggests it cannot achieve what I want since the arguments to my functions are abstract indices (e.g. multiple arguments are implicit).


Consider a (discrete) joint probability pXY(i,j) where i and j are indexes (with unspecified ranges) that represent the possible values. Now, consider a marginal distribution pX(i)=jpXY(i,j). Each pX(i) depends on all of the pXY(j,k) with j=i. I was hoping to declare the derivatives of pXY(i,j) and pX(i) with respect to pXY(k,l) (treating each pXY(i,j) as independent variables). Then, I wanted to be able to differentiate any generic expression involving those functions. Mathematically, this is the behavior:


pXY(i,j)pXY(k,l)=δijδjkpX(i)pXY(k,l)=δik


Here is a toy expression I'd like to differentiate:


pXY(k,l)[pXY(i,j)pX(i)2]=δikδjlpX(i)2+2pXY(i,j)pX(i)δik


The above is just an application of the product rule, so it's pretty straight-forward to do these calculations by hand, once the derivatives of the primitives are specified. However, I don't want to re-specify all the rules of differentiation. Somehow I'd like to tell Mathematica to use its own system, while incorporating my primitives.


Things I've tried...


pXY /: D[pXY[i_, j_], pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX /: D[pX[i_], pXY[j_, k_]] := KroneckerDelta[i, j]

D[pXY[i, j] pX[i]^2, pXY[k, l]] (* Undesirably returns 0 *)

I think I understand why (via pattern matching) this gives 0. Another attempt:


pXY /: D[pXY[i_, j_], pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX[i_] := MySumX[pXY[i, j], {i, j}]
MySum /: D[MySum[s_, {i_, j_}], pXY[k_, l_]] := D[s /. {i -> k, j -> l}, pXY[k, l]]
MySumX /: D[MySumX[s_, {i_, j_}], pXY[k_, l_]] := D[s /. {i -> k}, pXY[k, l]]

D[pX[i], pXY[i, k]] (* Result is correct *)
D[pXY[i, j] + pX[i], pXY[k, l]] (* Result is not correct *)


Hopefully its clearer now how the original question relates to what I'm truly after. Just to keep it fun, here is another expression I'd like to differentiate:


pXY(k,l)i,jpXY(i,j)logpXY(i,j)pX(i)pY(j)



Answer



Update


In my original answer (which I have left below for entertainment value) I asserted that it would be difficult to make it work with upvalues using D, but Tom has reminded me of the NonConstants option. When the symbols listed as NonConstants are differentiated they remain as expressions with head D:


D[a x, x, NonConstants -> {a}]
(* a + x D[a, x, NonConstants -> {a}] *)

This means that it is perfectly possible to use the upvalues approach suggested in the question, all that is required is to extend the pattern with a BlankNullSequence[] to allow for the presence of the option.



pXY /: D[pXY[i_, j_], pXY[k_, l_], ___] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX /: D[pX[i_], pXY[j_, k_], ___] := KroneckerDelta[i, j]

SetOptions[D, NonConstants -> {pXY, pX}];

D[pXY[i, j] pX[i]^2, pXY[k, l]]
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i]^2 +
2 KroneckerDelta[i, k] pX[i] pXY[i, j] *)

Original answer



In response to the updated question:


I think it will be very difficult to make it work using rules based on D. This simple example shows the problem:


a /: D[a[x], x] := 1

D[a[x], x]
(* 1 *)

D[2 a[x], x]
(* 2 a'[x] *)


Because there is no special rule defined for D[2 a[x], x] it is evaluated as normal, giving 2 Derivative[1][a][x]. You could of course add lots more upvalues to deal with different expressions, but you will end up effectively rewriting the rules of differentiation.


A better solution is to create your special rules based on Derivative, like this:


pXY[i_, j_]'[pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX[i_]'[pXY[j_, k_]] := KroneckerDelta[i, j]

You will note that your example does not appear to work, however:


D[pXY[i, j] pX[i]^2, pXY[k, l]]
(* 0 *)

What's going on here is that D sees that the function does not appear to depend on the differentiation variable, so it returns 0 immediately. It's the same as D[Sin, x] returning 0 while D[Sin[x], x] returns Cos[x]. The solution is to make the expression explicitly a function of the differentiation variable:



D[pXY[i, j][pXY[k, l]] pX[i][pXY[k, l]]^2, pXY[k, l]]
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i][pXY[k, l]]^2 +
2 KroneckerDelta[i, k] pX[i][pXY[k, l]] pXY[i, j][pXY[k, l]] *)

This works, but has made the notation messy. A possible solution is to define your own version of D which automatically converts func to func[var] before differentiating, and then converts func[var] back to func afterwards.


myD[expr_, var_] := D[expr /. (p_pX | p_pXY) :> p[var], var] /. (p_pX | p_pXY)[var] :> p

You now have:


myD[pXY[i, j] pX[i]^2, pXY[k, l]] 
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i]^2 +

2 KroneckerDelta[i, k] pX[i] pXY[i, j] *)

Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

What is and isn't a valid variable specification for Manipulate?

I have an expression whose terms have arguments (representing subscripts), like this: myExpr = A[0] + V[1,T] I would like to put it inside a Manipulate to see its value as I move around the parameters. (The goal is eventually to plot it wrt one of the variables inside.) However, Mathematica complains when I set V[1,T] as a manipulated variable: Manipulate[Evaluate[myExpr], {A[0], 0, 1}, {V[1, T], 0, 1}] (*Manipulate::vsform: Manipulate argument {V[1,T],0,1} does not have the correct form for a variable specification. >> *) As a workaround, if I get rid of the symbol T inside the argument, it works fine: Manipulate[ Evaluate[myExpr /. T -> 15], {A[0], 0, 1}, {V[1, 15], 0, 1}] Why this behavior? Can anyone point me to the documentation that says what counts as a valid variable? And is there a way to get Manpiulate to accept an expression with a symbolic argument as a variable? Investigations I've done so far: I tried using variableQ from this answer , but it says V[1...