Skip to main content

How to customize derivative behavior via upvalues?


I'm having trouble getting customized derivatives (defined using upvalues) to behave in sane ways. By this I mean that I'd like to define a derivative using an upvalue and then have Mathematica apply all derivative rules it normally would, delegating to the upvalues when appropriate. For example.


X /: D[X[x_], y_] := A KroneckerDelta[x,y]
Y /: D[Y[x_], y_] := B KroneckerDelta[x,y]
D[X[x], x] (* Gives the Expected Answer: A *)
D[X[x] + Y[x], x] (* Desired Output: A + B *)
D[X[x] Y[x], x] (* Desired Output: A Y[x] + X[x] B *)

For the latter two, I get X'[x] + Y'[x] and Y[x] X'[x] + X[x] Y'[x]. I'm not understanding why the upvalues aren't kicking in. A similar question was asked here, but I couldn't adjust it for my problem.



Update:


It looks like I didn't specify my problem in enough detail. To give a better sense of the direction I'm looking, see this question. My (present) understanding of Derivative suggests it cannot achieve what I want since the arguments to my functions are abstract indices (e.g. multiple arguments are implicit).


Consider a (discrete) joint probability $p_{XY}(i,j)$ where $i$ and $j$ are indexes (with unspecified ranges) that represent the possible values. Now, consider a marginal distribution $p_X(i) = \sum_j p_{XY}(i,j)$. Each $p_X(i)$ depends on all of the $p_{XY}(j,k)$ with $j = i$. I was hoping to declare the derivatives of $p_{XY}(i,j)$ and $p_X(i)$ with respect to $p_{XY}(k,l)$ (treating each $p_{XY}(i,j)$ as independent variables). Then, I wanted to be able to differentiate any generic expression involving those functions. Mathematically, this is the behavior:


$$ \frac{\partial p_{XY}(i,j)}{\partial p_{XY}(k,l)} = \delta_{ij} \delta_{jk} \qquad\qquad \frac{\partial p_X(i)}{\partial p_{XY}(k,l)} = \delta_{ik} $$


Here is a toy expression I'd like to differentiate:


$$\frac{\partial}{\partial p_{XY}(k,l)} \left[ p_{XY}(i,j) p_X(i)^2 \right] = \delta_{ik} \delta_{jl}\, p_X(i)^2 + 2 p_{XY}(i,j)\, p_X(i) \, \delta_{ik} $$


The above is just an application of the product rule, so it's pretty straight-forward to do these calculations by hand, once the derivatives of the primitives are specified. However, I don't want to re-specify all the rules of differentiation. Somehow I'd like to tell Mathematica to use its own system, while incorporating my primitives.


Things I've tried...


pXY /: D[pXY[i_, j_], pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX /: D[pX[i_], pXY[j_, k_]] := KroneckerDelta[i, j]

D[pXY[i, j] pX[i]^2, pXY[k, l]] (* Undesirably returns 0 *)

I think I understand why (via pattern matching) this gives 0. Another attempt:


pXY /: D[pXY[i_, j_], pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX[i_] := MySumX[pXY[i, j], {i, j}]
MySum /: D[MySum[s_, {i_, j_}], pXY[k_, l_]] := D[s /. {i -> k, j -> l}, pXY[k, l]]
MySumX /: D[MySumX[s_, {i_, j_}], pXY[k_, l_]] := D[s /. {i -> k}, pXY[k, l]]

D[pX[i], pXY[i, k]] (* Result is correct *)
D[pXY[i, j] + pX[i], pXY[k, l]] (* Result is not correct *)


Hopefully its clearer now how the original question relates to what I'm truly after. Just to keep it fun, here is another expression I'd like to differentiate:


$$ \frac{\partial}{\partial p_{XY}(k,l)} \sum_{i,j} p_{XY}(i,j)\log \frac{ p_{XY}(i,j)}{p_X(i) p_Y(j)}$$



Answer



Update


In my original answer (which I have left below for entertainment value) I asserted that it would be difficult to make it work with upvalues using D, but Tom has reminded me of the NonConstants option. When the symbols listed as NonConstants are differentiated they remain as expressions with head D:


D[a x, x, NonConstants -> {a}]
(* a + x D[a, x, NonConstants -> {a}] *)

This means that it is perfectly possible to use the upvalues approach suggested in the question, all that is required is to extend the pattern with a BlankNullSequence[] to allow for the presence of the option.



pXY /: D[pXY[i_, j_], pXY[k_, l_], ___] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX /: D[pX[i_], pXY[j_, k_], ___] := KroneckerDelta[i, j]

SetOptions[D, NonConstants -> {pXY, pX}];

D[pXY[i, j] pX[i]^2, pXY[k, l]]
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i]^2 +
2 KroneckerDelta[i, k] pX[i] pXY[i, j] *)

Original answer



In response to the updated question:


I think it will be very difficult to make it work using rules based on D. This simple example shows the problem:


a /: D[a[x], x] := 1

D[a[x], x]
(* 1 *)

D[2 a[x], x]
(* 2 a'[x] *)


Because there is no special rule defined for D[2 a[x], x] it is evaluated as normal, giving 2 Derivative[1][a][x]. You could of course add lots more upvalues to deal with different expressions, but you will end up effectively rewriting the rules of differentiation.


A better solution is to create your special rules based on Derivative, like this:


pXY[i_, j_]'[pXY[k_, l_]] := Simplify[KroneckerDelta[i, k] KroneckerDelta[j, l]]
pX[i_]'[pXY[j_, k_]] := KroneckerDelta[i, j]

You will note that your example does not appear to work, however:


D[pXY[i, j] pX[i]^2, pXY[k, l]]
(* 0 *)

What's going on here is that D sees that the function does not appear to depend on the differentiation variable, so it returns 0 immediately. It's the same as D[Sin, x] returning 0 while D[Sin[x], x] returns Cos[x]. The solution is to make the expression explicitly a function of the differentiation variable:



D[pXY[i, j][pXY[k, l]] pX[i][pXY[k, l]]^2, pXY[k, l]]
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i][pXY[k, l]]^2 +
2 KroneckerDelta[i, k] pX[i][pXY[k, l]] pXY[i, j][pXY[k, l]] *)

This works, but has made the notation messy. A possible solution is to define your own version of D which automatically converts func to func[var] before differentiating, and then converts func[var] back to func afterwards.


myD[expr_, var_] := D[expr /. (p_pX | p_pXY) :> p[var], var] /. (p_pX | p_pXY)[var] :> p

You now have:


myD[pXY[i, j] pX[i]^2, pXY[k, l]] 
(* KroneckerDelta[i, k] KroneckerDelta[j, l] pX[i]^2 +

2 KroneckerDelta[i, k] pX[i] pXY[i, j] *)

Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...