Skip to main content

contexts - Second argument of BeginPackage with nested package loading


I naively thought that the second argument of BeginPackage can simply be used to ensure the loading and availability in the $ContextPath of additional packages.


Example:


(* Pack1.m *)
BeginPackage["Pack1`"]

Print["Pack1: ", $ContextPath]


EndPackage[]

and


(* Pack2.m *)
BeginPackage["Pack2`", {"Pack1`"}]

Print["Pack2: ", $ContextPath]

EndPackage[]


Now loading Pack2` gives me this:


Needs["Pack2`"]

During evaluation of Pack1: {Pack1`,System`}

During evaluation of Pack2: {Pack2`,Pack1`,System`}

$ContextPath
(* {"Pack2`", "Pack1`", ...} *)


Everything is as I expected. Pack1` is available for use within the implementation of Pack2 and it's also available to the user (i.e. included in the $ContextPath) after Pack2 has finished loading.


This second behaviour is what makes the second argument of BeginPackage so convenient. If we had done BeginPackage["Pack2`"]; Needs["Pack1`"]; ... instead, then Pack1` would have been available for use only within the implementation of Pack2, but not after Pack2 has finished loading.


Let us add a third package that depends on Pack2` now.


(* Pack3.m *)
BeginPackage["Pack3`", {"Pack2`"}]

Print["Pack3: ",$ContextPath]

EndPackage[]


In a fresh kernel, let's load Pack3.


Needs["Pack3`"]

During evaluation of Pack1: {Pack1`,System`}

During evaluation of Pack2: {Pack2`,Pack1`,System`}

During evaluation of Pack3: {Pack3`,Pack2`,System`}

$ContextPath

(* {"Pack3`", "Pack2`", "Pack1`", ...} *)

Just like before, loading only Pack3` makes all three packages available to the user (i.e. includes all of them in the $ContextPath).


But wait! Adding the Pack2` dependency to Pack3 did not make Pack1` available within the implementation of Pack3. I found this surprising and I got bitten by it (it caused a strange bug).


Do others find this surprising too? What is the reasoning for this behaviour? What are its advantages compared to simply ensuring that Pack2 makes Pack1 available as well regardless of whether we need it inside of a package (for its implementation) or outside of it (for interactive use)?


What implications does this behaviour have for proper package design, especially in the case when the Kernel/init.m file of a package Gets several sub-packages, each of which have their own context and BeginPackage? This complex package might then be the dependency of another.



Answer



I've certainly encountered this behavior before. While I can't speak authoritatively, I'd think this is as designed, although it does introduce certain inconsistency. I also think that this issue is a result of clash of cultures: the end user - oriented one from the earlier days of Mathematica, and the one coming from standard software-engineering practices.


A bit of history, and the package extension model


I think, it may help to go a little into the history of the package mechanism.



On one hand, private import via Needs inside a package wasn't always available, in the sense that such an import in early versions of Mathematica would still keep the context of the loaded package on the $ContextPath, after the package has been loaded, even if called in the private section of the package.


On the other hand, to me it looks like early package development practices had, in particular, these features:



  • (Deeply) nested package dependencies were not very common

  • In many cases, one wanted to extend some of the built-in packages, giving the end user the functionality of one of the core package extended with some additional functionality, rather than build a package based on other packages but expose only that package's interface.

  • The end-user typically was not supposed to know anything about package development and such. In other words, the majority of end-users were considered non-programmers (which was IMO quite justified).


So, the main goals of the second argument of BeginPackage were, I think, these:



  • The usual one - make the functionality from packages listed in the second argument of BeginPackage available for the implementation of a given package


  • Provide a formal declarative syntax to specify package dependencies

  • Provide a way to expose an extended package "MyPackage`", that would also make additional packages available to the end-users without a need to call Needs on them separately.


While the first two goals are really necessary and are present in one form or another in most programming languages, the last one is different. And the inconsistency caused by it, is exactly what you noted: if we think of a package as an extension of one or more other packages, then it should always behave like that, no matter which way we load it.


Why I think the current behavior is better


From the developer's perspective, I'd argue that the current behavior is better than if BeginPackage was loading all those packages and their dependencies on the $ContextPath. Here are the reasons:



  • Information-hiding. It is always better to keep only as much information available to the client code, as it needs, and no more.


  • Level of control. By restricting the set of loaded packages to only those listed in the second argument of BeginPackage (but not their dependencies), the system allows the developer to second-guess developers of those packages s/he uses to build a given package at hand. The developer can then control exactly which packages are on the $ContextPath during the loading of their own code.



    If the packages were loaded automatically with all their dependencies, the developer of a given package would need to care about name clashes with all those dependent packages, and perhaps remove them from the $ContextPath. However, in general this is not even possible, since s/he is not supposed to know all those dependencies of dependencies.




  • Nesting ambiguity. If the dependencies of dependencies were loaded too, then which packages do we keep on the $ContextPath after the package loads? Where do we stop in this nesting? We may end up adding a bunch of contexts to the $ContextPath, which would've been really bad.




Implications for package design


I think, the guiding principles should be separation of responsibility and information-hiding.


The fact that your package A depends on packages B and C, is an internal business of your package, about which the users of A could not care less. So, there are two different cases here:





  • You do want to expose some of the functionality of B and / or C to the end user of your package A


    As I said before, this really doesn't sound like a right approach to me, in particular because, preferably, for any functionality there should be a single source that provides it and is responsible for it. Of course, duplication of code is the worst, but duplication of the interface is almost as bad.


    However, if you really need to, there is always delegation: create your own wrappers around these functions from B and C you need, and expose these wrappers as a part of the interface of A. This means that now you take the responsibility of those functions as a part of the A's interface - which is IMO the right thing to do. If later something changes in the functionality of B or C, it will be your responsibility to maintain the consistency of the interface of A, so you then don't put that burden on the users of A.




  • You don't require the end user of A to have an access of B and C - then you don't need anything besides the standard single - package development machinery.




If your package contains several sub-packages, you simply load those in your Kernel/init.m. This does not change the fact that you expose a single interface. If you want to make those sub-packages relatively independent, and packages on their own right, this may require to use some delegation / wrappers inside your main package, as a price for it - as I mentioned above. In most cases, however, I wouldn't do it, but would let those users who need that functionality load those packages (B and / or C in this example) separately, on their own.



Summary


Putting it more bluntly: the extension package idea (leaving contexts on the $ContextPath after the package has been loaded) may be Ok from the end-user usability point of view, but is IMO a failed concept for software development.


The advantage of such an approach is that it is friendly to non-programmers. The disadvantage is that it does not nest, and therefore also does not scale. From the software engineering perspective, a package can not and should not take any responsibility for other packages, and should only be responsible for its own interface. In addition, "extension package" model makes it harder to enforce information hiding and encapsulation - another two of the fundamental software engineering principles.


So, here I agree with Kuba, in that I tend to use the second argument of BeginPackage rarely. The only scalable model of imports are private imports, which is what also use all other programming languages I am familiar with.


Comments

Popular posts from this blog

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Adding a thick curve to a regionplot

Suppose we have the following simple RegionPlot: f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}] Now I'm trying to change the curve defined by $y=g[x]$ into a thick black curve, while leaving all other boundaries in the plot unchanged. I've tried adding the region $y=g[x]$ and playing with the plotstyle, which didn't work, and I've tried BoundaryStyle, which changed all the boundaries in the plot. Now I'm kinda out of ideas... Any help would be appreciated! Answer With f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 You can use Epilog to add the thick line: RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}, PlotPoints -> 50, Epilog -> (Plot[g[x], {x, 0, 2}, PlotStyle -> {Black, Thick}][[1]]), PlotStyle -> {Directive[Yellow, Opacity[0.4]], Directive[Pink, Opacity[0.4]],