Skip to main content

searching - Find file names in directory


I am trying to find the list of any file in any directory of a given name (by in the directory, I mean directly in the directory, so in a directory which is in the directory would not count). For the sake of example. Let's suppose I want to find all files in each folder called "Preferences", and let's restrict our search to the folder ~/.Mathematica. If I wanted to do this from the terminal, I could just do


find ~/.Mathematica -regex ~/.Mathematica.*Preferences/[^/]*.


This works and I see there is a single file matching my criteon, ~/.Mathematica/ApplicationData/Parallel/Preferences/Preferences.m


But I want to try to do it conveniently in mathematica. I am thinking the Filenames function should do it.


I will first run


SetDirectory["~/.Mathematica"]



Then I would run


fileAndDirectoryNames = 
FileNames["*",RegularExpression[".*Preferences"], 1]

followed by


fileNames = Select[fileAndDirectoryNames, ! DirectoryQ[#] &]

However, this gives incorrect results for me: fileAndDirectoryNames is an empty list. If I instead run


fileAndDirectoryNames = 
FileNames["*", RegularExpression[".*/.*/Preferences"], 1]


and recompute fileNames as before, then I get correct output.


I am confused because it seems to me that the regular expression in my second attempt is stronger (allows for fewer matches) than the one in my first attempt. The fileNames function should have a monotonicity property in the second argument that if you weaken the pattern, then the new output ought to be a superset of the original output. Yet this doesn't seem to happen. Why is this? I am not sure if I am having a problem with mathematica or my understanding of regular expressions.



Answer



All three of parameters for FileNames can affect the depth at which Mathematica searches for results. It seems like your confusion is a result of interaction among these parameters. This is easily understandable as the documentation for FileNames is not very illustrative. (Indeed my first attempt at answering this question was faulty for the same reason.)


The first parameter -- the form -- should be thought of as a relative path. It has no intrinsic depth specification, but will be tested at depths specified by the next two parameters. However, it is possible to control the depth of the search with this parameter by specifying a folder hierarchy in the form you are searching for. (See below.) This can be a literal string, a string with simple wildcards (*, etc.), a Mathematica-style string pattern, or a regular expression.


The second parameter -- the directories -- specifies the top-level locations in which Mathematica will conduct its search. The first parameter will be tested relative to what is specified here. This can also be a literal or a pattern, same as above.


The third parameter -- the depth -- tells Mathematica whether it should repeat the search for the first parameter in subdirectories of the paths specified in the second parameter. When its value is 1 (the default), Mathematica will only return matches that are immediately relative to a directory specified in the second argument.


Rather than writing a bunch of prose, I think it will be easier to just supply some examples to see how these things can interact.


First, here is the entire directory tree of the folder tmp:



FileNames["*", "tmp", Infinity]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/Preferences/test6", "tmp/t1", "tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences", "tmp/t1/st1/Preferences/test9", "tmp/t2", "tmp/t3", "tmp/t3/Preferences", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4", "tmp/test5"}



So of course we see that Infinity directs Mathematica to walk the whole tree. By contrast, the default value (1) yields:


FileNames["*", "tmp"]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/t1", "tmp/t2", "tmp/t3", "tmp/test5"}




Similarly,


    FileNames["*", "tmp", 2]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/Preferences/test6", "tmp/t1", "tmp/t1/Preferences", "tmp/t1/st1", "tmp/t2", "tmp/t3", "tmp/t3/Preferences", "tmp/test5"}



This is all straightforward. Now, consider these examples. Take note of how we are controlling the depth of the search in various ways.


FileNames["t1/*", "tmp"]



{"tmp/t1/Preferences", "tmp/t1/st1"}



FileNames["*", "tmp/t1"]


{"tmp/t1/Preferences", "tmp/t1/st1"}



FileNames["t1/*", "tmp", 2]



{"tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences"}



FileNames["t1/*", "tmp", Infinity]


{"tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences", "tmp/t1/st1/Preferences/test9"}



FileNames["test*", "tmp/t1", Infinity]



{"tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9"}



FileNames["*", "tmp/*/Preferences"]


{"tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}



Note that * in the second parameter is not matching nested directories. (E.g., we are not getting "tmp/t1/Preferences/Preferences/test7".) The same happens if we try RegularExpression["tmp/.*/Preferences"]. The reason is given in the documentation:




Mathematica syntax is sometimes inconsistent in unpredictable ways to remind users of the imperfection of the human condition.



FileNames["*", "tmp/*/*/Preferences", Infinity]


{"tmp/t1/Preferences/Preferences/test7", "tmp/t1/st1/Preferences/test9"}



The best way to conduct the search in question, then, is to describe the folder hierarchy in the first argument.


paths = FileNames[RegularExpression["Preferences/[^/]+"],"tmp‌​",Infinity]



{"tmp/Preferences/test6", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}



Notice how RegularExpression is doing what we would expect when it is passed to the form parameter.


And then we can filter as needed.


Select[Not@*DirectoryQ]@paths


{"tmp/Preferences/test6", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}




Comments

Popular posts from this blog

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

What is and isn't a valid variable specification for Manipulate?

I have an expression whose terms have arguments (representing subscripts), like this: myExpr = A[0] + V[1,T] I would like to put it inside a Manipulate to see its value as I move around the parameters. (The goal is eventually to plot it wrt one of the variables inside.) However, Mathematica complains when I set V[1,T] as a manipulated variable: Manipulate[Evaluate[myExpr], {A[0], 0, 1}, {V[1, T], 0, 1}] (*Manipulate::vsform: Manipulate argument {V[1,T],0,1} does not have the correct form for a variable specification. >> *) As a workaround, if I get rid of the symbol T inside the argument, it works fine: Manipulate[ Evaluate[myExpr /. T -> 15], {A[0], 0, 1}, {V[1, 15], 0, 1}] Why this behavior? Can anyone point me to the documentation that says what counts as a valid variable? And is there a way to get Manpiulate to accept an expression with a symbolic argument as a variable? Investigations I've done so far: I tried using variableQ from this answer , but it says V[1...