Skip to main content

searching - Find file names in directory


I am trying to find the list of any file in any directory of a given name (by in the directory, I mean directly in the directory, so in a directory which is in the directory would not count). For the sake of example. Let's suppose I want to find all files in each folder called "Preferences", and let's restrict our search to the folder ~/.Mathematica. If I wanted to do this from the terminal, I could just do


find ~/.Mathematica -regex ~/.Mathematica.*Preferences/[^/]*.


This works and I see there is a single file matching my criteon, ~/.Mathematica/ApplicationData/Parallel/Preferences/Preferences.m


But I want to try to do it conveniently in mathematica. I am thinking the Filenames function should do it.


I will first run


SetDirectory["~/.Mathematica"]



Then I would run


fileAndDirectoryNames = 
FileNames["*",RegularExpression[".*Preferences"], 1]

followed by


fileNames = Select[fileAndDirectoryNames, ! DirectoryQ[#] &]

However, this gives incorrect results for me: fileAndDirectoryNames is an empty list. If I instead run


fileAndDirectoryNames = 
FileNames["*", RegularExpression[".*/.*/Preferences"], 1]


and recompute fileNames as before, then I get correct output.


I am confused because it seems to me that the regular expression in my second attempt is stronger (allows for fewer matches) than the one in my first attempt. The fileNames function should have a monotonicity property in the second argument that if you weaken the pattern, then the new output ought to be a superset of the original output. Yet this doesn't seem to happen. Why is this? I am not sure if I am having a problem with mathematica or my understanding of regular expressions.



Answer



All three of parameters for FileNames can affect the depth at which Mathematica searches for results. It seems like your confusion is a result of interaction among these parameters. This is easily understandable as the documentation for FileNames is not very illustrative. (Indeed my first attempt at answering this question was faulty for the same reason.)


The first parameter -- the form -- should be thought of as a relative path. It has no intrinsic depth specification, but will be tested at depths specified by the next two parameters. However, it is possible to control the depth of the search with this parameter by specifying a folder hierarchy in the form you are searching for. (See below.) This can be a literal string, a string with simple wildcards (*, etc.), a Mathematica-style string pattern, or a regular expression.


The second parameter -- the directories -- specifies the top-level locations in which Mathematica will conduct its search. The first parameter will be tested relative to what is specified here. This can also be a literal or a pattern, same as above.


The third parameter -- the depth -- tells Mathematica whether it should repeat the search for the first parameter in subdirectories of the paths specified in the second parameter. When its value is 1 (the default), Mathematica will only return matches that are immediately relative to a directory specified in the second argument.


Rather than writing a bunch of prose, I think it will be easier to just supply some examples to see how these things can interact.


First, here is the entire directory tree of the folder tmp:



FileNames["*", "tmp", Infinity]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/Preferences/test6", "tmp/t1", "tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences", "tmp/t1/st1/Preferences/test9", "tmp/t2", "tmp/t3", "tmp/t3/Preferences", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4", "tmp/test5"}



So of course we see that Infinity directs Mathematica to walk the whole tree. By contrast, the default value (1) yields:


FileNames["*", "tmp"]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/t1", "tmp/t2", "tmp/t3", "tmp/test5"}




Similarly,


    FileNames["*", "tmp", 2]


{"tmp/1B.2010-2011.dataless", "tmp/Preferences", "tmp/Preferences/test6", "tmp/t1", "tmp/t1/Preferences", "tmp/t1/st1", "tmp/t2", "tmp/t3", "tmp/t3/Preferences", "tmp/test5"}



This is all straightforward. Now, consider these examples. Take note of how we are controlling the depth of the search in various ways.


FileNames["t1/*", "tmp"]



{"tmp/t1/Preferences", "tmp/t1/st1"}



FileNames["*", "tmp/t1"]


{"tmp/t1/Preferences", "tmp/t1/st1"}



FileNames["t1/*", "tmp", 2]



{"tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences"}



FileNames["t1/*", "tmp", Infinity]


{"tmp/t1/Preferences", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1", "tmp/t1/st1/Preferences", "tmp/t1/st1/Preferences/test9"}



FileNames["test*", "tmp/t1", Infinity]



{"tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9"}



FileNames["*", "tmp/*/Preferences"]


{"tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}



Note that * in the second parameter is not matching nested directories. (E.g., we are not getting "tmp/t1/Preferences/Preferences/test7".) The same happens if we try RegularExpression["tmp/.*/Preferences"]. The reason is given in the documentation:




Mathematica syntax is sometimes inconsistent in unpredictable ways to remind users of the imperfection of the human condition.



FileNames["*", "tmp/*/*/Preferences", Infinity]


{"tmp/t1/Preferences/Preferences/test7", "tmp/t1/st1/Preferences/test9"}



The best way to conduct the search in question, then, is to describe the folder hierarchy in the first argument.


paths = FileNames[RegularExpression["Preferences/[^/]+"],"tmp‌​",Infinity]



{"tmp/Preferences/test6", "tmp/t1/Preferences/dir1", "tmp/t1/Preferences/Preferences", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}



Notice how RegularExpression is doing what we would expect when it is passed to the form parameter.


And then we can filter as needed.


Select[Not@*DirectoryQ]@paths


{"tmp/Preferences/test6", "tmp/t1/Preferences/Preferences/test7", "tmp/t1/Preferences/test1", "tmp/t1/Preferences/test2", "tmp/t1/st1/Preferences/test9", "tmp/t3/Preferences/test3", "tmp/t3/Preferences/test4"}




Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...