I have a list of around 300 elements. I want to sample subsets of length 25 such that my samples are all distinct. My first inclination was to use something like RandomSample[Subsets[list, {25}], 1000]
, but the problem is the number of subsets of length 25 out of a 300 element set is way to big for the computer to deal with. Anyone have a nice way to do this?
Answer
This question may be a duplicate but for the time being:
list = Range[300];
The number of subsets length 25:
n = Binomial[300, 25]
1953265141442868389822364184842211512
Five samples:
samp = RandomInteger[{1, n}, 5]
{1097179597483122074395819626389736050,
1278400886908268917844987164926797363,
1855898035549513136165016617586671669,
1005956584417012779260052361741534263,
1845054078551378518016127833496347335}
Your subsets:
Subsets[list, {25}, {#}][[1]] & /@ samp
{{10,15,57,64,65,73,82,115,120,130,133,160,161,164,178,192,196,218,223,235,238,240,267,271,290},
{12,54,58,81,90,91,115,130,146,181,189,204,205,218,222,230,233,234,235,254,256,268,281,283,284},
{33,42,45,65,78,81,85,118,151,167,172,174,202,203,207,208,211,212,223,239,246,251,254,262,267},
{9,12,35,69,72,77,79,109,113,116,141,144,158,163,195,202,221,228,230,231,254,259,267,280,292},
{32,39,49,53,62,102,104,132,135,159,164,167,169,172,191,211,244,245,253,263,265,271,282,283,286}}
Be aware that RandomInteger
could produce duplicate samples however for the example given it is extremely unlikely. You can produce more an use DeleteDuplicates
and Take
as needed.
I think kguler's answer is the better method, and I wish I had had the insight to realize it myself, however there is still some value in the method above. Referring to subsets by a single number can make them easier to handle.
- They take less space.
- Comparison (e.g. for removing duplicates) requires a single numeric comparison rather than a list comparison.
- A given subset is independent of the input
list
; only length of input and subset matter.
One can "unrank" them at any time using the third parameter of Subsets
as shown above.
Comments
Post a Comment