<bdo id='cjnXq'></bdo><ul id='cjnXq'></ul>

        <small id='cjnXq'></small><noframes id='cjnXq'>

        <i id='cjnXq'><tr id='cjnXq'><dt id='cjnXq'><q id='cjnXq'><span id='cjnXq'><b id='cjnXq'><form id='cjnXq'><ins id='cjnXq'></ins><ul id='cjnXq'></ul><sub id='cjnXq'></sub></form><legend id='cjnXq'></legend><bdo id='cjnXq'><pre id='cjnXq'><center id='cjnXq'></center></pre></bdo></b><th id='cjnXq'></th></span></q></dt></tr></i><div id='cjnXq'><tfoot id='cjnXq'></tfoot><dl id='cjnXq'><fieldset id='cjnXq'></fieldset></dl></div>
      1. <legend id='cjnXq'><style id='cjnXq'><dir id='cjnXq'><q id='cjnXq'></q></dir></style></legend>

        <tfoot id='cjnXq'></tfoot>

      2. 随机选择子目录中的 x 个文件

        Select randomly x files in subdirectories(随机选择子目录中的 x 个文件)

          <tbody id='LYEpz'></tbody>

          • <bdo id='LYEpz'></bdo><ul id='LYEpz'></ul>

              <tfoot id='LYEpz'></tfoot>

              • <legend id='LYEpz'><style id='LYEpz'><dir id='LYEpz'><q id='LYEpz'></q></dir></style></legend>
                <i id='LYEpz'><tr id='LYEpz'><dt id='LYEpz'><q id='LYEpz'><span id='LYEpz'><b id='LYEpz'><form id='LYEpz'><ins id='LYEpz'></ins><ul id='LYEpz'></ul><sub id='LYEpz'></sub></form><legend id='LYEpz'></legend><bdo id='LYEpz'><pre id='LYEpz'><center id='LYEpz'></center></pre></bdo></b><th id='LYEpz'></th></span></q></dt></tr></i><div id='LYEpz'><tfoot id='LYEpz'></tfoot><dl id='LYEpz'><fieldset id='LYEpz'></fieldset></dl></div>

                  <small id='LYEpz'></small><noframes id='LYEpz'>

                  本文介绍了随机选择子目录中的 x 个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我需要在一个数据集中随机抽取 10 个文件(图像),但这个数据集是分层结构的.

                  I need to take exactly 10 files (images) in a dataset randomly, but this dataset is hierarchically structured.

                  所以我需要为每个包含图像的子目录随机保存其中的 10 个.有没有一种简单的方法可以做到这一点,或者我应该手动做到这一点?

                  So I need that for each subdirectory that contains images hold just 10 of them randomly. Is there an easy way to do that or I should do it manually?

                  def getListOfFiles(dirName):
                      ### create a list of file and sub directories 
                      ### names in the given directory 
                      listOfFile = os.listdir(dirName)
                      allFiles = list()
                      ### Iterate over all the entries
                      for entry in listOfFile:
                  
                          ### Create full path
                          fullPath = os.path.join(dirName, entry)
                          ### If entry is a directory then get the list of files in this directory 
                          if os.path.isdir(fullPath):
                              allFiles = allFiles + getListOfFiles(fullPath)
                          else:
                              allFiles.append(random.sample(fullPath, 10))
                      return allFiles
                  
                  dirName = 'C:/Users/bla/bla'
                  
                  ### Get the list of all files in directory tree at given path
                  listOfFiles = getListOfFiles(dirName)
                  
                  with open("elements.txt", mode='x') as f:
                      for elem in listOfFiles:
                          f.write(elem + '
                  ')
                  

                  推荐答案

                  从未知大小目录列表中采样的好方法是使用 水库采样.使用这种方法,您不必预先运行并列出目录中的所有文件.逐一阅读并示例.当您必须跨多个目录对固定数量的文件进行采样时,它甚至可以工作.

                  Good approach to sample from unknown size directory listing is to use Reservoir Sampling. With this approach, you don't have to run upfront and list all files in the directory. Read it one-by-one and sample. It even works when you have to sample fixed number of files across multiple directories.

                  最好使用基于生成器的目录扫描代码,它一次选择一个文件,因此您不必预先使用大量内存来保存所有文件名.

                  It would be good to use generator-based directory scanning code, which picks one file at a time, thus you don't use gobs of memory upfront to hold all file names.

                  顺理成章(注意!未指定的代码!)

                  Along the lines (NB! undested code!)

                  import numpy as np
                  import os
                  
                  def ResSampleFiles(dirname, N):
                      """pick N files from directory"""
                  
                      sampled_files = list()
                      k = 0
                      for item in scandir(dirname):
                          if item.is_dir():
                              continue
                          full_path = os.path.join(dirname, item.name)
                          if k < N:
                              sampled_files.append(full_path)
                          else:
                              idx = np.random.randint(0, k+1)
                              if (idx < N):
                                  sampled_files[idx] = full_path
                          k += 1
                  
                      return sampled_files
                  

                  这篇关于随机选择子目录中的 x 个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Adding config modes to Plotly.Py offline - modebar(将配置模式添加到 Plotly.Py 离线 - 模式栏)
                  Plotly: How to style a plotly figure so that it doesn#39;t display gaps for missing dates?(Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙?)
                  python save plotly plot to local file and insert into html(python将绘图保存到本地文件并插入到html中)
                  Plotly: What color cycle does plotly express follow?(情节:情节表达遵循什么颜色循环?)
                  How to save plotly express plot into a html or static image file?(如何将情节表达图保存到 html 或静态图像文件中?)
                  Plotly: How to make a line plot from a pandas dataframe with a long or wide format?(Plotly:如何使用长格式或宽格式的 pandas 数据框制作线图?)
                    <i id='nuFOv'><tr id='nuFOv'><dt id='nuFOv'><q id='nuFOv'><span id='nuFOv'><b id='nuFOv'><form id='nuFOv'><ins id='nuFOv'></ins><ul id='nuFOv'></ul><sub id='nuFOv'></sub></form><legend id='nuFOv'></legend><bdo id='nuFOv'><pre id='nuFOv'><center id='nuFOv'></center></pre></bdo></b><th id='nuFOv'></th></span></q></dt></tr></i><div id='nuFOv'><tfoot id='nuFOv'></tfoot><dl id='nuFOv'><fieldset id='nuFOv'></fieldset></dl></div>
                    • <legend id='nuFOv'><style id='nuFOv'><dir id='nuFOv'><q id='nuFOv'></q></dir></style></legend>

                      <small id='nuFOv'></small><noframes id='nuFOv'>

                        <bdo id='nuFOv'></bdo><ul id='nuFOv'></ul>

                          <tbody id='nuFOv'></tbody>
                        • <tfoot id='nuFOv'></tfoot>