ListFiles performance vs file system visitor

Paulo Levi i30817 at gmail.com
Wed Feb 18 11:15:46 PST 2009


I managed massive improvement by refactoring the code somewhat in the
older 1.1 usb machine with a usb drive, and it cut the java time from
41 to 25 - fairly awesome, and in another recent dual core machine, i
can no longer tell the difference.

Testing 2,306 files and 417 folders i just got a massive win by
removing the null check and just using list files:

   public static void getFiles(int levels, File[] sum, List<File>
files, List<File> directories){
       Comparator<String> orderFiles = Strings.getNaturalComparator();
       getFiles(levels, sum, files, directories, orderFiles);
   }

   private static void getFiles(int levels, File[] sum, List<File>
files, List<File> directories, Comparator<String> comp) {
       int dirIndex = directories.size();
       List<String[]> subFilesList = new ArrayList<String[]>(50);

       for (File f : sum) {
           String [] subFiles = f.list();
           if (subFiles == null) {
               files.add(f);
           } else {
               directories.add(f);
               Arrays.sort(subFiles, comp);
               subFilesList.add(subFiles);
           }
       }

       if (levels > 0) {
           for (int dirLen = directories.size(), subCounter = 0;
dirIndex < dirLen; dirIndex++, subCounter++) {
               File current = directories.get(dirIndex);
               String [] childs = subFilesList.get(subCounter);
               File[] children = new File[childs.length];
               createFiles(current, childs, children);
               getFiles(levels - 1, children, files, directories, comp);
           }
       }
   }

   private static void createFiles(File parent, String[]
childStrings, File [] childsOut) {
       for (int i = 0; i < childStrings.length; i++) {
           childsOut[i] = new File(parent, childStrings[i]);
       }
   }

This is my junit test:

   @Test
   public void testGetFiles() {
       List<File> files = new ArrayList<File>();
       List<File> directories = new ArrayList<File>();
       File [] arr = {new File("e:\\\\LargeDir")};
       long time = System.currentTimeMillis();
       IoUtils.getFiles(5, arr, files, directories);
       System.out.println("Time indexing : "+
(System.currentTimeMillis()-time) );
   }

Output : Time indexing : 2859

Don't know how to record the windows time, but its about the same 2-4s
I guess the bottle neck is in usb 1.1 ... Also strange that this
listfiles is faster ... i would think it is doing more work - being
executed both for files and directories, and saving the results on a
list (to break the time complexity up) versus just using isDirectory
in all files and using listFiles only on the directory. Ok maybe not
so strange now i write it out.



More information about the nio-discuss mailing list