Recursive file iteration in Java

In a java frameworks talk I gave recently I showed the following example for finding files recursively…

public void listFilesInDirectory(File dir) {
  File[] files = dir.listFiles();
    if (files != null) {
      for (File f : files) {
         if (f.isDirectory()) {
	    listFilesInDirectory(f);
	 }
	 else {
	    System.out.println(f.getName());
	}
     }
  }
}

… which in real projects often grows to a larger block of code. The web is full of code blocks like this for walking a directory tree. The best version I came across is this object oriented one by Torsten Curdt. Since you usually don’t want to write this yourself, I suggested in my talk to use FileUtils which makes recursive iteration much easier:

Collection jspFiles = FileUtils.listFiles(rootDirName,
                        new String[] { "jsp" }, true);

This looks concise and useful but as I tried to use it, I wasn’t too pleased with the FileUtils’ solution. Here is why:

  • The recursion is processed in one go, i.e. all results are written to a List even when using the iterateFiles method. The recursion is not processed iteratively.
  • You can not influence the directories that are searched.
  • Only files are returned, you can not search for directories.
  • The API is not very expressive (e.g. what does the “true” mean).
  • No generics (raw collection types are returned).

A Better API

Not being satisfied with the solutions I found, I “dreamed up” my own API for listing and finding files. I don’t consider it complete but for the most part I am pleased with the ease of use that the builder pattern provides. The code for this can currently be found in an unrelated goole code project. The rest of this article shows the functions that are currently supported.

Find files two ways

There are generally two ways to use the result – as interator or as list:

1. Iterate over all files in the windows directory:

for (File f : Files.find("c:\\windows")) {

}

2. Get all the files in a directory as a list of files:

List<File> allFiles = Files.find(somedir).list();

Except from the return type the second version does the same as the JDK command listFiles:

File[]  allFiles = (new File(somedir)).listFiles()

Easy recursive listing

To iterate all the files in the C:\Windows directory, you would use:

for (File f : Files.find("c:\\windows").recursive()) {

}

Note: This actually works iteratively, i.e. the recursion happens as you fetch files from the iterator. The result is not fetched into a huge list.

With a Predicate you can limit the recursion to specific directories. In this example all .svn directories within a source tree are skipped:

Predicate<File> noSvnDirs = new Predicate<File>() {
boolean apply(File file) {
return !file.getName().equals(".svn");
}
}
for (File f : Files.find("src/java/").recursive(noSvnDir)) {

}

Want Files, Directories or both?

Define if you want only files, only directories or both in your result with yield*()-Methods.

Files.find(someBaseDir).recursive().yieldFiles()  // this is the default
Files.find(someBaseDir).recursive().yieldDirectories()
Files.find(someBaseDir).recursive().yieldFilesAndDirectories()

Filtering the results

To get all textfiles within a dir use:

Files.find(dir).withExtension("txt").list();
Files.find(dir).ignoreCase().withExtension("txt").list();

You can also filter by Name, e.g. to find README files:

Files.find(dir).withName("README").list();
Files.find(dir).ignoreCase().withName("readme").list();

Note that the default matching is case sensitive. The commands caseSensitive() and ignoreCase() can be used to toggle the matching behaviour.

For special needs you can also specify a Predicate<File> to filter the resulting files.

Files.find(dir).recursive().withFilter(somePredicate).list();

Finding Directories

When looking for directories there are some special usecases that are supported, e.g. looking for directories that contain a specific file:

Files.find(dir).recursive().yieldDirectories()
               .containingFile("Thumbs.db");

Android exploration continued

I extended my android application with a preferences screen now, which is quite easy to do.

  • A tutorial for creating a Preferences Activity got me started – unfortunaltely the xml preferences definition it uses is incorrect. The tags are names of Classes which have to be capitalized.
  • The open the preferences activity I added a Option Menu.
  • I wanted to add some kind of progress indicator. I ended up using the ProgressDialog and the AsyncTask to run the downloading and xml parsing in the background. To fix issues with device rotation I might have a look at the BetterAsyncTask in the Droid-FU library later.
  • I also ran into DateFormat and Date issues with the UTC formated Date in the XML file. I Thought about using  joda-time at least twice but then stuck to the JDK implementation for smaller app size. The fact that android brings it’s own class named DateFormat which just provides a localized JDK-DateFormat object doesn’t help either.

First Steps with Android

In the last few days I started out with some android development. Here are some things learned so far developing my first app:

  • The Android Tutorials are a great starting point, though i only followed through with the HelloWorld Tutorial. In retrospect I should have looked at the Notepad Tutorial a little closer because it explains important concepts (namely activities/intents).
  • http://www.anddev.org/ is a useful source for tutorials and code snippets
  • It’s still java but a completely different API, so you often have to look for classes and methods via code completion oder in examples to get things done.
  • Downloading: Can be done with the included HTTP Client library. Unfortuantely Android still uses an old version of the HTTP Client though, which made it hard to find documentation (e.g. how to set authentication credentials). Additionally you shoudn’t forget to declare the INTERNET-permissions in your application manifest.
  • Storing and retrieving Files looks fairly easy (getDir(), getCacheDir()-Methods are there) at first sight but you have to unerstand the Android filesystem security model if you don’t want to spend hours with debugging. The before mentoined methods use internal storage where each application stores it’s data independently. Public read/write (e.g. file exchange with other applications) is only possible when you store your content with the specific method openFileOutput(). The external SD card on the other hand can be openly accessed with the regular Java File API.
  • XML Parsing: I started out with the sax parser but since my XML file was pretty complex I ditched it and downloaded dom4j which has a really easy to use API. Unfortunately it adds at least 200KB of final app size. I now realized I could have gone with the regular DOM parser which has a decent API. I’ll have to reevaluate this later – maybe the end user responsiveness does require the faster streaming parser approach (sax).
  • UI design: Declarative XML based looks powerful and well thought out but I mostly stuck to tutorial layout for now. This is an area I still have to get into.

Ok, that’s it for now. Android is turning out to be a great plattform – exciting times.

Adding Googles GData Java API to your maven repository

The google gdata apis do not come with maven POM-files. Someone went through the trouble to “mavenize” the source but it is limited to linux as build plattform and currently out of date (compile errors). So I installed the JARs from the binary distribution of the APIs into my local repository – which are of course missing the dependencies between the individual JAR files. Here are two batch files which I used to install the JARs quite painlessly:

install.bat:

@SET mvn=d:\java\maven\bin\mvn
@%mvn% install:install-file -DgroupId=com.google.gdata
       -DartifactId=%1 -Dversion=%2 -Dfile=%3 -Dpackaging=jar
       -DgeneratePom=true

installall.bat:

call install.bat gdata-analytics 2.0 gdata-analytics-2.0.jar
call install.bat gdata-appsforyourdomain 1.0 gdata-appsforyourdomain-1.0.jar
call install.bat gdata-base 1.0 gdata-base-1.0.jar
call install.bat gdata-blogger 2.0 gdata-blogger-2.0.jar
call install.bat gdata-books 1.0 gdata-books-1.0.jar
call install.bat gdata-calendar 1.0 gdata-calendar-2.0.jar
call install.bat gdata-client 1.0 gdata-client-1.0.jar
call install.bat gdata-codesearch 2.0 gdata-codesearch-2.0.jar
call install.bat gdata-contacts 3.0 gdata-contacts-3.0.jar
call install.bat gdata-core 1.0 gdata-core-1.0.jar
call install.bat gdata-docs 2.0 gdata-docs-2.0.jar
call install.bat gdata-finance 2.0 gdata-finance-2.0.jar
call install.bat gdata-health 2.0 gdata-health-2.0.jar
call install.bat gdata-maps 2.0 gdata-maps-2.0.jar
call install.bat gdata-media 1.0 gdata-media-1.0.jar
call install.bat gdata-photos 2.0 gdata-photos-2.0.jar
call install.bat gdata-spreadsheet 3.0 gdata-spreadsheet-3.0.jar
call install.bat gdata-webmastertools 2.0 gdata-webmastertools-2.0.jar
call install.bat gdata-youtube 2.0 gdata-youtube-2.0.jar

You surely could get fancy and automate the splitting between artifact-name and version number, but hey, I needed those JARs installed quickly and that’s what it does.