I am currently going through the book Real World Haskell and one exercise from this book asks the reader to implement file name matching with the use of **
, which is the same as *
, but also looks in subdirectories all the way down in the file system. Below is a fragment of my code with comments (there is a lot of duplication at the moment) and further down you can find additional info about the code. I think that the posted code is sufficient for the problem and there is no need to list the whole program here.
case splitFileName pat of
("", baseName) -> do -- just the file name passed
curDir <- getCurrentDirectory
if searchSubDirs baseName -- check if file name has `**` in it
then do
contents <- getDirectoryContents curDir
subDirs <- filterM doesDirectoryExist contents
let properSubDirs = filter (`notElem` [".", ".."]) subDirs
subDirsNames <- forM properSubDirs $ \dir -> do
namesMatching (curDir </> dir </> baseName) -- call the function recursively on subdirectories
curDirNames <- listMatches curDir baseName -- list matches in the current directory
return (curDirNames ++ (concat subDirsNames)) -- concatenate results into a single list
else listMatches curDir baseName
(dirName, baseName) -> do // full path passed
if searchSubDirs baseName
then do
contents <- getDirectoryContents dirName
subDirs <- filterM doesDirectoryExist contents
let properSubDirs = filter (`notElem` [".", ".."]) subDirs
subDirsNames <- forM properSubDirs $ \dir -> do
namesMatching (dirName </> dir </> baseName) -- call the function recursively on subdirectories
curDirNames <- listMatches dirName baseName -- list matches in the passed directory
return (curDirNames ++ (concat subDirsNames)) -- concatenate results into a single list
Additional information:
pat
is the pattern I'm looking for (e.g. *.txt
or C:\\A\[a-z].*
).
splitFileName
is a function which splits a file path into the directory path and the file name. The first element of the tuple will be empty if we specify just a file name in pat
.
searchSubDirs
returns True
if the file name has **
in it.
listMatches
returns a list of file names that match the pattern in the directory, substituting **
for *
.
namesMatching
is the name of the function whose excerpt I posted.
Why doesn't it work?
When I pass just the file name, the program searches for it only in the current directory and first level of subdirectories. When I pass a full path, it searches only in the specified directory. It looks like case (dirName, baseName)
doesn't properly recurse. I've been looking at the code for some time now and I can't figure out where the problem is.
Note
If any more information is needed, please let me know in the comments and I'll add whatever is necessary to the question.
Here's an issue:
getDirectoryContents
only returns the leaf names of the directories, so you have to prependdirName
(along with a/
) to the elements ofcontents
before callingdoesDirectoryExist
.