Avoiding recursion at all costs?
For once a question from me ;)
I'm writing a file search algorithm that searches every physical fixed drive and every network drive including all subdirectories for a filename. Although there is a shell function that brings up the file search box of Explorer, there seems to be no function to do this programmatically, neither in the shell nor in the file I/O functions.
The problem I have is: the function that searches the directory tree could either be recursive (shiver) or use some kind of loop and an array to simulate the recursion.
The tree-nature of directories requires two loops: one that walks into the depth of the tree and another that walks the directory siblings. The sibling-walker is the inner loop. The problem is that I can't use continue or break (or something like that) from within the inner loop to act on the outer loop. As I don't want to have cryptic continue flags the only way that seems left to me is using goto (*shiver again*).
Here's my code (MFC, but a lot of API access).
CList is a templated linked list. CStringList is equivalent to a specialization of CList for CString.
CFileFind is a class that encapsulates the API WIN32_FIND_DATA, FindFirstFile and FindNextFile.
Code:
bool CJBStarterApp::SearchTree(const CString & strInitDir,
const CString & strFilename, CString & strOut)
{
// Uses a freaky goto to avoid recursiveness.
// Makes you wonder...
CStringList lsStack;
CList<CFileFind *, CFileFind *> lpffStack;
CFileFind *pCurrentFF = NULL;
CString strCurDir(strInitDir);
recur:
pCurrentFF = new CFileFind();
::SetCurrentDirectory(strCurDir);
if(pCurrentFF->FindFile(strFilename))
{
pCurrentFF->FindNextFile();
strOut = pCurrentFF->GetFilePath();
// Unroll stack and return
delete pCurrentFF;
while(!lpffStack.IsEmpty())
delete lpffStack.RemoveHead();
return true;
}
pCurrentFF->FindFile();
while(pCurrentFF->FindNextFile())
{
if(pCurrentFF->IsDirectory())
{
lpffStack.AddHead(pCurrentFF);
lsStack.AddHead(strCurDir);
strCurDir = pCurrentFF->GetFilePath();
goto recur;
cont:
delete pCurrentFF;
pCurrentFF = lpffStack.RemoveHead();
strCurDir = lsStack.RemoveHead();
::SetCurrentDirectory(strCurDir);
}
}
if(lpffStack.IsEmpty())
return false;
goto cont;
}
I believe it should work (haven't tried yet), but I'm not sure and that's a bad sign. If I'm not able to follow program flow right after I've written the app then this can't be good.
I welcome any ideas on restructuring this code to make it more readable. I welcome any reference to an API function that does this even more.
Thanks in advance.
Re: Avoiding recursion at all costs?
Quote:
Originally Posted by jim mcnamara
FWIW -
In VMS's lib$findfile - which NT's FindFirstFile is based on - there is a context pointer. Win 9X FindFirstFile has one as well.
If you maintain your own context pointer - a pointer to a doubly linked list - the list tells you where you are and where you have come from, so you avoid recursion, you simply iterate, and maintain the linked list to tell you where you are, where you need to go in either direction (backwards or forwards).
FindClose cleans up the memory associated with FindFirstFile's context heap.
There is a little bit of discusion about this on the NT Native API page here:
http://www.sysinternals.com/ntw2k/info/ntdll.shtml
Hi jim,
using these context pointers (i am pushing them on a stack), can you retrieve their filedata, as to findfirstfile the subfolder? I could of course use the previous pointer and then use findnextfile but that doesn't work out with the first one..
Re: Avoiding recursion at all costs?
Nothing for an April Fool's joke like waking a two-year-old thread ;)
Re: Avoiding recursion at all costs?
I just happened to find this searching for findfirstfile ;)
Re: Avoiding recursion at all costs?
Of course, these last two years I have learned that avoiding recursion at all cost, as I'm doing above, is a BAD thing. Tree iteration is inherently recursive.
Re: Avoiding recursion at all costs?
not so sure about that.. doing recursion will always add some performance overhead. In a way all iteration is recursive, but that is in the abstract sense, not in the sense of recursive function calls. Tree iteration is the same, the stack however can be implemented without using function calls. In the abstract sense both do the same thing, but performance wise it should be better with your own stack.
Re: Avoiding recursion at all costs?
There is a point where managing your own stack takes more time than the CPU managing the system stack. You still require branches, you still require memory. Only, with your own stack it's quite likely that the memory will be complicated heap memory and not the already available system stack, and the branches might be harder for the CPU to predict. The fact that it's a recursive call ensures locality of reference, while the stack handling code just might make the function too big for the cache.
And of course, not managing your own stack makes the code a lot more understandable.
Re: Avoiding recursion at all costs?
hmm there is something in what you are saying, the stack i'm using is using heap memory, but how does the system stack avoid being blown then? It wouldn't be such a big deal to make a stack that doesn't use the heap, if only I could avoid the risk to have it blown. Another thing with recursion is that variables that don't change need not be on the stack, and taking them out makes your code less readable.