If I were you, I'd write an application that connects to a website's index page using SHDOCVW.dll and then log all the links you come across by finding the elements that use <ahref> tags, filtering out any values that do not match the first part of the URL (to avoid external links) and storing the rest.

Then work through each link that you have logged and do the same until you have a tree. I'm not really sure what the best way to log the data would be, perhaps create a dictionary/collection with the last term of the URL as the key and have each item as an array containing the links on that page. If the page has no links, then store an empty array or a 'Null'.

Once you've got a key for every page, close the webpage and construct your site map programmatically using the data you've found.