I'm using regular expressions to format C++ code to HTML with syntax highlighting. I've run into several problems right off the bat. Have a look at the functions I have so far and see if you can help me figure this out.

I'll start with the simplest function, to format strings for HTML in dark red:
Code:
public string FormatStrings(string strSource)
{
	Regex r = new Regex("(\".*?\")+?|('.*?')+?|(\"|').*");
	MatchEvaluator eval = new MatchEvaluator(ReplaceRed);
	strSource = r.Replace(strSource, eval);

	return strSource;
}

public string ReplaceRed(Match m)
{
	return "<span style=\"color: #800000\">" + m + "</span>";
}
This looks for:

1) a " followed by anything but \n (.), zero or more times, but as few as possible (*?) followed by a matching " - all that one or more times, but as few as possible.

2) same scenario except for single quotes

3) an unmatched " or ' followed by anything but \n




Next, functions to format comments in dark green:
Code:
public string FormatComments(string strSource)
{

	Regex r = new Regex("(/\\*(.|\n)*?\\*/)+?|//.*|/\\*(.|\n)*");
	MatchEvaluator eval = new MatchEvaluator(ReplaceGreen);
	strSource = r.Replace(strSource, eval);

	return strSource;
}

public string ReplaceGreen(Match m)
{
	return "<span style=\"color: #008000\">" + m + "</span>";
}
This searches for:

1) a slash /* followed by anything including newline, zero or more times but as few as possible (*?) followed by */ - all that one or more times but as few as possible

2) a // followed by anything but \n

3) an unterminated /*



The first problem I have is if I have strings nested within comments, after I call both replace functions, the nested inner string tags will override the comment coloring.

for example, this is some text before formatting:
Code:
// this is a "comment"
and after running FormatStrings and FormatComments:
Code:
<span style="color: #008000">// this is a <span style="color: #800000">"comment"</span>
</span>
So as I see it, I have two or more options:

1) Don't match strings that are enclosed in comment tags - but this is very difficult since it would involve a complex lookahead/lookbehind

2) Run the strings first, then when running comments, if I find nested string tags, remove the tags

Is there a better way to do this?