Best way to extract sections from sql string
What would be the best way to split a sql string from one column into many?
I have a column which contains html, within this there are many <h3> tags containing headings.
i.e.
row 1
<h3>overview</h3>gjbvjjkvbjkvv<h3>Details</h3>hcvhjchjcj<h3>cost</h3>cccc
I need to split the string based on the h3 tags into the correct columns of a new table.
In addition i need to support incorrectly titled tags such as the overview may sometimes be referred to as summary.
row 2
<h3>summary</h3>ghghcvcvv<h3>Details</h3>hcvhjchjcj<h3>cost</h3>cccc
I need to do this for a result set of 3k rows.
I have SSIS and sql server 2010 along with VS2008 and 2010 at my disposal.
Re: Best way to extract sections from sql string
Currently I have created a SSIS package which gets the result set, I have then added a for loop with a script task to look at each row as below.
The script runs but doesnt appear to be finding the overview text?
Code:
public void Main()
{
// TODO: Add your code here
//MessageBox.Show(Dts.Variables["objccod"].Value.ToString());
String coursecode = Dts.Variables["objccod"].Value.ToString();
String coursedesc = Dts.Variables["objcdesc"].Value.ToString();
String Overview = "";
String Prereq;
String Objectives;
String Outline;
String Special;
Boolean results = false;
//HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
//HtmlNode.ElementsFlags.Remove("option");
//doc.LoadHtml(coursedesc);
//foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//h3//overview"))
//{
// //Console.WriteLine("Value=" + node.Attributes["value"].Value);
// Overview = node.InnerText;
// MessageBox.Show(node.ChildNodes.ToString());
//}
results = GetTextBetweenTextBlocks("coursedesc", "<h3>overview</h3>", "<h3>", out Overview);
MessageBox.Show(Overview);
Dts.TaskResult = (int)ScriptResults.Success;
}
static bool GetTextBetweenTextBlocks(string input_expression, string left_text, string right_text, out string matched_text)
{
// Declare results variable.
bool results = false;
// Define the regular expression that needs to be found.
string regex_find = left_text + "(?'text'.*?)" + right_text;
// Match the string.
Match string_output = Regex.Match(input_expression, regex_find);
// Output results
if (string_output.Success.ToString() == "True")
{
matched_text = string_output.ToString().Substring(left_text.Length, string_output.Length - left_text.Length - right_text.Length);
results = true;
return results;
}
else
{
matched_text = "";
return results;
}
}
}
Re: Best way to extract sections from sql string
Check out my response in this thread, it details how to parse xml with sql syntax.
http://www.vbforums.com/showthread.php?t=651476