Friday, June 28, 2013

Regular expression on XML attributes

Please try this code:



private static readonly Regex GetHtmlTextRegex = new Regex("text=\"([^\"]*)\"");
private static readonly Regex GetTextRegex = new Regex("<[^>]*>([^<]+)<[^>]*/[^>]*>");

private static List<string> ParseXmlText(string xmlText)
{
var htmlTextMatch = GetHtmlTextRegex.Match(xmlText);
if (htmlTextMatch.Success)
{
var htmlText = htmlTextMatch.Groups[1].Value;
var plainTextMatchs = GetTextRegex.Matches(htmlText);
var result = new List<string>(plainTextMatchs.Count);
result.AddRange(from Match plainTextMatch in plainTextMatchs select plainTextMatch.Groups[1].Value);
return result;
}
return null;
}



You can call "ParseXmlText" method with the text you want to parse.


Please tell me if I'm getting something not right. :)


No comments:

Post a Comment