C# parsing part of a string - c#

WE have an application that prints out a log line. Within the log lines we also print out the fully syncML Payload in xml. I need to parse out just the syncML payloads. The actual xml and strip everything else out.
Log line looks like this.
`2016-01-06T15:13:45.188-0500 [DEBUG] {} Logger
[{{Correlation,(longID)}{Uri,POST (post
URL)}{host,(HOST)}{userID,(userID)}}] - request class SyncML: <?xml
version="1.0" encoding="UTF-8" standalone="yes"?></ns3:SyncML>`
My regex for the request class is as follows.
Regex request = new Regex(#"request class SyncML");
String line;
while ((line = sr.ReadLine()) != null)
{
if(req.Success)
{
Match req = request.Match(line);
string s = line.Substring(line.IndexOf("<?xml "));
}
}
After the request.Match(line), in VS it shows the full line. So I know the Match is a truly a success.
However, when I do line.SubString(line.IndexOF... I get System.ArgumentOutOfRangeException. When I checked print out indexOf it's -1.
Perhaps I am using this wrong. I guess my question is what do I need to do to just strip out everything before

If the "<?xml" string begins on the next line, use this:
Regex request = new Regex(#"request class winmo.SyncML");
String line;
while ((line = sr.ReadLine()) != null)
{
if(req.Success)
{
Match req = request.Match(line);
var xmlLine = line = sr.ReadLine();
if (null == xmlLine) break;
string s = xmlLine.Substring(line.IndexOf("<?xml "));
}
}
Or, you can improve your Regex for the newly edited example:
Regex request = new Regex(#"^.+request class winmo.SyncML[^\<]+(\<\?xml [^`]+)`");
string line;
while ((line = sr.ReadLine()) != null)
{
Match req = request.Match(line);
if(req.Success)
string s = req.Group[1].Value;
}
Additionally, you can search more than one line at a time with the improved Regex:
Regex request = new Regex(#"^.+request class winmo.SyncML[^\<]+(\<\?xml [^`]+)");
var lines = new List<String>(5);
string line;
while ((line = sr.ReadLine()) != null)
{
//NOTE:You'll need to make sure this gets enough of your log file to get what you want
lines.Add(line);
while(lines.Count>4)
lines.RemoveAt(0);
Match req = request.Match(string.Join("\r\n", lines);
if(req.Success)
string s = req.Group[1].Value;
}

Maybe you want something like this:
String line;
while ((line = sr.ReadLine()) != null)
{
if(line.Contains("<?xml "))
{
string s = line.Substring(line.IndexOf("<?xml "));
// do something useful with s
}
}

Your Regex looks wrong it should be Regex request = new Regex(#"request class SyncML");

Try using
"<?xml"
instead of
"<?xml "
, I don't see that space after xml.
This question have been edited. So, If the string are formatted in several lines, you should do:
while((line = sr.ReadLine))!= null){
if(req.Success){
Math req = request.Match(line);
if(line.contains("<?xml")){
stirng s = line.Substring(line.IndexOf(#"<?xml"));
}
}
}

If you have the entire log as a long string, you can use substring(x) with indexof(string) to strip everything before the area you are interested in. I'm making the assumption from your last line that everything after the initial log info is part of the wanted xml.
string sFullLog = ReadFullLogAsASingleString();//Could be taxing in large logs
string sXML = sFullLog.Substring(sFullLog.IndexOf("<?xml"));
I see that the provided sample is a single log entry, and that log entry has the xml of intrest.

Related

Fix for CWE-113: Improper Neutralization of CRLF Sequences in HTTP Headers ('HTTP Response Splitting')

After running veracode scan, I got the CWE 113 error. I had found a solution to replace the cookie value, but still the issue is not fixed.
Fix for CWE-113: Improper Neutralization of CRLF Sequences in HTTP
Headers ('HTTP Response Splitting')
string ReplaceHTTPRequestValue(string Value)
{
string replacedValue = string.Empty;
if (!string.IsNullOrEmpty(Value))
{
replacedValue = Value.Replace("\r", string.Empty)
.Replace("%0d", string.Empty)
.Replace("%0D", string.Empty)
.Replace("\n", string.Empty)
.Replace("%0a", string.Empty)
.Replace("%0A", string.Empty);
}
return replacedValue;
}
void WebTrends_PreRender()
{
HttpCookie cookie = Request.Cookies["WT_CID"];
string campaignIdVal = string.Empty;
if (cookie != null)
{
campaignIdVal = ReplaceHTTPRequestValue(Request.Cookies["WT_CID"].Value);
}
else
{
campaignIdVal = string.Empty;
}
}
How can I solve this?
Please take a look at this link
https://community.veracode.com/s/question/0D53n00007YVaMrCAL/how-to-fix-flaws-for-cwe-id-113-http-response-splitting
It is likely the reason the flaw continues to be reported is because
the functions you are using are not in the list of Supported Cleansing
Functions, which you can find in the Help Center here:
https://help.veracode.com/go/review_cleansers. For example the
supported function org.owasp.encoder.Encode.forJava() would cleanse
for CWE-113, as well as CWE-117, CWE-80 and CWE-93. Please note that
it is important to select the appropriate cleansing function for the
context.
string ReplaceHTTPRequestValue(string Value)
{
string NonCRLF = string.Empty;
foreach (char item in Value)
{
NonCRLF += item.ToString().Replace("\n", "").Replace("\r","");
}
return NonCRLF;
}

C# Stream keeps skipping first line

alright I'm doing something that should be rather simple, I believe I am overlooking something here.
Alright I and using a HttpWebRequest and a WebResponse to detect if a Robots.txt exists on a server (and that works perfectly fine). However, I am trying to add to do myList.Add(reader.ReadLine()); Which (works). But problem is, it keeps skipping the very first line.
https://www.assetstore.unity3d.com/robots.txt < That is the one I started noticing the problem on (just so you know what I'm talking about). It is just for testing purposes. (Look at that link so you can get an idea as to what I'm talking about).
Anywho, it is also not adding the reader.ReadLine to my list either (first line only). So I'm not exactly understanding what's going on, I've tried looking this up and the only things I'm finding is to purposely want to skip a line, I don't want to do that.
My Code Below.
Console.WriteLine("Robots.txt Found: Presenting Rules in (Robot Rules).");
HttpWebRequest getResults = (HttpWebRequest)WebRequest.Create(ur + "robots.txt");
WebResponse getResponse = getResults.GetResponse();
using (StreamReader reader = new StreamReader(getResponse.GetResponseStream())) {
string line = reader.ReadLine();
while(line != null && line != "#") {
line = reader.ReadLine();
rslList.Add(line);
results.Text = results.Text + line + Environment.NewLine; // At first I thought it might have been this (nope).
}
// This didn't work either (figured perhaps maybe it was skipping because I had to many things.
// So I just put into a for loop, - nope still skips first line.
// for(int i = 0; i < rslList.Count; i++) {
// results.Text = results.Text + rslList[i] + Environment.NewLine;
// }
}
// Close the connection sense it is no longer needed.
getResponse.Close();
// Now check for user-rights.
CheckUserRights();
Image of the results.
Change when next you call the read line
var line = reader.ReadLine(); //Read first line
while(line != null && line != "#") { //while line condition satisfied
//perform your desired actions
rslList.Add(line);
results.Text = results.Text + line + Environment.NewLine;
line = reader.ReadLine(); //read the next line
}

StreamReader get string between certain characters

I have a program that sends emails utilizing templates via a web service. To test the templates, I made a simple program that reads the templates, fills it up with dummy value and send it. The problem is that the templates have different 'fill in' variable names. So what I want to do is open the template, make a list of the variables and then fill them with dummy text.
Right no I have something like:
StreamReader SR = new StreamReader(myPath);
.... //Email code here
Msg.Body = SR.ReadToEnd();
SR.Close();
Msg.Body = Msg.Body.Replace(%myFillInVariable%, "Test String");
....
So I'm thinking, opening the template, search for values in between "%" and put them in an ArrayList, then do the Msg.Body = SR.ReadToEnd(); part. Loop the ArrayList and do the Replace part using the value of the Array.
What I can't find is how to read the value between the % tags. Any suggestions on what method to use will be greatly appreciated.
Thanks,
MORE DETAILS:
Sorry if I wasn't clear. I'm passing the name of the TEMPLATE to the script from a drop down. I might have a few dozen Templates and they all have different %VariableToBeReplace%. So that's is why I want to read the Template with the StreamReader, find all the %value names%, put them into an array AND THEN fill them up - which I already know how to do. It's getting the the name of what I need to replace in code which I don't know what to do.
I am not sure on your question either but here is a sample of how to do the replacement.
You can run and play with this example in LinqPad.
Copy this content into a file and change the path to what you want. Content:
Hello %FirstName% %LastName%,
We would like to welcome you and your family to our program at the low cost of %currentprice%. We are glad to offer you this %Service%
Thanks,
Some Person
Code:
var content = string.Empty;
using(var streamReader = new StreamReader(#"C:\EmailTemplate.txt"))
{
content = streamReader.ReadToEnd();
}
var matches = Regex.Matches(content, #"%(.*?)%", RegexOptions.ExplicitCapture);
var extractedReplacementVariables = new List<string>(matches.Count);
foreach(Match match in matches)
{
extractedReplacementVariables.Add(match.Value);
}
extractedReplacementVariables.Dump("Extracted KeyReplacements");
//Do your code here to populate these, this part is just to show it still works
//Modify to meet your needs
var replacementsWithValues = new Dictionary<string, string>(extractedReplacementVariables.Count);
for(var i = 0; i < extractedReplacementVariables.Count; i++)
{
replacementsWithValues.Add(extractedReplacementVariables[i], "TestValue" + i);
}
content.Dump("Template before Variable Replacement");
foreach(var key in replacementsWithValues.Keys)
{
content = content.Replace(key, replacementsWithValues[key]);
}
content.Dump("Template After Variable Replacement");
Result from LinqPad:
I am not really sure that I understood your question but, you can try to put on the first line of the template your 'fill in variable'.
Something like:
StreamReader SR = new StreamReader(myPath);
String fill_in_var=SR.ReadLine();
String line;
while((line = SR.ReadLine()) != null)
{
Msg.Body+=line;
}
SR.Close();
Msg.Body = Msg.Body.Replace(fill_in_var, "Test String");

500 error when querying yahoo placefinder with a particular character?

I am using the Yahoo Placefinder service to find some latitude/longitude positions for a list of addresses I have in a csv file.
I am using the following code:
String reqURL = "http://where.yahooapis.com/geocode?location=" + HttpUtility.UrlEncode(location) + "&appid=KGe6P34c";
XmlDocument xml = new XmlDocument();
xml.Load(reqURL);
XPathNavigator nav = xml.CreateNavigator();
// process xml here...
I just found a very stubborn error, that I thought (incorrectly) for several days was due to Yahoo forbidding further requests from me.
It is for this URL:
http://where.yahooapis.com/geocode?location=31+Front+Street%2c+Sedgefield%2c+Stockton%06on-Tees%2c+England%2c+TS21+3AT&appid=KGe6P34c
My browser complains about a parsing error for that url. My c# program says it has a 500 error.
The location string here comes from this address:
Agape Business Consortium Ltd.,michael.cutbill#agapesolutions.co.uk,Michael A Cutbill,Director,,,9 Jenner Drive,Victoria Gardens,,Stockton-on-Tee,,TS19 8RE,,England,85111,Hospitals,www.agapesolutions.co.uk
I think the error comes from the first hyphen in Stockton-on-Tee , but I can't explain why this is. If I replace this hypen with a 'normal' hyphen, the query goes through successfully.
Is this error due to a fault my end (the HttpUtility.UrlEncode function being incorrect?) or a fault Yahoo's end?
Even though I can see what is causing this problem, I don't understand why. Could someone explain?
EDIT:
Further investigation on my part indicates that the character this hypen is being encoded to, "%06", is the ascii control character "Acknowledge", "ACK". I have no idea why this character would turn up here. It seems that differrent places render Stockton-on-Tee in different ways - it appears normal opened in a text editor, but by the time it appears in Visual Studio, before being encoded, it is Stocktonon-Tees. Note that, when I copied the previous into this text box in firefox, the hypen rendered as a weird, square box character, but on this subsequent edit the SO software appears to have santized the character.
I include below the function & holder class I am using to parse the csv file - as you can see, I am doing nothing strange that might introduce unexpected characters. The dangerous character appears in the "Town" field.
public List<PaidBusiness> parseCSV(string path)
{
List<PaidBusiness> parsedBusiness = new List<PaidBusiness>();
List<string> parsedBusinessNames = new List<string>();
try
{
using (StreamReader readFile = new StreamReader(path))
{
string line;
string[] row;
bool first = true;
while ((line = readFile.ReadLine()) != null)
{
if (first)
first = false;
else
{
row = line.Split(',');
PaidBusiness business = new PaidBusiness(row);
if (!business.bad) // no problems with the formatting of the business (no missing fields, etc)
{
if (!parsedBusinessNames.Contains(business.CompanyName))
{
parsedBusinessNames.Add(business.CompanyName);
parsedBusiness.Add(business);
}
}
}
}
}
}
catch (Exception e)
{ }
return parsedBusiness;
}
public class PaidBusiness
{
public String CompanyName, EmailAddress, ContactFullName, Address, Address2, Address3, Town, County, Postcode, Region, Country, BusinessCategory, WebAddress;
public String latitude, longitude;
public bool bad;
public static int noCategoryCount = 0;
public static int badCount = 0;
public PaidBusiness(String[] parts)
{
bad = false;
for (int i = 0; i < parts.Length; i++)
{
parts[i] = parts[i].Replace("pithawala", ",");
parts[i] = parts[i].Replace("''", "'");
}
CompanyName = parts[0].Trim();
EmailAddress = parts[1].Trim();
ContactFullName = parts[2].Trim();
Address = parts[6].Trim();
Address2 = parts[7].Trim();
Address3 = parts[8].Trim();
Town = parts[9].Trim();
County = parts[10].Trim();
Postcode = parts[11].Trim();
Region = parts[12].Trim();
Country = parts[13].Trim();
BusinessCategory = parts[15].Trim();
WebAddress = parts[16].Trim();
// data testing
if (CompanyName == "")
bad = true;
if (EmailAddress == "")
bad = true;
if (Postcode == "")
bad = true;
if (Country == "")
bad = true;
if (BusinessCategory == "")
bad = true;
if (Address.ToLower().StartsWith("po box"))
bad = true;
// its ok if there is no contact name.
if (ContactFullName == "")
ContactFullName = CompanyName;
//problem if there is no business category.
if (BusinessCategory == "")
noCategoryCount++;
if (bad)
badCount++;
}
}
Welcome to real world data. It's likely that the problem is in the CSV file. To verify, read the line and inspect each character:
foreach (char c in line)
{
Console.WriteLine("{0}, {1}", c, (int)c);
}
A "normal" hyphen will give you a value of 45.
The other problem could be that you're reading the file using the wrong encoding. It could be that the file is encoded as UTF8 and you're reading it with the default encoding. You might try specifying UTF8 when you open the file:
using (StreamReader readFile = new StreamReader(path, Encoding.UTF8))
Do that, and then output each character on the line again (as above), and see what character you get for the hyphen.

Microsoft Word Document Controls not accepting carriage returns

So, I have a Microsoft Word 2007 Document with several Plain Text Format (I have tried Rich Text Format as well) controls which accept input via XML.
For carriage returns, I had the string being passed through XML containing "\r\n" when I wanted a carriage return, but the word document ignored that and just kept wrapping things on the same line. I also tried replacing the \r\n with System.Environment.NewLine in my C# mapper, but that just put in \r\n anyway, which still didn't work.
Note also that on the control itself I have set it to "Allow Carriage Returns (Multiple Paragrpahs)" in the control properties.
This is the XML for the listMapper
<Field id="32" name="32" fieldType="SimpleText">
<DataSelector path="/Data/DB/DebtProduct">
<InputField fieldType=""
path="/Data/DB/Client/strClientFirm"
link="" type=""/>
<InputField fieldType=""
path="strClientRefDebt"
link="" type=""/>
</DataSelector>
<DataMapper formatString="{0} Account Number: {1}"
name="SimpleListMapper" type="">
<MapperData></MapperData>
</DataMapper>
</Field>
Note that this is the listMapper C# where I actually map the list (notice where I try and append the system.environment.newline)
namespace DocEngine.Core.DataMappers
{
public class CSimpleListMapper:CBaseDataMapper
{
public override void Fill(DocEngine.Core.Interfaces.Document.IControl control, CDataSelector dataSelector)
{
if (control != null && dataSelector != null)
{
ISimpleTextControl textControl = (ISimpleTextControl)control;
IContent content = textControl.CreateContent();
CInputFieldCollection fileds = dataSelector.Read(Context);
StringBuilder builder = new StringBuilder();
if (fileds != null)
{
foreach (List<string> lst in fileds)
{
if (CanMap(lst) == false) continue;
if (builder.Length > 0 && lst[0].Length > 0)
builder.Append(Environment.NewLine);
if (string.IsNullOrEmpty(FormatString))
builder.Append(lst[0]);
else
builder.Append(string.Format(FormatString, lst.ToArray()));
}
content.Value = builder.ToString();
textControl.Content = content;
applyRules(control, null);
}
}
}
}
}
Does anybody have any clue at all how I can get MS Word 2007 (docx) to quit ignoring my newline characters??
Use a function like this:
private static Run InsertFormatRun(Run run, string[] formatText)
{
foreach (string text in formatText)
{
run.AppendChild(new Text(text));
RunProperties runProps = run.AppendChild(new RunProperties());
Break linebreak = new Break();
runProps.AppendChild(linebreak);
}
return run;
}
None of the above answers were any help for me.
However I figured out that the InsertAfter method swaps the \n in the original XML string for \v and when this is passed into the content control it then renders correctly.
contentControl.MultiLine = true
contentControl.Range.InsertAfter(your string)
I got the same problem but it was in a table cell.
I had one string with carriage return (multiple line) into a Text object that was append to a paragraph that was append to a table cell.
=> The carriage return was ignored by word.
Well the solution was simple :
Create one paragraph by line and add all of these paragraph's to the table cell.
I think it works
WordprocessingDocument _docx = WordprocessingDocument.Create("c:\\Test.docx", WordprocessingDocumentType.Document);
MainDocumentPart _part = _docx.MainDocumentPart;
string _str = "abc\ndef\ngeh";
string _strArr[] = _str.Split('\n');
foreach (string _line in _strArr)
{
Body _body = new Body();
_body.Append(NewText(_text));
_part.Append(_body);
}
_part.Document.Save();
_docx.Close();
.
static Paragraph NewText(string _text)
{
Paragraph _head = new Paragraph();
Run _run = new Run();
Text _line = new Text(_text);
_run.Append(_line);
_head.Append(_run);
return _head;
}

Categories