I have the following C# code to produce a small PHP file. The reason I am doing this is to update 400 plus sites automatically. The sites are in PHP on a Windows Environment so using C# for utility apps is the easiest for me.
fileContents.AppendFormat("<?php{0}",Environment.NewLine);
fileContents.AppendFormat("# FileName=\"clientsite.php\"{0}",Environment.NewLine);
fileContents.AppendFormat("# HTTP=\"true\"{0}",Environment.NewLine);
fileContents.AppendFormat("$clientname = \"{0}\";{1}", clientsiteName, Environment.NewLine);
fileContents.AppendFormat("$version = \"v6.2i\";{0}",Environment.NewLine);
fileContents.Append("?>");
The end result of this file causes a strange character to appear on the PHP page that includes this page. When I manually open the created PHP file - press backspace on the last line then enter it works. Is there something better than Environment.NewLine to use for this? Or is there another problem I am missing?
EDIT: The character looks like something I can't reproduce on the keyboard (squiggle line) by ends with ?
You could just try "\n", I believe Environment.NewLine is "\r\n".
But it could also be about how you write the StringBuilder (I assume fileContents is a StringBuilder) to the file. If you e.g. use WriteAllText, you could try using different encoding.
Related
I wrote a program to crawl website to get data and output to a excel sheet. The program is written in C# using Microsoft Visual Studio 2010.
For most of the time, I have no problem getting content from the website, parse it, and store data in excel.
However, once a will I'll run into issue, saying that there are illegal characters (such as ▶) that prevents outputting to excel file, which crashes the program.
I also went onto the website manually and found other illegal characters such as Ú.
I tried to do a .Replace() but the code can't seem to find those characters.
string htmlContent = getResponse(url); //get full html from given url
string newHtml = htmlContent.Replace("▶", "?").Replace("Ú", "?");
So my question is, is there a way to strip out all characters of those types from a html string? (the html of the web page) Below is the error message I got.
I tried Anthony and woz's solution and that didn't work...
See System.Text.Encoding.Convert
Example usage:
var htmlText = // get the text you're trying to convert.
var convertedText = System.Text.Encoding.ASCII.GetString(
System.Text.Encoding.Convert(
System.Text.Encoding.Unicode,
System.Text.Encoding.ASCII,
System.Text.Encoding.Unicode.GetBytes(htmlText)));
I tested this with the string ▶Hello World and it gave me ?Hello World.
You could try stripping all non-ASCII characters.
string htmlContent = getResponse(url);
string newHtml = Regex.Replace(htmlContent, #"[^\u0000-\u007F]", "?");
thank you for the replies and thanks for the help.
After couple more hours of googling I have found the solution to my question. The problem was that I had to "sanitize" my html string.
http://seattlesoftware.wordpress.com/2008/09/11/hexadecimal-value-0-is-an-invalid-character/
Above is the helpful article I found, which also provides code example.
I want to process a file coming from a post request.
The file looks like this:
first line: text1
second line: empty line
third line: text2
an example:
"
asdasdasd1
asdasdasd2
"
so far i processed the file like this:
byte[] data = Request.BinaryRead(Request.TotalBytes);
String processedfile = Encoding.UTF8.GetString(data);
But this was i lose the page breaks and the whole string becomes 1 line instead of 3.
How can i process the resquest where at the end i can keep the page breaks?
Thanks in advance!
Sincerely,
Zoli
Since you're not using an HTML upload control on the page (which would preserve the entire file without issue) - my guess is that the significant whitespace in your 'file', isn't percent-encoded (or URL encoded: http://en.wikipedia.org/wiki/Percent-encoding) and so is getting dropped.
Your client needs to URL encode the file contents otherwise stuff like spaces, tabs, etc will most likely be lost.
Within c#, i can put data to seperate strings.
For example the current date i put to a string called line1 and some info i put to a string called line2.
What i want to do now, is sent these 2 strings to a web adress that handles these lines, and write them into a simple text file. (or can i write to a text file on a website directly from C# ?)
My knowlage of php is very low, but so far i found this code to be working:
<?php
$File = "name.txt";
$Handle = fopen($File, 'a');
$Data = "line1\n";
fwrite($Handle, $Data);
$Data = "line2\n";
fwrite($Handle, $Data);
print "Data Added";
fclose($Handle);
?>
The C# application is running on a computer, not the website (WPF window).
But now it only has the content of the $Data written to the "name.txt" file.
Does anyone know how i could link the text that is binded to the stings in C3, to the datafields defined in the PHP, so that the text from the strings gets written to the text file on the website? Or would it be possible to write directly to a text file without the php in between ?
So, you have a C# app that you want to use to send 2 bits of data to a PHP based website, and have the website write the data into a file? If that's what you want, you'll need to do something like the following...
On the website, create a receiving PHP file. The bones of it would be something like :
<?php
$File = "name.txt";
$Handle = fopen($File, 'a');
$line1 = $_GET["line1"] . "\n";
fwrite($Handle, $line1);
$line2 = $_GET["line2"];
fwrite($Handle, $line2);
print "Data Added";
fclose($Handle);
echo "Completed writing data to the file";
?>
and to submit that data from the C# app to the website, do something as simple as
WebClient wc = new WebClient();
Console.WriteLine(wc.DownloadString("http://example.com/Receiver.php?line1=this is the first line&line2=and this is the second"));
(
NOTE : No error handling is included in this code, and anyone who knows the URL for the receiver will be able to overwrite your file with whatever they like. Take care when actually implementing this.
ALSO NOTE : It is years since I did much with PHP, so you will probably need to tweak the code.
AND ANOTHER THING : the WebClient.DownloadString approach is as basic as it gets. You may want to look at HttpWebRequests if you need more control
)
You can write to a text file on a website directly from C#.
System.IO.StreamWriter file = new System.IO.StreamWriter(Server.MapPath("/file.txt"););
file.WriteLine("First line.");
file.WriteLine("Secondline.");
file.Close();
It will create a file in the root of your website (the user running the site has to have write permissions in this directory)
I saw similar topics but could not find a solution. My problem is that I have a .txt file in which the symbols are in Bulgarian language / which is Cyrillic /, but after trying to read them, there is no sucess. I tried to read with this code:
StreamReader reader = new StreamReader(fileName,Encoding.UTF8);
if (File.Exists(fileName))
{
while ((line = reader.ReadLine()) != null)
{
Console.WriteLine(line);
}
}
And I also changed the Encoding value to all possible , as I tried with GetEncoding(1251), which I wrote is for cyrillic. And when I save the .txt file I tried to save it with each different encoding which was there / UNICODE,UTF-8,BigEndianUnicode,ANSI / in each combination with the Encoding I am settin through the code, but again no success.
Any ideas for how to read the cyrillic symbols in the right way will be appriciated.
And here is sample text for this: "Ето примерен текст."
Thanks in advance! :)
Your problem is that the console can't show cyrillic characters. Try putting a breakpoint on the Console.WriteLine and inspect the line variable. Clearly you'll need to know the correct encoding first! :-)
If you don't trust me, try this: make a console program that does this:
string line = "Ето примерен текст";
Console.WriteLine(line);
return 0;
put a breakpoint on the return 0;, watch the console and watch the line variable.
I'll add that unicode consoles should be one of the "new" things in .NET 4.5
And you can try to read this page: c# unicode string output
The problem you are having is not reading the text, but displaying it.
If your real intention is to display Unicode text in a console window, then you'll have to make a few changes. If however, you will be displaying the text in a WinForms or WPF app for instance, then you will not have problems - they work with Unicode by default.
By default, the console will not handle unicode, or use a font which has unicode glyphs. You need to do the following:
Save your text file as UTF8.
Start a console which is unicode enabled: cmd \u
Change the font to "Lucida Sans Unicode": console window menu -> properties -> font
Change the codepage to Unicode: chcp 65001
Run your app.
Your characters will now be displayed correctly:
I want to read the html file.And for that I use System.IO.File.ReadAllText(path).It can read all the html file but there is one file which is not read through this function.
I have also used
using (StreamReader reader = File.OpenText(fileName)) {
text = reader.ReadToEnd(); But still there is same problem.
What is the reason can be there ? And for that what can be the solution ? Or any other way to read the file ?
I'll take a wild guess:
The file contains unicode sequences for extended chars and the diagnose is based on (mismatched) length.
if I debug the code in the it looks
like
"<\0h\0t\0m\0l\0>\0<\0h\0e\0a\0d\0>\0\r\0\n\0<\0M\0E\0T\0A\0
\0h\0t\0t\0p\0-\0e\0q\0u\0i\0v\0=\0\"\0C\0o\0n\0t\0e\0n
Which is a valid beginning of a HTML file except for the very first char. The file is probably damaged by missing a unicode marker at the start. This damage was probably caused when it was written and is not (easy) repairable now.
You could try setting the WebClient.Encoding to UTF8 (and try a few ASCII as well).
Does MsgBox shows anything? Any error? What does varText.Length show?
string varText = File.ReadAllText(varFile, Encoding.Default);
MessageBox.Show(varFile + " Text: " + varText + " Lenght: " + varText.Length);
Verify in MessageBox that the path to file is correct, verify that the access rights from inside your application are the same as if you would be reading the file with notepad.
Came across this on google recently. The correct way to do it is via WebClient...
WebClient client = new WebClient();
String guestMsg = client.DownloadString("C:\\temp\\TheBarGuestDetailsEmail.htm");
File.ReadAllText will mess up the html when it's doing a read, and characters like £ or ' will get messed up.