I have seen the error "The ';' character, hexadecimal value 0x3B, cannot be included in a name." in my log files for an ASP.NET Web App. The url that's logged looks something like this:
mypage.aspx?paramone=one+two¶mtwo=zero+1
So my first question is what type of system/browser is encoding the original query string? (This happens rarely)
I've tried to address this problem with the following snippet of code in the Page_Load() event:
string rawUrl = Request.RawUrl;
if (rawUrl.Contains(amp))
{
rawUrl = rawUrl.Replace("&", "&");
Server.Transfer(rawUrl, false);
return;
}
However when it transfers back to this page the & is back in the query string. So I'm guessing that the .Transfer() function encodes the first param.
Suggestions about solving this problem?
Your web server should be able to log the "user agent" field from the HTTP Request, which should enable you to identify the culprit.
Don't fix it - there's a very well defined set of legal syntaxes for URI parameters, and this ain't one of them.
When you try to export a Microsoft catalog to XML, the resulting file cannot be imported, and you receive the following error message
"The XML file path/filename contains an error at line. " "A Name contained an invalid character."
If you validate the XML catalog by using Microsoft Visual Studio .NET you receive the following error message:
"The '(' character, hexadecimal value 0x28, cannot begin a name. Line #, Position #"
This problem occurs because the Commerce Server export was not encoding the following special characters:
The range 0x0021 – 0x002F includes ! “ # $ % & ‘ ( ) * + , - . /
The range 0x03A – 0x0040 includes : ; < = > ? #
The range 0x007B – 0x007E includes { | } ~
The range 0x005B – 0x005E [ \ ] ^**
Related
We have a Website and in one of our page we have a form with textarea and radio buttons which submits using jQuery $.ajax() call to a web handler(ashx). This handler runs an SQL-UPDATE and write into an XML type field. Textarea's entered text is written to one of XML nodes and radio buttons values write to other nodes of this field. I remove illegal xml characters from entered text before submitting. In the handler I do another try to remove illegal character with WebUtility.HtmlEncode().
My Problem is that in some circumstances (which I wasn't able to find) client's text entering and choosen radio button submitting will result in this error:
SqlException: XML parsing: line 28, character 80, illegal xml character
Line number changes between 28 - 29 - 30 -31. These lines is related to that xml node which filled by submitting this form.
The error happens on cmd.ExecuteNonQuery() line.
I think the key is "character 80" ,this illegal character should not exist in entered text, since XML parsing error is always happens in "character 80"; and if it was on entered text it's position would changed from error to error, also I try all radio buttons, none of them result in error.
Here is an xml of a successful update of this page, what do you think this character 80 referred to:
<Details xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<hasAwardPenalty>true</hasAwardPenalty>
<TranslatorPayment>177800</TranslatorPayment>
<TranslatorPaids />
<ProofreaderPayment>53340</ProofreaderPayment>
<FileReplace>
<FileStatus>NONE</FileStatus>
<AddedTime>0001-01-01T00:00:00</AddedTime>
<UploadTime>0001-01-01T00:00:00</UploadTime>
<AffectOnPayment>false</AffectOnPayment>
<AffectOnScore>false</AffectOnScore>
</FileReplace>
<PaymentDetails>
<AddedTime>2015-02-05T12:02:47.5618565+03:30</AddedTime>
<PaymentCode>2be92023-9e69-4215-8394-1b81f5b7fc51</PaymentCode>
<PaymentId>60508</PaymentId>
<BankResponse>تراکنش موفق</BankResponse>
<BankName>PASARGAD</BankName>
<Amount>362700</Amount>
<Status>PAID</Status>
<AuthorityCode>6653537</AuthorityCode>
<Type>SHETAB</Type>
<OrderId>138587</OrderId>
</PaymentDetails>
<MyProperty>0</MyProperty>
<RequestDate xsi:nil="true" />
<TranslationPurpose>
<Id>aa8cf8be-2e7c-42d7-8208-1721bb07299c</Id>
<TargetCategory>OTHERS</TargetCategory>
<TargetDescription>سایر</TargetDescription>
<PublicationMethod>PERSONAL</PublicationMethod>
<Tone>Formal</Tone>
<Keynote>FluidityAndLoyality</Keynote>
<GuidLines>با سلام و احترام و تشکر از زحمات شما لطفا مطابق رزومه جهت کافرمایان خارجی تهیه شود.</GuidLines>
<References />
<Modified>true</Modified>
<AddedTime>2015-02-05T12:18:24.6859596+03:30</AddedTime>
</TranslationPurpose>
</Details>
UPDATE: do you think that Window's language (Control Panel-->Language-->Change Date Time & Number -->administrator-->Language for none unicode programs) and also IIS Globalization (ASP.NET -->.NET Globalization--> File ) has any effect on this problem?
The XmlSerializer class generates XML that can contain invalid XML characters (according to XML 1.0 standard). In particular, the control characters in the ASCII / Unicode range from U+0001 to U+001F (except for U+0009, U+000A and U+000D) are encoded as numeric entities by XmlSerializer but are illegal.
SQL Server does not accept illegal XML characters such as in this XML snippet:
<TargetDescription>abcdef</TargetDescription>
So to fix it, you can clean all strings by removing these illegal characters:
class XmlHelper
{
static char[] IllegalXmlCharacters = new char[] {
'\u0001', '\u0002', '\u0003', '\u0004', '\u0005', '\u0006', '\u0007',
'\u0008', '\u000b', '\u000c', '\u000e', '\u000f', '\u0010', '\u0011',
'\u0012', '\u0013', '\u0014', '\u0015', '\u0016', '\u0017', '\u0018',
'\u0019', '\u001a', '\u001b', '\u001c', '\u001d', '\u001e', '\u001f'
};
static string RemoveIllegalXmlCharacters(string value)
{
string[] validParts = value.Split(IllegalXmlCharacters, StringSplitOptions.RemoveEmptyEntries);
return String.Join("", validParts);
}
}
To clean a string, just call the static method:
var cleanString = XmlHelper.RemoveIllegalXmlCharacters(dirtyString);
Finally I manage to fix the problem with talking to our customer who gave this error, and ask him which options he had select on this page.
I managed to find illegal character ( /h001f ) in some of our html input value.
I'm working on a problem with imap sort extention:
My command is the following:
var query = "icône";
//byte[] bytes = Encoding.Default.GetBytes(query);
//query = Encoding.UTF8.GetString(bytes);
var command = "SORT (REVERSE ARRIVAL) UTF-8 " + "{" + query.Length + "}";
var imapAnswerString = client.Command(command);
imapAnswerString = client.Command(query);
I get the following error:
BAD Error in IMAP command SORT: 8bit data in atom
I found this:
C# Imap search command with special characters like á,é
But I don't see how to prepare my code to send this request sucessfully.
If you want to stick with MailSystem.NET, the answer that arnt gave is correct.
However, as I point out here (and below for convenience), MailSystem.NET has a lot of architectural design problems that make it unusable.
If you use an alternative open source library, like MailKit, you'd accomplish this search query far more easily:
var query = SearchQuery.BodyContains ("icône");
var orderBy = new OrderBy[] { OrderBy.ReverseArrival };
var results = folder.Search (query, orderBy);
Hope that helps.
Architectural problems in MailSystem.NET include:
MailSystem.NET does not properly handle literal tokens - either sending them (for anything other than APPEND) or for receiving them (for anything other than the actual message data in a FETCH request). What none of the authors seem to have noticed is that a server may choose to use literals for any string response.
What does this mean?
It means that the server may choose to respond to a LIST command using a literal for the mailbox name.
It means that any field in a BODYSTRUCTURE may be a literal and does not have to be a quoted-string like they all assume.
(and more...)
MailSystem.NET, for example, also does not properly encode or quote mailbox names:
Example from MailSystem.NET:
public string RenameMailbox(string oldMailboxName, string newMailboxName)
{
string response = this.Command("rename \"" + oldMailboxName + "\" \"" + newMailboxName + "\"");
return response;
}
This deserves a Jean-Luc Picard and Will Riker face-palm. This code just blindly puts double-quotes around the mailbox name. This is wrong for at least 2 reasons:
What if the mailbox name has any double quotes or backslashes? It needs to escape them with \'s.
What if the mailboxName has non-ASCII characters or an &? It needs to encode the name using a modified version of the UTF-7 character encoding.
Most (all?) of the .NET IMAP clients I could find read the entire response from the server into 1 big string and then try and parse the response with some combination of regex, IndexOf(), and Substring(). What makes things worse is that most of them were also written by developers that don't know the difference between unicode character counts (i.e. string.Length) and octets (i.e. byte counts), so when they try to parse a response to a FETCH request for a message, they do this after parsing the "{}" value in the first line of the response:
int startIndex = response.IndexOf ("}") + 3;
int endIndex = startIndex + octets;
string msg = response.Substring (startIndex, endIndex - startIndex);
The MailSystem.NET developers obviously got bug reports about this not working for international mails, so their "fix" was to do this:
public string Body(int messageOrdinal)
{
this.ParentMailbox.SourceClient.SelectMailbox(this.ParentMailbox.Name);
string response = this.ParentMailbox.SourceClient.Command("fetch "+messageOrdinal.ToString()+" body", getFetchOptions());
return response.Substring(response.IndexOf("}")+3,response.LastIndexOf(" UID")-response.IndexOf("}")-7);
}
Essentially, they assume that the UID key/value pair will come after the message and use that as a hack-around for their incompetence. Unfortunately, adding more incompetence to existing incompetence only multiplies the incompetence, it doesn't actually fix it.
The IMAP specification specifically states that the order of the results can vary and that they may not even be in the same untagged response.
Not only that, but their FETCH request doesn't even request the UID value from the server, so it's up to the server whether to return it or not!
TL;DR
How to Evaluate an IMAP Client Library
The first thing you should do when evaluating an IMAP client library implementation is to see how they parse responses. If they don't use an actual tokenizer, you can tell right off the bat that the library was written by people who have no clue what they are doing. That is the most sure-fire warning sign to STAY AWAY.
Does the library handle untagged ("*") responses in a central place (such as their command pipeline)? Or does it do something retarded like try and parse it in every single method that sends a command (e.g. ImapClient.SelectFolder(), ImapClient.FetchMessage(), etc)? If the library doesn't handle it in a central location that can properly deal with these untagged responses and update state (and notify you of important things like EXPUNGE's), STAY AWAY.
If the library reads the entire response (or even just the "message") into a System.String, STAY AWAY.
You're almost there. Your final command should be something like
x sort (reverse arrival) utf-8 subject {6+}
icône
ie. you're just missing a search term to describe where the IMAP server should search for icône and sort the results. There are many other search keys, not just subject. See RFC3501 page 49 and following pages.
Edit: The + is needed after the 6 in order to send that as a single command (but requires that the server support the LITERAL+ extension). If the server doesn't support LITERAL+, then you will need to break up your command into multiple segments, like so:
C: a001 SORT (REVERSE ARRIVAL) UTF-8 SUBJECT {6}
S: + ok, whenever you are ready...
C: icône
S: ... <response goes here>
Thanks all for your answer.
Basically the way MailSystem.net sends requests (Command method) is the crux of this problem, and some others actually.
The command method should be corrected as follows:
First, when sending the request to imap, the following code works better than the original one:
//Convert stuff to have to right encoding in a char array
var myCommand = stamp + ((stamp.Length > 0) ? " " : "") + command + "\r\n";
var bytesUtf8 = Encoding.UTF8.GetBytes(myCommand);
var commandCharArray = Encoding.UTF8.GetString(bytesUtf8).ToCharArray();
#if !PocketPC
if (this._sslStream != null)
{
this._sslStream.Write(System.Text.Encoding.UTF8.GetBytes(commandCharArray));
}
else
{
base.GetStream().Write(System.Text.Encoding.UTF8.GetBytes(commandCharArray), 0, commandCharArray.Length);
}
#endif
#if PocketPC
base.GetStream().Write(System.Text.Encoding.UTF8.GetBytes(commandCharArray), 0, commandCharArray.Length);
#endif
Then, in the same method, to avoid some deadlock or wrong exceptions, improve the validity tests as follows:
if (temp.StartsWith(stamp) || temp.ToLower().StartsWith("* " + command.Split(' ')[0].ToLower()) || (temp.StartsWith("+ ") && options.IsPlusCmdAllowed) || temp.Contains("BAD Error in IMAP command"))
{
lastline = temp;
break;
}
Finally, update the return if as follows:
if (lastline.StartsWith(stamp + " OK") || temp.ToLower().StartsWith("* " + command.Split(' ')[0].ToLower()) && !options.IsSecondCallCommand || temp.ToLower().StartsWith("* ") && options.IsSecondCallCommand && !temp.Contains("BAD") || temp.StartsWith("+ "))
return bufferString;
With this change, all commands work fine, also double call commands. There are less side effects than with the original code.
This resolved most of my problems.
I have an issue regarding the filename which contains & character.
I have to get an image which name is: Test&TestAgain.jpg. In ASP.NET MVC3 application, in View side, I put
Url.Content( "~/Content/Images/"+ filename );
In Chrome, I see error 400 Bad Request because of filename which contains &.
I think that & is used for query string and browser interprets file name as query string.
Because filename doesn't contain ? then browser throw that error. (A potentially dangerous Request.Path value was detected from the client (&).)
It is a way (workaround) to fix this without replace that character ?
You need to UrlEncode the path:
Url.Content("~/Content/Images/" + HttpServerUtility.UrlEncode(filename));
Edit:
Seems it doesn't work. The only way I can think of right now is to allow resticted characters and enable VerificationCompatibility by setting in the registry:
// 32-bit IIS
HKLM\SOFTWARE\Microsoft\ASP.NET VerificationCompatibility = 1
// 64-bit IIS
HKLM\SOFTWARE\Wow6432Node\Microsoft\ASP.NET VerificationCompatibility = 1
I have a PostgreSQL database, which uses character encoding WIN1252.
When querying the database, some records will produce an error when trying to read the data, because it is trying to convert it to UTF8. This happens on some foreign names containing certain non-Latin characters.
The error is:
ERROR: 22P05: character with byte sequence 0x81 in encoding "WIN1252" has no equivalent in encoding "UTF8"
It happens when I call Read() on the NpgsqlDataReader.
My connection is defined as:
new NpgsqlConnection("Server=127.0.0.1;Port=5432;Database=xyz;User Id=****;Password=****;");
What can I do to read this data using C#?
I've managed to solve the problem. There is no way of setting the property in the connection string or any of the properties of the NpgsqlConnection or NpgsqlCommand.
However, I was able to set the value of client_encoding in a query. So directly after opening the connection I first executed the (non)query:
set client_encoding = 'WIN1252'
After that, any subsequent command on the same connection used the proper encoding and returned the results without complaints.
I tried to change the connection string but i had no luck with that.
The problem got solved by chaning the database settings file and reload it.
So i started pgadmin and executed
SHOW config_file;
which gave me
C:/Program Files/PostgreSQL/14/data/postgresql.conf
in this file i changed the lc_messages from lang_language.1252 to UTF8.
After that i reloaded this config in pg admin by right click on the server name and press "Reload Configuration".
All settings are now set to UTF8 and it just worked fine.
lc_messages = 'UTF8' # locale for system error message
# strings
lc_monetary = 'UTF8' # locale for monetary formatting
lc_numeric = 'UTF8' # locale for number formatting
lc_time = 'UTF8' # locale for time formatting
...
How do I provide an input string with automatic escaping to a console application?
I mean inside my code, I can do
public static void main(string[] args)
{
string myURL;
myFolder = #"C:\temp\january\"; //just for testing
myFolder = args[0]; // I want to do this eventually
}
How can I provide values to myFolder without me having to escape it manually via command line?
If possible, I want to avoid calling this app like this:
C:\test> myapplication.exe "C:\\temp\\january\\"
EDIT:
instead I'd prefer calling the app like this if possible
C:\test> myapplication.exe #"C:\temp\january\"
Thank you.
EDIT:
This is actually for a console application that calls Sharepoint Web services. I tried
string SourceFileFullPath, SourceFileName, DestinationFolder, DestinationFullPath;
//This part didn't work. Got Microsoft.SharePoint.SoapServer.SoapServerException
//SourceFileFullPath = args[0]; // C:\temp\xyz.pdf
//SourceFileName = args[1]; // xyz.pdf
//DestinationFolder = args[2]; // "http://myserver/ClientX/Performance" Reports
//This worked.
SourceFileFullPath = #"C:\temp\TestDoc2.txt";
SourceFileName = #"TestDoc2.txt";
DestinationFolder = #"http://myserver/ClientX/Performance Reports";
DestinationFullPath = string.Format("{0}/{1}", DestinationFolder, SourceFileName);
The requirement to escape \ inside a string if it is not a verbatim string (one that starts with #) is a C# feature. When you start your application from a console, you are outside of C#, and the console does not consider \ to be a special character, so C:\test> myapplication.exe "C:\temp\january" will work.
Edit: My original post had "C:\temp\january\" above; however, the Windows command line seems to also handle \ as an escape character - but only when in front of a ", so that command would pass C:\temp\january" to the application. Thanks to #zimdanen for pointing this out.
Please note that whatever you put between quotes in C# is a representation of a string; the actual string may be different - for instance, \\ represents a single \. If you use other means to get strings into the program, such as the command line arguments or by reading from a file, the strings do not need to follow C#'s rules for string literals. The command line has different rules for representation, in which a \ represents itself.
“The prefix “#” enables the use of keywords as identifiers, which is useful when interfacing with other programming languages. The character # is not actually part of the identifier, so the identifier might be seen in other languages as a normal identifier, without the prefix. An identifier with an # prefix is called a verbatim identifier. Use of the # prefix for identifiers that are not keywords is permitted, but strongly discouraged as a matter of style.”
You can use one of the reserved words of c# with the # symbol
ex:-
string #int = "senthil kumar";
string #class ="MCA";
2.Before a string specially when using the file paths
string filepath = #"D:\SENTHIL-DATA\myprofile.txt";
instead of
string filepath = "D:\\SENTHIL-DATA\\myprofile.txt";
For a Multi lined text
string ThreeIdiots = #"Senthil Kumar,
Norton Stanley,
and Pavan Rao!";
MessageBox.Show(ThreeIdiots);
instead of
string ThreeIdiots = #"Senthil Kumar,\n Norton Stanley,and Pavan Rao!";