I'm using the Xamarin SQLite.NET PCL Package. I also have a DB with 1 million records.
The encoding on my source DB is UTF8. For best performance, I chose a FTS engine. Everything works fine, except text with Cyrillic encoding.
Example, the following simple query:
"SELECT name FROM customers WHERE name MATCH '"+filter+"*' LIMIT 50";
Where filter is a string constant with a value. The value depends what user will input from keyboard. (It's like dynamic search)
SO the problem is that I get an SQLite Exception when filter contains a Cyrillic value:
SQLite.Net.SQLiteException: near " ": syntax error
I tried to something like this: (encode my string query to UTF-8)
string Query = String.Format("SELECT name FROM customer WHERE name MATCH '{0}*' LIMIT 50;",filter);
Encoding encod = Encoding.UTF8;
byte[] bts = encod.GetBytes(Query);
encod = Encoding.Unicode;
string Output = encod .GetString(bts, 0, bts.Length);
But the result is the same!
Any suggestions? What can I do?
By now you surely found some solution, but for anyone else this is what the query with SQLite.Net PCL should look like with parameters:
string query = "SELECT name FROM customers WHERE name MATCH #param1 LIMIT 50";
string param = "'your unicode characters'";
conn.execute(query,param);
If you have more parameters in your query (eg. #param1, #param2), you separate them with commas, like this:
string param = "'your unicode characters','your second parameter'";
Related
I have Student table with two field(id: number, name: 10 char)
The example value of name column is: 'William[2 space]', 'Ethan[5 space]'
(The space will be add to meet the max length)
The queries below work fine. (hard code or string interpolation)
select * from where name = 'William'
or select * from where name = 'William '
But when i use parameter like below, it doesn't work
select * from where name = :Name
and then inject the parameters
var result = ctx.ExecuteStatement(query, new { Name = name })
So when name = 'William ', it work.
But when name = 'William', it doesn't work.
=> I want it work in two case? Please help me address the issue.
So my temporary solution is trim the column before compare. But i think it just work around and not completely resolve the problem since oracle automatically ignore the whitespace(I showed in my first sample)
select * from where trim(name) = :Name
do not use char as datatype to store strings with variable length. use varchar2 instead
You could add whitespaces by
name = name.PadRight(10);
PadRight aligns the text to the left and fills the string with whitespaces to obtain the defined length.
I have an existing SQL Server 2000 database that stores UTF-8 representations of text in a TEXT column. I don't have the option of modifying the type of the column, and must be able to store non-ASCII Unicode data from a C# program into that column.
Here's the code:
sqlcmd.CommandText =
"INSERT INTO Notes " +
"(UserID, LocationID, Note) " +
"VALUES (" +
Note.UserId.ToString() + ", " +
Note.LocationID.ToString() + ", " +
"#note); " +
"SELECT CAST(SCOPE_IDENTITY() AS BIGINT) ";
SqlParameter noteparam = new SqlParameter( "#note", System.Data.SqlDbType.Text, int.MaxValue );
At this point I've tried a few different ways to get my UTF-8 data into the parameter. For example:
// METHOD ONE
byte[] bytes = (byte[]) Encoding.UTF8.GetBytes( Note.Note );
char[] characters = bytes.Select( b => (char) b ).ToArray();
noteparam.Value = new String( characters );
I've also tried simply
// METHOD TWO
noteparam.Value = Note.Note;
And
// METHOD THREE
byte[] bytes = (byte[]) Encoding.UTF8.GetBytes( Note.Note );
noteparam.Value = bytes;
Continuing, here's the rest of the code:
sqlcmd.Parameters.Add( noteparam );
sqlcmd.Prepare();
try
{
Note.RecordId = (Int64) sqlcmd.ExecuteScalar();
}
catch
{
return false;
}
Method one (get UTF8 bytes into a string) does something strange -- I think it is UTF-8 encoding the string a second time.
Method two stores garbage.
Method three throws an exception in ExecuteScalar() claiming it can't convert the parameter to a String.
Things I already know, so no need telling me:
SQL Server 2000 is past/approaching end-of-life
TEXT columns are not meant for Unicode text
Seriously, SQL Server 2000 is old. You need to upgrade.
Any suggestions?
If your database collation is SQL_Latin1_General_CP1 (the default for the U.S. edition of SQL Server 2000), then you can use the following trick to store Unicode text as UTF-8 in a char, varchar, or text column:
byte[] bytes = Encoding.UTF8.GetBytes(Note.Note);
noteparam.Value = Encoding.GetEncoding(1252).GetString(bytes);
Later, when you want to read back the text, reverse the process:
SqlDataReader reader;
// ...
byte[] bytes = Encoding.GetEncoding(1252).GetBytes((string)reader["Note"]);
string note = Encoding.UTF8.GetString(bytes);
If your database collation is not SQL_Latin1_General_CP1, then you will need to replace 1252 with the correct code page.
Note: If you look at the stored text in Enterprise Manager or Query Analyzer, you'll see strange characters in place of non-ASCII text, just as if you opened a UTF-8 document in a text editor that didn't support Unicode.
How it works: When storing Unicode text in a non-Unicode column, SQL Server automatically converts the text from Unicode to the code page specified by the database collation. Any Unicode characters that don't exist in the target code page will be irreversibly mangled, which is why your first two methods didn't work.
But you were on the right track with method one. The missing step is to "protect" the raw UTF-8 bytes by converting them to Unicode using the Windows-1252 code page. Now, when SQL Server performs the automatic conversion from Unicode to Windows-1252, it gets back the original UTF-8 bytes untouched.
Original String is:
csrfToken=ajax:1238044988226892967&postTitle=Job Openings Linux Systems Administrator
Staff&postText=Security Clearance: Public Trust -- Linux systems administration experience specifically in managing or supporting RedHat and/or Centos Linux in...&pollChoice1-
ANetPostForm=&pollChoice2-ANetPostForm=&pollChoice3-ANetPostForm=&pollChoice4-ANetPostForm=&pollChoice5-ANetPostForm=&pollEndDate-ANetPostForm=0&contentImageCount=1&contentImageIndex=0&
contentImage=http://www.ideal-jobs.net/images/image070.jpg&contentEntityID=5637974394992087135&contentUrl=http%3a%2f%2fwww.ideal-jobs.net%2fjob-openings-
linux-systems-administrator-staff%2f&contentTitle=Job Openings Linux Systems Administrator
Staff&contentSummary=Security Clearance: Public Trust -- Linux systems administration experience specifically in managing or supporting RedHat and/or Centos Linux in...&contentImageIncluded=true&%23=Save&tweet=&postItem=Share&gid=50565&ajax=true&tetherAccountID=&facebookTetherID=
String i want it to be like after encoding:
csrfToken=ajax%3A6293994705950333071&postTitle=hello&postText=Hi%20everyone%20hae%20a%20good%20day%20%2C%20i%20am%20new%20to%20this%20%3A)&
pollChoice1-ANetPostForm=&pollChoice2-ANetPostForm=&pollChoice3-ANetPostForm=&pollChoice4-
ANetPostForm=&pollChoice5-ANetPostForm=&pollEndDate-ANetPostForm=0&contentImageCount=0&contentImageIndex=-1&contentImage=&contentEntityID=&contentUrl=&contentTitle=&
contentSummary=&contentImageIncluded=true&%23=&gid=163857&postItem=&ajax=true&tetherAccountID=&facebookTetherID=
And currently i am using :
byte[] byteData = HttpUtility.UrlEncodeToBytes(postData);
and i am getting the string (i see in fiddler) like :
csrfToken%3dajax%3a1238044988226892967%26postTitle%3dJob+Openings+Linux+Systems+Administrat
or+Staff%26postText%3dSecurity+Clearance%3a+Public+Trust+--+Linux+systems+administration+ex
perience+specifically+in+managing+or+supporting+RedHat+and%2for+Centos+Linux+in...%26pollCh
oice1-ANetPostForm%3d%26pollChoice2-ANetPostForm%3d%26pollChoice3-ANetPostForm
%3d%26pollChoice4-ANetPostForm%3d%26pollChoice5-ANetPostForm%3d%26pollEndDate-
ANetPostForm%3d0%26contentImageCount%3d1%26contentImageIndex%3d0%26contentImage%3dhttp
%3a%2f%2fwww.ideal-
jobs.net%2fimages%2fimage070.jpg%26contentEntityID%3d5637974394992087135%26contentUrl%3dhtt
p%253a%252f%252fwww.ideal-jobs.net%252fjob-openings-linux-systems-administrator-
staff%252f%26contentTitle%3dJob+Openings+Linux+Systems+Administrator+Staff%26contentSummary
%3dSecurity+Clearance%3a+Public+Trust+--+Linux+systems+administration+experience+specifical
ly+in+managing+or+supporting+RedHat+and%2for+Centos+Linux+in...%26contentImageIncluded%3dtr
ue%26%2523%3dSave%26tweet
%3d%26postItem%3dShare%26gid%3d50565%26ajax%3dtrue%26tetherAccountID
%3d%26facebookTetherID%3d
ALSO TRIED:
UTF8Encoding encoding = new UTF8Encoding();
AND
byte[] byteData = HttpUtility.UrlEncodeUnicodeToBytes(postData);
Still no luck..
Thank you
The problem is you're URL encoding the whole string, including the delimiter characters & and =. You first need to parse the string into fields, then url encode just the field names and values and finally recombine into a string.
Give this a try:
string input; // Your input string
List<string> outputs = new List<string>();
// Parse the original string
NameValueCollection parms = HttpUtility.ParseQueryString(input);
// Loop over each item, url encoding
foreach (string key in parms.AllKeys) {
foreach (string val in parms.GetValues(key))
outputs.Add(HttpUtility.UrlEncode(key) + "=" + HttpUtility.UrlEncode(val));
}
// combine the encoded strings, joining with &
string result = string.Join("&", outputs); // the final result
EDIT
Here is a simpler version I figured out while trying out my previous idea:
string result = HttpUtility.ParseQueryString(postData).ToString();
I have a table I need to handle various characters. The characters include Ø, ® etc.
I have set my table to utf-8 as the default collation, all columns use table default, however when I try to insert these characters I get error: Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1
My connection string is defined as
string mySqlConn = "server="+server+";user="+username+";database="+database+";port="+port+";password="+password+";charset=utf8;";
I am at a loss as to why I am still seeing errors. Have I missed anything with either the .net connector, or with my MySQL setup?
--Edit--
My (new) C# insert statement looks like:
MySqlCommand insert = new MySqlCommand( "INSERT INTO fulfilled_Shipments_Data " +
"(amazonOrderId,merchantOrderId,shipmentId,shipmentItemId,"+
"amazonOrderItemId,merchantOrderItemId,purchaseDate,"+ ...
VALUES (#amazonOrderId,#merchantOrderId,#shipmentId,#shipmentItemId,"+
"#amazonOrderItemId,#merchantOrderItemId,#purchaseDate,"+
"paymentsDate,shipmentDate,reportingDate,buyerEmail,buyerName,"+ ...
insert.Parameters.AddWithValue("#amazonorderId",lines[0]);
insert.Parameters.AddWithValue("#merchantOrderId",lines[1]);
insert.Parameters.AddWithValue("#shipmentId",lines[2]);
insert.Parameters.AddWithValue("#shipmentItemId",lines[3]);
insert.Parameters.AddWithValue("#amazonOrderItemId",lines[4]);
insert.Parameters.AddWithValue("#merchantOrderItemId",lines[5]);
insert.Parameters.AddWithValue("#purchaseDate",lines[6]);
insert.Parameters.AddWithValue("#paymentsDate",lines[7]);
insert.ExecuteNonQuery();
Assuming that this is the correct way to use parametrized statements, it is still giving an error
"Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1"
Any other ideas?
\xEF\xBF\xBD is the UTF-8 encoding for the unicode character U+FFFD. This is a special character, also known as the "Replacement character". A quote from the wikipedia page about the special unicode characters:
The replacement character � (often a black diamond with a white question mark) is a symbol found in the Unicode standard at codepoint U+FFFD in the Specials table. It is used to indicate problems when a system is not able to decode a stream of data to a correct symbol. It is most commonly seen when a font does not contain a character, but is also seen when the data is invalid and does not match any character:
So it looks like your data source contains corrupted data. It is also possible that you try to read the data using the wrong encoding. Where do the lines come from?
If you can't fix the data, and your input indeed contains invalid characters, you could just remove the replacement characters:
lines[n] = lines[n].Replace("\xFFFD", "");
Mattmanser is right, never write a sql query by concatenating the parameters directly in the query. An example of parametrized query is:
string lastname = "Doe";
double height = 6.1;
DateTime date = new DateTime(1978,4,18);
var connection = new MySqlConnection(connStr);
try
{
connection.Open();
var command = new MySqlCommand(
"SELECT * FROM tblPerson WHERE LastName = #Name AND Height > #Height AND BirthDate < #BirthDate", connection);
command.Parameters.AddWithValue("#Name", lastname);
command.Parameters.AddWithValue("#Height", height);
command.Parameters.AddWithValue("#Name", birthDate);
MySqlDataReader reader = command.ExecuteReader();
...
}
finally
{
connection.Close();
}
To those who have a similar problem using PHP, try the function utf8_encode($string). It just works!
I have this some problem, when my website encoding is utf-u and I tried to send in form CP-1250 string (example taken by listdir dictionaries).
I think you must send string encoded like website.
I have a string in the format:
PROVIDER=Sybase.ASEOLEDBProvider.2;User ID=sa;Server Name=UKServer;Server Port Address=5001;Initial Catalog=master
Using a regular expression in C# now can I get the value of Server Name?
Please note that Server Name could be in any location in the string and there may or may not be a space either side of the "=" i.e. the fomate could be
... Server Name=UKServer;....
... Server Name = UKServer;....
... Server Name =UKServer;....
... Server Name= UKServer;....
You don't really have to the parse the connection-string yourself; the handy OdbcConnectionStringBuilderclass can do it for you. It implementsIDictionary, allowing you retrieve all of the attributes of the connection-string by key. I'm sure it is reasonably resistant to the different kinds of input that you mention, e.g. additional white-space, different ordering of key-vale pairs, etc.
Here's an example, tested for your sample:
var connString = #"PROVIDER=Sybase.ASEOLEDBProvider.2;User ID=sa;Server Name=UKServer;Server Port Address=5001;Initial Catalog=master";
var connStringBuilder = new OdbcConnectionStringBuilder(connString);
var serverName = connStringBuilder["Server Name"].ToString();
_serverName = RegEx.Match(inputString, "Server Name ?= ?([\\w]+);").Groups(1).Value;
Breakdown:
Server Name ?= ? // Normal string, the ? means that the preceding character
// or group is optional (0 or 1)
([\w]+); // The parentheses define a group (the Group(0) is the
// always the whole match), so that you can easily get a
// substring of the match.
[\w]+ // Matches any alphabetical character, number or underscore
Something like this should work:
"Server Name\s*=\s*(\w+)\s*;"
How about something like this :
[^&]*(i?)(Server Name\s|)((i?)[a-z]);
_serverName = RegEx.Match(inputString, "[^&]*(i?)(Server Name\s|)((i?)[a-z]);").Groups(2).Value;