Incorrect string value: '\xEF\xBF\xBD' for column - c#

I have a table I need to handle various characters. The characters include Ø, ® etc.
I have set my table to utf-8 as the default collation, all columns use table default, however when I try to insert these characters I get error: Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1
My connection string is defined as
string mySqlConn = "server="+server+";user="+username+";database="+database+";port="+port+";password="+password+";charset=utf8;";
I am at a loss as to why I am still seeing errors. Have I missed anything with either the .net connector, or with my MySQL setup?
--Edit--
My (new) C# insert statement looks like:
MySqlCommand insert = new MySqlCommand( "INSERT INTO fulfilled_Shipments_Data " +
"(amazonOrderId,merchantOrderId,shipmentId,shipmentItemId,"+
"amazonOrderItemId,merchantOrderItemId,purchaseDate,"+ ...
VALUES (#amazonOrderId,#merchantOrderId,#shipmentId,#shipmentItemId,"+
"#amazonOrderItemId,#merchantOrderItemId,#purchaseDate,"+
"paymentsDate,shipmentDate,reportingDate,buyerEmail,buyerName,"+ ...
insert.Parameters.AddWithValue("#amazonorderId",lines[0]);
insert.Parameters.AddWithValue("#merchantOrderId",lines[1]);
insert.Parameters.AddWithValue("#shipmentId",lines[2]);
insert.Parameters.AddWithValue("#shipmentItemId",lines[3]);
insert.Parameters.AddWithValue("#amazonOrderItemId",lines[4]);
insert.Parameters.AddWithValue("#merchantOrderItemId",lines[5]);
insert.Parameters.AddWithValue("#purchaseDate",lines[6]);
insert.Parameters.AddWithValue("#paymentsDate",lines[7]);
insert.ExecuteNonQuery();
Assuming that this is the correct way to use parametrized statements, it is still giving an error
"Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1"
Any other ideas?

\xEF\xBF\xBD is the UTF-8 encoding for the unicode character U+FFFD. This is a special character, also known as the "Replacement character". A quote from the wikipedia page about the special unicode characters:
The replacement character � (often a black diamond with a white question mark) is a symbol found in the Unicode standard at codepoint U+FFFD in the Specials table. It is used to indicate problems when a system is not able to decode a stream of data to a correct symbol. It is most commonly seen when a font does not contain a character, but is also seen when the data is invalid and does not match any character:
So it looks like your data source contains corrupted data. It is also possible that you try to read the data using the wrong encoding. Where do the lines come from?
If you can't fix the data, and your input indeed contains invalid characters, you could just remove the replacement characters:
lines[n] = lines[n].Replace("\xFFFD", "");

Mattmanser is right, never write a sql query by concatenating the parameters directly in the query. An example of parametrized query is:
string lastname = "Doe";
double height = 6.1;
DateTime date = new DateTime(1978,4,18);
var connection = new MySqlConnection(connStr);
try
{
connection.Open();
var command = new MySqlCommand(
"SELECT * FROM tblPerson WHERE LastName = #Name AND Height > #Height AND BirthDate < #BirthDate", connection);
command.Parameters.AddWithValue("#Name", lastname);
command.Parameters.AddWithValue("#Height", height);
command.Parameters.AddWithValue("#Name", birthDate);
MySqlDataReader reader = command.ExecuteReader();
...
}
finally
{
connection.Close();
}

To those who have a similar problem using PHP, try the function utf8_encode($string). It just works!

I have this some problem, when my website encoding is utf-u and I tried to send in form CP-1250 string (example taken by listdir dictionaries).
I think you must send string encoded like website.

Related

SqlCommand with parameters accepting different data formats

Imagine this code:
string value = "1.23";
string query = "UPDATE MYTABLE SET COL1=#p1";
SqlCommand cmd = new SqlCommand(query, connection);
cmd.Parameters.AddWithValue("#p1", value);
cmd.ExecuteNonQuery();
On my database it will work with value="1.23" if COL1 is decimal type column. But it will fail if value is "1,23" (comma instead of a dot as a decimal point). The error is
Error converting data type nvarchar to numeric
I'd like it to work in both cases with "value" being a string variable.
Unfortunately I cannot just replace comma for the dot as this code is going to be more universal, dealing both with numeric and varchar columns
Is there any way that an query accepts a parameter with number written as a string both with dot and a comma and correctly puts it in into table?
Thanks for any help
If the value isn't semantically a string, you shouldn't send it as a string. The type of the parameter value is important, and influences how it is transmitted and can lead to culture issues (comma vs dot, etc).
If the value is semantically a decimal, then use something like:
string value = "1.23";
var typedValue = decimal.Parse(value); // possible specifying a culture-info
//...
cmd.Parameters.AddWithValue("#p1", typedValue);
Then it should work reliably.
Unrelated, but you can make ADO.NET a lot easier with tools like "Dapper":
connection.Execute("UPDATE MYTABLE SET COL1=#typedValue", new { typedValue });

Incorrect syntax near '​'

First I was getting this error:
Invalid object name "product_images​_temporary"
and after I have added the [] brackets, everything worked fine. But then when I removed them again, I got this error:
Incorrect syntax near '​'
Why does this work:
[product_images​_temporary]
but this throws an exception ("Incorrect syntax near '​'"):
product_images​_temporary
More code:
try
{
using (var sqlConnection = new DapperHelper().DatabaseConnection())
{
var sqlStatement = "SELECT * FROM product_images​_temporary";
sqlConnection.Execute(sqlStatement);
}
}
catch (Exception e)
{
}
Is product_images​_temporary a reserved word in SQL Server? Like datetime etc.? I can't explain this.
Between the s and the _ is the Unicode zero-width-space character \u200B. This is invisible so makes the string not what it appears to be.
This character is not legal in an SQL object identifier name and is the cause of the error you see, using [] escapes make it legal.
Simply retype the name manually or double-delete between the two characters.
As your code does work with [] it means the actual table name contains \u200B so should also be renamed.
Just rename the table, you have an invisible character in your table's name

SQLite bug with Cyrillic encode

I'm using the Xamarin SQLite.NET PCL Package. I also have a DB with 1 million records.
The encoding on my source DB is UTF8. For best performance, I chose a FTS engine. Everything works fine, except text with Cyrillic encoding.
Example, the following simple query:
"SELECT name FROM customers WHERE name MATCH '"+filter+"*' LIMIT 50";
Where filter is a string constant with a value. The value depends what user will input from keyboard. (It's like dynamic search)
SO the problem is that I get an SQLite Exception when filter contains a Cyrillic value:
SQLite.Net.SQLiteException: near " ": syntax error
I tried to something like this: (encode my string query to UTF-8)
string Query = String.Format("SELECT name FROM customer WHERE name MATCH '{0}*' LIMIT 50;",filter);
Encoding encod = Encoding.UTF8;
byte[] bts = encod.GetBytes(Query);
encod = Encoding.Unicode;
string Output = encod .GetString(bts, 0, bts.Length);
But the result is the same!
Any suggestions? What can I do?
By now you surely found some solution, but for anyone else this is what the query with SQLite.Net PCL should look like with parameters:
string query = "SELECT name FROM customers WHERE name MATCH #param1 LIMIT 50";
string param = "'your unicode characters'";
conn.execute(query,param);
If you have more parameters in your query (eg. #param1, #param2), you separate them with commas, like this:
string param = "'your unicode characters','your second parameter'";

Input string was not in a correct format. C# error SQL database

I have a very BIG PROBLEM.
I want to delete a row from my database sql in C#.
here is my code:
int x = Convert.ToInt32(dataGridView1.SelectedCells[0].Value);
cmd.Parameters.Clear();
cmd.CommandText = "delete from Table2 where Name=#N";
cmd.Parameters.AddWithValue("#N", x);
con.Open();
cmd.ExecuteNonQuery();
con.Close();
and finally im get a problem
Input string was not in a correct format.
Help me.
I get the error in first Line.
First problem:
dataGridView1.SelectedCells[0].Value is not a valid integer value, so Convert.ToInt32 fails.
Did you mean
string x = dataGridView1.SelectedCells[0].Value;
instead?
Second problem:
You are comparing the Name field to an integer value. I'm assuming Name is a _string_ value in the database (otherwise it's poorly named). When SQL looks for matching values, it will try to convert every value ofNamein the database to a number. if ANY value in theName` field is not a valid number, the query will fail.
I've read that null or empty string will return 0 (to be confirmed)
Here my guess :
Depending of your country, numbers should be wrote :
1.234
or
1,234
Solution
Hugly solution : You can simply do a :
Convert.ToInt32(dataGridView1.SelectedCells[0].Value.Replace(".",","));
Good solution : Or use Convert.ToInt32(String, IFormatProvider) :
Convert.ToInt32(dataGridView1.SelectedCells[0].Value,CultureInfo.CurrentCulture)
EDIT -1 without comment, i'm please to help.

How to store UTF-8 bytes from a C# String in a SQL Server 2000 TEXT column

I have an existing SQL Server 2000 database that stores UTF-8 representations of text in a TEXT column. I don't have the option of modifying the type of the column, and must be able to store non-ASCII Unicode data from a C# program into that column.
Here's the code:
sqlcmd.CommandText =
"INSERT INTO Notes " +
"(UserID, LocationID, Note) " +
"VALUES (" +
Note.UserId.ToString() + ", " +
Note.LocationID.ToString() + ", " +
"#note); " +
"SELECT CAST(SCOPE_IDENTITY() AS BIGINT) ";
SqlParameter noteparam = new SqlParameter( "#note", System.Data.SqlDbType.Text, int.MaxValue );
At this point I've tried a few different ways to get my UTF-8 data into the parameter. For example:
// METHOD ONE
byte[] bytes = (byte[]) Encoding.UTF8.GetBytes( Note.Note );
char[] characters = bytes.Select( b => (char) b ).ToArray();
noteparam.Value = new String( characters );
I've also tried simply
// METHOD TWO
noteparam.Value = Note.Note;
And
// METHOD THREE
byte[] bytes = (byte[]) Encoding.UTF8.GetBytes( Note.Note );
noteparam.Value = bytes;
Continuing, here's the rest of the code:
sqlcmd.Parameters.Add( noteparam );
sqlcmd.Prepare();
try
{
Note.RecordId = (Int64) sqlcmd.ExecuteScalar();
}
catch
{
return false;
}
Method one (get UTF8 bytes into a string) does something strange -- I think it is UTF-8 encoding the string a second time.
Method two stores garbage.
Method three throws an exception in ExecuteScalar() claiming it can't convert the parameter to a String.
Things I already know, so no need telling me:
SQL Server 2000 is past/approaching end-of-life
TEXT columns are not meant for Unicode text
Seriously, SQL Server 2000 is old. You need to upgrade.
Any suggestions?
If your database collation is SQL_Latin1_General_CP1 (the default for the U.S. edition of SQL Server 2000), then you can use the following trick to store Unicode text as UTF-8 in a char, varchar, or text column:
byte[] bytes = Encoding.UTF8.GetBytes(Note.Note);
noteparam.Value = Encoding.GetEncoding(1252).GetString(bytes);
Later, when you want to read back the text, reverse the process:
SqlDataReader reader;
// ...
byte[] bytes = Encoding.GetEncoding(1252).GetBytes((string)reader["Note"]);
string note = Encoding.UTF8.GetString(bytes);
If your database collation is not SQL_Latin1_General_CP1, then you will need to replace 1252 with the correct code page.
Note: If you look at the stored text in Enterprise Manager or Query Analyzer, you'll see strange characters in place of non-ASCII text, just as if you opened a UTF-8 document in a text editor that didn't support Unicode.
How it works: When storing Unicode text in a non-Unicode column, SQL Server automatically converts the text from Unicode to the code page specified by the database collation. Any Unicode characters that don't exist in the target code page will be irreversibly mangled, which is why your first two methods didn't work.
But you were on the right track with method one. The missing step is to "protect" the raw UTF-8 bytes by converting them to Unicode using the Windows-1252 code page. Now, when SQL Server performs the automatic conversion from Unicode to Windows-1252, it gets back the original UTF-8 bytes untouched.

Categories