My website is displaying Arabic text as question mark symbols - c#

I have this problem where I am making a website that displays a news rss feed in Arabic so I insert to sql server database the title, body (description) and the link of each news but they stored in database as (?) symbols so when I request the data from the database to display it in the webpage it displays (?) symbols. How can I make it display the Arabic characters?
I tried
<globalization requestEncoding="utf-8" responseEncoding="utf-8" />
but that was not the solution please any help?!!

Make sure your data type in your database allow insertion of special (eg. Unicode) characters. In Sql Server, as example, you should use nvarchar data type instead of varchar. What is your RDBMS?

Few suggestions:
Make sure that the database tables that will store the Arabic data have the proper collation.
You'll probably need Arabic_CI_AS instead of the default Latin1_General_CI_AS.
Make sure that the database columns are set to nvarchar.
Make sure that any JavaScripts that are used on your website are saved with UTF8 encoding.
I just bumped into this link in my Smashing Magazine newsletter, it might provide some useful additional info on UTF8 and common difficulties people have with it:
http://the-pastry-box-project.net/oli-studholme/2013-october-8/

Related

Unicode characters in Listview c#

Here is the thing:
I need to display japanese character in listview in a SQL operated database manager I am currently building for a friendly company. Tried to google, but all answers led me to nothing really. Instead of displaying characters it just does "????". Have a look:
but I am loading a properly displayed .csv file from a machine that has a japanese installed on it. Also its been saved as utf8:
Font I am using is Meiryo UI. Tried Tahoma and the same thing is happening. Loading is being done including encoding:
3
And finally here's the code responsible for stuffing the data into a listview:
4
I would really appreciate if someone could help me. Thanks!
You are using a streamreader to open the file, but you are not using that same streamreader to read the data. Instead you are instructing SQL server to open it using the BULK INSERT command. Prior to Sql 2012 SP2, there was no support for UTF-8 in BULK INSERT.
If you are using Sql 2012 SP2 or above, you might consider Tom-K answer here:
How to write UTF-8 characters using bulk insert in SQL Server?
Failing that, you must either convert the file to UTF-16 before doing the bulk insert, or use another method.
I managed to solve this thing. While using SQL Server 2014 I simply forgot to change the collation encoding in database settings. It was set on Latin instead of Japanese-Unicode BIN. Thanks to Ben for pointing me right direction.
Fixed

German Letters encoding problem

I get HTML from a webpage that is in german language, i have to insert its html in database, but when I insert it in database the german letters does not appear coorectly.
E.g. Bundesstraße appears as Bundesstraße. I am using C# and MYsql database.
It seems like special characters are encoded as html entities (http://www.w3schools.com/tags/ref_entities.asp) on the website. When using UTF8 this isn't necessary, but many sites still do it.
If you want to have the exact html as it is on the website these encoded entities are correct.
To decode the entities you can use System.Net.WebUtility.HtmlDecode(yourString).
What encoding are you using?
Try switching to UTF-8 and ensure your database supports it. It looks as if though your string is getting HTML encoding, this is fine for presentation, but you'll need the original format to store it in the database.
In HTML, ß is encoded as ß.
You say "i have to insert its html in database", and what you're currently getting is correct.

Microsoft.Jet.OLEDB.4.0 Converting Characters

I'm working with a CSV that contains characters like:
” and •
I am reading the CSV via OleDb and the provider is Microsoft.Jet.OLEDB.4.0. when the data is loaded into the OleDbCommand, the characters are converted to the following respectively:
“ and •
I suspected there might be a collation setting in the connection string but I was unable to find anything about this.
I can confirm the following:
I can see the original character in the CSV when I open it.
If I run a select on the file through OleDb WHERE [field] LIKE '%•%' I get 0 rows but if SELECT WHERE [field] LIKE '%“%' I get rows returned.
Any thoughts?
Finally! Thanks to #HABJAN I was able to get to the resolution which is as simple as setting the CharaterSet in the Extended Properties of the connection string. For my situation it was UTF-8...commonly used by default in PHPMyAdmin which is where my data was retrieved from.
Resulting working connection string:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"{0}\";Extended Properties=\"text;HDR=Yes;FMT=Delimited;CharacterSet=65001;\""
Key is CharacterSet=65001 (Code Page Identifier for UTF-8) which might have been obvious to some collation savvy individuals but I've somehow managed to avoid these issues over the years and never come across it in this respect.
I was also able to get HABJAN's solution to work when also following the documentation found # http://msdn.microsoft.com/en-us/library/ms709353%28v=vs.85%29.aspx and setting the CharacterSet to the same as above.
For my situation, this is the better method as it is a simpler/more maintainable solution, but +1 to HABJAN for helping me get there!
Thanks
You can create schema.ini file and play with format and CharacterSet properties.
Take a look at this sample: How to read data from Unicode formatted text file and import to Data Table using .Net
And here is another sample that will show you how to read csv file with schema.ini: Importing CSV file into Database with Schema.ini

Asp.net renders string with wrong encoding, but PHP doesn't (MySQL)

I took over some old php application with MySQL as database. Inside the database, there are tables including content with localized strings (therefore containing special chars)
Currently there is a PHP application accessing that database. My job is to create an ASP.net (C# codebehind) application that accesses that strings as well. That works, as far as encoding goes.
If I try to access these strings, I do get a kind of encoding problem, like 'Ändern' and 'Prüfzeichen', but only in the ASP.net application. The PHP app sets utf-8 as charset and the strings are perfectly rendered. In the ASP.net application it's gibberish, regardless of the page encoding.
In the MySQL database, the charset for the specified table 'translations' is set to 'latin --cp1252 West European' and collation to 'latin_swedish_ci'.
I can't seem to figure out what PHP apparently does, and ASP.net does not. I traced the php code and could not find any sign of special encoding while getting a string from the database.
The question is, how can I ensure correct encoding inside the ASP.net application without modifying the database, because big changes at the php code are not possible?
Does anybody have a clue?
The best long-term solution would be to convert the table to use UTF-8 encoding:
ALTER TABLE translations CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
If the data is already in utf-8 format (even though the character set is latin1), you'll need to convert each column to the correct encoding.
This converts a column defined as being latin1 but containing utf8 to a column declared as and containing utf8:
ALTER TABLE translations CHANGE columnNameHere columnNameHere BLOB;
ALTER TABLE translations CHANGE columnNameHere columnNameHere TEXT CHARACTER SET utf8;
I can't seem to figure out what PHP apparently does,
The PHP app sets utf-8 as charset. For the database connection. With SET NAMES <encoding> query. Where <encoding> is your pages encoding
If finally managed to find way to convert into UTF8.
System.Text.Encoding.UTF8.GetString(System.Text.Encoding.Default.GetBytes("convert me"))

How to create HTML text from C# application?

I have C# application that must store some information into MS SQL that
would be later sent to email with DB Mail.
Within C# application I have a class with several properties and I need to use it to generate email text. So what I would like is set up a template with placeholders for variables. I need to create text as HTML and plain text.
What tools, libraries would you
recommend for HTML?
Is String.Format() best alternative to
work with plain text?
I do this in other applications by having the e-mail body available somewhere (SharePoint list, data table) already in the right format, but with named placeholders, corresponding to the information you have in your application.
Then sending the e-mail means replacing the placeholder with the right information. StringBuilder.Replace works fine.
I would say the most important thing you need to decide is when to encode the text. If you are emailing text supplied byusers, you will want to HtmlEncode it before including it in an email. It's probably ok to store it "as recieved" in the data base as long as every consumer encodes it before using it. I typically do this in the data layer that "gets" data from the data base.

Categories