failed to open sqlite database with persian character in the path - c#

I am using a SQLite database for my program. Everything works fine when I am using English characters in the database path . but when I want to open my SQLite database with Persian characters in its path it fails to open . I searched the internet and found two answers for other languages but it did not worked for Persian.
the two option:
First option:
var dbPath2 =
Path.Combine(Windows.Storage.ApplicationData.Current.RoamingFolder.Path,
"test.db");
string utf8String = String.Empty;
// Get UTF16 bytes and convert UTF16 bytes to UTF8 bytes
byte[] utf16Bytes = Encoding.Unicode.GetBytes(dbPath2);
byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode,
Encoding.UTF8, utf16Bytes);
// Fill UTF8 bytes inside UTF8 string
for (int i = 0; i < utf8Bytes.Length; i++)
{
// Because char always saves 2 bytes, fill char with 0
byte[] utf8Container = new byte[2] { utf8Bytes[i], 0 };
utf8String += BitConverter.ToChar(utf8Container, 0);
}
string dbPath = utf8String;
var db = new SQLite.SQLiteConnection(dbPath)
Second option (In Sqlite.cs comes when you add the reference)
public SQLiteConnection(string databasePath, bool
storeDateTimeAsTicks = false)
{
DatabasePath = databasePath;
Sqlite3DatabaseHandle handle;
var r = SQLite3.Open16(DatabasePath, out handle);
Handle = handle;
if (r != SQLite3.Result.OK)
{
throw SQLiteException.New(r, String.Format("Could not
open database file: {0} ({1})", DatabasePath, r));
}
_open = true;
StoreDateTimeAsTicks = storeDateTimeAsTicks;
BusyTimeout = TimeSpan.FromSeconds(0.1);
}
thanks

Related

How to convert already encoded string representation of Bytes Array to actualy Bytes Array? C#

The problem: We had a system that events and projections had a column Payload which is a serialized object. This payload was a string but for performance and saving disk space considerations we started saving a compressed version of the string in the database. and we decompress it whenever fetching from the Database.
Code for compressing and decompressing
using System.IO;
using System.IO.Compression;
using System.Text;
namespace DemoEFCore.Helpers
{
public class CompressionHelper
{
public static byte[] Compress(string stringData)
{
var stringBytes = Encoding.UTF8.GetBytes(stringData);
using var output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(output, CompressionLevel.Fastest))
{
dstream.Write(stringBytes, 0, stringBytes.Length);
}
return output.ToArray();
}
public static string Decompress(byte[] data)
{
using var input = new MemoryStream(data);
using var output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(input, CompressionMode.Decompress))
{
dstream.CopyTo(output);
}
var bytes = output.ToArray();
return Encoding.UTF8.GetString(bytes);
}
}
}
It works perfectly fine and it really gives performance improvements.
But sometimes when you are fixing a bug you go straight to the database to see a payload of a concrete record. I could copy it and paste in some of the JSON beautifier previously but now I can copy only encoded representation.
screenshot from db
0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B
What we want to do is to have the ability to copy this string, paste it in our application and get the decoded JSON string. It sounds easy because we already have the CompressionHelper.Decompress method but it accepts a bytes array as a parameter. I found 3 solutions on how I can convert such a string to a bytes array but they didn't work for me.
1.
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
var stream = new MemoryStream();
var writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
var s1 = CompressionHelper.Decompress(stream.ToArray());
The line var s1 = CompressionHeplepr.Decompress(stream.ToArray()); throws exception System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.
2.
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
var b = Convert.FromBase64String(s);
var jsonString = CompressionHelper.Decompress(b);
The line var b = Convert.FromBase64String(s); throws this exception System.FormatException: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.
3.
try
{
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
//var substring = s.Substring(2);
var byteArray = new byte[s.Length];
for (var i = 0; i < s.Length; i++)
{
var b = Byte.Parse(s[i].ToString());
byteArray[i] = b;
}
var jsonString = CompressionHelper.Decompress(byteArray);
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
throws this exception
System.FormatException: Input string was not in a correct format.
at System.Number.ThrowOverflowOrFormatException(ParsingStatus status, TypeCode type)
at System.Byte.Parse(String s)
Can you please help me to figure out how to solve my problem?
#madmonk46 Thank You indeed. Convert.FromHexString really helped. I couldn't find this method at first because my demo project is on .NET Core 3.1 but Convert.FromHexString supported starting from .NET 5. luckily our project is on .NET 5 and we are migrating to .NET6 now))
Also it is not working with string that starts from 0x.
var stringFromDataBase = "0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B";
var bytesArray = Convert.FromHexString(stringFromDataBase);
var jsonString = CompressionHelper.Decompress(bytesArray);
this code throws
System.FormatException: The input is not a valid hex string as it contains a non-hex character.
so you need to add one line of code to it.
var stringFromDataBase = "0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B";
var substring = stringFromDataBase.Substring(2);
var bytesArray = Convert.FromHexString(substring);
var jsonString = CompressionHelper.Decompress(bytesArray);
Thank you one more time for your quick help, madmonk46!

web service utf8 arabic decoding

I built a c# web service that accepts unicode characters
I have a client that consumes this web service from php to insert data to MS SQL database
It runs correctly with english characters but when he push arabic text it insert "???????" chars to the database
I tried to decode utf8 to unicode with no luck
here is my conversion code:
private byte[] GetRawBytes(string str)
{
int charcount = str.Length;
byte[] byttemp = new byte[charcount];
for (int i = 0; i < charcount; i++)
{
byttemp[i] = (byte)str[i];
}
return byttemp;
}
private string UTF8toUnicode(string str)
{
byte[] bytUTF8;
byte[] bytUnicode;
string strUnicode = String.Empty;
bytUTF8 = GetRawBytes(str);
bytUnicode = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, bytUTF8);
strUnicode = Encoding.Unicode.GetString(bytUnicode);
return strUnicode;
}

Making self-extracting executable with C#

I'm creating simple self-extracting archive using magic number to mark the beginning of the content.
For now it is a textfile:
MAGICNUMBER .... content of the text file
Next, textfile copied to the end of the executable:
copy programm.exe/b+textfile.txt/b sfx.exe
I'm trying to find the second occurrence of the magic number (the first one would be a hardcoded constant obviously) using the following code:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGICNUMBER";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
bool flag = false;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if ((rel_pos > -1) & (!flag))
{
flag = true;
continue;
}
if ((rel_pos > -1) & (flag == true))
{
glob_pos = block_size * count + rel_pos;
break;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
but for some reason I'm copying almost entire file, not the last few kilobytes. Is it because of the compiler optimization, inlining magic constant in while loop of something similar?
How should I do self-extraction archive properly?
Guessed I should read file backwards to avoid problems of compiler inlining magic constant multiply times.
So I've modified my code in the following way:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGIC";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if (rel_pos > -1)
{
glob_pos = block_size * count + rel_pos;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
}
So I've scanned the all file once, found that I though would be the last occurrence of the magic number and copied from here to the end of it. While the file created by this procedure seems smaller than in previous attempt it in no way the same file I've attached to my "self-extracting" archive. Why?
My guess is that position calculation of the beginning of the attached file is wrong due to used conversion from binary to string. If so how should I modify my position calculation to make it correct?
Also how should I choose magic number then working with real files, pdfs for example? I wont be able to modify pdfs easily to include predefined magic number in it.
Try this out. Some C# Stream IO 101:
public static void Main()
{
String path = #"c:\here is your path";
// Method A: Read all information into a Byte Stream
Byte[] data = System.IO.File.ReadAllBytes(path);
String[] lines = System.IO.File.ReadAllLines(path);
// Method B: Use a stream to do essentially the same thing. (More powerful)
// Using block essentially means 'close when we're done'. See 'using block' or 'IDisposable'.
using (FileStream stream = File.OpenRead(path))
using (StreamReader reader = new StreamReader(stream))
{
// This will read all the data as a single string
String allData = reader.ReadToEnd();
}
String outputPath = #"C:\where I'm writing to";
// Copy from one file-stream to another
using (FileStream inputStream = File.OpenRead(path))
using (FileStream outputStream = File.Create(outputPath))
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
}
// Copy to an in-memory stream
using (FileStream inputStream = File.OpenRead(path))
using (MemoryStream outputStream = new MemoryStream())
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
// If you want to hold the data in memory, just don't wrap your
// memory stream in a using block.
}
// Use serialization to store data.
var serializer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
// We'll serialize a person to the memory stream.
MemoryStream memoryStream = new MemoryStream();
serializer.Serialize(memoryStream, new Person() { Name = "Sam", Age = 20 });
// Now the person is stored in the memory stream (just as easy to write to disk using a
// file stream as well.
// Now lets reset the stream to the beginning:
memoryStream.Seek(0, SeekOrigin.Begin);
// And deserialize the person
Person deserializedPerson = (Person)serializer.Deserialize(memoryStream);
Console.WriteLine(deserializedPerson.Name); // Should print Sam
}
// Mark Serializable stuff as serializable.
// This means that C# will automatically format this to be put in a stream
[Serializable]
class Person
{
public String Name { get; set; }
public Int32 Age { get; set; }
}
The easiest solution is to replace
const string magic = "MAGICNUMBER";
with
static string magic = "magicnumber".ToUpper();
But there are more problems with the whole magic string approach. What is the file contains the magic string? I think that the best solution is to put the file size after the file. The extraction is much easier that way: Read the length from the last bytes and read the required amount of bytes from the end of the file.
Update: This should work unless your files are very big. (You'd need to use a revolving pair of buffers in that case (to read the file in small blocks)):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string magic = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] magicData = Encoding.ASCII.GetBytes(magic);
for (int idx = magicData.Length - 1; idx < data.Length; idx++) {
bool found = true;
for (int magicIdx = 0; magicIdx < magicData.Length; magicIdx++) {
if (data[idx - magicData.Length + 1 + magicIdx] != magicData[magicIdx]) {
found = false;
break;
}
}
if (found) {
using (FileStream output = new FileStream(outputFilename, FileMode.Create)) {
output.Write(data, idx + 1, data.Length - idx - 1);
}
}
}
Update2: This should be much faster, use little memory and work on files of all size, but the program your must be proper executable (with size being a multiple of 512 bytes):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string marker = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] markerData = Encoding.ASCII.GetBytes(marker);
int markerLength = markerData.Length;
const int blockSize = 512; //important!
using(FileStream input = File.OpenRead(inputFilename)) {
long lastPosition = 0;
byte[] buffer = new byte[blockSize];
while (input.Read(buffer, 0, blockSize) >= markerLength) {
bool found = true;
for (int idx = 0; idx < markerLength; idx++) {
if (buffer[idx] != markerData[idx]) {
found = false;
break;
}
}
if (found) {
input.Position = lastPosition + markerLength;
using (FileStream output = File.OpenWrite(outputFilename)) {
input.CopyTo(output);
}
}
lastPosition = input.Position;
}
}
Read about some approaches here: http://www.strchr.com/creating_self-extracting_executables
You can add the compressed file as resource to the project itself:
Project > Properties
Set the property of this resource to Binary.
You can then retrieve the resource with
byte[] resource = Properties.Resources.NameOfYourResource;
Search backwards rather than forwards (assuming your file won't contain said magic number).
Or append your (text) file and then lastly its length (or the length of the original exe), so you only need read the last DWORD / few bytes to see how long the file is - then no magic number is required.
More robustly, store the file as an additional data section within the executable file. This is more fiddly without external tools as it requires knowledge of the PE file format used for NT executables, q.v. http://msdn.microsoft.com/en-us/library/ms809762.aspx

Decode quoted printable correct

I have the following string:
=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt=...
which is an encoding of
[proconact-Verbesserung #279] (Neu) Stellvertretungen Benutzerrecht - andere können für andere Stellvertretungen erstellen ändern usw. dadurch ist der Schutz der Aktiviäten Mails nicht gewährt.
I am searching for a way do decode the quoted string.
I have tried:
private static string DecodeQuotedPrintables(string input, string charSet) {
Encoding enc = new ASCIIEncoding();
try {
enc = Encoding.GetEncoding(charSet);
} catch {
enc = new UTF8Encoding();
}
var occurences = new Regex(#"(=[0-9A-Z]{2}){1,}", RegexOptions.Multiline);
var matches = occurences.Matches(input);
foreach (Match match in matches) {
try {
byte[] b = new byte[match.Groups[0].Value.Length / 3];
for (int i = 0; i < match.Groups[0].Value.Length / 3; i++) {
b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
}
char[] hexChar = enc.GetChars(b);
input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
} catch { ;}
}
input = input.Replace("?=", "").Replace("=\r\n", "");
return input;
}
when I call (where s is my string)
var x = DecodeQuotedPrintables(s, "utf-8");
this will return
=?utf-8?Q?[proconact_-_Verbesserung_#_(Neu)_Stellvertretungen_Benutzerrecht_-_andere_können_für_andere_Stellvertretungen_erstellen_ändern_usw._dadurch_ist_der_Schutz_der_Aktiviäten_Mails_nicht_gewährt=...
What can I do, that there will also the _ and the starting =?utf-8?Q? and the trailing =.. be removed?
The text you’re trying to decode is typically found in MIME headers, and is encoded according to the specification defined in the following Internet standard: RFC 2047: MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text.
There is a sample implementation for such a decoder on GitHub; maybe you can draw some ideas from it: RFC2047 decoder in C#.
You can also use this online tool for comparing your results: Online MIME Headers Decoder.
Note that your sample text is incorrect. The specification declares:
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
Per the specification, any encoded word must end in ?=. Thus, your sample must be corrected from:
=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt=
…to (scroll to the far right):
=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt?=
Strictly speaking, your sample is also invalid because it exceeds the 75-character limit imposed on any encoded word; however, most decoders tend to be tolerant of this non-conformity.
I've tested 5+ of code snippets and this is the working one: I've modified the regex part:
Test line:
im sistemlerimizde bak=FDm =E7al=FD=FEmas=FD yap=FDlaca=F0=FDndan; www.gib.=
Sample call:
string encoding = "windows-1254";
string input = "im sistemlerimizde bak=FDm =E7al=FD=FEmas=FD yap=FDlaca=F0=FDndan; www.gib.=";
DecodeQuotedPrintables(input, encoding);
Code snippet:
private static string DecodeQuotedPrintables(string input, string charSet)
{
System.Text.Encoding enc = System.Text.Encoding.UTF7;
try
{
enc = Encoding.GetEncoding(charSet);
}
catch
{
enc = new UTF8Encoding();
}
////parse looking for =XX where XX is hexadecimal
//var occurences = new Regex(#"(=[0-9A-Z]{2}){1,}", RegexOptions.Multiline);
var occurences = new Regex("(\\=([0-9A-F][0-9A-F]))", RegexOptions.Multiline);
var matches = occurences.Matches(input);
foreach (Match match in matches)
{
try
{
byte[] b = new byte[match.Groups[0].Value.Length / 3];
for (int i = 0; i < match.Groups[0].Value.Length / 3; i++)
{
b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
}
char[] hexChar = enc.GetChars(b);
input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
}
catch
{ ;}
}
input = input.Replace("?=", "").Replace("=\r\n", "");
return input;
}
As mentioned at standard class .NET is exist for this purpose.
string unicodeString =
"=?UTF-8?Q?YourText?=";
System.Net.Mail.Attachment attachment = System.Net.Mail.Attachment.CreateAttachmentFromString("", unicodeString);
Console.WriteLine(attachment.Name);
Following my comment I'd suggest
private static string MessedUpUrlDecode(string input, string encoding)
{
Encoding enc = new ASCIIEncoding();
try
{
enc = Encoding.GetEncoding(charSet);
}
catch
{
enc = new UTF8Encoding();
}
string messedup = input.Split('?')[3];
string cleaned = input.Replace("_", " ").Replace("=...", ".").Replace("=", "%");
return System.Web.HttpUtility.UrlDecode(cleaned, enc);
}
assuming that the mutilating of the source strings is consistent.
I am not too sure on how to remove the
=?utf-8?Q?
Unless it appears all the time, if it does, you can do this:
input = input.Split('?')[3];
To get rid of the trailing '=' you can remove it by:
input = input.Remove(input.Length - 1);
You can get rid of the '_' by replacing it with a space:
input = input.Replace("_", " ");
You can use those pieces of code in your DecodeQuotedPrintables function.
Hope this Helps!

asp.net mvc file download --System.FormatException: An invalid character was found in the mail header

Our website has files in a few different languages - French, Spanish, Portuguese, and English. When a user uploads a file that contains special characters like ó or ç or ã etc i get an error message when i return File(data, "application/octet-stream", name); in MVC i get the exception:
System.FormatException: An invalid character was found in the mail header.
I found an article in MSDN for this showing how to set the mailmessage to UTF-8 encoding to avoid this. But i do not know how to UTF-8 encode the filename when using the MVC file actionresult. I found an article on the net to UTF-8 encode a string but when I try to use it I get a garbage name so I guess I do not understand what UTF-8 encoding is supposed to do to the string. Here is the sample code found in this blog post: An invalid character was found in the mail header
public static string GetCleanedFileName(string s)
{
char[] chars = s.ToCharArray();
var sb = new StringBuilder();
for (int index = 0; index < chars.Length; index++)
{
string encodedString = EncodeChar(chars[index]);
sb.Append(encodedString);
}
return sb.ToString();
}
private static string EncodeChar(char chr)
{
var encoding = new UTF8Encoding();
var sb = new StringBuilder();
byte[] bytes = encoding.GetBytes(chr.ToString());
for (int index = 0; index < bytes.Length; index++)
{
sb.AppendFormat("%{0}", Convert.ToString(bytes[index], 16));
}
return sb.ToString();
}
Maybe try another function encoding from and to utf8
//UTF8
public static string ConvertToUTF8(string inputString)
{
string toReturn = "";
byte[] arr = Encoding.UTF8.GetBytes(inputString);
for (int i = 0; i &lt arr.Length; i++)
{
toReturn += arr[i].ToString() + " ";
}
return toReturn;
}
public static string ConvertFromUTF8(string inputString)
{
inputString = inputString.Trim();
string result = "";
string[] parts = inputString.Split(' ');
byte[] bytes = new byte[parts.Length];
for (int i = 0; i &lt parts.Length; i++)
{
if (parts[i] == "")
{
continue;
}
try
{
bytes[i] = Convert.ToByte(parts[i]);
}
catch (Exception)
{
MessageBox.Show("Input string was not in a correct format.");
}
}
try
{
result = Encoding.UTF8.GetString(bytes);
}
catch (Exception)
{
throw;
}
return result;
}
I think i have got an idea you have to convert your string not to utf-8 but to utf-16
because utf-8 is encripted ascii as i think.
UTF-16 represents every character using two bytes. UTF-8 uses the one byte ASCII character encodings for ASCII characters and represents non-ASCII characters using variable-length encodings. Keep in mind that while UTF-8 can save space for Western languages, which is an argument often used by proponents, it can actually use up to three bytes per character for other languages.
And that symbols you wrote are not ASCII

Categories