C#: Insert strings to another string - performance issue

C#: Insert strings to another string - performance issue - c#

I have a string, which is long, and a sorted dictionary of indexes and values. I should go over the elements in the dictionary and insert the value to the specified index in the string. I wrote the following code, which works fine, but very slow:
private string restoreText(string text){
StringBuilder sb = new StringBuilder(text);
foreach(KeyValuePair<int, string> pair in _tags){
sb.Insert(pair.Key, pair.Value);
}
return sb.ToString();
}
The dictionary might be very big and contain 500,000 elements.
I think that what makes this function slow is the Insert() method. For dictionary of 100,000 elements, it took almost 5 seconds.
Is there a more efficient way to write this method?
Thanks,
Maya

Better way would be to sort items for insertion and then append them one after another.
Since you didn't comment on the overlap, maybe you have your items sorted in the first place?

Your original code will give different results depending on the order that items are returned from _tags; I very much suspect this isn't your intent.
Instead, sort the tags into order and then add them into the string builder in correct sequence:
private string restoreText(string text)
{
StringBuilder sb = new StringBuilder();
foreach( KeyValuePair<int, string> pair in _tags.OrderBy(t => t.Key))
{
sb.Append(pair.Value);
}
return sb.ToString();
}
If you really want to make this go as fast as possible, initialise the capacity of the StringBuilder up front:
StringBuilder sb = new StringBuilder(_tags.Sum(k => k.Value.Length));
Update
I missed the text parameter originally used to initialise the StringBuilder.
In order to avoid shuffling text around in memory (as caused by StringBuilder.Insert()), we want to stick with using StringBuilder.Append().
We can do this by converting the original text into another sequence of KeyValuePair instances, merging those with the original list and processing in order.
It would look something like this (note: adhoc code):
private string restoreText(string text)
{
var textPairs
= text.Select( (c,i) => new KeyValuePair<int,string>(i, (string)c));
var fullSequence
= textPairs.Union(_tags).OrderBy(t => t.Key);
StringBuilder sb = new StringBuilder();
foreach( KeyValuePair<int, string> pair in fullSequence)
{
sb.Append(pair.Value);
}
return sb.ToString();
}
Note - I've made a whole heap of assumptions about your context, so this may not work quite right for you. Particularly be aware, that .Union() will discard duplicates, though there are easy workarounds for that.

what I don't get if you have your indices setup so that the insert won't change the others but as your code says "yes" I'll assume so too.
Can you test this one:
private string RestoreText(string text)
{
var sb = new StringBuilder();
var totalLen = 0;
var orgIndex = 0;
foreach (var pair in _tags.OrderBy(t => t.Key))
{
var toAdd = text.Substring(orgIndex, pair.Key - totalLen);
sb.Append(toAdd);
orgIndex += toAdd.Length;
totalLen += toAdd.Length;
sb.Append(pair.Value);
totalLen += pair.Value.Length;
}
if (orgIndex < text.Length) sb.Append(text.Substring(orgIndex));
return sb.ToString();
}
it only uses append while beeing the same as your original code

I donnt know how about your data.
but in my test , it run fast(564ms) .
Dictionary<int, string> _tags = new Dictionary<int, string>();
for (int i = 0; i < 1000000; i++)
{
_tags.Add(i, i.ToString().Length + "");
}
string text = new String('a' , 50000000);
Console.WriteLine("****************************************");
System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
StringBuilder sb = new StringBuilder(text);
foreach (KeyValuePair<int, string> pair in _tags)
{
sb.Insert(pair.Key, pair.Value);
}
sw.Stop();
Console.WriteLine("sw:" + sw.ElapsedMilliseconds);
Console.ReadKey();
if you can use append() instead of insert() , it only takes 35ms...

Related

convert dictionary to list of objects in C#

I have a dictionary:
Dictionary<string, string> valuesDict = new Dictionary<string, string> {
{“Q1”, “A1”},
{“Q2”, “A2”},
{“Q3”, “A3”},
{“Q4”, “A4”} /*20000 Q and A pairs*/
};
Inorder to load this to a third party interface which only accepts a list of objects (of class QuestionAnswer), I am manually converting it to a list like so
Public Class QuestionAnswer {
Public string Question;
Public string Answer;
}
objects of the QuestionAnswer class are then created within the loop
List<QuestionAnswer> qaList = new List<QuestionAnswer>();
foreach(var key in valuesDict.Keys) {
qaList.add(new QuestionAnswer {Question = key, Answer = valuesDict[key]});
}
I want to know if there is a faster way to populate this list from the dictionary.
What I have found so far:
While looking around for the solution, I came across a solution for a conversion of simple Dictionary to List of simple types like so: Convert dictionary to List<KeyValuePair>
Could someone please help me in utilizing this solution to my case please.
I am also open to any other solution that can remove this overhead.

You're doing an unnecessary lookup for the key:
foreach(var item in valuesDict) {
qaList.add(new QuestionAnswer {Question = item.Key, Answer = item.Value});
}
You can also provide the list count when intializing to avoid resize:
List<QuestionAnswer> qaList = new List<QuestionAnswer>(valuesDict.Keys.Count);
You can use LinQ-based solutions, but that is slower and you're asking for optimal solution.

You can create a list with LINQ by projecting each KeyValuePair of the dictionary into your QuestionAnswer object:
var qaList =
valuesDict.Select(kvp => new QuestionAnswer { Question = kvp.Key, Answer = kvp.Value })
.ToList()

Faster? Well, yes, absolutely, iterate directly the dictionary, not the Keys collection:
foreach(var kv in valuesDicts) {
qaList.add(new QuestionAnswer {Question = kv.Key, Answer = kv.Value});
Or better yet, using System.Linq:
valuesDict.Select(kv => new QuestionAnswer(kv.Key, kv.Value);
In your code you are performing an unecessary key search on each iteration.

Basically ther are two common approaches. Using a foreach or LINQ. To check the performance you can use a stopwatch and run a simple code like this:
Dictionary<string, string> valuesDict = new Dictionary<string, string>();
for (uint i = 0; i < 60000; i++)
{
valuesDict.Add(i.ToString(), i.ToString());
}
List<QuestionAnswer> qaList;
Stopwatch stp = new Stopwatch();
stp.Start();
//LINQ approach
qaList = valuesDict.Select(kv => new QuestionAnswer { Question = kv.Key, Answer = kv.Value }).ToList();
stp.Stop();
Console.WriteLine(stp.ElapsedTicks);
stp.Restart();
//Foreach approach
qaList = new List<QuestionAnswer>();
foreach (var item in valuesDict)
{
qaList.Add(new QuestionAnswer { Question = item.Key, Answer = item.Value });
}
stp.Stop();
Console.WriteLine(stp.ElapsedTicks);
My result: Foreach performes about 30% faster than the LINQ approach.

How do I find all the distinct keys of json records in c#?

Why this question is not duplicate? (Added after seeing comments)
It is not related to Entity framework.
It has to deal with parsing huge json files and find distinct keys, but not records!
I've 200+ files and each of them is 2+ GB, implies total size is 400+ GB. Each line in those files is a json string. I don't have json schema for the records beforehand. My job is to find all the keys in those files.
I wrote following code to get all the distinct keys from all those json records. I call following method using multi-threaded for-loop from main.
private void GetTokensFromJson(string filePath)
{
IEnumerable<string> txts = File.ReadLines(filePath, Encoding.UTF8);
Console.WriteLine(txts.Count());
List<string> distinctKeys = new List<string>();
foreach (var text in txts)
{
string pattern = "{\"";
foreach (Match m in Regex.Matches(text, pattern))
{
//string matchValue = m.Value;
int matchIndex = m.Index;
string subStr=text.Substring(matchIndex+2, text.Length - matchIndex - 3);
int quoteIndex=subStr.IndexOf('\"');
string jsonKey = subStr.Substring(0, quoteIndex);
if (!distinctKeys.Contains(jsonKey) && !jsonKey.Contains("\\"))
{
Console.WriteLine(jsonKey);
distinctKeys.Add(jsonKey);
}
}
string secondPattern="\":";
foreach (Match m in Regex.Matches(text, secondPattern))
{
int matchIndex = m.Index;
string revJsonKKey = "";
while(matchIndex>0)
{
matchIndex--;
if (text[matchIndex] != '\"')
revJsonKKey += text[matchIndex];
else
break;
}
IEnumerable<char> jsonKeyCharArray = revJsonKKey.Reverse();
string jsonKey="";
foreach(char c in jsonKeyCharArray)
{
jsonKey += c;
}
if (!distinctKeys.Contains(jsonKey) && !jsonKey.Contains("\\"))
{
Console.WriteLine(jsonKey);
distinctKeys.Add(jsonKey);
}
}
}
distinctKeys has all the distinct json keys. But I'm missing few keys and adding unwanted keys, not sure why :|. I can't debug for the given input, as it is too huge! Also, this method is too slow.
To make things clearer, let's take an example, if files have following json,
{"id":"123", "name":"hello, world", "department":[{"name":"dept1", "deptID":"123"}]}
{"id":"456324", "department":[{"name":"dept2", "deptID":"456"}]}
Expected output is id,name,department, department->name, department->deptID. Formatting of output doesn't matter. Note that not all the json records will not have all the keys and json record can contain nested json records.
I've two questions,
What am I doing wrong in the code?
Is there a inbuilt or 3rd-party dll which will give me output as keys of json, when I give input as complex json record?

Try it with Json.net, That Path property contains the full path of that object
private static void GetKeys(JObject obj, List<string> keys)
{
var result = obj.Descendants()
.Where(f => f is JProperty) //.Where(f => f is JProperty)
.Select(f => f as JProperty)// and .Select(f => f as JProperty) can be replaced with .OfType<JProperty>()
.Select(f=>f.Path)
.Where(f=> !keys.Contains(f));
keys.AddRange(result);
}
static void Main(string[] args)
{
IEnumerable<string> txts = #"{'id':'123', 'name':'hello, world', 'department':[{'name':'dept1', 'deptID':'123'}]}
{'id':'456324', 'department':[{'name':'dept2', 'deptID':'456'}]}".Split("\r\n".ToArray(),StringSplitOptions.RemoveEmptyEntries);
List<string> keys = new List<string>();
foreach (var item in txts)
{
var obj = JObject.Parse(item);
GetKeys(obj, keys);
}
}

Read the strings into JSON.NET and convert them to Jobjects
Then loop through Jobjects
foreach ( jobject in jobjects )
{
IList<string> keys = jobject .Properties().Select(p => p.Name).ToList();
}
then do
keys.distinct();
It will be like
private void GetTokensFromJson(string filePath)
{
IEnumerable<string> txts = File.ReadLines(filePath, Encoding.UTF8);
List<JObject> jObjects = new List<JObject>() {};
IList<string> keyslist;
Console.WriteLine(txts.Count());
List<string> distinctKeys = new List<string>();
foreach (var text in txts)
{
var obj = JObject.Parse(text);
jObjects.add(obj);
}
for each ( jobject in jobjects )
{
IList<string> keys = jobject .Properties().Select(p => p.Name).ToList();
keyslist.add(keys);
}
keyslist.distinct();
}

Let's do the math shall we. You have:
200 files
of at least 2 GB
where a line is, lets say 120 characters (240 bytes) on average
That makes for 400 GB Internal memory just for holding all content and for
1,789,569,707, i.e. nearly 2 billion lines.
Clearly your problem here is not one that is related to parsing, but to managing your memory and indexing on keys in an incremental manner, using intermediate results that do not all reside in memory.
Using the simple list that you have now to track your keys, and assuming that 1 in 20 of your keys is unique:
You now have to maintain 125 million key entries in your index list
If storage required for a single key index entry is 80 bytes, this will add up to the list taking 9 GB of memory.
Searching the list (125 million items) for duplicates for a new line is going to be very slow.
You may want to look into map/reduce style algorithms to figure out how something like this may be achieved.

A few issues:
Don't do Console.WriteLine(txts.Count());. I believe this actually causes you to read the entire file twice -- once to count, and once to read keys.
Use a HashSet<string> to collect distinct keys, it's much faster than using a list.
As Kenner Dev suggests, install Json.NET and use LINQ to JSON to parse each line of the file without needing to know a schema.
Continue to read the files line-by-line as you are currently doing, don't try to load the entire thing in memory at once in any representation.
Then, GetTokensFromJson becomes:
public static HashSet<string> GetTokensFromJson(IEnumerable<string> txts)
{
return new HashSet<string>(txts.Select(t => JObject.Parse(t)).Where(o => o != null).SelectMany(o => o.Descendants().OfType<JProperty>()).Select(p => p.Name));
}

Speedily Read and Parse Data

As of now, I am using this code to open a file and read it into a list and parse that list into a string[]:
string CP4DataBase =
"C:\\Program\\Line Balancer\\FUJI DB\\KTS\\KTS - CP4 - Part Data Base.txt";
CP4DataBaseRTB.LoadFile(CP4DataBase, RichTextBoxStreamType.PlainText);
string[] splitCP4DataBaseLines = CP4DataBaseRTB.Text.Split('\n');
List<string> tempCP4List = new List<string>();
string[] line1CP4Components;
foreach (var line in splitCP4DataBaseLines)
tempCP4List.Add(line + Environment.NewLine);
string concattedUnitPart = "";
foreach (var line in tempCP4List)
{
concattedUnitPart = concattedUnitPart + line;
line1CP4PartLines++;
}
line1CP4Components = new Regex("\"UNIT\",\"PARTS\"", RegexOptions.Multiline)
.Split(concattedUnitPart)
.Where(c => !string.IsNullOrEmpty(c)).ToArray();
I am wondering if there is a quicker way to do this. This is just one of the files I am opening, so this is repeated a minimum of 5 times to open and properly load the lists.
The minimum file size being imported right now is 257 KB. The largest file is 1,803 KB. These files will only get larger as time goes on as they are being used to simulate a database and the user will continually add to them.
So my question is, is there a quicker way to do all of the above code?
EDIT:
***CP4***
"UNIT","PARTS"
"BLOCK","HEADER-"
"NAME","106536"
"REVISION","0000"
"DATE","11/09/03"
"TIME","11:10:11"
"PMABAR",""
"COMMENT",""
"PTPNAME","R160805"
"CMPNAME","R160805"
"BLOCK","PRTIDDT-"
"PMAPP",1
"PMADC",0
"ComponentQty",180
"BLOCK","PRTFORM-"
"PTPSZBX",1.60
"PTPSZBY",0.80
"PTPMNH",0.25
"NeedGlue",0
"BLOCK","TOLEINF-"
"PTPTLBX",0.50
"PTPTLBY",0.40
"PTPTLCL",10
"PTPTLPX",0.30
"PTPTLPY",0.30
"PTPTLPQ",30
"BLOCK","ELDT+" "PGDELSN","PGDELX","PGDELY","PGDELPP","PGDELQ","PGDELP","PGDELW","PGDELL","PGDELWT","PGDELLT","PGDELCT","PGDELR"
0,0.000,0.000,0,0,0.000,0.000,0.000,0.000,0.000,0.000,0
"BLOCK","VISION-"
"PTPVIPL",0
"PTPVILCA",0
"PTPVILB",0
"PTPVICVT",10
"PENVILIT",0
"BLOCK","ENVDT"
"ELEMENT","CP43ENVDT-"
"PENNMI",1.0
"PENNMA",1.0
"PENNZN",""
"PENNZT",1.0
"PENBLM",12
"PENCRTS",0
"PENSPD1",100
"PTPCRDCT",0
"PENVICT",1
"PCCCRFT",1
"BLOCK","CARRING-"
"PTPCRAPO",0
"PTPCRPCK",0
"PTPCRPUX",0.00
"PTPCRPUY",0.00
"PTPCRRCV",0
"BLOCK","PACKCLS-"
"FDRTYPE","Emboss"
"TAPEWIDTH","8mm"
"FEEDPITCH",4
"REELDIAMETER",0
"TAPEDEPTH",0.0
"DOADVVACUUM",0
"CHKBEFOREFEED",0
"TAPEARMLENGTH",0
"PPCFDPP",0
"PPCFDEC",4
"PPCMNPT",30
"UNIT","PARTS"
"BLOCK","HEADER-"
"NAME","106653"
"REVISION","0000"
"DATE","11/09/03"
"TIME","11:10:42"
"PMABAR",""
"COMMENT",""
"PTPNAME","0603R"
"CMPNAME","0603R"
"BLOCK","PRTIDDT-"
"PMAPP",1
"PMADC",0
"ComponentQty",18
"BLOCK","PRTFORM-"
"PTPSZBX",1.60
"PTPSZBY",0.80
"PTPMNH",0.23
"NeedGlue",0
"BLOCK","TOLEINF-"
"PTPTLBX",0.50
"PTPTLBY",0.34
"PTPTLCL",0
"PTPTLPX",0.60
"PTPTLPY",0.40
"PTPTLPQ",30
"BLOCK","ELDT+" "PGDELSN","PGDELX","PGDELY","PGDELPP","PGDELQ","PGDELP","PGDELW","PGDELL","PGDELWT","PGDELLT","PGDELCT","PGDELR"
0,0.000,0.000,0,0,0.000,0.000,0.000,0.000,0.000,0.000,0
"BLOCK","VISION-"
"PTPVIPL",0
"PTPVILCA",0
"PTPVILB",0
"PTPVICVT",10
"PENVILIT",0
"BLOCK","ENVDT"
"ELEMENT","CP43ENVDT-"
"PENNMI",1.0
"PENNMA",1.0
"PENNZN",""
"PENNZT",1.0
"PENBLM",12
"PENCRTS",0
"PENSPD1",80
"PTPCRDCT",0
"PENVICT",1
"PCCCRFT",1
"BLOCK","CARRING-"
"PTPCRAPO",0
"PTPCRPCK",0
"PTPCRPUX",0.00
"PTPCRPUY",0.00
"PTPCRRCV",0
"BLOCK","PACKCLS-"
"FDRTYPE","Emboss"
"TAPEWIDTH","8mm"
"FEEDPITCH",4
"REELDIAMETER",0
"TAPEDEPTH",0.0
"DOADVVACUUM",0
"CHKBEFOREFEED",0
"TAPEARMLENGTH",0
"PPCFDPP",0
"PPCFDEC",4
"PPCMNPT",30
... the file goes on and on and on.. and will only get larger.
The REGEX is placing each "UNIT PARTS" and the following code until the NEXT "UNIT PARTS" into a string[].
After this, I am checking each string[] to see if the "NAME" section exists in a different list. If it does exist, I am outputting that "UNIT PARTS" at the end of a textfile.

This bit is a potential performance killer:
string concattedUnitPart = "";
foreach (var line in tempCP4List)
{
concattedUnitPart = concattedUnitPart + line;
line1CP4PartLines++;
}
(See this article for why.) Use a StringBuilder for repeated concatenation:
// No need to use tempCP4List at all
StringBuilder builder = new StringBuilder();
foreach (var line in splitCP4DataBaseLines)
{
concattedUnitPart.AppendLine(line);
line1CP4PartLines++;
}
Or even just:
string concattedUnitPart = string.Join(Environment.NewLine,
splitCP4DataBaseLines);
Now the regex part may well also be slow - I'm not sure. It's not obvious what you're trying to achieve, whether you need regular expressions at all, or whether you really need to do the whole thing in one go. Can you definitely not just process it line by line?

You could achieve the same output list 'line1CP4Components' using the following:
Regex StripEmptyLines = new Regex(#"^\s*$", RegexOptions.Multiline);
Regex UnitPartsMatch = new Regex(#"(?<=\n)""UNIT"",""PARTS"".*?(?=(?:\n""UNIT"",""PARTS"")|$)", RegexOptions.Singleline);
string CP4DataBase =
"C:\\Program\\Line Balancer\\FUJI DB\\KTS\\KTS - CP4 - Part Data Base.txt";
CP4DataBaseRTB.LoadFile(CP4DataBase, RichTextBoxStreamType.PlainText);
List<string> line1CP4Components = new List<string>(
UnitPartsMatch.Matches(StripEmptyLines.Replace(CP4DataBaseRTB.Text, ""))
.OfType<Match>()
.Select(m => m.Value)
);
return line1CP4Components.ToArray();
You may be able to ignore the use of StripEmptyLines, but your original code is doing this via the Where(c => !string.IsNullOrEmpty(c)). Also your original code is causing the '\r' part of the "\r\n" newline/linefeed pair to be duplicated. I assumed this was an accident and not intentional?
Also you don't seem to be using the value in 'line1CP4PartLines' so I omitted the creation of the value. It was seemingly inconsistent with the omission of empty lines later so I guess you're not depending on it. If you need this value a simple regex can tell you how many new lines are in the string:
int linecount = new Regex("^", RegexOptions.Multiline).Matches(CP4DataBaseRTB.Text).Count;

// example of what your code will look like
string CP4DataBase = "C:\\Program\\Line Balancer\\FUJI DB\\KTS\\KTS - CP4 - Part Data Base.txt";
List<string> Cp4DataList = new List<string>(File.ReadAllLines(CP4DataBase);
//or create a Dictionary<int,string[]> object
string strData = string.Empty;//hold the line item data which is read in line by line
string[] strStockListRecord = null;//string array that holds information from the TFE_Stock.txt file
Dictionary<int, string[]> dctStockListRecords = null; //dictionary object that will hold the KeyValuePair of text file contents in a DictList
List<string> lstStockListRecord = null;//Generic list that will store all the lines from the .prnfile being processed
if (File.Exists(strExtraLoadFileLoc + strFileName))
{
try
{
lstStockListRecord = new List<string>();
List<string> lstStrLinesStockRecord = new List<string>(File.ReadAllLines(strExtraLoadFileLoc + strFileName));
dctStockListRecords = new Dictionary<int, string[]>(lstStrLinesStockRecord.Count());
int intLineCount = 0;
foreach (string strLineSplit in lstStrLinesStockRecord)
{
lstStockListRecord.Add(strLineSplit);
dctStockListRecords.Add(intLineCount, lstStockListRecord.ToArray());
lstStockListRecord.Clear();
intLineCount++;
}//foreach (string strlineSplit in lstStrLinesStockRecord)
lstStrLinesStockRecord.Clear();
lstStrLinesStockRecord = null;
lstStockListRecord.Clear();
lstStockListRecord = null;
//Alter the code to fit what you are doing..

In C#, best way to check if stringbuilder contains a substring

I have an existing StringBuilder object, the code appends some values and a delimiter to it.
I want to modify the code to add the logic that before appending the text, it will check if it already exists in the StringBuilder. If it does not, only then will it append the text, otherwise it is ignored.
What is the best way to do so? Do I need to change the object to string type? I need the best approach that will not hamper performance.
public static string BuildUniqueIDList(context RequestContext)
{
string rtnvalue = string.Empty;
try
{
StringBuilder strUIDList = new StringBuilder(100);
for (int iCntr = 0; iCntr < RequestContext.accounts.Length; iCntr++)
{
if (iCntr > 0)
{
strUIDList.Append(",");
}
// need to do somthing like:
// strUIDList.Contains(RequestContext.accounts[iCntr].uniqueid) then continue
// otherwise append
strUIDList.Append(RequestContext.accounts[iCntr].uniqueid);
}
rtnvalue = strUIDList.ToString();
}
catch (Exception e)
{
throw;
}
return rtnvalue;
}
I am not sure if having something like this will be efficient:
if (!strUIDList.ToString().Contains(RequestContext.accounts[iCntr].uniqueid.ToString()))

Personally I would use:
return string.Join(",", RequestContext.accounts
.Select(x => x.uniqueid)
.Distinct());
No need to loop explicitly, manually use a StringBuilder etc... just express it all declaratively :)
(You'd need to call ToArray() at the end if you're not using .NET 4, which would obviously reduce the efficiency somewhat... but I doubt it'll become a bottleneck for your app.)
EDIT: Okay, for a non-LINQ solution... if the size is reasonably small I'd just for for:
// First create a list of unique elements
List<string> ids = new List<string>();
foreach (var account in RequestContext.accounts)
{
string id = account.uniqueid;
if (ids.Contains(id))
{
ids.Add(id);
}
}
// Then convert it into a string.
// You could use string.Join(",", ids.ToArray()) here instead.
StringBuilder builder = new StringBuilder();
foreach (string id in ids)
{
builder.Append(id);
builder.Append(",");
}
if (builder.Length > 0)
{
builder.Length--; // Chop off the trailing comma
}
return builder.ToString();
If you could have a large collection of strings, you might use Dictionary<string, string> as a sort of fake HashSet<string>.

How can you change a ";" seperated string to some kind of dictionary?

I have a string like this:
"user=u123;name=Test;lastname=User"
I want to get a dictionary for this string like this:
user "u123"
name "Test"
lastname "User"
this way I can easely access the data within the string.
I want to do this in C#.
EDIT:
This is what I have so far:
public static Dictionary<string, string> ValueToDictionary(string value)
{
Dictionary<string, string> result = null;
result = new Dictionary<string, string>();
string[] values = value.Split(';');
foreach (string val in values)
{
string[] valueParts = val.Split('=');
result.Add(valueParts[0], valueParts[1]);
}
return result;
}
But to be honest I really think there is a better way to do this.
Cheers,
M.

You can use LINQ:
var text = "user=u123;name=Test;lastname=User";
var dictionary = (from t in text.Split( ";".ToCharArray() )
let pair = t.Split( "=".ToCharArray(), 2 )
select pair).ToDictionary( p => p[0], p => p[1] );

Split the string by ";".
Iterate over every element in the resulting array and split every element by "=".
Now;
dictionary.add(element[0], element[1]);
I Hope I made it clear enough.

Dictionary<string, string> d = new Dictionary<string, string>();
string s1 = "user=u123;name=Test;lastname=User";
foreach (string s2 in s1.Split(';'))
{
string[] split = s2.Split('=');
d.Add(split[0], split[1]);
}

var dictionary = new Dictionary<string, string>();
var linedValue = "user=u123;name=Test;lastname=User";
var kvps = linedValue.Split(new[] { ';' }); // you may use StringSplitOptions.RemoveEmptyEntries
foreach (var kvp in kvps)
{
var kvpSplit = kvp.Split(new[] { '=' });
var key = kvpSplit.ElementAtOrDefault(0);
var value = kvpSplit.ElementAtOrDefault(1);
dictionary.Add(key, value);
// you may check with .ContainsKey if key is already persistant
// you may check if key and value with string.IsNullOrEmpty
}

If you know for sure that there are no separator chars in your input data, the following works
string input = "user=u123;name=Test;lastname=User";
string[] fragments = input.Split(";=".ToArray());
Dictionary<string,string> result = new Dictionary<string,string>()
for(int i=0; i<fragments.Length-1;i+=2)
result.Add(fragments[i],fragments[i+1]);
It might perform slightly better than some of the other solutions, since it only calls Split() once. Usually I would go for any of the other solutions here, especially if readability of the code is of any value to you.

I think I would do it like this...
String s = "user=u123;name=Test;lastname=User";
Dictionary<string,string> dict = s.ToDictionary();
The implementation of ToDictonary is the same as yours except that I would implement it as an extension method. It does look more natural.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C#: Insert strings to another string - performance issue - c#

Better way would be to sort items for insertion and then append them one after another. Since you didn't comment on the overlap, maybe you have your items sorted in the first place?

Related

convert dictionary to list of objects in C#

How do I find all the distinct keys of json records in c#?

Speedily Read and Parse Data

In C#, best way to check if stringbuilder contains a substring

How can you change a ";" seperated string to some kind of dictionary?

Categories

Resources