How to split string into a dictionary

How to split string into a dictionary - c#

I have this string
string sx="(colorIndex=3)(font.family=Helvetica)(font.bold=1)";
and am splitting it with
string [] ss=sx.Split(new char[] { '(', ')' },
StringSplitOptions.RemoveEmptyEntries);
Instead of that, how could I split the result into a Dictionary<string,string>? The
resulting dictionary should look like:
Key Value
colorIndex 3
font.family Helvetica
font.bold 1

It can be done using LINQ ToDictionary() extension method:
string s1 = "(colorIndex=3)(font.family=Helvicta)(font.bold=1)";
string[] t = s1.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries);
Dictionary<string, string> dictionary =
t.ToDictionary(s => s.Split('=')[0], s => s.Split('=')[1]);
EDIT: The same result can be achieved without splitting twice:
Dictionary<string, string> dictionary =
t.Select(item => item.Split('=')).ToDictionary(s => s[0], s => s[1]);

There may be more efficient ways, but this should work:
string sx = "(colorIndex=3)(font.family=Helvicta)(font.bold=1)";
var items = sx.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Split(new[] { '=' }));
Dictionary<string, string> dict = new Dictionary<string, string>();
foreach (var item in items)
{
dict.Add(item[0], item[1]);
}

Randal Schwartz has a rule of thumb: use split when you know what you want to throw away or regular expressions when you know what you want to keep.
You know what you want to keep:
string sx="(colorIndex=3)(font.family=Helvetica)(font.bold=1)";
Regex pattern = new Regex(#"\((?<name>.+?)=(?<value>.+?)\)");
var d = new Dictionary<string,string>();
foreach (Match m in pattern.Matches(sx))
d.Add(m.Groups["name"].Value, m.Groups["value"].Value);
With a little effort, you can do it with ToDictionary:
var d = Enumerable.ToDictionary(
Enumerable.Cast<Match>(pattern.Matches(sx)),
m => m.Groups["name"].Value,
m => m.Groups["value"].Value);
Not sure whether this looks nicer:
var d = Enumerable.Cast<Match>(pattern.Matches(sx)).
ToDictionary(m => m.Groups["name"].Value,
m => m.Groups["value"].Value);

string sx = "(colorIndex=3)(font.family=Helvetica)(font.bold=1)";
var dict = sx.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split('='))
.ToDictionary(x => x[0], y => y[1]);

var dict = (from x in s1.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries)
select new { s = x.Split('=') }).ToDictionary(x => x[0], x => x[1]);

Often used for http query splitting.
Usage: Dictionary<string, string> dict = stringToDictionary("userid=abc&password=xyz&retain=false");
public static Dictionary<string, string> stringToDictionary(string line, char stringSplit = '&', char keyValueSplit = '=')
{
return line.Split(new[] { stringSplit }, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Split(new[] { keyValueSplit })).ToDictionary(x => x[0], y => y[1]); ;
}

You can try
string sx = "(colorIndex=3)(font.family=Helvetica)(font.bold=1)";
var keyValuePairs = sx.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries)
.Select(v => v.Split('='))
.ToDictionary(v => v.First(), v => v.Last());

You could do this with regular expressions:
string sx = "(colorIndex=3)(font.family=Helvetica)(font.bold=1)";
Dictionary<string,string> dic = new Dictionary<string,string>();
Regex re = new Regex(#"\(([^=]+)=([^=]+)\)");
foreach(Match m in re.Matches(sx))
{
dic.Add(m.Groups[1].Value, m.Groups[2].Value);
}
// extract values, to prove correctness of function
foreach(var s in dic)
Console.WriteLine("{0}={1}", s.Key, s.Value);

I am just putting this here for reference...
For ASP.net, if you want to parse a string from the client side into a dictionary this is handy...
Create a JSON string on the client side either like this:
var args = "{'A':'1','B':'2','C':'" + varForC + "'}";
or like this:
var args = JSON.stringify(new { 'A':1, 'B':2, 'C':varForC});
or even like this:
var obj = {};
obj.A = 1;
obj.B = 2;
obj.C = varForC;
var args = JSON.stringify(obj);
pass it to the server...
then parse it on the server side like this:
JavaScriptSerializer jss = new JavaScriptSerializer();
Dictionary<String, String> dict = jss.Deserialize<Dictionary<String, String>>(args);
JavaScriptSerializer requires System.Web.Script.Serialization.

Related

Splitting text and putting it into dictionary

I have text with 600 words and I'm supposed to delete every quotation marks, numbers(years, dates, ..), digits ,... I should only have words, and I have to put in into dictionary.
So I have tried to go through with for each loop and get the first letter and save it in a list. Then I split every row in a word.
e.g.:
You are pretty.
You
are
pretty
The problem there are words in a row they're still same but they shouldn't be same. I've tried to fix it but I couldn't find any solution.
public Dictionary<string, int> words = new Dictionary<string, int>();
public Dictionary<char, List<string>> firstletter = new Dictionary<char, List<string>>();
public Aufgabe(string filename)
{
string filler = "ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜ";
foreach (char f in filler)
{
firstletter[f] = new List<string>();
}
Load(filename);
}
public void Load(string filename)
{
List<string> w = new List<string>();
StreamReader r = new StreamReader(filename);
while (!r.EndOfStream)
{
string row = r.ReadLine();
string[] parts = row.Split(' ');
string[] sonderzeichen = new string[] { "#", ",", ".", ";", "'", "1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "(", ")", "{",
"}", "!", "?", "/", "\"", "&", "+", "-", "–" };
string[] list = new string[parts.Length];
for (int i = 0; i < parts.Length; i++)
{
string a = parts[i];
foreach (string s in sonderzeichen)
{
if (s != "-")
{
a = a.Replace(s, string.Empty);
}
else
{
if (a.Length == 1)
{
a = string.Empty;
}
}
}
list[i] = a;
}
parts = list;
foreach (string a in parts)
{
if (words.ContainsKey(a))
{
words[a] += 1;
}
else
{
words.Add(a, 1);
}
string b = a.ToUpper();
if (b == "")
continue;
List<string> letter = firstletter[b[0]];
if (!letter.Contains(a))
{
letter.Add(a);
}
}
}
}

There are some things missing in the other answers:
No validation is done to check if the text is a word
Comparison should not be case-sensitive (i.e. spain, Spain and SPAIN should be considered the same word)
My solution:
StringComparer comparer = StringComparer.OrdinalIgnoreCase;
string text = "The 'rain' in spain falls mainly on the plain. 07 November 2018 20:02:07 - 20180520 I said the Plain in SPAIN. 12345";
var dictionary = Regex.Split(text, #"\W+")
.Where(IsValidWord)
.GroupBy(m => m, comparer)
.ToDictionary(m => m.Key, m => m.Count(), comparer);
Method IsValidWord:
// logic to validate word goes here
private static bool IsValidWord(string text)
{
double value;
bool isNumeric = double.TryParse(text, out value);
// add more validation rules here
return !isNumeric;
}
EDIT
I noticed in your code that you have a Dictionary with the words grouped by first letter. This can be achieved like this (using the previous dictionary):
var lettersDictionary = dictionary.Keys.GroupBy(x => x.Substring(0, 1),
(alphabet, subList) => new {
Alphabet = alphabet,
SubList = subList.OrderBy(x => x, comparer).ToList()
})
.ToDictionary(m => m.Alphabet, m => m.SubList, comparer);

You can just split with a regex, then use LINQ to create your dictionary:
var dictionary = Regex.Split(text, #"\W+")
.GroupBy(m => m, StringComparer.OrdinalIgnoreCase) // Case-insensitive
.ToDictionary(m => m.Key, m => m.Count());
UPDATE
In applying to your example code, your task class could become something like this to build both dictionaries (and to consider case insensitive):
public class Aufgabe
{
const string ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜ";
public Dictionary<string, int> words;
public Dictionary<char, List<string>> firstletter;
public Aufgabe(string filename)
{
var text = File.ReadAllText(filename);
words = Regex.Split(text, #"\W+")
.GroupBy(m => m, StringComparer.OrdinalIgnoreCase)
.ToDictionary(m => m.Key, m => m.Count());
firstletter = ALPHABET.ToDictionary(a => a, // First-letter key
a => words.Keys.Where(m => a == char.ToUpper(m[0])).ToList()); // Words
}
}

Here is one way with Regex, note that case sensitivity has not been addressed
var text = "The 'rain' in spain falls mainly on the plain. I said the plain in spain";
var result = new Dictionary<string,string>();
Regex.Matches(text, #"[^\s]+")
.OfType<Match>()
.Select(m => Regex.Replace(m.Value, #"\W", string.Empty))
.ToList()
.ForEach(word =>
{
if (!result.ContainsKey(word))
result.Add(word, word);
});
result

This is almost certainly a job for regular expressions. \W+ splits your input string into words, defined as any character sequence of alphanumeric characters. See the documentation.
string sentence = "You are pretty. State-of-the-art.";
string[] words = Regex.Split(sentence, #"\W+");
foreach (string word in words)
{
if (word != "")
{
Console.WriteLine(word);
}
}

String replace in C# but only the exact character set

I have the following string:
string x = "23;32;323;34;45";
and I want to replace 23 with X as below:
x = "x:32;323;34;45";
but when I try it, I get this instead:
x = "x:32;3x;34;45";
Is there a way I can get the expecte output?

You will need a regular expression (regexp). The replacement rule here is
word boundary
23
word boundary
so your code would look like this
var result = Regex.Replace(input, #"\b23\b", "X");
An alternative approach would be to split your string, replace matching elements and join to new string>
var result = string.Join(";", input.Split(";").Select(v => v == "23" ? "X" : v));
Update: Update value in Dictionary
Assuming you know the key, that's easy:
myDict["thekey"] = Regex.Replace(myDict["thekey"], #"\b23\b", "X");
If you want to do this replacement for all items, I'd do it like this, but I'm not sure, if this is the best possible solution:
[Fact]
public void Replace_value_in_dict()
{
// given
var mydict = new Dictionary<string, string>
{
{ "key1", "donothing" },
{ "key2", "23;32;323;34;45" },
};
// when
var result = mydict
.Select(kv => (kv.Key, Regex.Replace(kv.Value, #"\b23\b", "X")))
.ToDictionary(x => x.Item1, x => x.Item2);
// then
Assert.Equal(result, new Dictionary<string, string>
{
{ "key1", "donothing" },
{ "key2", "X;32;323;34;45" },
});
}

You should use regex
var x="23;32;323;34;45";
var res = Regex.Replace(x, #"\b23\b", "x");
Console.WriteLine(res);
Working sample

How to find the duplicates in the given string in c#

I want to find the duplicates for a given string, I tried for collections, It is working fine, but i don't know how to do it for a string.
Here is the code I tried for collections,
string name = "this is a a program program";
string[] arr = name.Split(' ');
var myList = new List<string>();
var duplicates = new List<string>();
foreach(string res in arr)
{
if (!myList.Contains(res))
{
myList.Add(res);
}
else
{
duplicates.Add(res);
}
}
foreach(string result in duplicates)
{
Console.WriteLine(result);
}
Console.ReadLine();
But I want to find the duplicates for the below string and to store it in an array. How to do that?
eg:- string aa = "elements";
In the above string i want to find the duplicate characters and store it in an array
Can anyone help me?

Linq solution:
string name = "this is a a program program";
String[] result = name.Split(' ')
.GroupBy(word => word)
.Where(chunk => chunk.Count() > 1)
.Select(chunk => chunk.Key)
.ToArray();
Console.Write(String.Join(Environment.NewLine, result));
The same princicple for duplicate characters within a string:
String source = "elements";
Char[] result = source
.GroupBy(c => c)
.Where(chunk => chunk.Count() > 1)
.Select(chunk => chunk.Key)
.ToArray();
// result = ['e']
Console.Write(String.Join(Environment.NewLine, result));

string name = "elements";
var myList = new List<char>();
var duplicates = new List<char>();
foreach (char res in name)
{
if (!myList.Contains(res))
{
myList.Add(res);
}
else if (!duplicates.Contains(res))
{
duplicates.Add(res);
}
}
foreach (char result in duplicates)
{
Console.WriteLine(result);
}
Console.ReadLine();

string is an array of chars. So, you can use your collection approach.
But, I would reccomend typed HashSet. Just load it with string and you'll get array of chars without duplicates, with preserved order.
take a look:
string s = "aaabbcdaaee";
HashSet<char> hash = new HashSet<char>(s);
HashSet<char> hashDup = new HashSet<char>();
foreach (var c in s)
if (hash.Contains(c))
hash.Remove(c);
else
hashDup.Add(c);
foreach (var x in hashDup)
Console.WriteLine(x);
Console.ReadKey();

Instead of a List<> i'd use a HashSet<> because it doesn't allow duplicates and Add returns false in that case. It's more efficient. I'd also use a Dictionary<TKey,Tvalue> instead of the list to track the count of each char:
string text = "elements";
var duplicates = new HashSet<char>();
var duplicateCounts = new Dictionary<char, int>();
foreach (char c in text)
{
int charCount = 0;
bool isDuplicate = duplicateCounts.TryGetValue(c, out charCount);
duplicateCounts[c] = ++charCount;
if (isDuplicate)
duplicates.Add(c);
}
Now you have all unique duplicate chars in the HashSet and the count of each unique char in the dictionary. In this example the set only contains e because it's three times in the string.
So you could output it in the following way:
foreach(char dup in duplicates)
Console.WriteLine("Duplicate char {0} appears {1} times in the text."
, dup
, duplicateCounts[dup]);
For what it's worth, here's a LINQ one-liner which also creates a Dictionary that only contains the duplicate chars and their count:
Dictionary<char, int> duplicateCounts = text
.GroupBy(c => c)
.Where(g => g.Count() > 1)
.ToDictionary(g => g.Key, g => g.Count());
I've shown it as second approach because you should first understand the standard way.

string name = "this is a a program program";
var arr = name.Split(' ').ToArray();
var dup = arr.Where(p => arr.Count(q => q == p) > 1).Select(p => p);
HashSet<string> hash = new HashSet<string>(dup);
string duplicate = string.Join(" ", hash);

You can do this through `LINQ
string name = "this is a a program program";
var d = name.Split(' ').GroupBy(x => x).Select(y => new { word = y.Key, Wordcount = y.Count() }).Where(z=>z.cou > 1).ToList();

Use LINQ to group values:
public static IEnumerable<T> GetDuplicates<T>(this IEnumerable<T> list)
{
return list.GroupBy(item => item).SelectMany(group => group.Skip(1));
}
public static bool HasDuplicates<T>(this IEnumerable<T> list)
{
return list.GetDuplicates().IsNotEmpty();
}
Then you use these extensions like this:
var list = new List<string> { "a", "b", "b", "c" };
var duplicatedValues = list.GetDuplicates();

split string to Dictionnary<string, int>

I have a string like that : "content;123 contents;456 contentss;789 " etc..
I would like to split this string to get a Dictionary, but I don't know you to make it. I try to split the string but I got a List only.
The content (before semi colon) is always a unique string.
After the semi colon, I always have a number until I found the space.
the number is always an int (no float needs).
Could someone help me please ?

You can use the following LINQ expression:
"content;123 contents;456 contentss;789"
.Split(' ')
.Select(x => x.Split(';'))
.ToDictionary(x => x[0], x => int.Parse(x[1]));

string input = "content1;123 content2;456 content3;789";
var dict = Regex.Matches(input, #"(.+?);(\d+)").Cast<Match>()
.ToDictionary(m => m.Groups[1].Value, m => int.Parse(m.Groups[2].Value));

You can do something like this:
string value = "content;123 contents;456 contentss;789";
Dictionary<string, int> data = new Dictionary<string,int>();
foreach(string line in value.Split(' '))
{
string[] values = line.Split(';');
if (!data.ContainsKey(values[0]))
{
data.Add(values[0], Convert.ToInt32(values[1]));
}
}

var myList = "content1;number1 content2;number2 content3;number3";
var myDictionary = myList.Split(' ').Select(pair => pair.Split(';')).ToDictionary(splitPair => splitPair[0], splitPair => int.Parse(splitPair[1]));

static void Main(string[] args)
{
string content = "content;123 contents;456 contentss;789";
Dictionary<string, int> result = new Dictionary<string, int>();
content.Split(' ').ToList().ForEach(x =>
{
var items = x.Split(';');
result.Add(items[0], int.Parse(items[1]));
});
foreach(var item in result)
{
Console.WriteLine("{0} -> {1}" , item.Key, item.Value);
}
}

create a dictionary using 2 lists using LINQ

I am trying to create a dictionary from 2 lists where one list contains keys and one list contains values. I can do it using for loop but I am trying to find if there is a way of doing it using LINQ.
Sample code will be helpfull. Thanks!!!!

In .NET4 you could use the built-in Zip method to merge the two sequences, followed by a ToDictionary call:
var keys = new List<int> { 1, 2, 3 };
var values = new List<string> { "one", "two", "three" };
var dictionary = keys.Zip(values, (k, v) => new { Key = k, Value = v })
.ToDictionary(x => x.Key, x => x.Value);

List<string> keys = new List<string>();
List<string> values = new List<string>();
Dictionary<string, string> dict = keys.ToDictionary(x => x, x => values[keys.IndexOf(x)]);
This of course assumes that the length of each list is the same and that the keys are unique.
UPDATE: This answer is far more efficient and should be used for lists of non-trivial size.

You can include the index in a Select expression to make this efficient:
var a = new List<string>() { "A", "B", "C" };
var b = new List<string>() { "1", "2", "3" };
var c = a.Select((x, i) => new {key = x, value = b[i]}).ToDictionary(e => e.key, e => e.value );
foreach (var d in c)
Console.WriteLine(d.Key + " = " + d.Value);
Console.ReadKey();

var dic = keys.Zip(values, (k, v) => new { k, v })
.ToDictionary(x => x.k, x => x.v);

You can use this code and working perfectly.
C# Code:
var keys = new List<string> { "Kalu", "Kishan", "Gourav" };
var values = new List<string> { "Singh", "Paneri", "Jain" };
Dictionary<string, string> dictionary = new Dictionary<string, string>();
for (int i = 0; i < keys.Count; i++)
{
dictionary.Add(keys[i].ToString(), values[i].ToString());
}
foreach (var data in dictionary)
{
Console.WriteLine("{0} {1}", data.Key, data.Value);
}
Console.ReadLine();
Output Screen:

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to split string into a dictionary - c#

string sx = "(colorIndex=3)(font.family=Helvetica)(font.bold=1)"; var dict = sx.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries) .Select(x => x.Split('=')) .ToDictionary(x => x[0], y => y[1]);

var dict = (from x in s1.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries) select new { s = x.Split('=') }).ToDictionary(x => x[0], x => x[1]);

You can try string sx = "(colorIndex=3)(font.family=Helvetica)(font.bold=1)"; var keyValuePairs = sx.Split(new[] { '(', ')' }, StringSplitOptions.RemoveEmptyEntries) .Select(v => v.Split('=')) .ToDictionary(v => v.First(), v => v.Last());

Related

Splitting text and putting it into dictionary

String replace in C# but only the exact character set

How to find the duplicates in the given string in c#

split string to Dictionnary<string, int>

create a dictionary using 2 lists using LINQ

Categories

Resources