Final Edit: I was able to locate the duplicate field in the ini file. Thanks for your help everyone!
I'm using a regular expression to parse an ini file and LINQ to store it in a Dictionary:
Sample Data:
[WindowSettings]
Window X Pos='0'
Window Y Pos='0'
Window Maximized='false'
Window Name='Jabberwocky'
[Logging]
Directory='C:\Rosetta Stone\Logs'
EDIT: Here is the file actually causing the problem: http://pastebin.com/mQSrkrcP
EDIT2: I've narrowed it down to being caused by the last section in the file: [list_first_nonprintable]
For some reason one of the files that I'm parsing with this is throwing this exception:
System.ArgumentException: An item with the same key has already been added.
Is there any way for me to either find out which key is causing the problem (so I can fix the file), or to just skip the key that's causing this and continue parsing?
Here is the code:
try
{
// Read content of ini file.
string data = System.IO.File.ReadAllText(project);
// Create regular expression to parse ini file.
string pattern = #"^((?:\[)(?<Section>[^\]]*)(?:\])(?:[\r\n]{0,}|\Z))((?!\[)(?<Key>[^=]*?)(?:=)(?<Value>[^\r\n]*)(?:[\r\n]{0,4}))*";
//pattern = #"
//^ # Beginning of the line
//((?:\[) # Section Start
// (?<Section>[^\]]*) # Actual Section text into Section Group
// (?:\]) # Section End then EOL/EOB
// (?:[\r\n]{0,}|\Z)) # Match but don't capture the CRLF or EOB
// ( # Begin capture groups (Key Value Pairs)
// (?!\[) # Stop capture groups if a [ is found; new section
// (?<Key>[^=]*?) # Any text before the =, matched few as possible
// (?:=) # Get the = now
// (?<Value>[^\r\n]*) # Get everything that is not an Line Changes
// (?:[\r\n]{0,4}) # MBDC \r\n
// )* # End Capture groups";
// Parse each file into a Dictionary.
Dictionary<string, Dictionary<string, string>> iniFile
= (from Match m in Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
select new
{
Section = m.Groups["Section"].Value,
kvps = (from cpKey in m.Groups["Key"].Captures.Cast<Capture>().Select((a, i) => new { a.Value, i })
join cpValue in m.Groups["Value"].Captures.Cast<Capture>().Select((b, i) => new { b.Value, i }) on cpKey.i equals cpValue.i
select new KeyValuePair<string, string>(cpKey.Value, cpValue.Value)).ToDictionary(kvp => kvp.Key, kvp => kvp.Value)
}).ToDictionary(itm => itm.Section, itm => itm.kvps);
return iniFile;
}
catch (ArgumentException ex)
{
System.Diagnostics.Debug.Write(ex.ToString());
return new Dictionary<string, Dictionary<string, string>>();
}
Thanks in advance.
This just means that when you convert to a Dictionary --
.ToDictionary(itm => itm.Section, itm => itm.kvps);
-- there are multiple keys (itm.Section). You can use ToLookup instead, which is kind of like a dictionary but allows multiple keys.
Edit
There are a couple of ways to call ToLookup. The simplest is to specify the key selector:
var lookup =
// ...
.ToLookup(itm => itm.Section);
This should provide a lookup where the key is of type Group. Getting a lookup value should then return an IEnumerable, where T is the anonymous type:
Group g = null;
// TODO get group
var lookupvalues = lookup[g];
If the .NET compiler doesn't like this (sometimes it seems to have trouble figuring out what the various types should be), you can also specify an element selector, for example:
ILookup<string, KeyValuePair<string,string>> lookup =
// ...
.ToLookup(
itm => itm.Section.Value, // key selector
itm => itm.kvps // element selector
);
You can write your own ToDictionary method that doesn't break with duplicate keys easy enough.
public static Dictionary<K,V> ToDictionary<TSource, K, V>(
this IEnumerable<TSource> source,
Func<TSource, K> keySelector,
Funct<TSource, V> valueSelector)
{
//TODO validate inputs for null arguments.
Dictionary<K,V> output = new Dictionary<K,V>();
foreach(TSource item in source)
{
//overwrites previous values
output[keySelector(item)] = valueSelector(item);
//ignores future duplicates, comment above and
//uncomment below to change behavior
//K key = keySelector(item);
//if(!output.ContainsKey(key))
//{
//output.Add(key, valueSelector(item));
//}
}
return output;
}
I assume that you could figure out how to implement the additional overloads (without value the selector).
You can use Tuple to pass multiple keys. Check sample code below:
.ToDictionary(k => new Tuple<string,string>(k.key1,k.key2), v => v.value)
Related
Dictionary1 has a key of entities to a string, let's say "Def3".
So it looks like:
Ent1, Def3
Ent3, Def3
Dictionary2 has all entities associated to another string, that string which is not important.
Ent1, Unimportant
Ent2, Unimportant
Ent3, Unimportant
I know a default string Def2 I'd like to put into Dictionary1 for every Entity in Dictionary2 that doesn't exist in Dictionary1.
How can I update Dictionary1 such that it looks like:
Ent1, Def3
Ent2, Def2
Ent3, Def3
Fyi: These are short examples for much larger dictionaries, so simple case-by-case insertion wouldn't work here.
Update: Ok, let me clarify. If Dictionary2 Has a Key that is not a key in Dictionary1, add Dictionary2's Key with a string ("Def2").
foreach(var key in Dictionary2.Keys.Where(k => !Dictionary1.Keys.Contains(k)))
{
Dictionary1.Add(key, defaultstring);
}
You can use LINQ for this job:
var Dictionary3 = Dictionary2.ToDictionary(x => x.Key,
x => Dictionary1.ContainsKey(x.Key) ? Dictionary1[x.Key] : "Def2");
So you want all keys of Dictionary2 that are no keys in Dictionary1 to be added as a key in Dictionary1 with a value def2.
In small steps:
var allKeysInDict1 = Dictionary1.Keys;
var allKeysInDict2 = Dictionary2.Keys;
var missingKeysInDict1 = allKeysInDict2.Except(allKeysInDict1);
foreach (var missingKey in missingKeysInDict1)
{
dictionary1.Add(missingKey, def2);
}
If We have a list of strings like the following code:
List<string> XAll = new List<string>();
XAll.Add("#10#20");
XAll.Add("#20#30#40");
string S = "#30#20";//<- this is same as #20#30 also same as "#20#30#40" means S is exist in that list
//check un-ordered string S= #30#20
// if it is contained at any order like #30#20 or even #20#30 ..... then return true :it is exist
if (XAll.Contains(S))
{
Console.WriteLine("Your String is exist");
}
I would prefer to use Linq to check that S in this regard is exist, no matter how the order is in the list, but it contains both (#30) and (#20) [at least] together in that list XAll.
I am using
var c = item2.Intersect(item1);
if (c.Count() == item1.Length)
{
return true;
}
You should represent your data in a more meaningful way. Don't rely on strings.
For example I would suggest creating a type to represent a set of these numbers and write some code to populate it.
But there are already set types such as HashSet which is possibly a good match with built in functions for testing for sub sets.
This should get you started:
var input = "#20#30#40";
var hashSetOfNumbers = new HashSet<int>(input
.Split(new []{'#'}, StringSplitOptions.RemoveEmptyEntries)
.Select(s=>int.Parse(s)));
This works for me:
Func<string, string[]> split =
x => x.Split(new [] { '#' }, StringSplitOptions.RemoveEmptyEntries);
if (XAll.Any(x => split(x).Intersect(split(S)).Count() == split(S).Count()))
{
Console.WriteLine("Your String is exist");
}
Now, depending on you you want to handle duplicates, this might even be a better solution:
Func<string, HashSet<string>> split =
x => new HashSet<string>(x.Split(
new [] { '#' },
StringSplitOptions.RemoveEmptyEntries));
if (XAll.Any(x => split(S).IsSubsetOf(split(x))))
{
Console.WriteLine("Your String is exist");
}
This second approach uses pure set theory so it strips duplicates.
I am trying to replace occurrences of a property name with a value in a Dictionary in C#.
I have the following Dictionary:
Dictionary<string, string> properties = new Dictionary<string, string>()
{
{ "property1", #"E:\" },
{ "property2", #"$(property1)\Temp"},
{ "property3", #"$(property2)\AnotherSubFolder"}
};
Where the key is the property name, and the value is just a string value. I basically want to iterate over the values until all set properties have been replaced. The syntax is similar to MSBuild property names.
This should eventually evaluate property 3 to E:\Temp\AnotherSubFolder.
It would help if the RegEx part of the functionality would work, which is where I am stuck on.
I had tried out editing my RegEx on REFiddle here.
The following regex pattern works here:
/\$\(([^)]+)\)/g
Given the text:
$(property2)\AnotherSubFolder
It highlights the $(property2).
However, putting this together in .NET fiddle, I don't get any matches with the following code:
var pattern = #"\$\(([^)]+)\)/g";
Console.WriteLine(Regex.Matches(#"$(property2)AnotherSubFolder", pattern).Count);
Which outputs 0.
I am not too sure why here. Why is my match returning zero results?
.NET should match globally by default.
I'm not aware of support for /g as that is a Perl-ism, so remove it, and the leading /, .NET is trying to match them literally.
Regular Expressions may be overkill here, and may even introduce issues if your properties or values contain special characters, or characters that will be evaluated as regular expressions themselves.
A simple replacement should work:
Dictionary<string, string> properties = new Dictionary<string, string>()
{
{ "property1", #"E:\" },
{ "property2", #"$(property1)\Temp"},
{ "property3", #"$(property2)\AnotherSubFolder"}
};
Dictionary<string, string> newproperties = new Dictionary<string, string>();
// Iterate key value pairs in properties dictionary, evaluate values
foreach ( KeyValuePair<string,string> kvp in properties ) {
string value = kvp.Value;
// Execute replacements on value until no replacements are found
// (Replacement of $(property2) will result in value containing $(property1), must be evaluated again)
bool complete = false;
while (!complete) {
complete = true;
// Look for each replacement token in dictionary value, execute replacement if found
foreach ( string key in properties.Keys ) {
string token = "$(" + key + ")";
if ( value.Contains( token ) ) {
value = value.Replace( "$(" + key + ")", properties[key] );
complete = false;
}
}
}
newproperties[kvp.Key] = value;
}
properties = newproperties;
I have a lot of if, else if statements and I know there has to be a better way to do this but even after searching stackoverflow I'm unsure of how to do so in my particular case.
I am parsing text files (bills) and assigning the name of the service provider to a variable (txtvar.Provider) based on if certain strings appear on the bill.
This is a small sample of what I'm doing (don't laugh, I know it's messy). All in all, There are approximately 300 if, else if's.
if (txtvar.BillText.IndexOf("SWGAS.COM") > -1)
{
txtvar.Provider = "Southwest Gas";
}
else if (txtvar.BillText.IndexOf("georgiapower.com") > -1)
{
txtvar.Provider = "Georgia Power";
}
else if (txtvar.BillText.IndexOf("City of Austin") > -1)
{
txtvar.Provider = "City of Austin";
}
// And so forth for many different strings
I would like to use something like a switch statement to be more efficient and readable but I'm unsure of how I would compare the BillText. I'm looking for something like this but can't figure out how to make it work.
switch (txtvar.BillText)
{
case txtvar.BillText.IndexOf("Southwest Gas") > -1:
txtvar.Provider = "Southwest Gas";
break;
case txtvar.BillText.IndexOf("TexasGas.com") > -1:
txtvar.Provider = "Texas Gas";
break;
case txtvar.BillText.IndexOf("Southern") > -1:
txtvar.Provider = "Southern Power & Gas";
break;
}
I'm definitely open to ideas.
I would need the ability to determine the order in which the values were evaluated.
As you can imagine, when parsing for hundreds of slightly different layouts I occasionally run into the issue of not having a distinctly unique indicator as to what service provider the bill belongs to.
Why not use everything C# has to offer? The following use of anonymous types, collection initializers, implicitly typed variables, and lambda-syntax LINQ is compact, intuitive, and maintains your modified requirement that patterns be evaluated in order:
var providerMap = new[] {
new { Pattern = "SWGAS.COM" , Name = "Southwest Gas" },
new { Pattern = "georgiapower.com", Name = "Georgia Power" },
// More specific first
new { Pattern = "City of Austin" , Name = "City of Austin" },
// Then more general
new { Pattern = "Austin" , Name = "Austin Electric Company" }
// And for everything else:
new { Pattern = String.Empty , Name = "Unknown" }
};
txtVar.Provider = providerMap.First(p => txtVar.BillText.IndexOf(p.Pattern) > -1).Name;
More likely, the pairs of patterns would come from a configurable source, such as:
var providerMap =
System.IO.File.ReadLines(#"C:\some\folder\providers.psv")
.Select(line => line.Split('|'))
.Select(parts => new { Pattern = parts[0], Name = parts[1] }).ToList();
Finally, as #millimoose points out, anonymous types are less useful when passed between methods. In that case we can define a trival Provider class and use object initializers for nearly identical syntax:
class Provider {
public string Pattern { get; set; }
public string Name { get; set; }
}
var providerMap =
System.IO.File.ReadLines(#"C:\some\folder\providers.psv")
.Select(line => line.Split('|'))
.Select(parts => new Provider() { Pattern = parts[0], Name = parts[1] }).ToList();
Since you seem to need to search for the key before returning the value a Dictionary is the right way to go, but you will need to loop over it.
// dictionary to hold mappings
Dictionary<string, string> mapping = new Dictionary<string, string>();
// add your mappings here
// loop over the keys
foreach (KeyValuePair<string, string> item in mapping)
{
// return value if key found
if(txtvar.BillText.IndexOf(item.Key) > -1) {
return item.Value;
}
}
EDIT: If you wish to have control over the order in which elemnts are evaluated, use an OrderedDictionary and add the elements in the order in which you want them evaluated.
One more using LINQ and Dictionary
var mapping = new Dictionary<string, string>()
{
{ "SWGAS.COM", "Southwest Gas" },
{ "georgiapower.com", "Georgia Power" }
.
.
};
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.Select(pair => pair.Value)
.FirstOrDefault();
If we prefer empty string instead of null when no key matches we can use the ?? operator:
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.Select(pair => pair.Value)
.FirstOrDefault() ?? "";
If we should consider the dictionary contains similar strings we add an order by, alphabetically, shortest key will be first, this will pick 'SCE' before 'SCEC'
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.OrderBy(pair => pair.Key)
.Select(pair => pair.Value)
.FirstOrDefault() ?? "";
To avoid the blatant Schlemiel the Painter's approach that looping over all the keys would involve: let's use regular expressions!
// a dictionary that holds which bill text keyword maps to which provider
static Dictionary<string, string> BillTextToProvider = new Dictionary<string, string> {
{"SWGAS.COM", "Southwest Gas"},
{"georgiapower.com", "Georgia Power"}
// ...
};
// a regex that will match any of the keys of this dictionary
// i.e. any of the bill text keywords
static Regex BillTextRegex = new Regex(
string.Join("|", // to alternate between the keywords
from key in BillTextToProvider.Keys // grab the keywords
select Regex.Escape(key))); // escape any special characters in them
/// If any of the bill text keywords is found, return the corresponding provider.
/// Otherwise, return null.
string GetProvider(string billText)
{
var match = BillTextRegex.Match(billText);
if (match.Success)
// the Value of the match will be the found substring
return BillTextToProvider[match.Value];
else return null;
}
// Your original code now reduces to:
var provider = GetProvider(txtvar.BillText);
// the if is be unnecessary if txtvar.Provider should be null in case it can't be
// determined
if (provider != null)
txtvar.Provider = provider;
Making this case-insensitive is a trivial exercise for the reader.
All that said, this does not even pretend to impose an order on which keywords to look for first - it will find the match that's located earliest in the string. (And then the one that occurs first in the RE.) You do however mention that you're searching through largeish texts; if .NET's RE implementation is at all good this should perform considerably better than 200 naive string searches. (By only making one pass through the string, and maybe a little by merging common prefixes in the compiled RE.)
If ordering is important to you, you might want to consider looking for an implementation of a better string search algorithm than .NET uses. (Like a variant of Boyer-Moore.)
What you want is a Dictionary:
Dictionary<string, string> mapping = new Dictionary<string, string>();
mapping["SWGAS.COM"] = "Southwest Gas";
mapping["foo"] = "bar";
... as many as you need, maybe read from a file ...
Then just:
return mapping[inputString];
Done.
One way of doing it (other answers show very valid options):
void Main()
{
string input = "georgiapower.com";
string output = null;
// an array of string arrays...an array of Tuples would also work,
// or a List<T> with any two-member type, etc.
var search = new []{
new []{ "SWGAS.COM", "Southwest Gas"},
new []{ "georgiapower.com", "Georgia Power"},
new []{ "City of Austin", "City of Austin"}
};
for( int i = 0; i < search.Length; i++ ){
// more complex search logic could go here (e.g. a regex)
if( input.IndexOf( search[i][0] ) > -1 ){
output = search[i][1];
break;
}
}
// (optional) check that a valid result was found.
if( output == null ){
throw new InvalidOperationException( "A match was not found." );
}
// Assign the result, output it, etc.
Console.WriteLine( output );
}
The main thing to take out of this exercise is that creating a giant switch or if/else structure is not the best way to do it.
There are several approaches to do this, but for the reason of simplicity, conditional operator may be a choice:
Func<String, bool> contains=x => {
return txtvar.BillText.IndexOf(x)>-1;
};
txtvar.Provider=
contains("SWGAS.COM")?"Southwest Gas":
contains("georgiapower.com")?"Georgia Power":
contains("City of Austin")?"City of Austin":
// more statements go here
// if none of these matched, txtvar.Provider is assigned to itself
txtvar.Provider;
Note the result is according to the more preceded condition which is met, so if txtvar.BillText="City of Austin georgiapower.com"; then the result would be "Georgia Power".
you can use dictionary.
Dictionary<string, string> textValue = new Dictionary<string, string>();
foreach (KeyValuePair<string, string> textKey in textValue)
{
if(txtvar.BillText.IndexOf(textKey.Key) > -1)
return textKey.Value;
}
I'm just working on a Kata on my lunch and I've come unstuck...
Here's the steps I'm trying to follow:
Given an input string, split the string by the new line character
Given the string array result of the previous step, skip the first element in the array
Given the collection of strings resulting from the previous step, create a collection consisting of every 2 elements
In that last statement what I mean is, given this collection of 4 strings:
{
"string1",
"string2",
"string3",
"string4"
}
I should end up with this collection of pairs (is 'tuples' the right term?):
{
{ "string1","string2" },
{ "string3","string4" }
}
I started looking at ToDictionary, then moved over to selecting an anonymous type but I'm not sure how to say "return the next two strings as a pair".
My code looks similar to this at the time of writing:
public void myMethod() {
var splitInputString = input.Split('\n');
var dic = splitInputString.Skip(1).Select( /* each two elements */ );
}
Cheers for the help!
James
Well, you could use (untested):
var dic = splitInputStream.Zip(splitInputStream.Skip(1),
(key, value) => new { key, value })
.Where((pair, index) => index % 2 == 0)
.ToDictionary(pair => pair.key, pair => pair.value);
The Zip part will end up with:
{ "string1", "string2" }
{ "string2", "string3" }
{ "string3", "string4" }
... and the Where pair using the index will skip every other entry (which would be "value with the next key").
Of course if you really know you've got a List<string> to start with, you could just access the pairs by index, but that's boring...