Map two string arrays without a switch - c#

I have two arrays that need to be mapped. In code
var result = "[placeholder2] Hello my name is [placeholder1]";
var placeholder = { "[placeholder1]", "[placeholder2]", "[placeholder3]", "[placeholder4]" };
var placeholderValue = { "placeholderValue3", "placeholderValue2", "placeholderValue3" };
Array.ForEach(placeholder , i => result = result.Replace(i, placeholderValue));
given i, placeholderValue needs to be set in an intelligent way. I can implement a switch statement. The cyclomatic complexity would be unacceptable with 30 elements or so. What is a good pattern, extension method or otherwise means to achieve my goal?

I skipped null checks for simplicity
string result = "[placeholder2] Hello my name is [placeholder1]";
var placeHolders = new Dictionary<string, string>() {
{ "placeholder1", "placeholderValue1" },
{ "placeholder2", "placeholderValue2" }
};
var newResult = Regex.Replace(result,#"\[(.+?)\]",m=>placeHolders[m.Groups[1].Value]);

The smallest code change would be to just use a for loop, rather than a ForEach or, in your case, a ForEach taking a lambda. With a for loop you'll have the index of the appropriate value in the placehoderValue array.
The next improvement would be to make a single array of an object holding both a placeholder and it's value, rather than two 'parallel' arrays that you need to keep in sync.
Even better than that, and also even simpler to implement, is to just have a Dictionary with the key being a placeholder and the value being the placeholder value. This essentially does the above suggestion for you through the use of the KeyValuePair class (so you don't need to make your own).
At that point the pseudocode becomes:
foreach(key in placeholderDictionary) replace key with placeholderDictionary[key]

I think you want to use Zip to combine the placeholders with their values.
var result = "[placeholder2] Hello my name is [placeholder1]";
var placeholder = new[] { "[placeholder1]", "[placeholder2]", "[placeholder3]", "[placeholder4]" };
var placeholderValue = new[] { "placeholderValue1", "placeholderValue2", "placeholderValue3", "placeholderValue4" };
var placeHolderPairs = placeholder.Zip(placeholderValue, Tuple.Create);
foreach (var pair in placeHolderPairs)
{
result = result.Replace(pair.Item1, pair.Item2);
}

Related

Optimizing loop

I'm trying to implement a tool that groups certain strings based on the lemmas of their words. During the initialization I make a dictionary for each possible group containing a list of words that would group into this key. This is what I have so far:
public Dictionary<string, HashSet<string>> Sets { get; set; }
private void Initialize(IStemmer stemmer)
{
// Stemming of keywords and groups
var keywordStems = new Dictionary<string, List<string>>();
var groupStems = new Dictionary<string, List<string>>();
foreach (string keyword in Keywords)
{
keywordStems.Add(keyword, CreateLemmas(keyword, stemmer));
foreach (string subset in CreateSubsets(keyword))
{
if (subset.Length > 1 && !groupStems.ContainsKey(subset))
{
groupStems.Add(subset, CreateLemmas(subset, stemmer));
}
}
}
// Initialize all viable sets
// This is the slow part
foreach (string gr in groupStems.Keys)
{
var grStems = groupStems[gr];
var grKeywords = new HashSet<string>((from kw in Keywords
where grStems.All(keywordStems[kw].Contains)
select kw));
if (grKeywords.Count >= Settings.MinCount)
{
Sets.Add(gr, grKeywords);
}
}
}
Is there any way that I can speed the bottleneck of this method up?
The answer of #mjwills is a good idea. It seems likely that this is the most expensive operation:
var grKeywords = new HashSet<string>((
from kw in Keywords
where grStems.All(keywordStems[kw].Contains)
select kw));
The suggestion is to optimize the Contains by taking advantage of the fact that the stems are a set. But if they're a set then why are we repeatedly asking for containment at all? They're a set; do set operations. The question is "what are the keywords such that every member of the grStem set is contained within the keyword's stem set". "Is every member of this set contained in that set" is the subset operation.
var grKeywords = new HashSet<string>((
from kw in Keywords
where grStems.IsSubsetOf(keywordStems[kw])
select kw));
The implementation of IsSubsetOf is optimized for common scenarios like "both operands are sets". And it takes early outs; if your group stems set is larger than the keyword stem set then you don't need to check every element; one of them is going to be missing. But your original algorithm checks every element anyways, even when you could bail early and save all that time.
And again #mjwills has a good idea which I'll suggest some possible improvements to. The idea here is to execute the query, cache the results in an array, and only later realize it as a hash set, if necessary:
foreach (var entry in groupStems)
{
var grStems = entry.Value;
var grKeywords = (WHATEVER).ToArray();
if (grKeywords.Length >= Settings.MinCount)
Sets.Add(entry.Key, new HashSet<string>(grKeywords));
}
First: I actually doubt that avoiding the unnnecessary hash set construction by replacing it with unnecessary array constructions is a win. Measure it and see.
Second: ToList can be faster than ToArray because a list can be constructed before you know the size of the query result set. ToArray basically has to do a ToList first, and then copy the results into an exactly-sized array. So if ToArray is not a win, ToList might be. Or not. Measure it.
Third: I note that the whole thing can be rewritten into a query should you prefer that style.
var q = from entry in groupStems
let grStems = entry.Value
let grKeywords = new HashSet<string>(WHATEVER)
where grKeywords.Count >= Settings.MinCount
select (entry.Key, grKeywords);
var result = q.ToDictionary( ... and so on ... )
That's probably not faster, but it might be easier to reason about.
One suggestion would be to change:
var keywordStems = new Dictionary<string, List<string>>();
to:
var keywordStems = new Dictionary<string, HashSet<string>>();
That should have an impact due to your later Contains call:
var grKeywords = new HashSet<string>((from kw in Keywords
where grStems.All(keywordStems[kw].Contains)
select kw));
because Contains is generally faster on a HashSet than a List.
Also consider changing:
foreach (string gr in groupStems.Keys)
{
var grStems = groupStems[gr];
var grKeywords = new HashSet<string>((from kw in Keywords
where grStems.All(keywordStems[kw].Contains)
select kw));
if (grKeywords.Count >= Settings.MinCount)
{
Sets.Add(gr, grKeywords);
}
}
to:
foreach (var entry in groupStems)
{
var grStems = entry.Value;
var grKeywords = (from kw in Keywords
where grStems.All(keywordStems[kw].Contains)
select kw).ToArray();
if (grKeywords.Length >= Settings.MinCount)
{
Sets.Add(entry.Key, new HashSet<string>(grKeywords));
}
}
By shifting the HashSet initialization (which is relatively expensive compared to initializing an Array) into the if statement then you may improve performance if the if is entered relatively rarely (in your comments you state it is entered roughly 25% of the time).

Search a List of string array to find a value in matching element and return another element in same array

So I have
List<string[]> listy = new List<string[]>();
listy.add('a','1','blue');
listy.add('b','2','yellow');
And i want to search through all of the list ti find the index where the array containing 'yellow' is, and return the first element value, in this case 'b'.
Is there a way to do this with built in functions or am i going to need to write my own search here?
Relatively new to c# and not aware of good practice or all the built in functions. Lists and arrays im ok with but lists of arrays baffles me somewhat.
Thanks in advance.
As others have already suggested, the easiest way to do this involves a very powerful C# feature called LINQ ("Language INtegrated Queries). It gives you a SQL-like syntax for querying collections of objects (or databases, or XML documents, or JSON documents).
To make LINQ work, you will need to add this at the top of your source code file:
using System.Linq;
Then you can write:
IEnumerable<string> yellowThings =
from stringArray in listy
where stringArray.Contains("yellow")
select stringArray[0];
Or equivalently:
IEnumerable<string> yellowThings =
listy.Where(strings => strings.Contains("yellow"))
.Select(strings => strings[0]);
At this point, yellowThings is an object containing a description of the query that you want to run. You can write other LINQ queries on top of it if you want, and it won't actually perform the search until you ask to see the results.
You now have several options...
Loop over the yellow things:
foreach(string thing in yellowThings)
{
// do something with thing...
}
(Don't do this more than once, otherwise the query will be evaluated repeatedly.)
Get a list or array :
List<string> listOfYellowThings = yellowThings.ToList();
string[] arrayOfYellowThings = yellowThings.ToArray();
If you expect to have exactly one yellow thing:
string result = yellowThings.Single();
// Will throw an exception if the number of matches is zero or greater than 1
If you expect to have either zero or one yellow things:
string result = yellowThings.SingleOrDefault();
// result will be null if there are no matches.
// An exception will be thrown if there is more than one match.
If you expect to have one or more yellow things, but only want the first one:
string result = yellowThings.First();
// Will throw an exception if there are no yellow things
If you expect to have zero or more yellow things, but only want the first one if it exists:
string result = yellowThings.FirstOrDefault();
// result will be null if there are no yellow things.
Based on the problem explanation provided by you following is the solution I can suggest.
List<string[]> listy = new List<string[]>();
listy.Add(new string[] { "a", "1", "blue"});
listy.Add(new string[] { "b", "2", "yellow"});
var target = listy.FirstOrDefault(item => item.Contains("yellow"));
if (target != null)
{
Console.WriteLine(target[0]);
}
This should solve your issue. Let me know if I am missing any use case here.
You might consider changing the data structure,
Have a class for your data as follows,
public class Myclas
{
public string name { get; set; }
public int id { get; set; }
public string color { get; set; }
}
And then,
static void Main(string[] args)
{
List<Myclas> listy = new List<Myclas>();
listy.Add(new Myclas { name = "a", id = 1, color = "blue" });
listy.Add(new Myclas { name = "b", id = 1, color = "yellow" });
var result = listy.FirstOrDefault(t => t.color == "yellow");
}
Your current situation is
List<string[]> listy = new List<string[]>();
listy.Add(new string[]{"a","1","blue"});
listy.Add(new string[]{"b","2","yellow"});
Now there are Linq methods, so this is what you're trying to do
var result = listy.FirstOrDefault(x => x.Contains("yellow"))?[0];

Check whether a string is in a list at any order in C#

If We have a list of strings like the following code:
List<string> XAll = new List<string>();
XAll.Add("#10#20");
XAll.Add("#20#30#40");
string S = "#30#20";//<- this is same as #20#30 also same as "#20#30#40" means S is exist in that list
//check un-ordered string S= #30#20
// if it is contained at any order like #30#20 or even #20#30 ..... then return true :it is exist
if (XAll.Contains(S))
{
Console.WriteLine("Your String is exist");
}
I would prefer to use Linq to check that S in this regard is exist, no matter how the order is in the list, but it contains both (#30) and (#20) [at least] together in that list XAll.
I am using
var c = item2.Intersect(item1);
if (c.Count() == item1.Length)
{
return true;
}
You should represent your data in a more meaningful way. Don't rely on strings.
For example I would suggest creating a type to represent a set of these numbers and write some code to populate it.
But there are already set types such as HashSet which is possibly a good match with built in functions for testing for sub sets.
This should get you started:
var input = "#20#30#40";
var hashSetOfNumbers = new HashSet<int>(input
.Split(new []{'#'}, StringSplitOptions.RemoveEmptyEntries)
.Select(s=>int.Parse(s)));
This works for me:
Func<string, string[]> split =
x => x.Split(new [] { '#' }, StringSplitOptions.RemoveEmptyEntries);
if (XAll.Any(x => split(x).Intersect(split(S)).Count() == split(S).Count()))
{
Console.WriteLine("Your String is exist");
}
Now, depending on you you want to handle duplicates, this might even be a better solution:
Func<string, HashSet<string>> split =
x => new HashSet<string>(x.Split(
new [] { '#' },
StringSplitOptions.RemoveEmptyEntries));
if (XAll.Any(x => split(S).IsSubsetOf(split(x))))
{
Console.WriteLine("Your String is exist");
}
This second approach uses pure set theory so it strips duplicates.

Alternative to if, else if

I have a lot of if, else if statements and I know there has to be a better way to do this but even after searching stackoverflow I'm unsure of how to do so in my particular case.
I am parsing text files (bills) and assigning the name of the service provider to a variable (txtvar.Provider) based on if certain strings appear on the bill.
This is a small sample of what I'm doing (don't laugh, I know it's messy). All in all, There are approximately 300 if, else if's.
if (txtvar.BillText.IndexOf("SWGAS.COM") > -1)
{
txtvar.Provider = "Southwest Gas";
}
else if (txtvar.BillText.IndexOf("georgiapower.com") > -1)
{
txtvar.Provider = "Georgia Power";
}
else if (txtvar.BillText.IndexOf("City of Austin") > -1)
{
txtvar.Provider = "City of Austin";
}
// And so forth for many different strings
I would like to use something like a switch statement to be more efficient and readable but I'm unsure of how I would compare the BillText. I'm looking for something like this but can't figure out how to make it work.
switch (txtvar.BillText)
{
case txtvar.BillText.IndexOf("Southwest Gas") > -1:
txtvar.Provider = "Southwest Gas";
break;
case txtvar.BillText.IndexOf("TexasGas.com") > -1:
txtvar.Provider = "Texas Gas";
break;
case txtvar.BillText.IndexOf("Southern") > -1:
txtvar.Provider = "Southern Power & Gas";
break;
}
I'm definitely open to ideas.
I would need the ability to determine the order in which the values were evaluated.
As you can imagine, when parsing for hundreds of slightly different layouts I occasionally run into the issue of not having a distinctly unique indicator as to what service provider the bill belongs to.
Why not use everything C# has to offer? The following use of anonymous types, collection initializers, implicitly typed variables, and lambda-syntax LINQ is compact, intuitive, and maintains your modified requirement that patterns be evaluated in order:
var providerMap = new[] {
new { Pattern = "SWGAS.COM" , Name = "Southwest Gas" },
new { Pattern = "georgiapower.com", Name = "Georgia Power" },
// More specific first
new { Pattern = "City of Austin" , Name = "City of Austin" },
// Then more general
new { Pattern = "Austin" , Name = "Austin Electric Company" }
// And for everything else:
new { Pattern = String.Empty , Name = "Unknown" }
};
txtVar.Provider = providerMap.First(p => txtVar.BillText.IndexOf(p.Pattern) > -1).Name;
More likely, the pairs of patterns would come from a configurable source, such as:
var providerMap =
System.IO.File.ReadLines(#"C:\some\folder\providers.psv")
.Select(line => line.Split('|'))
.Select(parts => new { Pattern = parts[0], Name = parts[1] }).ToList();
Finally, as #millimoose points out, anonymous types are less useful when passed between methods. In that case we can define a trival Provider class and use object initializers for nearly identical syntax:
class Provider {
public string Pattern { get; set; }
public string Name { get; set; }
}
var providerMap =
System.IO.File.ReadLines(#"C:\some\folder\providers.psv")
.Select(line => line.Split('|'))
.Select(parts => new Provider() { Pattern = parts[0], Name = parts[1] }).ToList();
Since you seem to need to search for the key before returning the value a Dictionary is the right way to go, but you will need to loop over it.
// dictionary to hold mappings
Dictionary<string, string> mapping = new Dictionary<string, string>();
// add your mappings here
// loop over the keys
foreach (KeyValuePair<string, string> item in mapping)
{
// return value if key found
if(txtvar.BillText.IndexOf(item.Key) > -1) {
return item.Value;
}
}
EDIT: If you wish to have control over the order in which elemnts are evaluated, use an OrderedDictionary and add the elements in the order in which you want them evaluated.
One more using LINQ and Dictionary
var mapping = new Dictionary<string, string>()
{
{ "SWGAS.COM", "Southwest Gas" },
{ "georgiapower.com", "Georgia Power" }
.
.
};
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.Select(pair => pair.Value)
.FirstOrDefault();
If we prefer empty string instead of null when no key matches we can use the ?? operator:
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.Select(pair => pair.Value)
.FirstOrDefault() ?? "";
If we should consider the dictionary contains similar strings we add an order by, alphabetically, shortest key will be first, this will pick 'SCE' before 'SCEC'
return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
.OrderBy(pair => pair.Key)
.Select(pair => pair.Value)
.FirstOrDefault() ?? "";
To avoid the blatant Schlemiel the Painter's approach that looping over all the keys would involve: let's use regular expressions!
// a dictionary that holds which bill text keyword maps to which provider
static Dictionary<string, string> BillTextToProvider = new Dictionary<string, string> {
{"SWGAS.COM", "Southwest Gas"},
{"georgiapower.com", "Georgia Power"}
// ...
};
// a regex that will match any of the keys of this dictionary
// i.e. any of the bill text keywords
static Regex BillTextRegex = new Regex(
string.Join("|", // to alternate between the keywords
from key in BillTextToProvider.Keys // grab the keywords
select Regex.Escape(key))); // escape any special characters in them
/// If any of the bill text keywords is found, return the corresponding provider.
/// Otherwise, return null.
string GetProvider(string billText)
{
var match = BillTextRegex.Match(billText);
if (match.Success)
// the Value of the match will be the found substring
return BillTextToProvider[match.Value];
else return null;
}
// Your original code now reduces to:
var provider = GetProvider(txtvar.BillText);
// the if is be unnecessary if txtvar.Provider should be null in case it can't be
// determined
if (provider != null)
txtvar.Provider = provider;
Making this case-insensitive is a trivial exercise for the reader.
All that said, this does not even pretend to impose an order on which keywords to look for first - it will find the match that's located earliest in the string. (And then the one that occurs first in the RE.) You do however mention that you're searching through largeish texts; if .NET's RE implementation is at all good this should perform considerably better than 200 naive string searches. (By only making one pass through the string, and maybe a little by merging common prefixes in the compiled RE.)
If ordering is important to you, you might want to consider looking for an implementation of a better string search algorithm than .NET uses. (Like a variant of Boyer-Moore.)
What you want is a Dictionary:
Dictionary<string, string> mapping = new Dictionary<string, string>();
mapping["SWGAS.COM"] = "Southwest Gas";
mapping["foo"] = "bar";
... as many as you need, maybe read from a file ...
Then just:
return mapping[inputString];
Done.
One way of doing it (other answers show very valid options):
void Main()
{
string input = "georgiapower.com";
string output = null;
// an array of string arrays...an array of Tuples would also work,
// or a List<T> with any two-member type, etc.
var search = new []{
new []{ "SWGAS.COM", "Southwest Gas"},
new []{ "georgiapower.com", "Georgia Power"},
new []{ "City of Austin", "City of Austin"}
};
for( int i = 0; i < search.Length; i++ ){
// more complex search logic could go here (e.g. a regex)
if( input.IndexOf( search[i][0] ) > -1 ){
output = search[i][1];
break;
}
}
// (optional) check that a valid result was found.
if( output == null ){
throw new InvalidOperationException( "A match was not found." );
}
// Assign the result, output it, etc.
Console.WriteLine( output );
}
The main thing to take out of this exercise is that creating a giant switch or if/else structure is not the best way to do it.
There are several approaches to do this, but for the reason of simplicity, conditional operator may be a choice:
Func<String, bool> contains=x => {
return txtvar.BillText.IndexOf(x)>-1;
};
txtvar.Provider=
contains("SWGAS.COM")?"Southwest Gas":
contains("georgiapower.com")?"Georgia Power":
contains("City of Austin")?"City of Austin":
// more statements go here
// if none of these matched, txtvar.Provider is assigned to itself
txtvar.Provider;
Note the result is according to the more preceded condition which is met, so if txtvar.BillText="City of Austin georgiapower.com"; then the result would be "Georgia Power".
you can use dictionary.
Dictionary<string, string> textValue = new Dictionary<string, string>();
foreach (KeyValuePair<string, string> textKey in textValue)
{
if(txtvar.BillText.IndexOf(textKey.Key) > -1)
return textKey.Value;
}

Create objects within a foreach to push to an array in C#

What I am trying to achieve is to split a string into multiple adresses like "NL,VENLO,5928PN" which getLocation will return a "POINT( x y)" string value.
This works. Next I need to create a WayPointDesc object for each location. And each of these objects has to be pushed into the WayPointDesc[]. I have tried various methods but I cannot find a feasable option so far. My last resort is to hardcode a maximum amount of waypoints but I would rather avoid such a thing.
Using a list is unfortunately not an option... I think.
This is the function:
/* tour()
* Input: string route
* Output: string[] [0] DISTANCE [1] TIME [2] MAP
* Edited 21/12/12 - Davide Nguyen
*/
public string[] tour(string route)
{
// EXAMPLE INPUT FROM QUERY
route = "NL,HELMOND,5709EM+NL,BREDA,8249EN+NL,VENLO,5928PN";
string[] waypoints = route.Split('+');
// Do something completly incomprehensible
foreach (string point in waypoints)
{
xRoute.WaypointDesc wpdStart = new xRoute.WaypointDesc();
wpdStart.wrappedCoords = new xRoute.Point[] { new xRoute.Point() };
wpdStart.wrappedCoords[0].wkt = getLocation(point);
}
// Put the strange result in here somehow
xRoute.WaypointDesc[] waypointDesc = new xRoute.WaypointDesc[] { wpdStart };
// Calculate the route information
xRoute.Route route = calculateRoute(waypointDesc);
// Generate the map, travel distance and travel time using the route information
string[] result = createMap(route);
// Return the result
return result;
//WEEKEND?
}
Arrays are fixed-length, if you want to dynamically add elements, you need to use some type of linked list structure. Also, your wpdStart variable was out of scope when you were adding it originally.
List<xRoute.WaypointDesc> waypointDesc = new List<xRoute.WaypointDesc>();
// Do something completly incomprehensible
foreach (string point in waypoints)
{
xRoute.WaypointDesc wpdStart = new xRoute.WaypointDesc();
wpdStart.wrappedCoords = new xRoute.Point[] { new xRoute.Point() };
wpdStart.wrappedCoords[0].wkt = getLocation(point);
// Put the strange result in here somehow
waypointDesc.add(wpdStart);
}
If you really want the list as an array later, use: waypointDesc.ToArray()

Categories