C# regex data from website

C# regex data from website - c#

I am trying to make an addon to a game named Tibia.
On their website Tibia.com you can search up people and see their deaths.
forexample:
http://www.tibia.com/community/?subtopic=characters&name=Kixus
Now I want to read the deaths data by using Regex in my C# application.
But I cannot seem to work it out, I've been spending hours and hours on
http://myregextester.com/index.php
The expression I use is :
<tr bgcolor=(?:"#D4C0A1"|"#F1E0C6") ><td width="25%" valign="top" >(.*?)?#160;CET</td><td>((?:Died|Killed) at Level ([^ ]*)|and) by (?:<[^>]*>)?([^<]*).</td></tr>
But I cannot make it work.
I want the Timestamp, creature / player Level, and creature / player name
Thanks in advance.
-Regards

It's a bad idea to use regular expressions to parse HTML. They're a very poor tool for the job. If you're parsing HTML, use an HTML parser.
For .NET, the usual recommendation is to use the HTML Agility Pack.

As suggested by Joe White, you would have a much more robust implementation if you use an HTML parser for this task. There is plenty of support for this on StackOverflow: see here for example.
If you really have to use regexs
I would recommend breaking your solution down into simpler regexs which can be applied using a top down parsing approach to get the results.
For example:
use a regex on the whole page which matches the character table
I would suggest matching the shortest unique string before and after the table rather than the table itself, and capturing the table using a group, since this avoids having to deal with the possibility of nested tables.
use a regex on the character table that matches table rows
use a regex on the first cell to match the date
use a regex on the second cell to match links
use a regex on the second cell to match the players level
use a regex on the second cell to match the killers name if it was a creature (there are no links in the cell)
This will be much more maintainable if the site changes its Html structure significantly.
A complete working implementation using HtmlAgilityKit
You can dowload the library from the HtmlAgilityKit site on CodePlex.
// This class is used to represent the extracted details
public class DeathDetails
{
public DeathDetails()
{
this.KilledBy = new List<string>();
}
public string DeathDate { get; set; }
public List<String> KilledBy { get; set; }
public int PlayerLevel { get; set; }
}
public class CharacterPageParser
{
public string CharacterName { get; private set; }
public CharacterPageParser(string characterName)
{
this.CharacterName = characterName;
}
public List<DeathDetails> GetDetails()
{
string url = "http://www.tibia.com/community/?subtopic=characters&name=" + this.CharacterName;
string content = GetContent(url);
HtmlDocument document = new HtmlDocument();
document.LoadHtml(content);
HtmlNodeCollection tables = document.DocumentNode.SelectNodes("//div[#id='characters']//table");
HtmlNode table = GetCharacterDeathsTable(tables);
List<DeathDetails> deaths = new List<DeathDetails>();
for (int i = 1; i < table.ChildNodes.Count; i++)
{
DeathDetails details = BuildDeathDetails(table, i);
deaths.Add(details);
}
return deaths;
}
private static string GetContent(string url)
{
using (System.Net.WebClient c = new System.Net.WebClient())
{
string content = c.DownloadString(url);
return content;
}
}
private static DeathDetails BuildDeathDetails(HtmlNode table, int i)
{
DeathDetails details = new DeathDetails();
HtmlNode tableRow = table.ChildNodes[i];
//every row should have two cells in it
if (tableRow.ChildNodes.Count != 2)
{
throw new Exception("Html format may have changed");
}
HtmlNode deathDateCell = tableRow.ChildNodes[0];
details.DeathDate = System.Net.WebUtility.HtmlDecode(deathDateCell.InnerText);
HtmlNode deathDetailsCell = tableRow.ChildNodes[1];
// get inner text to parse for player level and or creature name
string deathDetails = System.Net.WebUtility.HtmlDecode(deathDetailsCell.InnerText);
// get player level using regex
Match playerLevelMatch = Regex.Match(deathDetails, #" level ([\d]+) ", RegexOptions.IgnoreCase);
int playerLevel = 0;
if (int.TryParse(playerLevelMatch.Groups[1].Value, out playerLevel))
{
details.PlayerLevel = playerLevel;
}
if (deathDetailsCell.ChildNodes.Count > 1)
{
// death details contains links which we can parse for character names
foreach (HtmlNode link in deathDetailsCell.ChildNodes)
{
if (link.OriginalName == "a")
{
string characterName = System.Net.WebUtility.HtmlDecode(link.InnerText);
details.KilledBy.Add(characterName);
}
}
}
else
{
// player was killed by a creature - capture creature name
Match creatureMatch = Regex.Match(deathDetails, " by (.*)", RegexOptions.IgnoreCase);
string creatureName = creatureMatch.Groups[1].Value;
details.KilledBy.Add(creatureName);
}
return details;
}
private static HtmlNode GetCharacterDeathsTable(HtmlNodeCollection tables)
{
foreach (HtmlNode table in tables)
{
// Get first row
HtmlNode tableRow = table.ChildNodes[0];
// check to see if contains enough elements
if (tableRow.ChildNodes.Count == 1)
{
HtmlNode tableCell = tableRow.ChildNodes[0];
string title = tableCell.InnerText;
// skip this table if it doesn't have the right title
if (title == "Character Deaths")
{
return table;
}
}
}
return null;
}
And an example of it in use:
CharacterPageParser kixusParser = new CharacterPageParser("Kixus");
foreach (DeathDetails details in kixusParser.GetDetails())
{
Console.WriteLine("Player at level {0} was killed on {1} by {2}", details.PlayerLevel, details.DeathDate, string.Join(",", details.KilledBy));
}

You can also use Espresso tool to work out proper regular expression.
To properly escape all special characters that are not parts of regular expression you can use Regex.Escape method:
string escapedText = Regex.Escape("<td width=\"25%\" valign=\"top\" >");

try this :
http://jsbin.com/atupok/edit#javascript,html
and continue from there .... I did the most job here :)
edit
http://jsbin.com/atupok/3/edit
and start using this tool
http://regexr.com?2vrmf
not the one you have.

Related

MVVM get data from text file

I can not get the logic how to search in a text file and then get the data I need using model view view model.
Basically, I have to make a dictionary app and I have word,language and description in the text file. Like:
cat;e English; it is a four leg animal
In the model I have a text box where the client writes a word and two other boxes, where language and description of the word should be shown.
I just can not get how to search in this file. I tried to search online but nothing seemed to meet my exact question.

Unless your file is going to change you can get away with reading the entire file up front when running your application and putting the data into lists of models for your view models.
As this is essentially a CSV file, and assuming each entry is a line, using a Semi-colon as the delimiter we can use the .Net CSV parser to process your file into your models:
Basic Model:
public class DictionaryEntryModel {
public string Word { get; set; }
public string Language { get; set; }
public string Description { get; set; }
}
Example view model with a constructor to fill out your models:
public class DictionaryViewModel {
// This will be a INotify based property in your VM
public List<DictionaryEntryModel> DictionaryEntries { get; set; }
public DictionaryViewModel () {
DictionaryEntries = new List<DictionaryEntryModel>();
// Create a parser with the [;] delimiter
var textFieldParser = new TextFieldParser(new StringReader(File.ReadAllText(filePath)))
{
Delimiters = new string[] { ";" }
};
while (!textFieldParser.EndOfData)
{
var entry = textFieldParser.ReadFields();
DictionaryEntries.Add(new DictionaryEntryModel()
{
Word = entry[0],
Language = entry[1],
Description = entry[2]
});
}
// Don't forget to close!
textFieldParser.Close();
}
}
You can now bind your view using the property DictionaryEntries and as long as your app is open it will preserve your full file as the list of DictionaryEntryModel.
Hope this helps!

I'm not addressing the MVVM part here, but just how to search the text file in order to get resulting data according to a search term, using case insensitive regex.
string dictionaryFileName = #"C:\Test\SampleDictionary.txt"; // replace with your file path
string searchedTerm = "Cat"; // Replace with user input word
string searchRegex = string.Format("^(?<Term>{0});(?<Lang>[^;]*);(?<Desc>.*)$", searchedTerm);
string foundTerm;
string foundLanguage;
string foundDescription;
using (var s = new StreamReader(dictionaryFileName, Encoding.UTF8))
{
string line;
while ((line = s.ReadLine()) != null)
{
var matches = Regex.Match(line, searchRegex, RegexOptions.IgnoreCase);
if (matches.Success)
{
foundTerm = matches.Groups["Term"].Value;
foundLanguage = matches.Groups["Lang"].Value;
foundDescription = matches.Groups["Desc"].Value;
break;
}
}
}
Then you can display the resulting strings to the user.
Note that this will work for typical input words, but it might produce strange results if the user inputs special characters that interfere with the regular expression syntax. Most of this might be corrected by utilizing Regex.Escape(searchedTerm).

Really slow load speed Neo4jClient C# LoadCsv

The code I use now is really slow with about 20 inserts per second and uses a splitter to create multiple csv files to load. Is there a way to use "USING PERIODIC COMMIT 1000" in a proper way using the Neo4jClient for dotnet?
public async Task InsertEdgesByName(List<string> nodeListA, List<string> nodeListB,
List<int> weightList, string type)
{
for (var i = 0; i < nodeListA.Count; i += 200)
{
using (var sw = new StreamWriter(File.OpenWrite($"tempEdge-{type}.csv")))
{
sw.Write("From,To,Weight\n");
for (var j = i;
j < i + 200 &
j < nodeListA.Count;
j++)
{
sw.Write($"{nodeListA[j]}," +
$"{nodeListB[j]}," +
$"{weightList[j]} + id:{j}" +
$"\n");
}
}
var f = new FileInfo($"tempEdge-{type}.csv");
await Client.Cypher
.LoadCsv(new Uri("file://" + f.FullName), "rels", true)
.Match("(from {label: rels.From}), (to {label: rels.To})")
.Create($"(from)-[:{type} {{weight: rels.Weight}}]->(to);")
.ExecuteWithoutResultsAsync();
_logger.LogDebug($"{DateTime.Now}\tEdges inserted\t\tedges inserted: {i}");
}
}
To create the nodes I use
await Client.Cypher
.Create("INDEX ON :Node(label);")
.ExecuteWithoutResultsAsync();
await Client.Cypher
.LoadCsv(new Uri("file://" + f.FullName), "csvNode", true)
.Create("(n:Node {label:csvNode.label, source:csvNode.source})")
.ExecuteWithoutResultsAsync();
The indexing on label does not seem to change the speed of either insert statement. I have about 200.000 edges to insert, at 20 per second this would take hours. Being able to add the USING PERIODIC COMMIT 1000 would clean up my code but wouldn't improve performance by much.
Is there a way to speed up inserts? I know the neo4jclient is not the fastest but I would really like to stay within the asp.net environment.
SimpleNode class
public class SimpleNodeModel
{
public long id { get; set; }
public string label { get; set; }
public string source { get; set; } = "";
public override string ToString()
{
return $"label: {label}, source: {source}, id: {id}";
}
public SimpleNodeModel(string label, string source)
{
this.label = label;
this.source = source;
}
public SimpleNodeModel() { }
public static string Header => "label,source";
public string ToCSVWithoutID()
{
return $"{label},{source}";
}
}
Cypher code
USING PERIODIC COMMIT 500
LOAD CSV FROM 'file://F:/edge.csv' AS rels
MATCH (from {label: rels.From}), (to {label: rels.To})
CREATE (from)-[:edge {{weight: rels.Weight}}]->(to);

Regarding the slow speed of the Cypher code at the bottom, that's because you're not using labels in your MATCH, so your MATCH never uses the index to find the nodes quickly, it instead must scan every node in your database TWICE, once for from, and again for to.
Your use of label in the node properties is not the same as the node label. Since you created the nodes with the :Node label, please reuse this label in your match:
...
MATCH (from:Node {label: rels.FROM}), (to:Node {label: rels.To})
...

Period commit isn't supported in Neo4jClient in the version you're using.
I've just committed a change that will be published shortly (2.0.0.7) which you can then use:
.LoadCsv(new Uri("file://" + f.FullName), "rels", true, periodicCommit:1000)
which will generate the correct cypher.
It's on its way, and should be 5 mins or so depending on indexing time for nuget.

Trying to get NetSuite Country list with enumeration value linked to code and name

I am implementing a integration with NetSuite in C#. In the external system I need to populate a list of countries that will match NetSuite's country list.
The NetSuite Web Service provides an enumeration call Country
public enum Country {
_afghanistan,
_alandIslands,
_albania,
_algeria,
...
You can also get a list of country Name and Code (in an albeit not so straight forward way) from the web service. (See: http://suiteweekly.com/2015/07/netsuite-get-all-country-list/)
Which gives you access to values like this:
Afghanistan, AF
Aland Islands, AX
Albania, AL
Algeria, DZ
American Samoa, AS
...
But, as you can see, there is no way to link the two together. (I tried to match by index but that didn't work and sounds scary anyway)
NetSuite's "help" files have a list. But this is static and I really want a dynamic solution that updates as NetSuites updates because we know countries will change--even is not that often.
Screenshot of Country Enumerations from NetSuite help docs
The only solutions I have found online are people who have provided static data that maps the two sets of data. (ex. suiteweekly.com /2015/07/netsuite-complete-country-list-in-netsuite/)
I cannot (don't want to) believe that this is the only solution.
Anyone else have experience with this that has a better solution?
NetSuite, if you are reading, come on guys, give a programmer a break.

The best solution I have come up with is to leverage the apparent relationship between the country name and the enumeration key to forge a link between the two. I am sure others could improve on this solution but what I would really like to see is a solution that isn't a hack like this that relies on an apparent pattern but rather on that is based on an explicit connection. Or better yet NetSuite should just provide the data in one place all together.
For example you can see the apparent relationship here:
_alandIslands -> Aland Islands
With a little code I can try to forge a match.
I first get the Enumeration Keys into an array. And I create a list of objects of type NetSuiteCountry that will hold my results.
var countryEnumKeys = Enum.GetNames(typeof(Country));
var countries = new List<NetSuiteCountry>();
I then loop through the list of country Name and Code I got using the referenced code above (not shown here).
For each country name I then strip all non-word characters from the country name with Regex.Replace, prepend an underscore (_) and then convert the string to lowercase. Finally I try to find a match between the Enumeration Key (converted to lowercase as well) and the matcher string that was created. If a match is found I save all the data together the countries list.
UPDATE: Based on the comments I have added additional code/hacks to try to deal with the anomalies without hard-coding exceptions. Hopefully these updates will catch any future updates to the country list as well, but no promises. As of this writing it was able to handle all the known anomalies. In my case I needed to ignore Deprecated countries so those aren't included.
foreach (RecordRef baseRef in baseRefList)
{
var name = baseRef.name;
//Skip Deprecated countries
if (name.EndsWith("(Deprecated)")) continue;
//Use the name to try to find and enumkey match and only add a country if found.
var enumMatcher = $"_{Regex.Replace(name, #"\W", "").ToLower()}";
//Compares Ignoring Case and Diacritic characters
var enumMatch = CountryEnumKeys.FirstOrDefault(e => string.Compare(e, enumMatcher, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0);
//Then try by Enum starts with Name but only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => e.ToLower().StartsWith(enumMatcher));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 1 : ");
enumMatch = matches.First();
}
}
//Then try by Name starts with Enum but only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => enumMatcher.StartsWith(e.ToLower()));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 2 : ");
enumMatch = matches.First();
}
}
//Finally try by first half Enum and Name match but again only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => e.ToLower().StartsWith(enumMatcher.Substring(0, (enumMatcher.Length/2))));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 3 : ");
enumMatch = matches.First();
}
}
if (enumMatch != null)
{
var enumIndex = Array.IndexOf(CountryEnumKeys, enumMatch);
if (enumIndex >= 0)
{
var country = (Country) enumIndex;
var nsCountry = new NetSuiteCountry
{
Name = baseRef.name,
Code = baseRef.internalId,
EnumKey = country.ToString(),
Country = country
};
Debug.WriteLine($"[{nsCountry.Name}] as [{nsCountry.EnumKey}]");
countries.Add(nsCountry);
}
}
else
{
Debug.WriteLine($"Could not find Country match for: [{name}] as [{enumMatcher}]");
}
}
Here is my NetSuiteCountry class:
public class NetSuiteCountry
{
public string Name { get; set; }
public string Code { get; set; }
public string EnumKey { get; set; }
public Country Country { get; set; }
}

Let me start off with a disclaimer that I'm not a coder, and this is the first day I've tried to look at a C# program.
I need something similar for a Javascript project where I need the complete list of Netsuite company names, codes and their numeric values and when reading the help it seemed like the only way was through webservices.
I downloaded the sample application for webservices from Netsuite and a version of Visual Studio and I was able to edit the sample program provided to create a list of all of the country names and country codes (ex. Canada, CA).
I started out doing something similar to the previous poster to get the list of country names:
string[] countryList = Enum.GetNames(typeof(Country));
foreach (string s in countryList)
{
_out.writeLn(s);
}
But I later got rid of this and started a new technique. I created a class similar to the previous answer:
public class NS_Country
{
public string countryCode { get; set; }
public string countryName { get; set; }
public string countryEnum { get; set; }
public string countryNumericID { get; set; }
}
Here is the new code for getting the list of company names, codes and IDs. I realize that it's not very efficient as I mentioned before I'm not really a coder and this is my first attempt with C#, lots of Google and cutting/pasting ;D.
_out.writeLn(" Attempting to get Country list.");
// Create a list for the NS_Country objects
List<NS_Country> CountryList = new List<NS_Country>();
// Create a new GetSelectValueFieldDescription object to use in a getSelectValue search
GetSelectValueFieldDescription countryDesc = new GetSelectValueFieldDescription();
countryDesc.recordType = RecordType.customer;
countryDesc.recordTypeSpecified = true;
countryDesc.sublist = "addressbooklist";
countryDesc.field = "country";
// Create a GetSelectValueResult object to hold the results of the search
GetSelectValueResult myResult = _service.getSelectValue(countryDesc, 0);
BaseRef[] baseRef = myResult.baseRefList;
foreach (BaseRef nsCountryRef in baseRef)
{
// Didn't know how to do this more efficiently
// Get the type for the BaseRef object, get the property for "internalId",
// then finally get it's value as string and assign it to myCountryCode
string myCountryCode = nsCountryRef.GetType().GetProperty("internalId").GetValue(nsCountryRef).ToString();
// Create a new NS_Country object
NS_Country countryToAdd = new NS_Country
{
countryCode = myCountryCode,
countryName = nsCountryRef.name,
// Call to a function to get the enum value based on the name
countryEnum = getCountryEnum(nsCountryRef.name)
};
try
{
// If the country enum was verified in the Countries enum
if (!String.IsNullOrEmpty(countryToAdd.countryEnum))
{
int countryEnumIndex = (int)Enum.Parse(typeof(Country), countryToAdd.countryEnum);
Debug.WriteLine("Enum: " + countryToAdd.countryEnum + ", Enum Index: " + countryEnumIndex);
_out.writeLn("ID: " + countryToAdd.countryCode + ", Name: " + countryToAdd.countryName + ", Enum: " + countryToAdd.countryEnum);
}
}
// There was a problem locating the country enum that was not handled
catch (Exception ex)
{
Debug.WriteLine("Enum: " + countryToAdd.countryEnum + ", Enum Index Not Found");
_out.writeLn("ID: " + countryToAdd.countryCode + ", Name: " + countryToAdd.countryName + ", Enum: Not Found");
}
// Add the countryToAdd object to the CountryList
CountryList.Add(countryToAdd);
}
// Create a JSON - I need this for my javascript
var javaScriptSerializer = new System.Web.Script.Serialization.JavaScriptSerializer();
string jsonString = javaScriptSerializer.Serialize(CountryList);
Debug.WriteLine(jsonString);
In order to get the enum values, I created a function called getCountryEnum:
static string getCountryEnum(string countryName)
{
// Create a dictionary for looking up the exceptions that can't be converted
// Don't know what Netsuite was thinking with these ones ;D
Dictionary<string, string> dictExceptions = new Dictionary<string, string>()
{
{"Congo, Democratic Republic of", "_congoDemocraticPeoplesRepublic"},
{"Myanmar (Burma)", "_myanmar"},
{"Wallis and Futuna", "_wallisAndFutunaIslands"}
};
// Replace with "'s" in the Country names with "s"
string countryName2 = Regex.Replace(countryName, #"\'s", "s");
// Call a function that replaces accented characters with non-accented equivalent
countryName2 = RemoveDiacritics(countryName2);
countryName2 = Regex.Replace(countryName2, #"\W", " ");
string[] separators = {" ","'"}; // "'" required to deal with country names like "Cote d'Ivoire"
string[] words = countryName2.Split(separators, StringSplitOptions.RemoveEmptyEntries);
for (var i = 0; i < words.Length; i++)
{
string word = words[i];
if (i == 0)
{
words[i] = char.ToLower(word[0]) + word.Substring(1);
}
else
{
words[i] = char.ToUpper(word[0]) + word.Substring(1);
}
}
string countryEnum2 = "_" + String.Join("", words);
// return an empty string if the country name contains Deprecated
bool b = countryName.Contains("Deprecated");
if (b)
{
return String.Empty;
}
else
{
// test to see if the country name was one of the exceptions
string test;
bool isExceptionCountry = dictExceptions.TryGetValue(countryName, out test);
if (isExceptionCountry == true)
{
return dictExceptions[countryName];
}
else
{
return countryEnum2;
}
}
}
In the above I used a function, RemoveDiacritics I found here. I will repost the referenced function below:
static string RemoveDiacritics(string text)
{
string formD = text.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
foreach (char ch in formD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(ch);
}
}
return sb.ToString().Normalize(NormalizationForm.FormC);
}
Here are the tricky cases to test any solution you develop with:
// Test tricky names
Debug.WriteLine(getCountryEnum("Curaçao"));
Debug.WriteLine(getCountryEnum("Saint Barthélemy"));
Debug.WriteLine(getCountryEnum("Croatia/Hrvatska"));
Debug.WriteLine(getCountryEnum("Korea, Democratic People's Republic"));
Debug.WriteLine(getCountryEnum("US Minor Outlying Islands"));
Debug.WriteLine(getCountryEnum("Cote d'Ivoire"));
Debug.WriteLine(getCountryEnum("Heard and McDonald Islands"));
// Enums that fail
Debug.WriteLine(getCountryEnum("Congo, Democratic Republic of")); // _congoDemocraticPeoplesRepublic added to exceptions
Debug.WriteLine(getCountryEnum("Myanmar (Burma)")); // _myanmar added to exceptions
Debug.WriteLine(getCountryEnum("Netherlands Antilles (Deprecated)")); // Skip Deprecated
Debug.WriteLine(getCountryEnum("Serbia and Montenegro (Deprecated)")); // Skip Deprecated
Debug.WriteLine(getCountryEnum("Wallis and Futuna")); // _wallisAndFutunaIslands added to exceptions
For my purposes I wanted a JSON object that had all the values for Coutries (Name, Code, Enum, Value). I'll include it here in case anyone is searching for it. The numeric values are useful when you have a 3rd party HTML form that has to forward the information to a Netsuite online form.
Here is a link to the JSON object on Pastebin.
My appologies for the lack of programming knowledge (only really do a bit of javascript), hopefully this additional information will be useful for someone.

How do you get the index of a character in a string when it's less than your starting index?

As per my question, I want to get the index of the first comma prior to my current starting index. To give an example of the data I have a string like this:
Bob Green;PD,Andy Richards;BD,Frank Williams;OW,James Clack;PM
The string contains elements setup as [Persons Name];[Role], so the name is separated from the role by a ; (semi-colon) and each element is separated from each other with a , (comma).
The elements in the string can be in any order, so the reason for my question is that I want to get the person's name out for the role OW. My initial thoughts were to get the index of ;OW, and somehow work back from there. I can obviously loop backwards through the string from my starting index checking to see if the character is a comma but that seems inefficient, so is there a better way to achieve this?
EDIT
To clarify, I only want to get the name associated with the role OW. This role SHOULD only occur in the string once. If it doesn't then I'm happy to only get the first occurrence, which I think IndexOf(";OW,") will do. I don't need the other roles or names, just the name associated with OW.
Also, roles will only ever be 2 characters long. As Matt Burland pointed out, if it's at the end of the string it won't have a trailing comma. However, I can amend my indexof to simply search for ";OW" as roles are only 2 characters long.

Use String.Split(',') to split your string on the comma into an Array. Then make a custom object:
public RoledPerson{
public string Person;
public string Role;
public RoledPerson(string input){
string[] splitInput = input.Split(';');
Person = splitInput[0];
Role = splitInput[1];
}
}
Then you can convert your string into an Enumerable as follows:
var RoledPersons inputstring.Split(',').Select(string => new RoledPerson(string));
Then you can just find whatever RoledPerson has OW as his role:
var RoledPersonsWithRole = RoledPersons.Where(roledperson => roledperson.Role == "OW");
As Matt Burland said, you can also do this with a Dictionary<string, string>. I'll leave how to work this out to you. However, this doesn't support multiple keys with the same name, so this won't work if you have the same role multiple times.
Disclaimer: there might be errors in here.

how about a regular expression. this should work:
string role = "OW";
string str = "Bob Green;PD,Andy Richards;BD,Frank Williams;OW,James Clack;PM";
string pattern = "([^,]*);" + role;
var match = Regex.Match(str, pattern);
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}

Use the string.LastIndexOf overload that includes the starting position:
string s = "Bob Green;PD,Andy Richards;BD,Frank Williams;OW,James Clack;PM";
int startRole = s.IndexOf(";OW");
int startName = s.LastIndexOf(',',startRole) + 1; // start at the semicolon before the role
string name s.Substring(startName,(startRole-startName));
Note that there are edge cases that need to be considered:
Are all roles two characters (e.g. could there be a OWX role)?
If the OW role is the first in the list there will be no comma before it
Are there multiple OW roles? If so you could use a while loop and just start the search at the end of the previous role string.

This is a basic way to get all OW not just the first or the last one. Feed in OW to the second function.
class stringSections
{
private List<string> role = new List<string>();
private List<string> name = new List<string>();
public void Input(string input)
{
string temp = "";
for(int i =0;i<input.Length;i++)
{
if(input[i]==';')
{
name.Add(temp);
temp = "";
} else if(input[i]==',')
{
role.Add(temp);
temp = "";
} else
{
temp += input[i];
}
}
}
public List<string> GetAll(string prole)
{
List<string> reterners = new List<string>();
for(int i = 0; i < role.Count;i++)
{
if (role[i] == prole)
{
reterners.Add(name[i]);
}
}
return reterners;
}
}

To handle the edge cases raised by #D Stanley:
var s = "Bob Williams;OW,Bob Green;PD,Frank Williams;OW,Andy Richards;BD,James Clack;PM,Dave Williams;OW";
var r = new Regex("(;OW,|^OW,|;OW$)");
if (r.IsMatch(s))
{
foreach (Match m in r.Matches(s))
{
var rIdx = m.Index;
var pIdx = s.LastIndexOf(",",rIdx);
var person = s.Substring(pIdx + 1, rIdx - pIdx - 1);
Console.WriteLine(person);
}
}
else
{
Console.WriteLine("Role not found");
}

To ensure my 'solution' worked I coded a quick apsx web form with the following :
<asp:Content ID="Content3" ContentPlaceHolderID="PageContent" runat="server">
<asp:HiddenField ID="TestString" runat="server" Value="Bob Green;PD,Andy Richards;BD,Frank Williams;OW,James Clack;PM" />
<asp:Label ID="Label1" runat="server" Text="Find people in role"></asp:Label>
<asp:TextBox ID="RoleToFind" runat="server"></asp:TextBox><br /><br />
<asp:TextBox ID="Result" runat="server" Rows="10" TextMode="MultiLine" Width="294px"></asp:TextBox>
<asp:Button ID="SearchButton" runat="server" Text="Search" OnClick="SearchButton_Click" />
I then created a class as per Nate's answer
[Serializable]
public class RolePerson
{
public string Person { get; set; }
public string Role { get; set; }
}
And finally in the code behind the aspx page I have added the following:
public partial class teststack : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
protected void SearchButton_Click(object sender, EventArgs e)
{
List<RolePerson> lrp = new List<RolePerson>();
// Get the string from the hidden field
string strData = this.TestString.Value;
// split the string into an array each value in the array haveing name;role
string[] strRecords = strData.Split(new Char[] {','});
// process the array and add to the list
foreach (string s in strRecords)
{
string[] strRecord = s.Split(new Char[] { ';' });
lrp.Add(new RolePerson{
Person = strRecord[0],
Role = strRecord[1]
});
}
// Find the person for the specified role
FindPerson(lrp, this.RoleToFind.Text);
}
//Find the people for the specified role and add to the results textbox
private void FindPerson(List<RolePerson> lrp, string strRole)
{
this.Result.Text = null;
string strResults = string.Empty;
foreach (RolePerson rp in lrp)
{
if (rp.Role == strRole)
strResults = strResults + rp.Person + "\r\n";
}
this.Result.Text = strResults;
}
}
And the result :
Using the above you would be able to find multiple people in the same role and allow the role to be specified by the user.
The FindPerson could be developed further to use Linq query which would be more efficient on a much larger string.

Splitting a list<> that each item contains comma separated strings and formatting it out

I have a simple class like this:
class QuickReport
{
public string DeviceName { get; set; }
public string GroupName { get; set; }
public string PinName { get; set; }
public override string ToString()
{
return DeviceName + "," + GroupName + "," + PinName;
}
}
Later I make a list of items with this class:
List<QuickReport> QR = new List<QuickReport>();
Later in my program it will fill up and when I save it in a text file it will be like this example:
HBM\D1,GND,10
HBM\D1,GND,12
HBM\D1,NT_IOp,115
HBM\D1,NT_IOp,117
HBM\D2,GND,8
HBM\D2,NT_IOp,115
HBM\D2,NT_IOp,116
Now I want to make a function to save the text file in more readable manner. That is formatting it by DEVICE, GROUPS and PINS. So the above example would result in:
HBM\D1
GND: 10, 12
NT_IOp: 115, 117
HBM\D2
GND: 8
NT_IOp: 115, 116
can you please help and give some ideas?
Thanks!

var query = QR.ToLookup(i=>i.DeviceName, i => new {i.GroupName, i.PinName})
.Select(i=>
new {DeviceName = i.Key,
Groups = i.ToLookup(g=>g.GroupName, g=>g.PinName)});
var sb = new StringBuilder();
foreach ( var device in query)
{
sb.AppendLine(device.DeviceName);
foreach ( var gr in device.Groups)
{
sb.Append(gr.Key + ": ");
sb.Append(String.Join(", ", gr.ToArray()));
sb.AppendLine();
}
sb.AppendLine();
}
var stringToWrite = sb.ToString();

As i understand you have tree structure, where Device have child Groups, and Groups have child pins.
You can create custom classes like this:
class Group
{
string Name;
//pins that belong to this group
List<string> pins;
}
class Device
{
string Name;
//groups that belong to this device
List<Group> Groups;
}
And than just collect it to List<Device> and serialize it using XML Serialization.

This isn't complete, but it should give you enough to go on. You'll still need to add your newlines, and remove trailing commas, etc.
// Make your key the device name
var qrHash = new Dictionary<string, List<QuickReport>>();
// Populate your QR Dictionary here.
var output = new StringBuilder();
foreach (var keyValuePair in qrHash)
{
output.Append(keyValuePair.Key);
var gnd = new StringBuilder("GND: ");
var nt = new StringBuilder("NT_IOp: ");
foreach (var qr in keyValuePair.Value)
{
gnd.Append(qr.GroupName);
nt.Append(qr.PinName);
}
output.Append(gnd);
output.Append(nt);
}

How about using the XmlSerializer to serialize and deserialize your class? This should provide some readable output.
http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx

The quickest ways I can think of to do this would either be to loop over the List<> 3 times, eachtime checking on a seperate accessor, writing it out to a StringBuilder, then returning StringBuilder.ToString() from the function.
Or, you could use 3 stringbuilders to hold each accessor type, then push all 3 from the function on return.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# regex data from website - c#

It's a bad idea to use regular expressions to parse HTML. They're a very poor tool for the job. If you're parsing HTML, use an HTML parser. For .NET, the usual recommendation is to use the HTML Agility Pack.

You can also use Espresso tool to work out proper regular expression. To properly escape all special characters that are not parts of regular expression you can use Regex.Escape method: string escapedText = Regex.Escape("<td width=\"25%\" valign=\"top\" >");

try this : http://jsbin.com/atupok/edit#javascript,html and continue from there .... I did the most job here :) edit http://jsbin.com/atupok/3/edit and start using this tool http://regexr.com?2vrmf not the one you have.

Related

MVVM get data from text file

Really slow load speed Neo4jClient C# LoadCsv

Trying to get NetSuite Country list with enumeration value linked to code and name

How do you get the index of a character in a string when it's less than your starting index?

Splitting a list<> that each item contains comma separated strings and formatting it out

Categories

Resources