Separating Syllables into an array or list Unity C# - c#

Im trying to seperate each syllable of an english word/name into a list. The code I have is counting the syllables perfectly which I also need, but I cannot figure out how to seperate each syllable from the words given.
Im using an input field but for now the example word is my name.
im very new to programming so please keep that in mind.
preferred example output would be something like {"joh", "nat", "hon"}
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
public class SyllableSeperator : MonoBehaviour
{
string word = "johnathon";
private void Start()
{
SyllableCount(word);
}
public static int SyllableCount(string word)
{
word = word.ToLower().Trim();
List<string> wordList = new List<string>();
bool lastWasVowel = false;
string vowels = "aeiouy";
int count = 0;
foreach (char c in word)
{
if (vowels.Contains(c))
{
if (!lastWasVowel)
{
count++;
lastWasVowel = true;
}
}
else
{
lastWasVowel = false;
}
}
if((word.EndsWith("e") || (word.EndsWith("es") || word.EndsWith("ed"))) && !word.EndsWith("le"))
{
count--;
}
return count;
}
}
what i have tried just does not work at all so I figured I wouldn't post that code.
Again please keep in mind that I am brand new to Unity and programming in general. Someone may have already asked how to do this but the code was too advanced for me.

Assuming you are checking the syllable by going through each character, you can make a temporary string and keep adding characters to the string.
Once you determine that string is good enough to be a syllable, add it to the list.
Like so:
public static int GetSyllableCount(string word, out List<string> syllableList)
{
word = word.ToLower().Trim();
List<string> syllableList = new List<string>();
bool lastWasVowel = false;
string vowels = "aeiouy";
StringBuilder currSyllable = new StringBuilder();
foreach (char c in word)
{
if (vowels.Contains(c))
{
if (!lastWasVowel)
{
lastWasVowel = true;
// Finish this syllable and add to the list
syllableList.Add(currSyllable.ToString());
currSyllable.Clear();
}
}
else
{
lastWasVowel = false;
}
// Add this character to the current syllable
currSyllable.Append(c);
}
if((word.EndsWith("e") || (word.EndsWith("es") || word.EndsWith("ed"))) && !word.EndsWith("le"))
{
// Remove the last syllable?
syllableList.Remove(syllableList.Count - 1);
}
return syllableList.Count;
}
The out parm modifer allows you to output another value.

Related

parsing functions from text file without third party

I’ve been struggling wrapping my head around parsers and lexers, which I’m not even sure is the best way to tackle my challenge. I don’t want to use any third party libraries because of the unnecessary overhead, project size, and possible license issues.
Also there won’t be any arithmetic’s, loops, while’s, for’s or foreach’s
It’s a very simple parser that adds or instantiates objects from a text file.
For instance,
buildings.addWithMaterials([buildable:mansion], {{[materials: brick], [materials: wood]}, {[materials: wood], [materials: brick]}});
Parsing this text would add a Mansion made of two pieces of brick and wood to the buildings Collection.
The object buildable contains the properties Name which in this case is Mansion
and some building components which in this case are the materials Brick and Wood.
Any tip/direction to search for doing this?
I've looked and searched within stackoverflow, most entries I've stumbled upon refer to third parties
like Sprache and more.
If I missed an article, :/ sorry, please point it out to me
Thanks and kind regards, Nick
Providing all of the items in your script are in the same format, you can use something like this to parse it:
public static class ScriptParser
{
public static void Parse(string filename)
{
string[] script = null;
using (StreamReader reader = File.OpenText(filename))
{
// read the whole file, remove all line breaks and split
// by semicolons to get individual commands
script = reader.ReadToEnd().Replace("\r", string.Empty.Replace("\n", string.Empty)).Split(';');
}
foreach (string command in script)
HandleCommand(command);
}
// as arguments seem to be grouped you can't just split
// by commas so this will get the arguments recursively
private static object[] ParseArgs(string argsString)
{
int startPos = 0;
int depth = 0;
List<object> args = new List<object>();
Action<int> addArg = pos =>
{
string arg = argsString.Substring(startPos, pos - startPos).Trim();
if (arg.StartsWith("{") && arg.EndsWith("}"))
{
arg = arg.Substring(1, arg.Length - 2);
args.Add(ParseArgs(arg));
}
else
{
args.Add(arg);
}
}
for (int i = 0; i < argsString.Length; i++)
{
switch (argsString[i])
{
case ',':
if (depth == 0)
{
addArg(i);
startPos = i + 1;
}
break;
case '{':
// increase depth of parsing so that commas
// within braces are ignored
depth++;
break;
case '}':
// decrease depth when exiting braces
depth--;
break;
}
}
// as there is no final comma
addArg(argsString.Length);
return args.ToArray();
}
private static void HandleCommand(string commandString)
{
Command command = new Command();
string prefix = commandString.Substring(0, commandString.IndexOf("("));
command.Collection = prefix.Split('.')[0];
command.Action = prefix.Split('.')[1];
string commandArgs = commandString.Substring(commandString.IndexOf("("));
commandArgs = commandArgs.Substring(1, commandArgs.Length - 2);
command.Args = ParseArgs(commandArgs);
command.Execute();
}
private class Command
{
public string Collection;
public string Action;
public object[] Args;
public void Execute()
{
// need to handle any expected commands
if (Collection == "buildings")
{
if (Action == "addWithMaterials")
{
buildings.AddWithMaterials((string)Args[0], (object[])Args[1]);
}
}
}
}
//Eg.
public static BuildingCollection buildings = new BuildingCollection();
public class BuildingCollection
{
public void AddWithMaterials(string building, object[] materials)
{
Building building = new Building();
foreach (object[] materialsPart in materials)
// more stuff for you to work out
// you will need to match the strings to material classes
}
}
}
Usage:
ScriptParser.Parse("scriptfile.txt");
Note: This lacks error handling, which you will definitely want when parsing a file as it could contain anything.

Find two strings in list with a regular expression

I need to find two strings within a list that contains the characters from another string, which are not in order. To make it clear, an example could be a list of animals like:
lion
dog
bear
cat
And a given string is: oodilgn.
The answer here would be: lion and dog
Each character from the string will be used only once.
Is there a regular expression that will allow me to do this?
You could try to put the given string between []. These brackets will allow choosing - in any order - from these letters only. This may not be a perfect solution, but it will catch the majority of your list.
For example, you could write oodilgn as [oodilgn], then add a minimum number of letters to be found - let's say 3 - by using the curly brackets {}. The full regex will be like this:
[oodilgn]{3,}
This code basically says: find any word that has three of the letters that are located between brackets in any order.
Demo: https://regex101.com/r/MCWHjQ/2
Here is some example algorithm that does the job. I have assumed that the two strings together don't need to take all letters from the text else i make additional commented check. Also i return first two appropriate answers.
Here is how you call it in the outside function, Main or else:
static void Main(string[] args)
{
var text = "oodilgn";
var listOfWords = new List<string> { "lion", "dog", "bear", "cat" };
ExtractWordsWithSameLetters(text, listOfWords);
}
Here is the function with the algorithm. All string manuplations are entirely with regex.
public static void ExtractWordsWithSameLetters(string text, List<string> listOfWords)
{
string firstWord = null;
string secondWord = null;
for (var i = 0; i < listOfWords.Count - 1; i++)
{
var textCopy = text;
var firstWordIsMatched = true;
foreach (var letter in listOfWords[i])
{
var pattern = $"(.*?)({letter})(.*?)";
var regex = new Regex(pattern);
if (regex.IsMatch(text))
{
textCopy = regex.Replace(textCopy, "$1*$3", 1);
}
else
{
firstWordIsMatched = false;
break;
}
}
if (!firstWordIsMatched)
{
continue;
}
firstWord = listOfWords[i];
for (var j = i + 1; j < listOfWords.Count; j++)
{
var secondWordIsMatched = true;
foreach (var letter in listOfWords[j])
{
var pattern = $"(.*?)({letter})(.*?)";
var regex = new Regex(pattern);
if (regex.IsMatch(text))
{
textCopy = regex.Replace(textCopy, "$1*$3", 1);
}
else
{
secondWordIsMatched = false;
break;
}
}
if (secondWordIsMatched)
{
secondWord = listOfWords[j];
break;
}
}
if (secondWord == null)
{
firstWord = null;
}
else
{
//if (textCopy.ToCharArray().Any(l => l != '*'))
//{
// break;
//}
break;
}
}
if (firstWord != null)
{
Console.WriteLine($"{firstWord} { secondWord}");
}
}
Function is far from optimised but does what you want. If you want to return results, not print them just create an array and stuff firstWord and secondWord in it and have return type string[] or add two paramaters with ref out In those cases you will need to check the result in the calling function.
please try this out
Regex r=new Regex("^[.*oodilgn]$");
var list=new List<String>(){"lion","dog","fish","god"};
var output=list.Where(x=>r.IsMatch(x));
result
output=["lion","dog","god"];

Foreach going out of bounds while searching through array c#

The purpose of my program is to take a data txt file and edit it, and/or make additions and subtractions to it.
The text file format is like this:
Name|Address|Phone|# of people coming|isRSVP
The code I have seems to be doing it's job all the way up until I try to click one of the names within a listbox and it needs to search through the multidimensional array to pull information out and place within a set of textboxes where they can be edited. The problem is that the foreach loop I use gives me an out of bounds exception. I tried to do a step into debug to make sure the data is correct in the array and see the process. Everything seems to do correctly but for some reason in the conditional statement person[0]==listbox1.selectedIndex isn't returning true even though both are the same as I seen through the step into process. Any help would be greatly appreciated.
This is my code:
StringBuilder newFile = new StringBuilder();
static string txtList= "guest_list.txt";
static string[] file = File.ReadAllLines(txtList);
static int x = file.Count();
string[,] list = new string[x ,5];
public void loadGuestList()
{
int count2 = 0;
foreach (string line in file)
{
string[] sliced = line.Split('|');
int count = 0;
list[count2, count] = sliced[0];
count++;
list[count2, count] = sliced[1];
count++;
list[count2,count] = sliced[2];
count++;
list[count2,count]= sliced[3];
count++;
list[count2, count] = sliced[4];
count++;
listBox1.Items.Add(list[count2,0]);
count2++;
}
}
private void listBox1_SelectedIndexChanged(object sender, EventArgs e)
{
foreach (string person in list)
{
if ( person[0].ToString()==listBox1.SelectedItem.ToString())
{
addTxt.Text = char.ToString(person[1]);
textPhone.Text = char.ToString(person[2]);
textPeople.Text = char.ToString(person[3]);
if (person[4] == 'n' )
{
}
else
{
chkRSVP.Checked = true;
}
break;
}
}
}
The problem lies in this line:
foreach (string person in list)
The list is defined as being string[,] which when you for each over will do every element, not just the column of data. You really should do something such as:
for(int index = 0; index <= list.GetUpperBound(0); index++)
{
string slice1 = list[index, 0];
string slice2 = list[index, 1];
....
}
or switch to using a Dictionary<string, string[]>().
Try to use a "Person" object and override equals(). Right now you're trying to put your multidimensional array (list[0]) into a string, it'll give you a unwanted result. You should use list[0,0] instead.
In agreement with Adam Gritt, I tested the following code and it seemed to work:
using System;
namespace so_foreach_bounds
{
class MainClass
{
public static void Main (string[] args)
{
//System.Text.StringBuilder newFile = new System.Text.StringBuilder();
string txtList= "guest_list.txt";
string[] file = System.IO.File.ReadAllLines(txtList);
int x = file.Length;
System.Collections.Generic.List<string[]> list = new System.Collections.Generic.List<string[]> ();
foreach (string line in file)
{
string[] sliced = line.Split('|');
list.Add(sliced);
}
foreach (string[] person in list)
{
Console.WriteLine (String.Join(", ", person));
if (person[0] =="Gary")
{
string txtAdd = person[1];
string txtPhone = person[2];
string txtpeople = person[3];
if (person[4] == "n" )
{
}
else
{
bool has_resvped = true;
}
break;
}
}
}
}
}
The issue is how you are iterating over the 2d array. It is usually a better idea to create a "Person" class, than try to manage it via arrays though. Oh yes, and it's usually a good idea to check that a list box item is selected before attempting to use one of its members.

Counting/sorting characters in a text file

I am trying to write a program that reads a text file, sorts it by character, and keeps track of how many times each character appears in the document. This is what I have so far.
class Program
{
static void Main(string[] args)
{
CharFrequency[] Charfreq = new CharFrequency[128];
try
{
string line;
System.IO.StreamReader file = new System.IO.StreamReader(#"C:\Users\User\Documents\Visual Studio 2013\Projects\Array_Project\wap.txt");
while ((line = file.ReadLine()) != null)
{
int ch = file.Read();
if (Charfreq.Contains(ch))
{
}
}
file.Close();
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine("The process failed: {0}", e.ToString());
}
}
}
My question is, what should go in the if statement here?
I also have a Charfrequency class, which I'll include here in case it is helpful/necessary that I include it (and yes, it is necessary that I use an array versus a list or arraylist).
public class CharFrequency
{
private char m_character;
private long m_count;
public CharFrequency(char ch)
{
Character = ch;
Count = 0;
}
public CharFrequency(char ch, long charCount)
{
Character = ch;
Count = charCount;
}
public char Character
{
set
{
m_character = value;
}
get
{
return m_character;
}
}
public long Count
{
get
{
return m_count;
}
set
{
if (value < 0)
value = 0;
m_count = value;
}
}
public void Increment()
{
m_count++;
}
public override bool Equals(object obj)
{
bool equal = false;
CharFrequency cf = new CharFrequency('\0', 0);
cf = (CharFrequency)obj;
if (this.Character == cf.Character)
equal = true;
return equal;
}
public override int GetHashCode()
{
return m_character.GetHashCode();
}
public override string ToString()
{
String s = String.Format("'{0}' ({1}) = {2}", m_character, (byte)m_character, m_count);
return s;
}
}
Have a look at this post.
https://codereview.stackexchange.com/questions/63872/counting-the-number-of-character-occurrences
It uses LINQ to achieve your goal
You shouldn't use Contains
first you need to initialize your Charfreq array:
CharFrequency[] Charfreq = new CharFrequency[128];
for (int i = 0; i < Charferq.Length; i++)
{
Charfreq[i] = new CharFrequency((char)i);
}
try
then you can
int ch;
// -1 means that there are no more characters to read,
// otherwise ch is the char read
while ((ch = file.Read()) != -1)
{
CharFrequency cf = new CharFrequency((char)ch);
// This works because CharFrequency overloads the
// Equals method, and the Equals method checks only
// for the Character property of CharFrequency
int ix = Array.IndexOf(Charfreq, cf);
// if there is the "right" charfrequency
if (ix != -1)
{
Charfreq[ix].Increment();
}
}
Note that this isn't the way I would write the program. This is the minimum changes needed to make your program working.
As a sidenote, this program will count the "frequency" of ASCII characters (characters with code <= 127)
CharFrequency cf = new CharFrequency('\0', 0);
cf = (CharFrequency)obj;
And this is an useless initialization:
CharFrequency cf = (CharFrequency)obj;
is enough, otherwise you are creating a CharFrequency just to discard it the line below.
A dictionary is well suited for a task like this. You didn't say which character set and encoding the file was in. So, because Unicode is so common, let's assume the Unicode character set and UTF-8 encoding. (After all, it is the default for .NET, Java, JavaScript, HTML, XML,….) If that's not the case then read the file using the applicable encoding and fix your code because you currently are using UTF-8 in your StreamReader.
Next comes iterating across the "characters". And then incrementing the count for a "character" in the dictionary as it is seen in the text.
Unicode does have a few complex features. One is combining characters, where a base character can be overlaid with diacritics etc. Users view such combinations as one "character", or, as Unicode calls them, graphemes. Thankfully, .NET gives is the StringInfo class that iterates over them as a "text element."
So, if you think about it, using an array would be quite difficult. You'd have to build your own dictionary on top of your array.
The example below uses a Dictionary and is runnable using a LINQPad script. After it creates the dictionary, it orders and dumps it with a nice display.
var path = Path.GetTempFileName();
// Get some text we know is encoded in UTF-8 to simplify the code below
// and contains combining codepoints as a matter of example.
using (var web = new WebClient())
{
web.DownloadFile("http://superuser.com/questions/52671/which-unicode-characters-do-smilies-like-%D9%A9-%CC%AE%CC%AE%CC%83-%CC%83%DB%B6-consist-of", path);
}
// since the question asks to analyze a file
var content = File.ReadAllText(path, Encoding.UTF8);
var frequency = new Dictionary<String, int>();
var itor = System.Globalization.StringInfo.GetTextElementEnumerator(content);
while (itor.MoveNext())
{
var element = (String)itor.Current;
if (!frequency.ContainsKey(element))
{
frequency.Add(element, 0);
}
frequency[element]++;
}
var histogram = frequency
.OrderByDescending(f => f.Value)
// jazz it up with the list of codepoints in each text element
.Select(pair =>
{
var bytes = Encoding.UTF32.GetBytes(pair.Key);
var codepoints = new UInt32[bytes.Length/4];
Buffer.BlockCopy(bytes, 0, codepoints, 0, bytes.Length);
return new {
Count = pair.Value,
textElement = pair.Key,
codepoints = codepoints.Select(cp => String.Format("U+{0:X4}", cp) ) };
});
histogram.Dump(); // For use in LINQPad

Please critique my class

I've taken a few school classes along time ago on and to be honest i never really understood the concept of classes. I recently "got back on the horse" and have been trying to find some real world application for creating a class.
you may have seen that I'm trying to parse a lot of family tree data that is in an very old and antiquated format called gedcom
I created a Gedcom Reader class to read in the file , process it and make it available as two lists that contain the data that i found necessary to use
More importantly to me is i created a class to do it so I would very much like to get the experts here to tell me what i did right and what i could have done better ( I wont say wrong because the thing works and that's good enough for me)
Class:
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
namespace GedcomReader
{
class Gedcom
{
private string GedcomText = "";
public struct INDI
{
public string ID;
public string Name;
public string Sex;
public string BDay;
public bool Dead;
}
public struct FAM
{
public string FamID;
public string Type;
public string IndiID;
}
public List<INDI> Individuals = new List<INDI>();
public List<FAM> Families = new List<FAM>();
public Gedcom(string fileName)
{
using (StreamReader SR = new StreamReader(fileName))
{
GedcomText = SR.ReadToEnd();
}
ReadGedcom();
}
private void ReadGedcom()
{
string[] Nodes = GedcomText.Replace("0 #", "\u0646").Split('\u0646');
foreach (string Node in Nodes)
{
string[] SubNode = Node.Replace("\r\n", "\r").Split('\r');
if (SubNode[0].Contains("INDI"))
{
Individuals.Add(ExtractINDI(SubNode));
}
else if (SubNode[0].Contains("FAM"))
{
Families.Add(ExtractFAM(SubNode));
}
}
}
private FAM ExtractFAM(string[] Node)
{
string sFID = Node[0].Replace("# FAM", "");
string sID = "";
string sType = "";
foreach (string Line in Node)
{
// If node is HUSB
if (Line.Contains("1 HUSB "))
{
sType = "PAR";
sID = Line.Replace("1 HUSB ", "").Replace("#", "").Trim();
}
//If node for Wife
else if (Line.Contains("1 WIFE "))
{
sType = "PAR";
sID = Line.Replace("1 WIFE ", "").Replace("#", "").Trim();
}
//if node for multi children
else if (Line.Contains("1 CHIL "))
{
sType = "CHIL";
sID = Line.Replace("1 CHIL ", "").Replace("#", "");
}
}
FAM Fam = new FAM();
Fam.FamID = sFID;
Fam.Type = sType;
Fam.IndiID = sID;
return Fam;
}
private INDI ExtractINDI(string[] Node)
{
//If a individual is found
INDI I = new INDI();
if (Node[0].Contains("INDI"))
{
//Create new Structure
//Add the ID number and remove extra formating
I.ID = Node[0].Replace("#", "").Replace(" INDI", "").Trim();
//Find the name remove extra formating for last name
I.Name = Node[FindIndexinArray(Node, "NAME")].Replace("1 NAME", "").Replace("/", "").Trim();
//Find Sex and remove extra formating
I.Sex = Node[FindIndexinArray(Node, "SEX")].Replace("1 SEX ", "").Trim();
//Deterine if there is a brithday -1 means no
if (FindIndexinArray(Node, "1 BIRT ") != -1)
{
// add birthday to Struct
I.BDay = Node[FindIndexinArray(Node, "1 BIRT ") + 1].Replace("2 DATE ", "").Trim();
}
// deterimin if there is a death tag will return -1 if not found
if (FindIndexinArray(Node, "1 DEAT ") != -1)
{
//convert Y or N to true or false ( defaults to False so no need to change unless Y is found.
if (Node[FindIndexinArray(Node, "1 DEAT ")].Replace("1 DEAT ", "").Trim() == "Y")
{
//set death
I.Dead = true;
}
}
}
return I;
}
private int FindIndexinArray(string[] Arr, string search)
{
int Val = -1;
for (int i = 0; i < Arr.Length; i++)
{
if (Arr[i].Contains(search))
{
Val = i;
}
}
return Val;
}
}
}
Implementation:
using System;
using System.Windows.Forms;
using GedcomReader;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
string path = #"C:\mostrecent.ged";
string outpath = #"C:\gedcom.txt";
Gedcom GD = new Gedcom(path);
GraphvizWriter GVW = new GraphvizWriter("Family Tree");
foreach(Gedcom.INDI I in GD.Individuals)
{
string color = "pink";
if (I.Sex == "M")
{
color = "blue";
}
GVW.ListNode(I.ID, I.Name, "filled", color, "circle");
if (I.ID == "ind23800")
{MessageBox.Show("stop");}
//"ind23800" [ label="Sarah Mandley",shape="circle",style="filled",color="pink" ];
}
foreach (Gedcom.FAM F in GD.Families)
{
if (F.Type == "par")
{
GVW.ConnNode(F.FamID, F.IndiID);
}
else if (F.Type =="chil")
{
GVW.ConnNode(F.IndiID, F.FamID);
}
}
string x = GVW.SB.ToString();
GVW.SaveFile(outpath);
MessageBox.Show("done");
}
}
I am particularly interested in if anything could be done about the structures i don't know if how i use them in the implementation is the greatest but again it works
Thanks alot
Quick thoughts:
Nested types should not be visible.
ValueTypes (structs) should be immutable.
Fields (class variables) should not be public. Expose them via properties instead.
Check passed arguments for invalid values, like null.
It might be more readable. It's hard to read and understand.
You may study SOLID principles (http://butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod)
Robert C. Martin gave good presentation on Oredev 2008 about clean code (http://www.oredev.org/topmenu/video/agile/robertcmartincleancodeiiifunctions.4.5a2d30d411ee6ffd2888000779.html)
Some recomended books to read about code readability:
Kent Beck "Implemetation patterns"
Robert C Martin "Clean Code" Robert C
Martin "Agile Principles, Patterns
and Practices in C#"
I suggest you check this place out: http://refactormycode.com/.
For some quick things, your naming is the biggest thing I would start to change.
No need to use ALL-CAPS or abbreviated terms.
Also, FxCop will help with a lot of suggested changes. For example, FindIndexinArray would be named FindIndexInArray.
EDIT:
I don't know if this is a bug in your code or by-design, but in FindIndexinArray, you don't break from your loop once you find a match. Do you want the first (break) or last (no break) match in the array?

Categories