loading TextAsset from file and unable to compare - c#

I am loading TextAsset from resources which is dictionary words and added to List and i want to compare user input word with list whether list contains user input word or not? i have tried many methods but none is working, result is negative. can any one help me out to find out?
public TextAsset txt;
public List<string> words;
void Awake()
{
words = new List<string>();
txt = (TextAsset)Resources.Load("words");
words = TextAssetExtensionMethods.TextAssetToList(txt);
}
public void Search()
{
Debug.Log(inputField.text);
Debug.Log(words.Contains(inputField.text));
Debug.Log(words.FindAll(s => s.Contains(inputField.text)));
Debug.Log(words.FindAll(s => s.IndexOf(inputField.text, StringComparison.OrdinalIgnoreCase) >= 0));
if (words.Contains(inputField.text, StringComparer.CurrentCultureIgnoreCase)) {
Debug.Log("Contains");
} else{
Debug.Log("not");
}
}
public static class TextAssetExtensionMethods {
public static List<string> TextAssetToList(this TextAsset ta) {
return new List<string>(ta.text.Split('\n'));
}
}

I don't know why you have created an extension method for TextAsset class but now, when you have it, you should use it like this (calling own method on TextAsset instance):
words = txt.TextAssetToList();
instad of:
words = TextAssetExtensionMethods.TextAssetToList(txt);
Now, one of possible issues you might have here is leaving spaces in your strings,
just trim out your entries:
Array.ConvertAll(ta.text.Split(','), p => p.Trim()).ToList(); //LINQ used
assuming your words are separated by comma

Related

Compare 2 lists for partial match

C# Folks! I have 2 List that I want to compare.
Example:
List<string> ONE contains:
A
B
C
List<string> TWO contains:
B
C
I know I can achieve the results of ONE if I do:
ONE.Except(TWO);
Results: A
How can I do the same if my Lists contain a file extension for each
Element?
List<string> ONE contains:
A.pdf
B.pdf
C.pdf
List<string> TWO contains: (will always have .txt extension)
B.txt
C.txt
Results should = A.pdf
I realized that I need to display the full filename (A.pdf) in a report at the end, so I cannot strip the extension, like I originally did.
Thanks for the help!
EDIT:
This is how I went about it, but I am not sure if this is the "best" or "most performant" way to actually solve it, but it does seem to work...
foreach (string s in ONE)
{
//since I know TWO will always be .txt
string temp = Path.GetFileNameWithoutExtension(s) + ".txt";
if (TWO.Contains(temp))
{
// yes it exists, do something
}
else
{
// no it does not exist, do something
}
}
This a very straightforward and a easy code , but if your requirement has more file extension
List<string> lstA = new List<string>() { "A.pdf", "B.pdf", "C.pdf" };
List<string> lstB = new List<string>() { "B.txt", "C.txt" };
foreach (var item in lstA)
{
if (lstB.Contains(item.Replace(".pdf",".txt"))==false)
{
Console.WriteLine(item);
}
}
You can implement a custom equality comparer:
class FileNameComparer: IEqualityComparer<String>
{
public bool Equals(String b1, String b2)
{
return Path.GetFileNameWithoutExtension(b1).Equals(Path.GetFileNameWithoutExtension(b2));
}
public int GetHashCode(String a)
{
return Path.GetFileNameWithoutExtension(a).GetHashCode();
}
}
... and pass it to the Except method:
System.Console.WriteLine(string.Join(", ", list1.Except(list2, new FileNameComparer())));

Problems with List<string> when adding strings by variable

I wanna do a list without duplicates from a file which have too many lines with identifier, sometimes repeated. When I try using List<string>.Contains, it doesn't work. This is, I think, because I'm adding object instead of strings directly.
public List<string> obterRelacaoDeBlocos()
{
List<string> listaDeBlocos = new List<string>();
foreach(string linhas in arquivos.obterLinhasDoArquivo())
{
string[] linhaQuebrada = linhas.Split('|');
string bloco = linhaQuebrada[1].ToString();
if (listaDeBlocos.Contains((string)bloco) != true)
{
listaDeBlocos.Add( bloco + ":" + listaDeBlocos.Contains(bloco).ToString());
}
}
return listaDeBlocos;
}
You're appending ":" + listaDeBlocos.Contains(bloco).ToString() to the string before you add it to the list. That's not going to match when you encounter the same word again, so Contains will return false and the same word will get added again.
I don't see what point it serves to append ": true" to the end of each string in the list anyway, so just remove that part and it should work.
if (!listaDeBlocos.Contains(bloco))
{
listaDeBlocos.Add(bloco);
}
Since you're only interested in one part of each string, based on how you're splitting, you could rewrite your method using LINQ. This is untested but should work:
public List<string> obterRelacaoDeBlocos()
{
return arquivos.obterLinhasDoArquivo().Select(x => x.Split('|')[1]).Distinct().ToList();
}

Better method for loading a list of items to be parsed into two, and calling the matching item by index?

I have a .txt file with a list of items (u.s. state and capitals) going down such as Arizona:Phoenix Arkansas:Little Rock California:Sacramento. I'm going to be importing that list, but only want to display the States in a Combobox. After that, if comboBox1.Items[0] is selected, I want it to get the corresponding item that was initially parsed along with it after the : delimiter. My initial solution was to create a class to hold both values, and hold them in a List and compare the index from the Combobox to that of the List to get the matching value. I feel like this might be overkill and I am over thinking it for something as simple as a combobox where the data won't be subjected to any complex manipulations. Would there be a simpler method/datatype to use to do this? I just want to get the corresponding value after the : delimiter from the Combobox index that was parsed when it was first loaded.
First of all build your classes of State & Capital like this:
public class State
{
public string stateName { get; set; }
public Capital capital { get; set; }
}
public class Capital
{
public string capitalName { get; set; }
}
Read the text file, generate a list and populate the ComboBox like this:
List<State> list = new List<State>();
var file = File.ReadAllLines(FilePath).ToList();
foreach (var item in file)
list.Add(new State()
{
stateName = item.Split(':')[0],
capital = new Capital() { capitalName = item.Split(':')[1] }
});
StatesCB.DataSource = list.Select(x => x.stateName).ToList();
And within your ComboBoxIndexChange eventHandler, get the Capital based on the State.
private void Sates_SelectedIndexChanged(object sender, EventArgs e)
{
capital.Text = list.Where(x => x.stateName == StatesCB.SelectedValue)
.Select(x => x.capital.capitalName).FirstOrDefault();
}
It works and address your problem perfetcly.
You can try this:
I assume your text file contains the following lines:
Arizona:Phoenix
Arkansas:Little
Rock California:Sacramento
On your code:
List<string> lstResult = new List<string>();
using (StreamReader sr = new StreamReader(#"C:\Stack\file.txt"))
{
string line = string.Empty;
while ((line = sr.ReadLine()) != null)
{
//Here I am getting the second part of splitted string which is your requirement
lstResult.Add(line.Split(':').Select(x=>x).Skip(1).SingleOrDefault().ToString());
}
}
comboBox1.DataSource = lstResult;
This will produce:

Validate file extensions match approved list

I have a list of list of files that I want to against a list of approved extensions.
The list of approved extensions may be either
Mandatory or Optional
I need to handle two cases
Check that the list of files contains all mandatory extensions
The list of files may only contain extensions in the approved list
Have been trying with regex because some cases can be grouped together. For example .docx and .doc treated as the same
Here is what I have so far (pseudo code)
List<string[]> approvedExt = new List<string[]>();
// M - Mandatory
// O - Optional
approvedExt.Add(new[] { "pdf", "M" });
approvedExt.Add(new[] { "(docx|doc)", "M" }); //Handle as one case
approvedExt.Add(new[] { "(txt)", "O" });
//Example list
List<string> fileList = new List<string>();
fileList.Add("123.pdf");
fileList.Add("123.txt");
fileList.Add("123.xlsx");
fileList.Add("123.pdf");
//pseudo code
For each ext in approvedExt (that are Mandatory)
{
bool checkMandatoryExt = Any file match?
//Example code I have seen
fileList.All(f => System.Text.RegularExpressions.Regex.IsMatch(f, pattern, System.Text.RegularExpressions.RegexOptions.IgnoreCase));
}
if (!checkMandatoryExt)
{
//Handle Error
}
for each file in fileList
{
bool allApprovedExt = Any patterns match?
}
if (!allApprovedExt)
{
//Handle Error
}
The example file list above would fail 2 cases
Contains a .xlsx file (Not in the approved ext list)
Does contain neither a .docx nor .doc file (Mandatory extension not in the file list)
I would like to be able to pass a list of files names and a list of approved extensions and return true/false if the list files passes the two checks above.
Thank you
Here is how I would solve it (pseudo code):
public class Condition
{
public bool Mandatory {get;set;}
public string[] Extensions {get;set;}
}
// ...
//NOTE: includes the . before the extension
public string[] GetExtensions(IEnumerable<string> files)
{
return files.Select(f => Path.GetExtension(f).ToLower()??"").Distinct().ToArray();
}
public bool AllConditionsOk(string[] fileNamesToCheck, Condition[] conditions)
{
//Extract Extension only (e.g. Path.GetExtension)
string[] extensions = GetExtensions(fileNamesToCheck);
//Check if any existing extension is not allowed
foreach(string extension in extensions)
{
if(!conditions.Any(c => c.Extensions.Contains(extension)))
return false;
}
//Check if every mandatory condition is fulfilled
foreach(Condition condition in conditions.Where(c => c.Mandatory))
{
if(!condition.Extensions.Any(e => extensions.Contains(e)))
return false;
}
return true;
}
Or if you prefer a short version:
return extensions.Any(extension => !conditions.Any(c => c.Extensions.Contains(extension))) &&
conditions.Where(c => c.Mandatory)
.All(condition => condition.Extensions.Any(e => extensions.Contains(e)));
If you need to check extensions in list of filenames contains in extension list you can do it like this:
foreach(var file in filelist)
{
approvedExt.Contains(file.split(".").Last();
}

how to efficiently Comparing two lists with 500k objects and strings

So i have a main directory with sub folders and around 500k images. I know alot of theese images does not exist in my database and i want to know which ones so that i can delete them.
This is the code i have so far:
var listOfAdPictureNames = ImageDB.GetAllAdPictureNames();
var listWithFilesFromImageFolder = ImageDirSearch(adPicturesPath);
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
var differenceList = result.ToList();
listOfAdPictureNames is of type List<string>
here is my model that im returing from the ImageDirSearch:
public class CheckNotUsedAdImagesModel
{
public List<ImageDirModel> ListWithUnusedAdImages { get; set; }
}
public class ImageDirModel
{
public string FileName { get; set; }
public string Path { get; set; }
}
and here is the recursive method to get all images from my folder.
private List<ImageDirModel> ImageDirSearch(string path)
{
string adPicturesPath = ConfigurationManager.AppSettings["AdPicturesPath"];
List<ImageDirModel> files = new List<ImageDirModel>();
try
{
foreach (string f in Directory.GetFiles(path))
{
var model = new ImageDirModel();
model.Path = f.ToLower();
model.FileName = Path.GetFileName(f.ToLower());
files.Add(model);
}
foreach (string d in Directory.GetDirectories(path))
{
files.AddRange(ImageDirSearch(d));
}
}
catch (System.Exception excpt)
{
throw new Exception(excpt.Message);
}
return files;
}
The problem I have is that this row:
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
takes over an hour to complete. I want to know if there is a better way to check in my images folder if there are images there that doesn't exist in my database.
Here is the method that get all the image names from my database layer:
public static List<string> GetAllAdPictureNames()
{
List<string> ListWithAllAdFileNames = new List<string>();
using (var db = new DatabaseLayer.DBEntities())
{
ListWithAllAdFileNames = db.ad_pictures.Select(b => b.filename.ToLower()).ToList();
}
if (ListWithAllAdFileNames.Count < 1)
return new List<string>();
return ListWithAllAdFileNames;
}
Perhaps Except is what you're looking for. Something like this:
var filesInFolderNotInDb = listWithFilesFromImageFolder.Select(p => p.FileName).Except(listOfAdPictureNames).ToList();
Should give you the files that exist in the folder but not in the database.
Instead of the search being repeated on each of these lists its optimal to sort second list "listOfAdPictureNames" (Use any of n*log(n) sorts). Then checking for existence by binary search will be the most efficient all other techniques including the current one are exponential in order.
As I said in my comment, you seem to have recreated the FileInfo class, you don't need to do this, so your ImageDirSearch can become the following
private IEnumerable<string> ImageDirSearch(string path)
{
return Directory.EnumerateFiles(path, "*.jpg", SearchOption.TopDirectoryOnly);
}
There doesn't seem to be much gained by returning the whole file info where you only need the file name, and also this only finds jpgs, but this can be changed..
The ToLower calls are quite expensive and a bit pointless, so is the to list when you are planning on querying again so you can get rid of that and return an IEnumerable again, (this is in the GetAllAdPictureNames method)
Then your comparison can use equals and ignore case.
!listOfAdPictureNames.Any(q => p.Equals(q, StringComparison.InvariantCultureIgnoreCase));
One more thing that will probably help is removing items from the list of file names as they are found, this should make the searching of the list quicker every time one is removed since there is less to iterate through.

Categories