Writing from a list to a text file C# - c#

How do I code the whole list into the text file with commas in between each bit of data? Currently it is creating the file newData, but it is not putting in the variables from the list. Here is what I have so far.
public partial class Form1 : Form {
List<string> newData = new List<string>();
}
Above is where I create my list. Below is where I am reading it from.
private void saveToolStripMenuItem_Click(object sender, EventArgs e) {
TextWriter tw = new StreamWriter("NewData.txt");
tw.WriteLine(newData);
buttonSave.Enabled = true;
textBoxLatitude.Enabled = false;
textBoxLongtitude.Enabled = false;
textBoxElevation.Enabled = false;
}
And below is where the variables are coming from.
private void buttonSave_Click(object sender, EventArgs e) {
newData.Add (textBoxLatitude.Text);
newData.Add (textBoxLongtitude.Text);
newData.Add (textBoxElevation.Text);
textBoxLatitude.Text = null;
textBoxLongtitude.Text = null;
textBoxElevation.Text = null;
}

While you can use String.Join as others have mentioned they're ignoring three important things:
The fact that what you're really trying to do is write a comma-separated values file
The input that you're receiving and whether or not it will have commas in it
If you sanitize your input, what the current culture on the thread is when you write it out to the file
You want to write a comma-delimited file. There's no standardized format for this, but you do have to be careful of string content, especially in your case, where you're getting user input. Consider the following input:
latitude = "39,41"
longitude = "41,20"
There are a number of countries where the comma is used as a decimal separator, so this kind of input is very possible, depending on how distributed your application is (I'd be even more concerned if this was a website, personally).
And when getting the elevation, it's absolutely possible in most other places that use a comma as the thousands separator:
elevation = 20,000
In all of the other answers, your output for the line in the file will be:
39,41,41,20,20,000
Which when parsed (assuming it will be parsed, you're creating a machine-readable format) will fail.
What you want to do is parse the content first into a decimal and then output that.
Assuming you sanitize your input like so:
decimal latitude = Decimal.Parse(textBoxLatitude.Text);
decimal longitude = Decimal.Parse(textBoxLongitude.Text);
decimal elevation = Decimal.Parse(textBoxElevation.Text);
You would then format the values so that there are no commas (if you want).
To that end, I really recommend that you want to use a dedicated CSV writer/parser (try ServiceStack's serializer on NuGet, or others, if you prefer), which accounts for commas within the content you want separated by commas.

private void saveToolStripMenuItem_Click(object sender, EventArgs e)
{
TextWriter tw = new StreamWriter("NewData.txt");
tw.WriteLine(String.Join(", ", newData));
// Add appropriate error detection
}
In response to the discussion in both main answer threads, here is an example from my older code of a more robust way to handle CSV output:
The above not checked for syntax, but the key concept is String.Join.
public const string Quote = "\"";
public static void EmitCsvLine(TextWriter report, IList<string> values)
{
List<string> csv = new List<string>(values.Count);
for (var z = 0; z < values.Count; z += 1)
{
csv.Add(Quote + values[z].Replace(Quote, Quote + Quote) + Quote);
}
string line = String.Join(",", csv);
report.WriteLine(line);
}
This could be made slightly more general with an IEnumerable<object> but in the code I took this form, I didn't have the need to.

You cannot output the list just by calling tw.WriteLine(newData);
But something like this will achieve that:
tw.WriteLine(string.Join(", ", newData));

you could:
StringBuilder b = new StringBuilder();
foreach (string s in yourList)
{
b.Append(s);
b.Append(", ");
}
string dir = "c:\mypath";
File.WriteAllText(dir, b.ToString());

You have to iterate the List (not tested) or use string.Join, as the other users suggested (you need to convert your list to an array then)
private void saveToolStripMenuItem_Click(object sender, EventArgs e)
{
TextWriter tw = new StreamWriter("NewData.txt");
for (int i = 0; i < newData.Count; i++)
{
tw.Write(newData[i]);
if(i < newData.Count-1)
{
tw.Write(",");
}
}
tw.close();
buttonSave.Enabled = true;
textBoxLatitude.Enabled = false;
textBoxLongtitude.Enabled = false;
textBoxElevation.Enabled = false;
}

Related

Load huge txt file for winform quickly

I am going to make a sinhala english dictionary. SO i have a file that contains sinhala meaning for every english word. So i thought to load it while form is loading. So i added following command to get all file content to a string variable. SO i used following command in FormLoad method,
private string DictionaryWords = "";
private string ss = null;
...
private void Form1_Load(object sender, EventArgs e)
{
this.BackColor = ColorTranslator.FromHtml("#AFC3E0");
string fileName = #"SI-utf8.Txt";
using (StreamReader sr = File.OpenText(fileName))
{
while ((ss = sr.ReadLine()) != null)
{
DictionaryWords += ss;
}
}
}
But unfortunately that txt file has 130000+ line and it size it more than 5MB. SO my winform not loading.
see the image
I need to load this faster for winform to use REGEX form getting right meaning for every english word..
Could anybody tell me a method to do this. I tried everything.
Load this huge file to my project within 15 more less and need to use Regex for finding each english words..
Well, there are too little code to analyze. I suspect that
DictionaryWords += ss;
is the felon: appending string 130000 times which means re-creating quite long string over and over again can well put the system on the knees, but I have not rigorous proof (I've asked about DictionaryWords in the comment). Another possible candidate to be blamed is the unknown for me your regular expression.
That's why let me try to solve the problem from scratch.
We a have a (long) dictionary in SI-utf8.Txt.
We should load the dictionary without freezing the UI.
We should use the dictionary loaded to translate the English texts.
I have got something like this:
using System.IO;
using System.Linq;
using System.Threading.Tasks;
...
// Loading dictionary (async, since dictionary can be quite long)
// static: we want just one dictionary for all the instances
private static readonly Task<IReadOnlyDictionary<string, string>> s_Dictionary =
Task<IReadOnlyDictionary<string, string>>.Run(() => {
char[] delimiters = { ' ', '\t' };
IReadOnlyDictionary<string, string> result = File
.ReadLines(#"SI-utf8.Txt")
.Where(line => !string.IsNullOrWhiteSpace(line))
.Select(line => line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries))
.Where(items => items.Length == 2)
.ToDictionary(items => items[0],
items => items[1],
StringComparer.OrdinalIgnoreCase);
return result;
});
Then we need a translation part:
// Let it be the simplest regex: English letters and apostrophes;
// you can improve it if you like
private static readonly Regex s_EnglishWords = new Regex("[A-Za-z']+");
// Tanslation is async, since we have to wait for dictionary to be loaded
private static async Task<string> Translate(string englishText) {
if (string.IsNullOrWhiteSpace(englishText))
return englishText;
var dictionary = await s_Dictionary;
return s_EnglishWords.Replace(englishText,
match => dictionary.TryGetValue(match.Value, out var translation)
? translation // if we know the translation
: match.Value); // if we don't know the translation
}
Usage:
// Note, that button event should be async as well
private async void button1_Click(object sender, EventArgs e) {
TranslationTextBox.Text = await Translate(OriginalTextBox.Text);
}
Edit: So, DictionaryWords is a string and thus
DictionaryWords += ss;
is a felon. Please, don't append string in a (deep) loop: each append re-creates the string which is slow. If you insist on the looping, use StringBuilder:
// Let's pre-allocate a buffer for 6 million chars
StringBuilder sb = new StringBuilder(6 * 1024 * 1024);
using (StreamReader sr = File.OpenText(fileName))
{
while ((ss = sr.ReadLine()) != null)
{
sb.Append(ss);
}
}
DictionaryWords = sb.ToString();
Or, why should you loop at all? Let .net do the work for you:
DictionaryWords = File.ReadAllText(#"SI-utf8.Txt");
Edit 2: If actual file size is not that huge (it is DictionaryWords += ss; alone who spoils the fun) you can stick to a simple synchronous solution:
private static readonly Regex s_EnglishWords = new Regex("[A-Za-z']+");
private static readonly IReadOnlyDictionary<string, string> s_Dictionary = File
.ReadLines(#"SI-utf8.Txt")
.Where(line => !string.IsNullOrWhiteSpace(line))
.Select(line => line.Split(new char[] { ' ', '\t' },
StringSplitOptions.RemoveEmptyEntries))
.Where(items => items.Length == 2)
.ToDictionary(items => items[0],
items => items[1],
StringComparer.OrdinalIgnoreCase);
private static string Translate(string englishText) {
if (string.IsNullOrWhiteSpace(englishText))
return englishText;
return s_EnglishWords.Replace(englishText,
match => s_Dictionary.TryGetValue(match.Value, out var translation)
? translation
: match.Value);
}
An then the usage is quite simple:
// Note, that button event should be async as well
private void button1_Click(object sender, EventArgs e) {
TranslationTextBox.Text = Translate(OriginalTextBox.Text);
}

Question about file reading and comparing

First Off I have a File That Looks Like This:
//Manager Ids
ManagerName: FirstName_LastName
ManagerLoginId: 12345
And a Text Box That has a five digit code(ex. 12345) That gets entered. When the Enter Key Is pressed it is assigned to a String called: "EnteredEmployeeId", Then What I need is to search the Entire file above for "EnteredEmployeeId" and if it matches then it will open another page, if it doesn't find that number then display a message(That tells you no employee Id found).
So essentially Im trying to open a file search the entire document for the Id then return true or false to allow it too either display an error or open a new page, and reset the EnteredEmployeeId to nothing.
My Code So Far:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace Rent_a_Car
{
public partial class Employee_Login_Page : Form
{
public Employee_Login_Page()
{
InitializeComponent();
}
string ManagersPath = #"C:\Users\Name\Visual Studios Project Custom Files\Rent A Car Employee Id's\Managers\Manager_Ids.txt"; //Path To Manager Logins
string EnteredEmployeeId;
private void textBox1_TextChanged(object sender, EventArgs e)
{
}
private void Employee_Id_TextBox_KeyPress(object sender, KeyPressEventArgs e)
{
if (!char.IsControl(e.KeyChar) && !char.IsDigit(e.KeyChar) && //Checks Characters entered are Numbers Only and allows them
(e.KeyChar != '0'))
{
e.Handled = true;
}
else if (e.KeyChar == (char)13) //Checks if The "Enter" Key is pressed
{
EnteredEmployeeId = Employee_Id_TextBox.Text; //Assigns EnteredEmployeeId To the Entered Numbes In Text Box
bool result = ***IsNumberInFile***(EnteredEmployeeId, "ManagerLoginId:", ManagersPath);
if (result)
{
//open new window
}
else
{
MessageBox.Show("User Not Found");
}
}
}
}
}
This function will read through whole file and find if there is inserted code. It will work with strings (as it is output of your text box) and will return only true or false (employee is or is not in file) not his name, surname etc.
static bool IsNumberInFile(string numberAsString, string LineName, string FileName)
{
var lines = File.ReadAllLines(FileName);
foreach(var line in lines)
{
var trimmedLine = line.Replace(" ", ""); //To remove all spaces in file. Not expecting any spaces in the middle of number
if (!string.IsNullOrEmpty(trimmedLine) && trimmedLine.Split(':')[0].Equals(LineName) && trimmedLine.Split(':')[1].Equals(numberAsString))
return true;
}
return false;
}
//Example of use
String ManagersPath = #"C:\Users\Name\Visual Studios Project Custom Files\Employee Id's\Managers\Manager_Ids.txt"; //Path To Manager Logins
String EnteredEmployeeId;
private void textBox1_TextChanged(object sender, EventArgs e)
{
}
private void Employee_Id_TextBox_KeyPress(object sender, KeyPressEventArgs e)
{
if (!char.IsControl(e.KeyChar) && !char.IsDigit(e.KeyChar) && //Checks Characters entered are Numbers Only and allows them
(e.KeyChar != '0'))
{
e.Handled = true;
}
else if (e.KeyChar == (char)13) //Checks if The "Enter" Key is pressed
{
EnteredEmployeeId = Employee_Id_TextBox.Text; //Assigns EnteredEmployeeId To the Entered Numbes In Text Box
bool result = IsNumberInFile(EnteredEmployeeId, "ManagerLoginId" , ManagersPath)
if(result)
//User is in file
else
//User is not in file
}
}
}
Short answer
Is your question about how to read your file?
private bool ManagerExists(int managerId)
{
return this.ReadManagers().Where(manager => manager.Id == managerId).Any();
}
private IEnumerable<Manager> ReadManagers()
{
using (var reader = System.IO.File.OpenText(managersFileName))
{
while (!reader.EndOfStream)
{
string lineManagerName = reader.ReadLine();
string lineMangerId = reader.ReadLine();
string managerName = ExtractValue(lineManagerName);
int managerId = Int32.Parse(ExtractValue(lineManagerId));
yield return new Manager
{
Id = managerId,
Name = managerName,
}
}
}
private string ExtractValue(string text)
{
// the value of the read text starts after the space:
const char separator = ' ';
int indexSeparator = text.IndexOf(separator);
return text.SubString(indexSeparator + 1);
}
Long Answer
I see several problems in your design.
The most important thing is that you intertwine your manager handling with your form.
You should separate your concerns.
Apparently you have the notion of a sequence of Managers, each Manager has a Name (first name, last name) and a ManagerId, and in future maybe other properties.
This sequence is persistable: it is saved somewhere, and if you load it again, you have the same sequence of Managers.
In this version you want to be able to see if a Manager with a given ManagerId exists. Maybe in future you might want more functionality, like fetching information of a Manager with a certain Id, or Fetch All managers, or let's go crazy: Add / Remove / Change managers!
You see in this description I didn't mention your Forms at all. Because I separated it from your Forms, you can use it in other forms, or even in a class that has nothing to do with a Form, for instance you can use it in a unit test.
I described what I needed in such a general from, that in future I might even change it. Users of my persistable manager collection wouldn't even notice it: I can put it in a JSON file, or XML; I can save the data in a Dictionary, a database, or maybe even fetch it from the internet.
All that users need to know, is that they have to create an instance of the class, using some parameters, and bingo, you can fetch Managers.
You also give users the freedom to decide how the data is to be saved: if they want to save it in a JSON file, changes in your form class will be minimal.
An object that stores sequences of objects is quite often called a Repository.
Let's create some classes:
interface IManager
{
public int Id {get;}
public string Name {get; set;}
}
interface IManagerRepository
{
bool ManagerExists(int managerId);
// possible future extensions: Add / Retrieve / Update / Delete (CRUD)
IManager Add(IManager manager);
IManager Find(int managerId);
void Update(IManager manager);
void Delete(int ManagerId);
}
class Manager : IManager
{
public Id {get; set;}
public string Name {get; set;}
}
class ManagerFileRepository : IManagerRepository,
{
public ManagerFileRepository(string fileName)
{
// TODO implement
}
// TODO: implement.
}
The ManagerFileRepository saves the managers in a file. It hides for the outside world how the file is internally structured. It could be your file format, it could be a CSV-file, or JSON / XML.
I also separated an interface, so if you later decide to save the data somewhere else, for instance in a Dictionary (for unit tests), or in a database, users of your Repository class won't see the difference.
Let's first see if you can use this class.
class MyForm : Form
{
const string managerFileName = ...
private IManagerRepository ManagerRepository {get;}
public MyForm()
{
InitializeComponent();
this.ManagerRepository = new ManagerFileRepository(managerFileName);
}
public bool ManagerExists(int managerId)
{
return this.ManagerRepository.ManagerExists(managerId);
}
Now let's handle your keyPress:
private void Employee_Id_TextBox_KeyPress(object sender, KeyPressEventArgs e)
{
TextBox textBox = (TextBox)sender;
... // code about numbers and enter key
int enteredManagerId = Int32.Parse(textBox.Text);
bool managerExists = this.ManagerExists(enteredManagerId);
if (managerExists) { ... }
}
This code seems to do what you want in an easy way. It looks transparent. The managerRepository is testable, reusable, simple to extend or change, because users won't notice this. So the class looks good. Let's implement
Implement ManagerFileRepository
There are several ways to implement reading the file:
(1) Read everything at construction time
and keep the read data in memory. If you add Managers they are not saved until you say so. Advantages: after initial startup it is fast. You can make changes and later decide not to save them anyway, so it is just like editing any other file. Disadvantage: if your program crashes, you have lost your changes.
(2) Read the file every time you need information
Advantage: data is always up-to-date, even if others edited the file while your program runs. If you change the manager collection it is immediately saved, so other can use it.
Which solution you choose depends on the size of the file and the importance of never losing data. If you file contains millions of records, then maybe it wasn't very wise to save the data in a file. Consider SQLite to save it in a small fairly fast database.
class ManagerFileRepository : IManagerRepository, IEnumerable<IManager>
{
private readonly IDictionary<int, IManager> managers;
public ManagerFileRepository(string FileName)
{
this.managers = ReadManagers(fileName);
}
public bool ManagerExists(int managerId)
{
return this.Managers.HasKey(managerId);
}
private static IEnumerable<IManager> ReadManagers(string fileName)
{
// See the short answer above
}
}
Room for improvement
If you will be using your manager repository for more things, consider to let the repository implement ICollection<IManager> and IReadOnlyCollection<IManager>. This is quite simple:
public IEnumerable<IManager> GetEnumerator()
{
return this.managers.Values.GetEnumerator();
}
public void Add(IManager manager)
{
this.managers.Add(manager.Id, manager);
}
// etc.
If you add functions to change the manager collection you'll also need a Save method:
public void Save()
{
using (var writer = File.CreateText(FullFileName))
{
const string namePrefix = "ManagerName: ";
const string idPrefix = "ManagerLoginId: ";
foreach (var manager in managers.Values)
{
string managerLine = namePrefix + manager.Name;
writer.WriteLine(managerLine);
string idLine = idPrefix + manager.Id.ToString();
writer.WriteLine(idLine);
}
}
}
Another method of improvement: your file structure. Consider using a more standard file structure: CSV, JSON, XML. There are numerous NUGET packages (CSVHelper, NewtonSoft.Json) that makes reading and writing Managers much more robust.
Summary
Because you separated the concerns of persisting your managers from your form, you can reuse the manager repository, especially if you need functionality to Add / Retrieve / Update / Delete managers.
Because of the separation it is much easier to unit test your functions. And future changes won't hinder users of the repository, because they won't notice that the data has changed.
If your Manager_Ids.txt is in the following format, you can use File.ReadLine() method to traverse the text and query it.
ManagerName: FirstName_LastName1
ManagerLoginId: 12345
ManagerName: FirstName_LastName2
ManagerLoginId: 23456
...
Here is the demo that traverse the .txt.
string ManagersPath = #"D:\Manager_Ids.txt";
string EnteredEmployeeId;
private void textBox_id_KeyDown(object sender, KeyEventArgs e)
{
int counter = 0;
bool exist = false;
string line;
string str = "";
if (e.KeyCode == Keys.Enter)
{
EnteredEmployeeId = textBox_id.Text;
System.IO.StreamReader file =
new System.IO.StreamReader(ManagersPath);
while ((line = file.ReadLine()) != null)
{
str += line + "|";
if (counter % 2 != 0)
{
if (str.Split('|')[1].Split(':')[1].Trim() == EnteredEmployeeId)
{
str = str.Replace("|", "\n");
MessageBox.Show(str);
exist = true;
break;
}
str = "";
}
counter++;
}
if (!exist)
{
MessageBox.Show("No such id");
}
file.Close();
}
}
Besides, I recommend to use "xml", "json" or other formats to serialize the data. About storing the data in "xml", you can refer to the following simple demo.
<?xml version="1.0"?>
<Managers>
<Manager>
<ManagerName>FirstName_LastName1</ManagerName>
<ManagerLoginId>12345</ManagerLoginId>
</Manager>
<Manager>
<ManagerName>FirstName_LastName2</ManagerName>
<ManagerLoginId>23456</ManagerLoginId>
</Manager>
</Managers>
And then use LINQ to XML to query the id.
string ManagersPath = #"D:\Manager_Ids.xml";
string EnteredEmployeeId;
private void textBox_id_KeyDown(object sender, KeyEventArgs e)
{
if (e.KeyCode == Keys.Enter)
{
EnteredEmployeeId = textBox_id.Text;
XElement root = XElement.Load(ManagersPath);
IEnumerable<XElement> manager =
from el in root.Elements("Manager")
where (string)el.Element("ManagerLoginId") == EnteredEmployeeId
select el;
if(manager.Count() == 0)
{
MessageBox.Show("No such id");
}
foreach (XElement el in manager)
MessageBox.Show("ManagerName: " + (string)el.Element("ManagerName") + "\n"
+ "ManagerLoginId: " + (string)el.Element("ManagerLoginId"));
}
}

How can I simulate user input from a console?

Im doing some challenges in HackerRank. I usually use a windows Form project in visualstudio to do the debug, but realize I lost lot of time input the test cases. So I want suggestion of a way I can easy simulate the console.ReadLine()
Usually the challenges have the cases describe with something like this:
5
1 2 1 3 2
3 2
And then is read like: using three ReadLine
static void Main(String[] args) {
int n = Convert.ToInt32(Console.ReadLine());
string[] squares_temp = Console.ReadLine().Split(' ');
int[] squares = Array.ConvertAll(squares_temp,Int32.Parse);
string[] tokens_d = Console.ReadLine().Split(' ');
int d = Convert.ToInt32(tokens_d[0]);
int m = Convert.ToInt32(tokens_d[1]);
// your code goes here
}
Right now I was thinking in create a file testCase.txt and use StreamReader.
using (StreamReader sr = new StreamReader("testCase.txt"))
{
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
Console.WriteLine(line);
}
}
This way I can replace Console.ReadLine() with sr.ReadLine(), but this need have a text editor open, delete old case, copy the new one and save the file each time.
So is there a way I can use a Textbox, so only need copy/paste in the textbox and use streamReader or something similar to read from the textbox?
You can use the StringReader class to read from a string rather than a file.
the solution you accepted! doesn't really emulate the Console.ReadLine(), so you can't paste it directly to HackerRank.
I solved it this way:
.
.
Just paste this class above the static Main method or anywhere inside the main class to hide the original System.Console
class Console
{
public static Queue<string> TestData = new Queue<string>();
public static void SetTestData(string testData)
{
TestData = new Queue<string>(testData.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries).Select(x=>x.TrimStart()));
}
public static void SetTestDataFromFile(string path)
{
TestData = new Queue<string>(File.ReadAllLines(path));
}
public static string ReadLine()
{
return TestData.Dequeue();
}
public static void WriteLine(object value = null)
{
System.Console.WriteLine(value);
}
public static void Write(object value = null)
{
System.Console.WriteLine(value);
}
}
and use it this way.
//Paste the Console class here.
static void HackersRankProblem(String[] args)
{
Console.SetTestData(#"
6
6 12 8 10 20 16
");
int n = int.Parse(Console.ReadLine());
string arrStr = Console.ReadLine();
.
.
.
}
Now your code will look the same! and you can test as many data as you want without changing your code.
Note: If you need more complexes Write or WriteLine methods, just add them and send them to the original System.Console(..args)
Just set Application Arguments: <input.txt
and provide in input.txt your input text.
Be careful to save the file with ANSI encoding.

Array Randomly Splitting String

The string is being split using commas as delimiters. Every time string is printed, it appears in a different order. The String is variable:
' String: Z1,TA,H999.00,T999.00 '
It Successfully splits, however even if the string is exactly the same, when printing the array, we get random new lines and random data missing.
When printed to Text box its either correctly split, or like:
-Z1
-T
-H999.00
-T999.
-00
If the Loop runs again, we get different results. On the odd occasion, it is correctly displayed.
I assume its this code: (EDIT: ITS NOT)
string[] ArrayCleanDataRX = CleanDataRX.Split(',');
foreach (string EntireList1 in ArrayCleanDataRX)
{
TxtZ1.AppendText(EntireList1);
TxtZ1.AppendText("\n");
}
Any Suggestions would be brilliant.
Thank you.
UPDATE: (Still Unsolved)
Update 2: More Code -
#region Global Strings
public string DirtyDataRX; //String contains Data from Serial
public string Z1 = "Z1"; //String to check if Data from serial Contains Z1
private void FeedbackProcessing(object sender, EventArgs e)
{
TxtDirtyDataRX.AppendText(DirtyDataRX); //Populate TxtDirtyTest with DirtyText String
var CleanDataRX = DirtyDataRX; //Clean Data = Dirty Text
var charstoremove = new string[] { "|", "-", "%", " ", " ", " ", "~", "$", "?", "'", ".,", "..,", "..", "..:", ".:", "...", "....", ".....", "......", "......", "......", "-" }; // Contents of CharsToRemove (Removes Bad Charecters from raw serial)
foreach (var c in charstoremove) //C is Char(s) to remove
{
CleanDataRX = CleanDataRX.Replace(c, string.Empty); //Replace C in CleanDataRX with nothing.
}
TxtCleanDataRX.AppendText(CleanDataRX); //Show DirtyDataRX in DirtyDataRX Textbox
#region IfZones and Array Loops
if (CleanDataRX.Contains(Z1)) // If CleanDataRX Contains "Z1" Run Code
{
string[] ArrayZ1 = CleanDataRX.Split(','); //New String Array from CleanDaraRX. Split using Comma as Delimiter
foreach (string StrArrayZ1 in ArrayZ1) // New string Called StrArrayZ1 in ArrayCleanDataRX
{
TxtZ1.AppendText(StrArrayZ1); //Append Textbox with String Array, Loop untill Empty
}
}
#region DirtyRX
private void serialPort1_DataReceived(object sender, System.IO.Ports.SerialDataReceivedEventArgs e)
{
DirtyDataRX = serialPort1.ReadExisting();
this.Invoke(new EventHandler(FeedbackProcessing));
}
#endregion
Code i think is irelevent to the problem is left out to simplify the problem.
Note: Some Array names have been edited slightly..
This peace of code is not enough to answer the question. However if you are using multithreading, you have to use locks to avoid hazardous results.
Example:
lock(TxtZ1)
{
string[] ArrayCleanDataRX = CleanDataRX.Split(',');
foreach (string EntireList1 in ArrayCleanDataRX1)
{
TxtZ1.AppendText(EntireList1);
TxtZ1.AppendText("\n");
}
}
TxtZ1 is a textbox?
Then you should rather do something like:
string[] ArrayCleanDataRX = CleanDataRX.Split(',');
StringBuilder sb = new StringBuilder();
foreach (string EntireList1 in ArrayCleanDataRX)
{
sb.AppendLine(EntireList1);
}
TxtZ1.AppendText(sb.ToString());
I think it would also solve it if you use Environment.Newline instead of \n. Strange things tend to happen with \n in windows controls...
thanks for your help. Problem solved on my own...
It wasn't exactly the loop, or the split...sorry.
This may help Others though......
The error was being thrown because the string was being actively built. The problem is in how the data is read in the serialport read method, this is how it was:
DirtyDataRX = serialPort1.ReadExisting();
this.Invoke(new EventHandler(FeedbackProcessing));
As the data coming through is line oriented .ReadLine should be used....
DirtyDataRX = serialPort1.ReadLine();
this.Invoke(new EventHandler(FeedbackProcessing));
Using ReadLine instead solves the issue at hand.

Getting bibliographic data from text in a PDF and exporting to a window form

I use iText5 for .NET to extract text from a PDF, by using below code.
private void button1_Click(object sender, EventArgs e)
{
PdfReader reader2 = new PdfReader("Scharfetter1969.pdf");
int pagen = reader2.NumberOfPages;
reader2.Close();
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
for (int i = 1; i < 2; i++)
{
textBox1.Text = "";
PdfReader reader = new PdfReader("Scharfetter1969.pdf");
String s = PdfTextExtractor.GetTextFromPage(reader, i, its);
s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
textBox1.Text = s;
reader.Close();
}
}
But I want to get bibliographic data from research paper pdf.
Here is example of data which is extrected from this pdf (in endnote format), Here's a link!
%0 Journal Article
%T Repeated temperature modulation epitaxy for p-type doping and light-emitting diode based on ZnO
%A Tsukazaki, A.
%A Ohtomo, A.
%A Onuma, T.
%A Ohtani, M.
%A Makino, T.
%A Sumiya, M.
%A Ohtani, K.
%A Chichibu, S.F.
%A Fuke, S.
%A Segawa, Y.
%J Nature Materials
%V 4
%N 1
%P 42-46
%# 1476-1122
%D 2004
%I Nature Publishing Group
But remember that this is bibliographic information, it is not available in metadata of this pdf. I want to access Article Type (%O), Title (%T), Authors (%A), Date (%D) and (%I) and show it to different assigned textbox in window form.
I am using C# if any one have any code for this, or guide me how to do this.
PDF is a one-way format. You put data in so that it renders consistently on all devices (monitors, printers, etc) but the format was never intended to pull data back out. Any and all attempts to do that will be pure guess work. iText's PdfTextExtractor works but you are going to have to piece things together based on your own arbitrary set of rules, and these rules will probably change from PDF to PDF. The supplied PDF was created by InDesign which does such a great job of making text look good that it actually makes it even harder to parse the data back out.
That said, if your PDFs are all visually consistent, you could try to pull the data out while retaining formatting and use the formatting rules to guess what is what. That post will get you some HTML formatting that you could guess at. (If this actually works I'd recommend returning something more specific than HTML but I'll leave that up to you.)
Running it against your supplied PDF shows that the title is using the font HelveticaNeue-LightExt at about 17pts so you could write a rule to look for all lines that use that font at that size and combine them together. Authors are done in HelveticaNeue-Condensed at about 10pts so that's another rule.
The below code is a modified version of the one linked to above. Its a full working C# 2010 WinForms app targeting iTextSharp 5.1.1.0. It pulls out the title and authors for the supplied PDF but you'll need to tweak it for other PDFs and meta data. See the comments in the code for specific implementation details.
using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using iTextSharp.text.pdf.parser;
using iTextSharp.text.pdf;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
PdfReader reader = new PdfReader(System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "nmat4-42.pdf"));
TextWithFontExtractionStategy S = new TextWithFontExtractionStategy();
string F = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader, 1, S);
//Buffers to hold various parts from the PDF
List<string> titles = new List<string>();
List<string> authors = new List<string>();
//Array of lines of text
string[] lines = F.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
//Temporary string
string t;
//Loop through each line in the array
foreach (string line in lines)
{
//See if the line looks like a "title"
if (line.Contains("HelveticaNeue-LightExt") && line.Contains("font-size:17.28003"))
{
//Remove the HTML tags
titles.Add(System.Text.RegularExpressions.Regex.Replace(line, "</?span.*?>", "").Trim());
}
//See if the line looks like an "author"
else if (line.Contains("HelveticaNeue-Condensed") && line.Contains("font-size:9.995972"))
{
//Remove the HTML tags and trim extra characters
t = System.Text.RegularExpressions.Regex.Replace(line, "</?span.*?>", "").Trim(new char[] { ' ', ',', '*' });
//Make sure we have a valid name, probably need some more exceptions here, too
if (!string.IsNullOrWhiteSpace(t) && t != "AND")
{
authors.Add(t);
}
}
}
//Write out the title to the console
Console.WriteLine("Title : {0}", string.Join(" ", titles.ToArray()));
//Write out each author
foreach (string author in authors)
{
Console.WriteLine("Author : {0}", author);
}
Console.WriteLine(F);
this.Close();
}
public class TextWithFontExtractionStategy : iTextSharp.text.pdf.parser.ITextExtractionStrategy
{
//HTML buffer
private StringBuilder result = new StringBuilder();
//Store last used properties
private Vector lastBaseLine;
private string lastFont;
private float lastFontSize;
//http://api.itextpdf.com/itext/com/itextpdf/text/pdf/parser/TextRenderInfo.html
private enum TextRenderMode
{
FillText = 0,
StrokeText = 1,
FillThenStrokeText = 2,
Invisible = 3,
FillTextAndAddToPathForClipping = 4,
StrokeTextAndAddToPathForClipping = 5,
FillThenStrokeTextAndAddToPathForClipping = 6,
AddTextToPaddForClipping = 7
}
public void RenderText(iTextSharp.text.pdf.parser.TextRenderInfo renderInfo)
{
string curFont = renderInfo.GetFont().PostscriptFontName;
//Check if faux bold is used
if ((renderInfo.GetTextRenderMode() == (int)TextRenderMode.FillThenStrokeText))
{
curFont += "-Bold";
}
//This code assumes that if the baseline changes then we're on a newline
Vector curBaseline = renderInfo.GetBaseline().GetStartPoint();
Vector topRight = renderInfo.GetAscentLine().GetEndPoint();
iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(curBaseline[Vector.I1], curBaseline[Vector.I2], topRight[Vector.I1], topRight[Vector.I2]);
Single curFontSize = rect.Height;
//See if something has changed, either the baseline, the font or the font size
if ((this.lastBaseLine == null) || (curBaseline[Vector.I2] != lastBaseLine[Vector.I2]) || (curFontSize != lastFontSize) || (curFont != lastFont))
{
//if we've put down at least one span tag close it
if ((this.lastBaseLine != null))
{
this.result.AppendLine("</span>");
}
//If the baseline has changed then insert a line break
if ((this.lastBaseLine != null) && curBaseline[Vector.I2] != lastBaseLine[Vector.I2])
{
this.result.AppendLine("<br />");
}
//Create an HTML tag with appropriate styles
this.result.AppendFormat("<span style=\"font-family:{0};font-size:{1}\">", curFont, curFontSize);
}
//Append the current text
this.result.Append(renderInfo.GetText());
//Set currently used properties
this.lastBaseLine = curBaseline;
this.lastFontSize = curFontSize;
this.lastFont = curFont;
}
public string GetResultantText()
{
//If we wrote anything then we'll always have a missing closing tag so close it here
if (result.Length > 0)
{
result.Append("</span>");
}
return result.ToString();
}
//Not needed
public void BeginTextBlock() { }
public void EndTextBlock() { }
public void RenderImage(ImageRenderInfo renderInfo) { }
}
}
}

Categories