Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have a bunch of 40k lines HTML files and I need to extract only sentences from it, so I want to automate this process. Text is located inside such blocks
<div class="text">...</div>
How do I search for these blocks and extract data between them to another file?
If the files are truly HTML files (e.g. They are the source of an actual webpage). Your best bet is to use HtmlAgilityPack which, despite it's age, is still incredibly robust (https://html-agility-pack.net/).
Your code to load the file and get all divs with the class of text would be :
var doc = new HtmlDocument();
doc.Load(filePath);
doc.DocumentNode.SelectNodes("//div[#class='text']");
SelectNodes simply takes an XPath string so it's easy enough to manipulate and the documentation is pretty good!
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I was trying to create a utility that generates an self extracting executable, containing a pregenerated executable and a dynamically generated text file.
I have looked, i may be looking with the wrong keywords, but i have not been able to find anything that would help.
Quick-n-dirty way
You can append whatever you want to an exe and it will work. So you have your pre-made fixed exe unpacker. You append to it an easily-to-find byte sequence, then you append the file. Or better, you append the file to the stub and then append the length as an int64. So in the unpacker you take a look at the last 8 bytes, see how much big is the payload, then you read the payload. No magic sequence necessary. See appending data to an exe for some suggestions.
Better way
You use mono.cecil to modify the exe stub and add as a resource the compressed content. Here there is a question about the argument.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a c# file as in .txt file format, I have to read it dynamically and extract all data available, I need a parser to idenitify
c# class instances,
C# class fields,
etc..
Can anyone have idea to do this in simple way ?
If your C# file is an actual, valid C# file, you could wrap it in a project inside a solution (very simple, one file project), and then compile it. From the EXE file that got generated, you could use reflection to extract types, fields and methods dynamically during runtime.
Another option is to write a basic text parser that recognizes C# keywords and understands what that metadata is, but I think that the first alternative is easier and faster to implement.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I currently have an application written in C#, which performs some calculations on electrical lines sag and tensions.
The program has only the option of exporting a .doc file, or printing to PDF. As it stands, I cannot save it into a format that allows me to change any data, as it is already in word, or pdf.
i need to setup an intermediary file format, that allows editing of the file, while retaining the ability to export the project file to Word or PDF.
Thank you in advance.
You'll have to define your own binary file format. How to define it depends on the data to store and this is up to you.
Or you can use XML file format. Of course again you'll have to define what to write in which structure to the file. It might be a good idea to provide your own DTD.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to let my users edit css files.
so I need to load (Read) one of my stylesheets to a textarea and then save (write) it into same stylesheet,
how i can do it?
It's just a file containing plain text. Simply load its contents and then save it back.. Simple File methods should be ok, maybe implement some kind of a text highlighter to allow nicer editing.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am reading a XMl and using it for some processing in my application.
var config = XElement.Load("c:/sample.xml");
Is there anyway to do load it in a better way? it takes a while while trying to process this line of code.
Look at XmlReader class, it provides fast, noncached, forward-only access to XML data.
You use so call DOM model of loading document which loads the whole XML. An alternative is a SAX model when you read data in consecutive manner. The API for that is XmlReader