We have a MailMerge docx which has the following table:
_____________________________________________________________________________
Date Id Description Amount
_____________________________________________________________________________
{{TableStart {{Id}} {{Description}} € {{Amount
:Lines}}{{Da \# 0,00}}{{
te \#"dd-MM- TableEnd:Li
yyyy"}} nes}}
_____________________________________________________________________________
Total € {{Total \#
0,00}}
_____________________________________________________________________________
Here is an example result row:
____________________________________________________________________________
Date Id Description Amount
____________________________________________________________________________
03-09-2015 0001 Company Name € 25,00
Buyer Name 1, Buyer Name 2
Product description
Extra description line
As you can see, the description has multiple lines. When the end of a page is reached, it just continues on the next page. So with the example above, the line could be like this at the end of page 1:
03-09-2015 0001 Company Name € 25,00
Buyer Name 1, Buyer Name 2
And like this at the start of page 2:
Product description
Extra description line
What I'd like instead is the following: When an item doesn't fit on the page anymore, the entire item must go to the start of the next page. Basically I want to prevent items from splitting between pages. Is there any way to accomplish this with MailMerge?
Also, we use C# in our project. Here is the code we use for the MailMerge. I think it's a bit to ambitious to ask if there is a setting to allow the behavior I desire in the MailMerge libraries. Anyway, here is the code we use to convert the data & docx to a pdf:
var pdf = _documentService.CreateTableFile(new TableFileData(date, companyId,
dataList.Select(x => new TableRowData
{
Description = x.Description,
Amount = x.Amount,
Date = x.Date,
Id = x.Id
}).ToList()));
var path = Path.Combine(FileService.GetTemporaryPath(), Path.GetRandomFileName());
var file = Path.ChangeExtension(path, "pdf");
using (var fs = File.OpenWrite(file))
{
fs.Write(pdf, 0, pdf.Length);
}
Process.Start(file);
With CreateTableFile-method:
public byte[] CreateTableFile(TableFileData data)
{
if (data == null) throw new ArgumentNullException("data");
const string fileName = "TableFile.docx";
var path = Path.Combine(_templatePath, fileName);
using (var fs = File.OpenRead(path))
{
var dataSource = new DocumentDataSource(data);
return GenerateDocument(fs, dataSource);
}
}
With GenerateDocument-method:
private static byte[] GenerateDocument(Stream template, DocumentDataSource dataSource, IFieldMergingCallback fieldMergingCallback = null)
{
var doc = new Document(template);
doc.MailMerge.FieldMergingCallback = fieldMergingCallback;
doc.MailMerge.UseNonMergeFields = true;
doc.MailMerge.CleanupOptions = MailMergeCleanupOptions.RemoveContainingFields |
MailMergeCleanupOptions.RemoveUnusedFields |
MailMergeCleanupOptions.RemoveUnusedRegions |
MailMergeCleanupOptions.RemoveEmptyParagraphs;
doc.MailMerge.Execute(dataSource);
doc.MailMerge.ExecuteWithRegions((IMailMergeDataSourceRoot)dataSource);
doc.UpdateFields();
using (var ms = new MemoryStream())
{
var options = new PdfSaveOptions { WarningCallback = new AsposeWarningCallback() };
doc.Save(ms, options);
return ms.ToArray();
}
}
After #bibadia's suggestion in the first comment of the question, I've unchecked the suggested checkbox of the table settings in the docx:
This did the trick, so thanks a lot bibadia!
Related
I have a txt file named fileA.txt that I am trying to validate.
here is an example for fileA.txt
123, joshua, employee
134, vernon, manager
382, lisa, HR
So, what I am trying to do is read the contents of fileA and if e.g the value of the first index of the file is suppose to be the employee ID(an int) but has a string. I want to skip that line and go to the next using try catch. However, if everything is fine, I will return its value and add it to a new list. Any ideas on how may I do the validation part?
here is what I have for now to read the file and add it to a new list
public static List<Employee> readlist(string path)
{
var employees = new List<Employee>();
var content = File.ReadAllText(path);
var lines = content.Split('\n');
foreach (var line in lines)
{
var info = line.Split(',');
employees.Add(new Employee
(
int.Parse(info[0]),
info[1],
info[2]
));
}
return employees;
}
Hope what I have provided is sufficient, thank you for all the help in advance!
There is not need of using a try catch, you can simply use Int32.TryParse method to see if the expected value is a number, if is not a number then you just continue checking the other lines.
foreach (var line in lines)
{
var info = line.Split(',');
var isIdValid = Int32.TryParse(info[0], out int employeeId);
if(!isIdValid)
{
Console.WriteLine($"'{info[0]}' could not be parsed as an Int32.");
continue;
}
employees.Add(new Employee
(
employeeId,
info[1],
info[2]
));
}
I want to read a textfile dynamically based on the headers. Consider an example like this
name|email|phone|othername|company
john|john#example.com|1234||example
doe|doe#example.com||pin
jane||98485|
The values to be read like this for the following records
name email phone othername company
john john#example.com 1234 example
doe doe#example.com pin
jane 98485
I tried using this
using (StreamReader sr = new StreamReader(new MemoryStream(textFile)))
{
while (sr.Peek() >= 0)
{
string line = sr.ReadLine(); //Using readline method to read text file.
string[] strlist = line.Split('|'); //using string.split() method to split the string.
Obj obj = new Obj();
obj.Name = strlist[0].ToString();
obj.Email = strlist[1].ToString();
obj.Phone = strlist[2].ToString();
obj.othername = strlist[3].ToString();
obj.company = strlist[4].ToString();
}
}
Above code works if all the delimiters are put exactly but doesn't work when given dynamically like the above. Any possible solution for this?
If you have any control over this, you should use a better serialization techinology, or at least use a csv parser that can deal with this sort of format. However, if you want to use string.Split, you can also take advantage of ElementAtOrDefault
Returns the element at a specified index in a sequence or a default
value if the index is out of range.
Given
public class Data
{
public string Name { get; set; }
public string Email { get; set; }
public string Phone { get; set; }
public string OtherName { get; set; }
public string Company { get; set; }
}
Usage
var results = File
.ReadLines(SomeFileName) // stream the lines from a file
.Skip(1) // skip the header
.Select(line => line.Split('|')) // split on pipe
.Select(items => new Data() // populate some funky class
{
Name = items.ElementAtOrDefault(0),
Email = items.ElementAtOrDefault(1),
Phone = items.ElementAtOrDefault(2),
OtherName = items.ElementAtOrDefault(3),
Company = items.ElementAtOrDefault(4)
});
foreach (var result in results)
Console.WriteLine($"{result.Name}, {result.Email}, {result.Phone}, {result.OtherName}, {result.Company}");
Output
john, john#example.com, 1234, , example
doe, doe#example.com, , pin,
jane, , 98485, ,
When you split the line like string[] strlist = line.Split('|'); you can get undesired results.
For example: jane||98485| generates an array of just 4 elements as you can check here https://rextester.com/WBOT6074 online.
You should check your array strList after generating it with thinks like measuring the size.
As you haven't given clear details about the problem I cannot give a more especific answer to it.
I'm trying to get the author name from this site .. The site simply shows a result of 25 Rows .. Each row contain different info like authors name, Title ...etc
I tried lots of solution to select the author name for each tr .. but failed to retrieve the author name .. Here is my code if someone can help me to know what i missed!
var documentx = new HtmlWeb().Load(post.ExtLink);
var div = documentx.DocumentNode.SelectNodes("//*//table[2]//tr");
if (div != null)
{
foreach (var item in div)
{
Book model = new Book();
var author= item.SelectSingleNode("//td[1]//a").InnerText.ToString();
//var title = item.SelectNodes("//td").Skip(2).FirstOrDefault().InnerText;
//var img = item.Descendants("img").Select(a1 => a1.GetAttributeValue("src", null)).FirstOrDefault();
model.Book_Description = author;
}
}
I want to get the author name for each row this photo explain exactly what i want:
I tried to debug the code .. and it's doing well before the foreach it shows that it has a 25 row result then when foreach start executing it's not showing the expected result or value.
Try using:
var div = documentx.DocumentNode.SelectNodes("//*//table[3]//tr");
instead of:
var div = documentx.DocumentNode.SelectNodes("//*//table[2]//tr");
and use it like this:
var author = item.ChildNodes[0].InnerText;
var series = item.ChildNodes[1].InnerText;
var title = item.ChildNodes[2].InnerText;
I need to open a csv file. Than I need filter each data and generate an output for each value of them.
◘ Example
•Input file = "full list.csv"
NAME CITY
Mark Venezia
John New York
Lisa San Miguel
Emily New York
Amelia New York
Nicolas Venezia
Bill San Miguel
Steve Venezia
Output will be =
• file1 = "full list_Venezia.csv"
NAME CITY
Mark Venezia
Nicolas Venezia
Steve Venezia
• file2 = "full list_New York.csv"
NAME CITY
John New York
Emily New York
Amelia New York
• file3 = "full list_San Miguel"
NAME CITY
Lisa San Miguel
Bill San Miguel
I'm using c# with ConsoleApplication on Visual Studio and I started to read the input file in this method:
string inputFile = "full list.csv";
string outputFile;
string line;
string titles = File.ReadLines(inputFile).First();
System.IO.StreamReader file = new System.IO.StreamReader(inputFile);
while ((line = file.ReadLine()) != null)
{
}
file.Close();
System.IO.StreamWriter fileOut = new System.IO.StreamWriter(outputFile);
foreach (DatiOutput objOut in listOutput)
{
}
fileOut.Close();
Is there an algorithm that allows me to filter the data I need?
Here's a non-LINQy approach using a Dictionary to keep a reference to each output file based on the city name as the Key (there's nothing wrong with LINQ, though!):
string[] values;
string header;
string line, city, outputFileName;
string inputFile = "full list.csv";
Dictionary<string, System.IO.StreamWriter> outputFiles = new Dictionary<string, System.IO.StreamWriter>();
using (System.IO.StreamReader file = new System.IO.StreamReader(inputFile))
{
header = file.ReadLine();
while ((line = file.ReadLine()) != null)
{
values = line.Split(",".ToCharArray());
city = values[1];
if (!outputFiles.ContainsKey(city))
{
outputFileName = "full list_" + city + ".csv";
outputFiles.Add(city, new System.IO.StreamWriter(outputFileName));
outputFiles[city].WriteLine(header);
}
outputFiles[city].WriteLine(line);
}
}
foreach(System.IO.StreamWriter outputFile in outputFiles.Values)
{
outputFile.Close();
}
You have written most of the good parts yourself, and now you need to fill the blanks.
Breaking down the steps
Read the CSV to a Collection
Group Collection based on City
Write the
each group to separate file
The first step is of course is to read the input file
var listOutput = new List<DatiOutput>();
while ((line = file.ReadLine()) != null)
{
var data = line.Split(new []{";"},StringSplitOptions.RemoveEmptyEntries);
if(!data[0].Trim().Equals("NAME"))
listOutput.Add(new DatiOutput{ Name = data[0].Trim(), City = data[1].Trim()});
}
I have assumed your DatiOutput looks like following as it was not given.
public class DatiOutput
{
public string City{get;set;}
public string Name{get;set;}
}
Then next step is to Group the collection based on City and then write them to file. You can use LINQ to group the collection based on City.
listOutput.GroupBy(c=>c.City)
Once your have the result, you can now create file name with corresponding city name appended, and add the data to it.
foreach (var objOut in listOutput.GroupBy(c=>c.City))
{
var filePath = $"{Path.Combine(Path.GetDirectoryName(inputFile),Path.GetFileNameWithoutExtension(inputFile))}_{objOut.First().City}.csv";
using(System.IO.StreamWriter fileOut = new System.IO.StreamWriter(File.Open(filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite)))
{
fileOut.WriteLine($"NAME;CITY");
foreach(var items in objOut)
{
fileOut.WriteLine($"{items.Name};{items.City}");
}
}
}
You would have the desired result
foreach (var g in File.ReadAllLines("full list.csv")
.Skip(1)
.Select(l => new {
Name = l.Substring(0, l.IndexOf(',')),
City = l.Substring(l.IndexOf(',') + 1) })
.GroupBy(l => l.City))
{
File.WriteAllLines($"full list_{g.Key}.csv", new[] { "NAME,CITY" }
.Concat(g.Select(l => $"{l.Name},{l.City}")));
}
The key part your example was missing was GroupBy - this allows you to group the data you have read in to groups based on a certain criteria (in our case City).
Group by is a powerful LINQ extension that allows you to filter data. The example above reads in all the data, skips the header, uses select to transform each line into an instance of an anonymous type to contain the name and city. GroupBy is then used to group these instances by city. Then for each group the data is written to a new file.
I would take #TVOHMs answer to slightly cleaner direction by keeping the same codestyle on the whole solution.
File.ReadAllLines("full list.csv") // Read the input file
.Skip(1) // Skip the header row
.Select(row => row.Split(',')) // Split each row to array of city and name
.GroupBy(row => row[1], row => row[0]) // Group by cities, selecting names
.ToList() // To list, so .ForEach is possible
.ForEach(group => File.WriteAllLines($"full list_{group.Key}.csv", group)); // Create file for each group and write the names
I'm trying to get a link and another element from an HTML page, but I don't really know what to do. This is what I have right now:
var client = new HtmlWeb(); // Initialize HtmlAgilityPack's functions.
var url = "http://p.thedgtl.net/index.php?tag=-1&title={0}&author=&o=u&od=d&page=-1&"; // The site/page we are indexing.
var doc = client.Load(string.Format(url, textBox1.Text)); // Index the whole DB.
var nodes = doc.DocumentNode.SelectNodes("//a[#href]"); // Get every url.
string authorName = "";
string fileName = "";
string fileNameWithExt;
foreach (HtmlNode link in nodes)
{
string completeUrl = link.Attributes["href"].Value; // The complete plugin download url.
#region Get all jars
if (completeUrl.Contains(".jar")) // Check if the url contains .jar
{
fileNameWithExt = completeUrl.Substring(completeUrl.LastIndexOf('/') + 1); // Get the filename with extension.
fileName = fileNameWithExt.Remove(fileNameWithExt.LastIndexOf('.')); ; // Get the filename without extension.
Console.WriteLine(fileName);
}
#endregion
#region Get all Authors
if (completeUrl.Contains("?author=")) // Check if the url contains .jar
{
authorName = completeUrl.Substring(completeUrl.LastIndexOf('=') + 1); // Get the filename with extension.
Console.WriteLine(authorName);
}
#endregion
}
I am trying to get all the filenames and authors next to each other, but now everything is like randomly placed, why?
Can someone help me with this? Thanks!
If you look at the HTML, it's very unfortunate it is not well-formed. There's a lot of open tags and the way HAP structures it is not like a browser, it interprets the majority of the document as deeply nested. So you can't just simply iterate through the rows of the table like you would in the browser, it gets a lot more complicated than that.
When dealing with such documents, you have to change your queries quite a bit. Rather than searching through child elements, you have to search through descendants adjusting for the change.
var title = System.Web.HttpUtility.UrlEncode(textBox1.Text);
var url = String.Format("http://p.thedgtl.net/index.php?title={0}", title);
var web = new HtmlWeb();
var doc = web.Load(url);
// select the rows in the table
var xpath = "//div[#class='content']/div[#class='pluginList']/table[2]";
var table = doc.DocumentNode.SelectSingleNode(xpath);
// unfortunately the `tr` tags are not closed so HAP interprets
// this table having a single row with multiple descendant `tr`s
var rows = table.Descendants("tr")
.Skip(1); // skip header row
var query =
from row in rows
// there may be a row with an embedded ad
where row.SelectSingleNode("td/script") == null
// each row has 6 columns so we need to grab the next 6 descendants
let columns = row.Descendants("td").Take(6).ToList()
let titleText = columns[1].Elements("a").Select(a => a.InnerText).FirstOrDefault()
let authorText = columns[2].Elements("a").Select(a => a.InnerText).FirstOrDefault()
let downloadLink = columns[5].Elements("a").Select(a => a.GetAttributeValue("href", null)).FirstOrDefault()
select new
{
Title = titleText ?? "",
Author = authorText ?? "",
FileName = Path.GetFileName(downloadLink ?? ""),
};
So now you can just iterate through the query and write out what you want for each of the rows.
foreach (var item in query)
{
Console.WriteLine("{0} ({1})", item.FileName, item.Author);
}