PDDocument addPage /importPage not working correct - c#

I am using PDFBox 1.8.3 and my need is to take a set of ZPLs and convert them all into one PDF. For the conversion part(ZPL to PDF), we are using labelary and this works correctly. I am basically using PDFBox to "stitch" all these individual PDFs. My source code is very simple as below :
var outputByteStream = new ByteArrayOutputStream();
var destinationDoc = null;
var doc = null;
for each(var img in images.toArray()) {
log.debug("Working with image : " + img);
if("ZPL".equalsIgnoreCase(img.getFormat()) || "ZPL203".equalsIgnoreCase(img.getFormat())) {
try {
var convertorServiceUrl = "http://labelary ...." + img.getId()+"?labelSize=4x6&density=8dpmm";
var urlObject = new URL(convertorServiceUrl);
var conn = urlObject.openConnection();
conn.connect();
if(destinationDoc == null ) {
var tmpfile = java.io.File.createTempFile(pw +"-"+uniqKey, ".pdf");
var raf = new org.apache.pdfbox.io.RandomAccessFile(tmpfile, "rw");
destinationDoc = PDDocument.load(conn.getInputStream(), raf);
}
else {
doc = PDDocument.load(conn.getInputStream());
if (doc != null && doc.getNumberOfPages() > 0) {
var page = doc.getDocumentCatalog().getAllPages().get(0);
destinationDoc.importPage(page);
}
}
} catch (res) {
log.error("Error message retrieved is " + exceptionMsg);
throw new BaseRuntimeException("Unable to convert the PDF for image with id " + img.getId(), res);
}
}
}
try {
if(destinationDoc != null) {
destinationDoc.save(outputByteStream);
destinationDoc.close();
}
} catch (e1) {
log.error("Error in writing the document to the output stream " + pw + "." , e1);
throw e1;
}
return outputByteStream.toByteArray();
Source code runs and generates a PDF but all the pages of the PDF are pointing to the first page. So if my for-loop run 4 times, all 4 pages of the PDF are for first label.
If I use addPage like below
var outputByteStream = new ByteArrayOutputStream();
var destinationDoc = new PDDocument();
var doc = null;
for each(var img in images.toArray()) {
log.debug("Working with image : " + img);
if("ZPL".equalsIgnoreCase(img.getFormat()) || "ZPL203".equalsIgnoreCase(img.getFormat())) {
try {
var convertorServiceUrl = "http://labelary ...." + img.getId()+"?labelSize=4x6&density=8dpmm";
var urlObject = new URL(convertorServiceUrl);
var conn = urlObject.openConnection();
conn.connect();
doc = PDDocument.load(conn.getInputStream());
if (doc != null && doc.getNumberOfPages() > 0) {
var page = doc.getDocumentCatalog().getAllPages().get(0);
destinationDoc.addPage(page);
}
} catch (res) {
log.error("Error message retrieved is " + exceptionMsg);
throw new BaseRuntimeException("Unable to convert the PDF for image with id " + img.getId(), res);
}
}
}
try {
if(destinationDoc != null) {
destinationDoc.save(outputByteStream);
destinationDoc.close();
}
} catch (e1) {
log.error("Error in writing the document to the output stream " + pw + "." , e1);
throw e1;
}
return outputByteStream.toByteArray();
Then the result is a PDF page with all empty pages.
I have already ensured that the content returned from the labelary service is correct and if I simply take the response and save it into a file it works correctly. Even saving the PDDocument one page at a time also produces the PDF correctly.
The problem I have is with the "stitching" of PDFs. It should work as per the documentation but I am not sure what I am doing wrong.

Related

How to save the image while updating the BARCODE?

When I am saving an image, it is getting saved by the Barcode name.
But If I modify the barcode name, then the image is gone, not saving.
I have noticed while debugging that the if(System.IO.File.Exists(imageFilePath)) is not executing when I am updating the barcode. But at the time of saving a new one, its executing.
How to correct this?
Updating code
public object Update(ItemEntity item)
{
var val = _itemRepository.Save(item);
if (item.ImageFileName != string.Empty || item.ImageOriginalFileName != string.Empty)
{
string result = JsonConvert.SerializeObject(val);
var definition = new { Success = false, EntryMode = string.Empty, BarCode = string.Empty, ItemCode = string.Empty };
var info = JsonConvert.DeserializeAnonymousType(result, definition);
if (info.Success)
{
if (!string.IsNullOrEmpty(item.ImageOriginalFileName))
this.SaveImage(item.ImageFileName, (item.ImageOriginalFileName == "BARCODE" ? info.ItemCode : item.ImageOriginalFileName == "BARCODEYes" ? item.Barcode : item.ImageOriginalFileName));
}
}
if (item.DeleteImageFileName != string.Empty && item.ImageFileName == string.Empty) // Image file removed but no new image file is uploaded[Bug Id :34754 -No image is uploaded, when a user re-uploads an image after deleting a previous one.]
{
this.DeleteImage(item.DeleteImageFileName);
}
return val;
}
Saving Code
public void SaveImage(string fileName, string originalFileName)
{
string tempFileName = fileName;
if (!string.IsNullOrEmpty(tempFileName))
{
// move file from temp location to actual loation
var webAppPath = System.Web.Hosting.HostingEnvironment.MapPath("~");
var splitPath = webAppPath.Split("\\".ToCharArray()).ToList();
var upLevel = splitPath.Count - 1; // one level up
splitPath = splitPath.Take(upLevel).ToList();
var UIFolder = "WebUI";
splitPath.AddRange(new string[] { UIFolder, "Temp", "ImageUpload", CurrentSession.UserId.Value.ToString(), "ItemImage" });
var webUIPath = splitPath.Aggregate((i, j) => i + "\\" + j);
var imageFilePath = Path.Combine(webUIPath.EnsureEndsWith('\\'), tempFileName);
if (System.IO.File.Exists(imageFilePath))
{
splitPath = webAppPath.Split("\\".ToCharArray()).ToList();
upLevel = splitPath.Count - 1; // one level up
splitPath = splitPath.Take(upLevel).ToList();
splitPath.AddRange(new string[] { "Shared", "ItemImage" });
webUIPath = splitPath.Aggregate((i, j) => i + "\\" + j);
// Check existence of folder, create if not found and Delete-Create if found
if (!Directory.Exists(webUIPath))
{
Logger.Debug("Create directory :" + webUIPath);
Directory.CreateDirectory(webUIPath);
}
originalFileName = originalFileName + ".jpg";
var imagDestPath = Path.Combine(webUIPath.EnsureEndsWith('\\'), originalFileName);
if (File.Exists(imagDestPath))
{
GC.Collect();
GC.WaitForPendingFinalizers();
Logger.Debug("Delete File :" + imagDestPath);
File.Delete(imagDestPath);
}
byte[] bytes = System.IO.File.ReadAllBytes(imageFilePath);
FileStream fs = new FileStream(imagDestPath, FileMode.Create, FileAccess.Write);
BinaryWriter bw = new BinaryWriter(fs);
bw.Write(bytes);
bw.Close();
}
}
}

How to ignore protected pdf's?

I am writing on my pdf-word converter and I just received a really strange exception witch makes no sens to me.
Error:PdfiumViewer.PdfException:{"Unsupported security scheme"}
Its the first time that such a exception appears. but I have to be honest that I never tried to convert more then 3-4 files from pdf to word and right now I am doing more then 100 files.
Here is my code I am sry if its too long but I simply do not know on which line the error occurs
public static void PdfToImage()
{
try
{
Application application = null;
application = new Application();
string path = #"C:\Users\chnikos\Desktop\Test\Batch1\";
foreach (string file in Directory.EnumerateFiles(path, "*.pdf"))
{
var doc = application.Documents.Add();
using (var document = PdfiumViewer.PdfDocument.Load(file))
{
int pagecount = document.PageCount;
for (int index = 0; index < pagecount; index++)
{
var image = document.Render(index, 200, 200, true);
image.Save(#"C:\Users\chnikos\Desktop\Test\Batch1\output" + index.ToString("000") + ".png", ImageFormat.Png);
application.Selection.InlineShapes.AddPicture(#"C:\Users\chnikos\Desktop\Test\Batch1\output" + index.ToString("000") + ".png");
}
string getFileName = file.Substring(file.LastIndexOf("\\"));
string getFileWithoutExtras = Regex.Replace(getFileName, #"\\", "");
string getFileWihtoutExtension = Regex.Replace(getFileWithoutExtras, #".pdf", "");
string fileName = #"C:\Users\chnikos\Desktop\Test\Batch1\" + getFileWihtoutExtension;
doc.PageSetup.PaperSize = WdPaperSize.wdPaperA4;
foreach (Microsoft.Office.Interop.Word.InlineShape inline in doc.InlineShapes)
{
.....
}
doc.PageSetup.TopMargin = 28.29f;
doc.PageSetup.LeftMargin = 28.29f;
doc.PageSetup.RightMargin = 30.29f;
doc.PageSetup.BottomMargin = 28.29f;
application.ActiveDocument.SaveAs(fileName, WdSaveFormat.wdFormatDocument);
doc.Close();
string imagePath = #"C:\Users\chnikos\Desktop\Test\Batch1\";
Array.ForEach(Directory.GetFiles(imagePath, "*.png"), delegate(string deletePath) { File.Delete(deletePath); });
}
}
}
catch (Exception e)
{
Console.WriteLine("Error: " + e);
}
}
}
}

Create multi-file zip from stream and download on the fly

I'm using DotNetZip library to make multi-file zip archive and download it on the fly (no need to wait for download to start). However I can't make it to download instantly. From browser dev console, in network tab I noticed that first zip file is "transferred" and after being fully transferred it starts downloading.
Here is code fragment:
using (var vZipArchive = new ZipFile())
{
if (vFilesTable.Rows.Count > 0)
{
string vPathFormat = null;
string vKeyName = null;
string vPrimKey = null;
string vFileName = null;
vZipArchive.CompressionLevel = CompressionLevel.BestSpeed;
vZipArchive.CompressionMethod = CompressionMethod.Deflate;
vZipArchive.Comment = "Document Archive";
foreach (DataRow vFile in vFilesTable.Rows)
{
if (vKeyFieldName != null && !string.IsNullOrEmpty(vKeyFieldName))
{
vPathFormat = "{0}/";
}
else
{
vPathFormat = "/";
}
if (vFile.Table.Columns.Contains("FileName") && vFile.Table.Columns.Contains("PrimKey"))
{
vPrimKey = vFile["PrimKey"].ToString();
vFileName = vFile["FileName"].ToString();
if (vKeyFieldName != null)
{
vKeyName = vFile[vKeyFieldName].ToString();
}
string vPath = null;
if (vKeyFieldName != null)
{
vPath = string.Format(vKeyFieldName + " " + vPathFormat + "{1}", vKeyName, vFileName);
}
else
{
vPath = string.Format(vPathFormat + vFileName);
}
using (var vFileStream = vUserContext.GetFileStream(vRecordSource.ViewName, new Guid(vPrimKey)))
{
vFileStream.Position = 0;
vZipArchive.AddEntry(vPath, vFileStream);
}
}
else
{
throw new Exception("Select 'FileName' and 'PrimKey' fields in underlying DataSource");
}
} //end loop
pContext.Response.Clear();
pContext.Response.BufferOutput = false;
pContext.Response.ContentType = "application/zip";
pContext.Response.AddHeader("Content-Disposition", string.Format("attachment; filename=\"{0}.zip\"", vZipName));
vZipArchive.Save(pContext.Response.OutputStream);
}
else
{
return;
}
}
I also tried using ZipOutputStream and SharpCompress.dll library.
So, what I am missing to make it work? Or its impossible?

GoogleDrive C# SDK: Too much items in FileList

When I retrieve all Files (and Folders) of my GoogleDrive Account I should get something like 1500 List elements back, but I get a bit more than 3000 back. I looked into the List and found that some files are 2-3 times in it. Why is that?
Here is the code I use to retrieve the files:
public async Task<List<File>> RetrieveAllFilesAsList(DriveService service, string query = null)
{
List<File> result = new List<File>();
FilesResource.ListRequest request = service.Files.List();
if (query != null)
{
request.Q = query;
}
do
{
try
{
FileList files = await request.ExecuteAsync();
result.AddRange(files.Items);
request.PageToken = files.NextPageToken;
}
catch (Exception e)
{
Console.WriteLine("An error occurred (from RetrieveAllFilesAsList): " + e.Message);
request.PageToken = null;
}
}
while (!String.IsNullOrEmpty(request.PageToken));
return result;
}
Update1:
public async Task<List<File>> RetrieveAllFilesAsList(DriveService service, string query = null)
{
List<File> result = new List<File>();
FilesResource.ListRequest request = service.Files.List();
request.MaxResults = 1000;
if (query != null)
{
request.Q = query + " AND trashed=false";
}
else
{
request.Q = "trashed=false";
}
do
{
try
{
FileList files = await request.ExecuteAsync();
result.AddRange(files.Items);
request.PageToken = files.NextPageToken;
}
catch (Exception e)
{
Console.WriteLine("An error occurred (from RetrieveAllFilesAsList): " + e.Message);
request.PageToken = null;
}
}
while (!String.IsNullOrEmpty(request.PageToken));
int i;
for (i = 0; i < result.Count; i++ )
{
System.IO.File.AppendAllText(#"C:\Users\carl\Desktop\log.txt", result[i].Id + "\t" + result[i].Title + "\t" + result[i].ExplicitlyTrashed.ToString() + "\r\n");
}
// prints 3120 Lines
System.IO.File.AppendAllText(#"C:\Users\carl\Desktop\log.txt", "" + i + Environment.NewLine);
//Count = 3120
System.IO.File.AppendAllText(#"C:\Users\carl\Desktop\log.txt", "" + result.Count);
return result;
}
Word failed to give me the right the linecount, so I did it over my Function.
But I can find the FileId 2-3 times in the File.
I cannot write on the comments yet, so according to the API
from google
"Note: This method returns all files by default. This includes files with trashed=true in the results. Use the trashed=false query parameter to filter these from the results."
So can you check what url of the rest api is actually being called? It seems you need to put some filters on the List method.

How do I use lucene.net for searching file content?

I am currently using lucene.net to search the content of files for keyword search. I am able to get the results correctly but I have a scenario where I need to display the keywords found in a particular file.
There are two different files containing "karthik" and "steven", and if I search for "karthik and steven" I am able to get both the files displayed. If I search only for "karthik" and "steven" separately, only the respective files are getting displayed.
When I search for "karthik and steven" simultaneously I get both the files in the result as I am displaying the filename alone, and now I need to display the particular keyword found in that particular file as a record in the listview.
Public bool StartSearch()
{
bool bResult = false;
Searcher objSearcher = new IndexSearcher(mstrIndexLocation);
Analyzer objAnalyzer = new StandardAnalyzer();
try
{
//Perform Search
DateTime dteStart = DateTime.Now;
Query objQuery = QueryParser.Parse(mstrSearchFor, "contents", objAnalyzer);
Hits objHits = objSearcher.Search(objQuery, objFilter);
DateTime dteEnd = DateTime.Now;
mlngTotalTime = (Date.GetTime(dteEnd) - Date.GetTime(dteStart));
mlngNumHitsFound = objHits.Length();
//GeneratePreviewText(objQuery, mstrSearchFor,objHits);
//Generate results - convert to XML
mstrResultsXML = "";
if (mlngNumHitsFound > 0)
{
mstrResultsXML = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><Results>";
//Loop through results
for (int i = 0; i < objHits.Length(); i++)
{
try
{
//Get the next result
Document objDocument = objHits.Doc(i);
//Extract the data
string strPath = objDocument.Get("path");
string strFileName = objDocument.Get("name");
if (strPath == null) { strPath = ""; }
string strLastWrite = objDocument.Get("last_write_time");
if (strLastWrite == null)
strLastWrite = "unavailable";
else
{
strLastWrite = DateField.StringToDate(strLastWrite).ToShortDateString();
}
double dblScore = objHits.Score(i) * 100;
string strScore = String.Format("{0:00.00}", dblScore);
//Add results as an XML row
mstrResultsXML += "<Row>";
//mstrResultsXML += "<Sequence>" + (i + 1).ToString() + "</Sequence>";
mstrResultsXML += "<Path>" + strPath + "</Path>";
mstrResultsXML += "<FileName>" + strFileName + "</FileName>";
//mstrResultsXML += "<Score>" + strScore + "%" + "</Score>";
mstrResultsXML += "</Row>";
}
catch
{
break;
}
}
//Finish off XML
mstrResultsXML += "</Results>";
//Build Dataview (to bind to datagrid
DataSet objDS = new DataSet();
StringReader objSR = new StringReader(mstrResultsXML);
objDS.ReadXml(objSR);
objSR = null;
mobjResultsDataView = new DataView();
mobjResultsDataView = objDS.Tables[0].DefaultView;
}
//Finish up
objSearcher.Close();
bResult = true;
}
catch (Exception e)
{
mstrError = "Exception: " + e.Message;
}
finally
{
objSearcher = null;
objAnalyzer = null;
}
return bResult;
}
Above is the code i am using for search and the xml i am binding to the listview, now i need to tag the particular keywords found in the respective document and display it in the listview as recordsss,simlar to the below listview
No FileName KeyWord(s)Found
1 Test.Doc karthik
2 Test2.Doc steven
i hope u guys undesrtood the question,
This depends on how your documents were indexed. You'll need to extract the original content, pass it through the analyzer to get the indexed tokens, and check which matches the generated query.
Just go with the Highlighter.Net package, part of contrib, which does this and more.

Categories