I originally asked this on Adobe's forums but yet to receive any reponses.
I have to merge a set of many (100+) PDF files into a single report on a weekly basis, and so far, I have been doing the process by hand by selecting the files, right clicking, and selecting "Combine supported files in Acrobat". What I would like to do is replicate this exact same process programmatically (preferrably in Excel/VBA, but C# or Batch commands are acceptable alternatives). I currently have code that will combine pdf files, but it it does not keep the bookmark structure the same way that "Combine supported files in Acrobat" does.
In other words, say I have three files called "A.pdf", "B.pdf", and "C.pdf", and each file contains two bookmarks called "Bkmrk 1" and "Bkmrk 2". I want to programatically combine these three files into a single file that has 9 bookmarks that look like the structure below:
A
Bkmrk 1
Bkmrk 2
B
Bkmrk 1
Bkmrk 2
C
Bkmrk 1
Bkmrk 2
I at first tried automating the process via the Acrobat SDK, but from what I understand the Acrobat SDK does not allow programs to interact with the dialog box that appears when you execute the "Combine Files" menu option, so that did not work. I also tried the option to programatically insert pages from one pdf file into another, but that does not produce the bookmark structure that I am looking for, nor does it let me manipulate the bookmark heirarchy to create the bookmark structure I am looking for.
Does anyone have an idea of how to do this? Any help would be greatly appreciated!
This was pure hell to get working, so I'm happy to share what I've got. This was adapted from code I found here, and will merge files, and put bookmarks at each merge point:
Private mlngBkmkCounter As Long
Public Sub updfConcatenate(pvarFromPaths As Variant, _
pstrToPath As String)
Dim origPdfDoc As Acrobat.CAcroPDDoc
Dim newPdfDoc As Acrobat.CAcroPDDoc
Dim lngNewPageCount As Long
Dim lngInsertPage As Long
Dim i As Long
Set origPdfDoc = CreateObject("AcroExch.PDDoc")
Set newPdfDoc = CreateObject("AcroExch.PDDoc")
mlngBkmkCounter = 0
'set the first file in the array as the "new"'
If newPdfDoc.Open(pvarFromPaths(LBound(pvarFromPaths))) = True Then
updfInsertBookmark "Test Start", lngInsertPage, , newPdfDoc
mlngBkmkCounter = 1
For i = LBound(pvarFromPaths) + 1 To UBound(pvarFromPaths)
Application.StatusBar = "Merging " & pvarFromPaths(i) & "..."
If origPdfDoc.Open(pvarFromPaths(i)) = True Then
lngInsertPage = newPdfDoc.GetNumPages
newPdfDoc.InsertPages lngInsertPage - 1, origPdfDoc, 0, origPdfDoc.GetNumPages, False
updfInsertBookmark "Test " & i, lngInsertPage, , newPdfDoc
origPdfDoc.Close
mlngBkmkCounter = mlngBkmkCounter + 1
End If
Next i
newPdfDoc.Save PDSaveFull, pstrToPath
End If
ExitHere:
Set origPdfDoc = Nothing
Set newPdfDoc = Nothing
Application.StatusBar = False
Exit Sub
End Sub
The insert-bookmark code... You would need to array your bookmarks from each document, and then set them
Public Sub updfInsertBookmark(pstrCaption As String, _
plngPage As Long, _
Optional pstrPath As String, _
Optional pMyPDDoc As Acrobat.CAcroPDDoc, _
Optional plngIndex As Long = -1, _
Optional plngParentIndex As Long = -1)
Dim MyPDDoc As Acrobat.CAcroPDDoc
Dim jso As Object
Dim BMR As Object
Dim arrParents As Variant
Dim bkmChildsParent As Object
Dim bleContinue As Boolean
Dim bleSave As Boolean
Dim lngIndex As Long
If pMyPDDoc Is Nothing Then
Set MyPDDoc = CreateObject("AcroExch.PDDoc")
bleContinue = MyPDDoc.Open(pstrPath)
bleSave = True
Else
Set MyPDDoc = pMyPDDoc
bleContinue = True
End If
If plngIndex > -1 Then
lngIndex = plngIndex
Else
lngIndex = mlngBkmkCounter
End If
If bleContinue = True Then
Set jso = MyPDDoc.GetJSObject
Set BMR = jso.bookmarkRoot
If plngParentIndex > -1 Then
arrParents = jso.bookmarkRoot.Children
Set bkmChildsParent = arrParents(plngParentIndex)
bkmChildsParent.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
Else
BMR.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
End If
MyPDDoc.SetPageMode 3 '3 — display using bookmarks'
If bleSave = True Then
MyPDDoc.Save PDSaveIncremental, pstrPath
MyPDDoc.Close
End If
End If
ExitHere:
Set jso = Nothing
Set BMR = Nothing
Set arrParents = Nothing
Set bkmChildsParent = Nothing
Set MyPDDoc = Nothing
End Sub
To use:
Public Sub uTest_pdfConcatenate()
Const cPath As String = "C:\MyPath\"
updfConcatenate Array(cPath & "Test1.pdf", _
cPath & "Test2.pdf", _
cPath & "Test3.pdf"), "C:\Temp\TestOut.pdf"
End Sub
You might need to consider a commercial tool such as Aspose.Pdf.Kit to get the level of flexibility you're after. It does support file concatenation and bookmark manipulation.
There's a 30 day unlimited trial so you can't really lose out other than time if it doesn't work for you.
Use iText# (http://www.itextpdf.com/). imho it is one of the best PDF-tools around. A code to do (approximately) what you want can be found here http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html
Do not worry that all examples talk about Java, the classes and functions are the same in .NET
hth
Mario
Docotic.Pdf library can merge PDF files while maintaining outline (bookmarks) structure.
There is nothing special should be done. You just append all documents one after another and that's all.
using (PdfDocument pdf = new PdfDocument())
{
string[] filesToMerge = ...
foreach (string file in filesToMerge)
pdf.Append(file);
pdf.Save("merged.pdf");
}
Disclaimer: I work for Bit Miracle, vendor of the library.
The Acrobat SDK does let you create and read bookmarks. Check your SDK API Reference for:
PDDocGetBookmarkRoot()
PDBookmark* (AddChild, AddNewChild, GetNext, GetPrev... lots of functions in there)
If the "combine files" dialog doesn't give you the control you need, make your own dialog.
Related
I read this topic more times: This SO Link about compare two XLS (Excel File), I work and try on some little examples.
I want to write a best performance C# code that read two huge XLS file and compare first row of file A with all lines of file B. if first row of A not occurred in all lines of file B, list the row of A and going to next line of A.xls and again compare with all lines of file B.
Update 1:
(I do as follows):
DataTable dt1 = GetDataTableFromExcel(this.Directory, this.FirstFile, this.FirstFileSheetName);
dtRet = getDifferentRecords(dt1, dt2);
var adapter = new OleDbDataAdapter("SELECT * FROM [" + strSheetName + "$]", connectionString);
Update 2:
My main problem occured when Xls contains 4000 records ! (Huge files)
As requested by OP, here's a VBA solution. Guessing at a few details, so OP will need to adjust to suit their specific use case
This runs for me in <2s over 4000 rows
Sub Demo()
Dim wb1 As Workbook, wb2 As Workbook
Dim ws1 As Worksheet, ws2 As Worksheet
Dim r1 As Range, r2 As Range
Dim v1 As Variant, v2 As Variant
Dim rw1 As Long, rw2 As Long
Dim cl As Long
Dim Found As Boolean
Const NUM_COLS_COMPARE = 1 'adjust as required
' Get Reference to, or open workboks
Set wb1 = Application.Workbooks("NameOfBook1.xlsx") 'if already open
Set wb2 = Application.Workbooks.Open("C:\Path\ToWorkbook2.xlsx") 'if not open
'Get reference to sheets
Set ws1 = wb1.Worksheets("NameOfSheet1")
Set ws2 = wb2.Worksheets("NameOfSheet2")
'get reference to ranges
' assuming data in Column A and Row 1fill whole range. Adjust if necassary
Set r1 = ws1.Range(ws1.Cells(1, ws1.Columns.Count).End(xlToLeft), _
ws1.Cells(ws1.Rows.Count, 1).End(xlUp))
Set r2 = ws2.Range(ws2.Cells(1, ws2.Columns.Count).End(xlToLeft), _
ws2.Cells(ws2.Rows.Count, 1).End(xlUp))
'Get Data into Array
v1 = r1.Value2
v2 = r2.Value2
For rw1 = 1 To UBound(v1, 1)
For rw2 = 1 To UBound(v2, 1)
Found = False
For cl = 1 To NUM_COLS_COMPARE
If v1(rw1, cl) = v2(rw2, cl) Then
Found = True
Exit For
End If
Next
If Found Then Exit For
Next rw2
'List Found row
If Not Found Then
Debug.Print "No Match for " & rw1, v1(rw1, 1)
End If
Next rw1
End Sub
I am facing issue with SSRS report where I trying to generate numbers for showing sequence of filds. I am generating this data by using code in report. but it having two major issues.
Here I am able get numbers in my parent report but in sub report this numbering is not getting start from 1 it continues inside subreport and not considering parent report values.
My code is as given below. which will generate numbers like
1 dummy data
1.1 dummy data
1.2 dummy data
2 dummy data
2.1 dummy data
Dim currentValue As Double
Public Function GetCounter(ByVal iCounter As Double, ByVal incrementCounter
As Boolean) As Double
If (incrementCounter = true) Then
iCounter = (iCounter + currentValue)
currentValue = (currentValue + 0.1)
End If
Return iCounter
End Function
but with my sub report I want to generate numbers like
1 dummy data
1.1 dummy data
1.1.1 dummy data
1.1.1.1 dummy data
1.1.1.2 dummy data
1.1.2 dummy data
1.1.2.1 dummy data
1.1.2.2 dummy data
1.2 and so on.
I am not able get how can achive this with subReport.
Second issue is.
some fildes were I am getting correct numbers like from parrent report, having problem in PDF some extrea numbers are appearing in report and also numbers displaying in report and in PDF are diferent for some filds.
I am not able to get why this issue is comming and what solution I have to apply.
Pleas if any one knows solution to these issue please help me..
You can use the following custom code
Dim numbers = New Integer() {0, 0, 0, 0}
Public Function Seq(lev as Integer) As String
Select Case lev
Case 0
numbers(0) = numbers(0)+1
numbers(1) = 0
numbers(2) = 0
numbers(3) = 0
Return Cstr(numbers(0))
Case 1
numbers(1) = numbers(1)+1
numbers(2) = 0
numbers(3) = 0
Return Cstr(numbers(0)) & "." & Cstr(numbers(1))
Case 2
numbers(2) = numbers(2)+1
numbers(3) = 0
Return Cstr(numbers(0)) & "." & Cstr(numbers(1)) & "." & Cstr(numbers(2))
Case 3
numbers(3) = numbers(3)+1
Return Cstr(numbers(0)) & "." & Cstr(numbers(1)) & "." & Cstr(numbers(2)) &"." & Cstr(numbers(3))
End Select
End Function
The expression for group1 will be = Code.Seq(0), for group2 will be =Code.Seq(1), ... etc
I need to create single pdf file from multiple images. For example, I have 12 images then pdf will generate 3 pages with consist of 4 image in single page 2 images in a row.
So, is there any dll, sample I can use to generate pdf from images?
There are multiple libraries that have support for this:
iTextSharp - working with images tutorial:
pdfSharp - Working with images tutorial
PDF Clown
Thanks, I have used table to create 6 images on one page in pdf.
Public Function CreatePDF(images As System.Collections.Generic.List(Of Byte())) As String
Dim PDFGeneratePath = Server.MapPath("../images/pdfimages/")
Dim FileName = "attachmentpdf-" & DateTime.Now.Ticks & ".pdf"
If images.Count >= 1 Then
Dim document As New Document(PageSize.LETTER)
Try
' Create pdfimages directory in images folder.
If (Not Directory.Exists(PDFGeneratePath)) Then
Directory.CreateDirectory(PDFGeneratePath)
End If
' we create a writer that listens to the document
' and directs a PDF-stream to a file
PdfWriter.GetInstance(document, New FileStream(PDFGeneratePath & FileName, FileMode.Create))
' opens up the document
document.Open()
' Add metadata to the document. This information is visible when viewing the
' Set images in table
Dim imageTable As New PdfPTable(2)
imageTable.DefaultCell.Border = Rectangle.NO_BORDER
imageTable.DefaultCell.HorizontalAlignment = Element.ALIGN_CENTER
For ImageIndex As Integer = 0 To images.Count - 1
If (images(ImageIndex) IsNot Nothing) AndAlso (images(ImageIndex).Length > 0) Then
Dim pic As iTextSharp.text.Image = iTextSharp.text.Image.GetInstance(SRS.Utility.Utils.ByteArrayToImage(images(ImageIndex)), System.Drawing.Imaging.ImageFormat.Jpeg)
' Setting image resolution
If pic.Height > pic.Width Then
Dim percentage As Single = 0.0F
percentage = 400 / pic.Height
pic.ScalePercent(percentage * 100)
Else
Dim percentage As Single = 0.0F
percentage = 240 / pic.Width
pic.ScalePercent(percentage * 100)
End If
pic.Border = iTextSharp.text.Rectangle.BOX
pic.BorderColor = iTextSharp.text.BaseColor.BLACK
pic.BorderWidth = 3.0F
imageTable.AddCell(pic)
End If
If ((ImageIndex + 1) Mod 6 = 0) Then
document.Add(imageTable)
document.NewPage()
imageTable = New PdfPTable(2)
imageTable.DefaultCell.Border = Rectangle.NO_BORDER
imageTable.DefaultCell.HorizontalAlignment = Element.ALIGN_CENTER
End If
If (ImageIndex = (images.Count - 1)) Then
imageTable.AddCell(String.Empty)
document.Add(imageTable)
document.NewPage()
End If
Next
Catch ex As Exception
Throw ex
Finally
' Close the document object
' Clean up
document.Close()
document = Nothing
End Try
End If
Return PDFGeneratePath & FileName
End Function
Have a look at the book "iText in Action", this more or less also covers iTextSharp, which is a .NET version of the iText PDF library. That is, the C# you must write is almost identical to the Java code samples.
You can download the samples from http://itextpdf.com/book/examples.php. A particularly interesting example (code in Java) is the sample on how to add an image. The corresponding C# examples can be found on SourceForge.
Good luck!
I developed a Windows Service using C# that processes a number of Excel files in a folder to add conditional formatting, adjust page layout and print settings and add a macro to adjust page breaks. The problem I'm having is trying to add a line of code to the ThisWorkbook object in the Workbook_Open routine to automatically run the macro when the file is opened. The code I'm using to add the macro to Module1 is as follows:
using Excel = Microsoft.Office.Interop.Excel;
using VBIDE = Microsoft.Vbe.Interop;
VBIDE.VBComponent oModule;
String sCode;
oModule = wb.VBProject.VBComponents.Add(VBIDE.vbext_ComponentType.vbext_ct_StdModule);
sCode =
#"Sub FixPageBreaks()
On Error GoTo ErrMsg
Dim wb As Workbook
Set wb = ActiveWorkbook
Dim sheet As Worksheet
Set sheet = wb.Worksheets(1)
Dim vBreaks As VPageBreaks
Set vBreaks = sheet.VPageBreaks
If vBreaks.Count > 0 Then
Dim lastCol As Integer
lastCol = ActiveSheet.Cells.Find(What:=""*"", After:=ActiveCell, LookIn:=xlFormulas, LookAt:=xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, MatchCase:=False, SearchFormat:=False).Column
Dim lCount As Integer
lCount = 1
Dim brkCol As Integer
Dim brkRng As Range
Dim iReply As VbMsgBoxResult
Do
If vBreaks(lCount).Location.Column = lastCol Then
brkCol = vBreaks(lCount).Location.Column + 1
Else
brkCol = vBreaks(lCount).Location.Column - 1
End If
Set brkRng = Range(sheet.Cells(1, brkCol), sheet.Cells(1, brkCol))
If brkCol Mod 2 = 1 And lastCol > brkCol Then
Set vBreaks(lCount).Location = brkRng
ElseIf brkCol Mod 2 = 1 Then
vBreaks(lCount).DragOff Direction:=xlToRight, RegionIndex:=1
End If
lCount = lCount + 1
Loop While lCount <= vBreaks.Count
sheet.PrintPreview
End If
Exit Sub
ErrMsg:
MsgBox Err.Description
End Sub";
oModule.CodeModule.AddFromString(sCode);
In the line
wb.VBProject.VBComponents.Add(VBIDE.vbext_ComponentType.vbext_ct_StdModule);
wb is the workbook object instantiated earlier in the code. This all works, however, I can't seem to find much documentation on the vbext_ComponentType enumeration to determine which one (if any) represent the ThisWorkbook object in the workbook and how to add code to it. I would also be happy with finding C# code that does the same thing to page breaks as the macro in the Excel document. The only reason I'm not doing it in C# like the rest of the processing is that I was unable to make it work. Any help there would be equally helpful.
var workbookMainModule = wkBk.VBProject.VBComponents.Item("ThisWorkbook");
workbookMainModule.CodeModule.AddFromString(sCode);
<script language="VB" runat="server">
Public Data As String = ""
Public Height As Byte = 25
Public WidthMultiplier As Byte = 1
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs)
Dim dictEncoding As StringDictionary
Dim sbBarcodeImgs As StringBuilder
Dim strEncodedData As String
Dim I As Integer
dictEncoding = New StringDictionary()
dictEncoding.Add("0", "101001101101")
dictEncoding.Add("1", "110100101011")
dictEncoding.Add("2", "101100101011")
dictEncoding.Add("3", "110110010101")
dictEncoding.Add("4", "101001101011")
dictEncoding.Add("5", "110100110101")
dictEncoding.Add("6", "101100110101")
dictEncoding.Add("7", "101001011011")
dictEncoding.Add("8", "110100101101")
dictEncoding.Add("9", "101100101101")
dictEncoding.Add("A", "110101001011")
dictEncoding.Add("B", "101101001011")
dictEncoding.Add("C", "110110100101")
dictEncoding.Add("D", "101011001011")
dictEncoding.Add("E", "110101100101")
dictEncoding.Add("F", "101101100101")
dictEncoding.Add("G", "101010011011")
dictEncoding.Add("H", "110101001101")
dictEncoding.Add("I", "101101001101")
dictEncoding.Add("J", "101011001101")
dictEncoding.Add("K", "110101010011")
dictEncoding.Add("L", "101101010011")
dictEncoding.Add("M", "110110101001")
dictEncoding.Add("N", "101011010011")
dictEncoding.Add("O", "110101101001")
dictEncoding.Add("P", "101101101001")
dictEncoding.Add("Q", "101010110011")
dictEncoding.Add("R", "110101011001")
dictEncoding.Add("S", "101101011001")
dictEncoding.Add("T", "101011011001")
dictEncoding.Add("U", "110010101011")
dictEncoding.Add("V", "100110101011")
dictEncoding.Add("W", "110011010101")
dictEncoding.Add("X", "100101101011")
dictEncoding.Add("Y", "110010110101")
dictEncoding.Add("Z", "100110110101")
dictEncoding.Add("-", "100101011011")
dictEncoding.Add(":", "110010101101")
dictEncoding.Add(" ", "100110101101")
dictEncoding.Add("$", "100100100101")
dictEncoding.Add("/", "100100101001")
dictEncoding.Add("+", "100101001001")
dictEncoding.Add("%", "101001001001")
dictEncoding.Add("*", "100101101101")
strEncodedData = dictEncoding("*") & "0"
For I = 1 To Len(Data)
strEncodedData = strEncodedData & dictEncoding(Mid(Data, I, 1)) & "0"
Next I
strEncodedData = strEncodedData & dictEncoding("*")
sbBarcodeImgs = New StringBuilder()
For I = 1 To Len(strEncodedData)
If Mid(strEncodedData, I, 1) = "1" Then
sbBarcodeImgs.Append("<img src=""images/bar_blk.gif"" width=""" & WidthMultiplier & """ height=""" & Height & """ />")
Else
sbBarcodeImgs.Append("<img src=""images/bar_wht.gif"" width=""" & WidthMultiplier & """ height=""" & Height & """ />")
End If
Next I
litBarcode.Text = sbBarcodeImgs.ToString
End Sub
</script>
<asp:Literal ID="litBarcode" runat="server" />
Primarily the MID and dictionary usage are unfamiliar to me. Can this be completely converted to C#?
StringDictionary is just another collection class so no problem. Mid could still be used as Microsoft.VisualBasic.Mid() if you're willing to import the Visual Basic library to your C# app (nothing bad about that) or it could be rewritten fairly easily.
Edit: Actually, the VB.Net code just seems to use the Mid in the same way as you can use String.Substring so no need to use the Visual Basic library even. (I was thinking of Mid in VB6 that could be either a function or a statement, the function is similar to String.Substring but there's no real easy way to replicate it if it's the statement one but either way, doesn't matter for this code).
Have you tried using the Telerik Converter? I ran your stuff through there and didn't get any errors.
http://www.developerfusion.com/tools/convert/vb-to-csharp/
Yes. Use one of the converter tools mentioned here; only run through the code though, and not the markup parts of it.