|
-
Mar 19th, 2013, 02:27 AM
#1
Thread Starter
New Member
[RESOLVED] Extracting and merging pages with iText
Dear All,
Itext version - 5.3.4
I am using the PdfManipulation2.vb supplied by Stanav.
Code:
''' <summary>
''' Extract selected pages from a source pdf to a new pdf
''' </summary>
''' <param name="sourcePdf">the full path to source pdf to a new pdf</param>
''' <param name="pageNumbersToExtract">the page numbers to extract (i.e {1, 3, 5, 6})</param>
''' <param name="outPdf">The full path for the output pdf</param>
''' <remarks>The output pdf will contains the extracted pages in the order of the page numbers listed
''' in pageNumbersToExtract parameter.</remarks>
Public Overloads Shared Sub ExtractPdfPage(ByVal sourcePdf As String, ByVal pageNumbersToExtract As Integer(), ByVal outPdf As String)
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim doc As iTextSharp.text.Document = Nothing
Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
Try
reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
doc.Open()
For Each pageNum As Integer In pageNumbersToExtract
page = pdfCpy.GetImportedPage(reader, pageNum)
pdfCpy.AddPage(page)
Next
doc.Close()
reader.Close()
Catch ex As Exception
Throw ex
End Try
End Sub
My problem is when I try and use it I get "Expected expression" at the page numbers {1,3,5,6}. 
Code:
PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")
Is it a syntax problem.. Any ideas would be great!
Thank you in advance!
Glen
-
Mar 19th, 2013, 04:57 AM
#2
Thread Starter
New Member
Re: Extracting and merging pages with iText
I figured it out myself: 
Code:
Private Sub Button5_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button5.Click
Dim fname As String
fname = "C:\Test.pdf"
'create table to hold reports
Dim dtMerge As New DataTable
dtMerge.Columns.Add("column 0", GetType(String))
dtMerge.Columns.Add("column 1", GetType(String))
'add report to table, using 0 to mrge all pages
dtMerge.Rows.Add(New String() {fname, TextBox2.Text})
' TextBox2.Text = 1,3,5,6
'call pdf merge
PdfManipulation.ExtractAndMergePdfPages(dtMerge, "C:\TestExtractMerge.pdf")
End Sub
Using the PdfManipulation.vb supplied by Stanav.
Code:
''' <summary>
''' Extract pages from multiple pdf's file and merge them into
''' a single pdf
''' </summary>
''' <param name="sourceTable">the datatable containing source pfd paths and the pages to extract
''' from each of them. This datatable should have 2 datacolumns of type String. The 1st column (column 0)
''' is for the file (full) path while the 2nd column (column 1) is for the list of pages to extract from
''' the source pdf in column 1. This list is a string of integer values separated by commas
''' (ex: "1, 3, 2, 5 , 8, 7, 9") </param>
''' <param name="outPdf">the path to save the output pdf</param>
''' <remarks>the pdf pages are extracted and merged in the order list in the source datatable.
''' That is, for source pdf files, they will be merged from top row down, and for pages, they will be merged
''' by the order listed in the csv string</remarks>
Public Shared Sub ExtractAndMergePdfPages(ByVal sourceTable As DataTable, ByVal outPdf As String)
Dim rowCount As Integer = sourceTable.Rows.Count
Dim sourcePdf As String = String.Empty
Dim pageNumbersToExtract() As Integer = Nothing
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim doc As iTextSharp.text.Document = Nothing
Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
Select Case rowCount
Case 0 'Nothing to extract and merge
Exit Sub
Case 1 'only 1 source pdf
sourcePdf = CStr(sourceTable.Rows(0).Item(0))
pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
ExtractPdfPage(sourcePdf, pageNumbersToExtract, outPdf)
Case Else 'multiple source pdf's
Try
sourcePdf = CStr(sourceTable.Rows(0).Item(0))
pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
doc.Open()
For Each pageNum As Integer In pageNumbersToExtract
page = pdfCpy.GetImportedPage(reader, pageNum)
pdfCpy.AddPage(page)
Next
reader.Close()
For i As Integer = 1 To rowCount - 1
sourcePdf = CStr(sourceTable.Rows(i).Item(0))
pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(i).Item(1)))
reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
doc.SetPageSize(reader.GetPageSizeWithRotation(1))
For Each pageNum As Integer In pageNumbersToExtract
page = pdfCpy.GetImportedPage(reader, pageNum)
pdfCpy.AddPage(page)
Next
reader.Close()
Next
doc.Close()
Catch ex As Exception
Throw ex
End Try
End Select
End Sub
-
Mar 19th, 2013, 03:20 PM
#3
Re: Extracting and merging pages with iText
 Originally Posted by Glenvn
Code:
PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")
Is it a syntax problem.. Any ideas would be great!
Thank you in advance!
Glen
You need to declare the array and you're correct, it's a syntax error on your part for incorrectly doing it. That line should be like this:
Code:
Dim myPages() as Integer = New Integer() {1, 3, 5, 6}
PdfManipulation2.ExtractPdfPage("C:\Test.pdf",myPages, "C:\TestExtractMerge.pdf")
If you prefer to cramp everything in 1 line then
Code:
PdfManipulation2.ExtractPdfPage("C:\Test.pdf", New Integer() {1,3,5,6}, "C:\TestExtractMerge.pdf")
Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it.
- Abraham Lincoln -
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|