VS 2008 [RESOLVED] Extracting and merging pages with iText

[RESOLVED] Extracting and merging pages with iText

Dear All,

Itext version - 5.3.4

I am using the PdfManipulation2.vb‎ supplied by Stanav.

Code:

''' <summary> ''' Extract selected pages from a source pdf to a new pdf ''' </summary> ''' <param name="sourcePdf">the full path to source pdf to a new pdf</param> ''' <param name="pageNumbersToExtract">the page numbers to extract (i.e {1, 3, 5, 6})</param> ''' <param name="outPdf">The full path for the output pdf</param> ''' <remarks>The output pdf will contains the extracted pages in the order of the page numbers listed ''' in pageNumbersToExtract parameter.</remarks> Public Overloads Shared Sub ExtractPdfPage(ByVal sourcePdf As String, ByVal pageNumbersToExtract As Integer(), ByVal outPdf As String) Dim reader As iTextSharp.text.pdf.PdfReader = Nothing Dim doc As iTextSharp.text.Document = Nothing Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing Try reader = New iTextSharp.text.pdf.PdfReader(sourcePdf) doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1)) pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create)) doc.Open() For Each pageNum As Integer In pageNumbersToExtract page = pdfCpy.GetImportedPage(reader, pageNum) pdfCpy.AddPage(page) Next doc.Close() reader.Close() Catch ex As Exception Throw ex End Try End Sub

My problem is when I try and use it I get "Expected expression" at the page numbers {1,3,5,6}. :(

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")

Is it a syntax problem.. Any ideas would be great!

Thank you in advance!

Glen

Re: Extracting and merging pages with iText

I figured it out myself: :bigyello:

Code:

Private Sub Button5_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button5.Click Dim fname As String fname = "C:\Test.pdf" 'create table to hold reports Dim dtMerge As New DataTable dtMerge.Columns.Add("column 0", GetType(String)) dtMerge.Columns.Add("column 1", GetType(String)) 'add report to table, using 0 to mrge all pages dtMerge.Rows.Add(New String() {fname, TextBox2.Text}) ' TextBox2.Text = 1,3,5,6 'call pdf merge PdfManipulation.ExtractAndMergePdfPages(dtMerge, "C:\TestExtractMerge.pdf") End Sub

Using the PdfManipulation.vb‎ supplied by Stanav.

Code:

''' <summary> ''' Extract pages from multiple pdf's file and merge them into ''' a single pdf ''' </summary> ''' <param name="sourceTable">the datatable containing source pfd paths and the pages to extract ''' from each of them. This datatable should have 2 datacolumns of type String. The 1st column (column 0) ''' is for the file (full) path while the 2nd column (column 1) is for the list of pages to extract from ''' the source pdf in column 1. This list is a string of integer values separated by commas ''' (ex: "1, 3, 2, 5 , 8, 7, 9") </param> ''' <param name="outPdf">the path to save the output pdf</param> ''' <remarks>the pdf pages are extracted and merged in the order list in the source datatable. ''' That is, for source pdf files, they will be merged from top row down, and for pages, they will be merged ''' by the order listed in the csv string</remarks> Public Shared Sub ExtractAndMergePdfPages(ByVal sourceTable As DataTable, ByVal outPdf As String) Dim rowCount As Integer = sourceTable.Rows.Count Dim sourcePdf As String = String.Empty Dim pageNumbersToExtract() As Integer = Nothing Dim reader As iTextSharp.text.pdf.PdfReader = Nothing Dim doc As iTextSharp.text.Document = Nothing Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing Select Case rowCount Case 0 'Nothing to extract and merge Exit Sub Case 1 'only 1 source pdf sourcePdf = CStr(sourceTable.Rows(0).Item(0)) pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1))) ExtractPdfPage(sourcePdf, pageNumbersToExtract, outPdf) Case Else 'multiple source pdf's Try sourcePdf = CStr(sourceTable.Rows(0).Item(0)) pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1))) reader = New iTextSharp.text.pdf.PdfReader(sourcePdf) doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1)) pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create)) doc.Open() For Each pageNum As Integer In pageNumbersToExtract page = pdfCpy.GetImportedPage(reader, pageNum) pdfCpy.AddPage(page) Next reader.Close() For i As Integer = 1 To rowCount - 1 sourcePdf = CStr(sourceTable.Rows(i).Item(0)) pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(i).Item(1))) reader = New iTextSharp.text.pdf.PdfReader(sourcePdf) doc.SetPageSize(reader.GetPageSizeWithRotation(1)) For Each pageNum As Integer In pageNumbersToExtract page = pdfCpy.GetImportedPage(reader, pageNum) pdfCpy.AddPage(page) Next reader.Close() Next doc.Close() Catch ex As Exception Throw ex End Try End Select End Sub

Re: Extracting and merging pages with iText

Quote:

Originally Posted by Glenvn

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")

Is it a syntax problem.. Any ideas would be great!
Thank you in advance!
Glen

You need to declare the array and you're correct, it's a syntax error on your part for incorrectly doing it. That line should be like this:

Code:

Dim myPages() as Integer = New Integer() {1, 3, 5, 6} PdfManipulation2.ExtractPdfPage("C:\Test.pdf",myPages, "C:\TestExtractMerge.pdf")

If you prefer to cramp everything in 1 line then

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf", New Integer() {1,3,5,6}, "C:\TestExtractMerge.pdf")