[RESOLVED] Extracting and merging pages with iText

**Glenvn** · Mar 19th, 2013, 02:27 AM

Dear All,

Itext version - 5.3.4

I am using the PdfManipulation2.vb‎ supplied by Stanav.

Code:

 ''' <summary>
    ''' Extract selected pages from a source pdf to a new pdf
    ''' </summary>
    ''' <param name="sourcePdf">the full path to source pdf to a new pdf</param>
    ''' <param name="pageNumbersToExtract">the page numbers to extract (i.e {1, 3, 5, 6})</param>
    ''' <param name="outPdf">The full path for the output pdf</param>
    ''' <remarks>The output pdf will contains the extracted pages in the order of the page numbers listed
    ''' in pageNumbersToExtract parameter.</remarks>
    Public Overloads Shared Sub ExtractPdfPage(ByVal sourcePdf As String, ByVal pageNumbersToExtract As Integer(), ByVal outPdf As String)
        Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
        Dim doc As iTextSharp.text.Document = Nothing
        Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
        Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
        Try
            reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
            doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
            pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
            doc.Open()
            For Each pageNum As Integer In pageNumbersToExtract
                page = pdfCpy.GetImportedPage(reader, pageNum)
                pdfCpy.AddPage(page)
            Next
            doc.Close()
            reader.Close()
        Catch ex As Exception
            Throw ex
        End Try
    End Sub

My problem is when I try and use it I get "Expected expression" at the page numbers {1,3,5,6}.

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")

Is it a syntax problem.. Any ideas would be great!

Thank you in advance!

Glen

**Glenvn** · Mar 19th, 2013, 04:57 AM

I figured it out myself:

Code:

Private Sub Button5_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button5.Click

        Dim fname As String

        fname = "C:\Test.pdf"

        'create table to hold reports
        Dim dtMerge As New DataTable

        dtMerge.Columns.Add("column 0", GetType(String))
        dtMerge.Columns.Add("column 1", GetType(String))

        'add report to table, using 0 to mrge all pages
        dtMerge.Rows.Add(New String() {fname, TextBox2.Text})

         ' TextBox2.Text = 1,3,5,6


        'call pdf merge
        PdfManipulation.ExtractAndMergePdfPages(dtMerge, "C:\TestExtractMerge.pdf")


    End Sub

Using the PdfManipulation.vb‎ supplied by Stanav.

Code:

 ''' <summary>
    ''' Extract pages from multiple pdf's file and merge them into 
    ''' a single pdf
    ''' </summary>
    ''' <param name="sourceTable">the datatable containing source pfd paths and the pages to extract
    ''' from each of them. This datatable should have 2 datacolumns of type String. The 1st column (column 0)
    ''' is for the file (full) path while the 2nd column (column 1) is for the list of pages to extract from
    ''' the source pdf in column 1. This list is a string of integer values separated by commas 
    ''' (ex: "1, 3, 2, 5 , 8, 7, 9") </param>
    ''' <param name="outPdf">the path to save the output pdf</param>
    ''' <remarks>the pdf pages are extracted and merged in the order list in the source datatable.
    ''' That is, for source pdf files, they will be merged from top row down, and for pages, they will be merged
    ''' by the order listed in the csv string</remarks>
    Public Shared Sub ExtractAndMergePdfPages(ByVal sourceTable As DataTable, ByVal outPdf As String)
        Dim rowCount As Integer = sourceTable.Rows.Count
        Dim sourcePdf As String = String.Empty
        Dim pageNumbersToExtract() As Integer = Nothing
        Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
        Dim doc As iTextSharp.text.Document = Nothing
        Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
        Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
        Select Case rowCount
            Case 0  'Nothing to extract and merge
                Exit Sub
            Case 1  'only 1 source pdf
                sourcePdf = CStr(sourceTable.Rows(0).Item(0))
                pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
                ExtractPdfPage(sourcePdf, pageNumbersToExtract, outPdf)
            Case Else   'multiple source pdf's
                Try
                    sourcePdf = CStr(sourceTable.Rows(0).Item(0))
                    pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
                    reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
                    doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
                    pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
                    doc.Open()
                    For Each pageNum As Integer In pageNumbersToExtract
                        page = pdfCpy.GetImportedPage(reader, pageNum)
                        pdfCpy.AddPage(page)
                    Next
                    reader.Close()
                    For i As Integer = 1 To rowCount - 1
                        sourcePdf = CStr(sourceTable.Rows(i).Item(0))
                        pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(i).Item(1)))
                        reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
                        doc.SetPageSize(reader.GetPageSizeWithRotation(1))
                        For Each pageNum As Integer In pageNumbersToExtract
                            page = pdfCpy.GetImportedPage(reader, pageNum)
                            pdfCpy.AddPage(page)
                        Next
                        reader.Close()
                    Next
                    doc.Close()
                Catch ex As Exception
                    Throw ex
                End Try
        End Select
    End Sub

**stanav** · Mar 19th, 2013, 03:20 PM

Originally Posted by Glenvn

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")

Is it a syntax problem.. Any ideas would be great!
Thank you in advance!
Glen

You need to declare the array and you're correct, it's a syntax error on your part for incorrectly doing it. That line should be like this:

Code:

Dim myPages() as Integer = New Integer() {1, 3, 5, 6}
PdfManipulation2.ExtractPdfPage("C:\Test.pdf",myPages, "C:\TestExtractMerge.pdf")

If you prefer to cramp everything in 1 line then

Code:

PdfManipulation2.ExtractPdfPage("C:\Test.pdf", New Integer() {1,3,5,6}, "C:\TestExtractMerge.pdf")

Thread: [RESOLVED] Extracting and merging pages with iText

Thread Tools

Display

[RESOLVED] Extracting and merging pages with iText

Re: Extracting and merging pages with iText

Re: Extracting and merging pages with iText

Posting Permissions