Results 1 to 3 of 3

Thread: [RESOLVED] Extracting and merging pages with iText

  1. #1

    Thread Starter
    New Member
    Join Date
    Oct 2009
    Posts
    10

    Resolved [RESOLVED] Extracting and merging pages with iText

    Dear All,

    Itext version - 5.3.4


    I am using the PdfManipulation2.vb‎ supplied by Stanav.




    Code:
     ''' <summary>
        ''' Extract selected pages from a source pdf to a new pdf
        ''' </summary>
        ''' <param name="sourcePdf">the full path to source pdf to a new pdf</param>
        ''' <param name="pageNumbersToExtract">the page numbers to extract (i.e {1, 3, 5, 6})</param>
        ''' <param name="outPdf">The full path for the output pdf</param>
        ''' <remarks>The output pdf will contains the extracted pages in the order of the page numbers listed
        ''' in pageNumbersToExtract parameter.</remarks>
        Public Overloads Shared Sub ExtractPdfPage(ByVal sourcePdf As String, ByVal pageNumbersToExtract As Integer(), ByVal outPdf As String)
            Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
            Dim doc As iTextSharp.text.Document = Nothing
            Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
            Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
            Try
                reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
                doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
                pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
                doc.Open()
                For Each pageNum As Integer In pageNumbersToExtract
                    page = pdfCpy.GetImportedPage(reader, pageNum)
                    pdfCpy.AddPage(page)
                Next
                doc.Close()
                reader.Close()
            Catch ex As Exception
                Throw ex
            End Try
        End Sub

    My problem is when I try and use it I get "Expected expression" at the page numbers {1,3,5,6}.

    Code:
    PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")
    Is it a syntax problem.. Any ideas would be great!


    Thank you in advance!

    Glen

  2. #2

    Thread Starter
    New Member
    Join Date
    Oct 2009
    Posts
    10

    Resolved Re: Extracting and merging pages with iText

    I figured it out myself:

    Code:
    Private Sub Button5_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button5.Click
    
            Dim fname As String
    
            fname = "C:\Test.pdf"
    
            'create table to hold reports
            Dim dtMerge As New DataTable
    
            dtMerge.Columns.Add("column 0", GetType(String))
            dtMerge.Columns.Add("column 1", GetType(String))
    
            'add report to table, using 0 to mrge all pages
            dtMerge.Rows.Add(New String() {fname, TextBox2.Text})
    
             ' TextBox2.Text = 1,3,5,6
    
    
            'call pdf merge
            PdfManipulation.ExtractAndMergePdfPages(dtMerge, "C:\TestExtractMerge.pdf")
    
    
        End Sub

    Using the PdfManipulation.vb‎ supplied by Stanav.

    Code:
     ''' <summary>
        ''' Extract pages from multiple pdf's file and merge them into 
        ''' a single pdf
        ''' </summary>
        ''' <param name="sourceTable">the datatable containing source pfd paths and the pages to extract
        ''' from each of them. This datatable should have 2 datacolumns of type String. The 1st column (column 0)
        ''' is for the file (full) path while the 2nd column (column 1) is for the list of pages to extract from
        ''' the source pdf in column 1. This list is a string of integer values separated by commas 
        ''' (ex: "1, 3, 2, 5 , 8, 7, 9") </param>
        ''' <param name="outPdf">the path to save the output pdf</param>
        ''' <remarks>the pdf pages are extracted and merged in the order list in the source datatable.
        ''' That is, for source pdf files, they will be merged from top row down, and for pages, they will be merged
        ''' by the order listed in the csv string</remarks>
        Public Shared Sub ExtractAndMergePdfPages(ByVal sourceTable As DataTable, ByVal outPdf As String)
            Dim rowCount As Integer = sourceTable.Rows.Count
            Dim sourcePdf As String = String.Empty
            Dim pageNumbersToExtract() As Integer = Nothing
            Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
            Dim doc As iTextSharp.text.Document = Nothing
            Dim pdfCpy As iTextSharp.text.pdf.PdfCopy = Nothing
            Dim page As iTextSharp.text.pdf.PdfImportedPage = Nothing
            Select Case rowCount
                Case 0  'Nothing to extract and merge
                    Exit Sub
                Case 1  'only 1 source pdf
                    sourcePdf = CStr(sourceTable.Rows(0).Item(0))
                    pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
                    ExtractPdfPage(sourcePdf, pageNumbersToExtract, outPdf)
                Case Else   'multiple source pdf's
                    Try
                        sourcePdf = CStr(sourceTable.Rows(0).Item(0))
                        pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(0).Item(1)))
                        reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
                        doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1))
                        pdfCpy = New iTextSharp.text.pdf.PdfCopy(doc, New IO.FileStream(outPdf, IO.FileMode.Create))
                        doc.Open()
                        For Each pageNum As Integer In pageNumbersToExtract
                            page = pdfCpy.GetImportedPage(reader, pageNum)
                            pdfCpy.AddPage(page)
                        Next
                        reader.Close()
                        For i As Integer = 1 To rowCount - 1
                            sourcePdf = CStr(sourceTable.Rows(i).Item(0))
                            pageNumbersToExtract = ConvertToIntegerArray(CStr(sourceTable.Rows(i).Item(1)))
                            reader = New iTextSharp.text.pdf.PdfReader(sourcePdf)
                            doc.SetPageSize(reader.GetPageSizeWithRotation(1))
                            For Each pageNum As Integer In pageNumbersToExtract
                                page = pdfCpy.GetImportedPage(reader, pageNum)
                                pdfCpy.AddPage(page)
                            Next
                            reader.Close()
                        Next
                        doc.Close()
                    Catch ex As Exception
                        Throw ex
                    End Try
            End Select
        End Sub

  3. #3
    PowerPoster stanav's Avatar
    Join Date
    Jul 2006
    Location
    Providence, RI - USA
    Posts
    9,290

    Re: Extracting and merging pages with iText

    Quote Originally Posted by Glenvn View Post
    Code:
    PdfManipulation2.ExtractPdfPage("C:\Test.pdf",{1,3,5,6}, "C:\TestExtractMerge.pdf")
    Is it a syntax problem.. Any ideas would be great!
    Thank you in advance!
    Glen
    You need to declare the array and you're correct, it's a syntax error on your part for incorrectly doing it. That line should be like this:
    Code:
    Dim myPages() as Integer = New Integer() {1, 3, 5, 6}
    PdfManipulation2.ExtractPdfPage("C:\Test.pdf",myPages, "C:\TestExtractMerge.pdf")
    If you prefer to cramp everything in 1 line then
    Code:
    PdfManipulation2.ExtractPdfPage("C:\Test.pdf", New Integer() {1,3,5,6}, "C:\TestExtractMerge.pdf")
    Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it.
    - Abraham Lincoln -

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width