Results 1 to 3 of 3

Thread: Removing text from a PDF

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Sep 2018
    Posts
    160

    Removing text from a PDF

    Hi

    I have a pdf file from a client with name and address details on the odd pages (it's a file of postcards which are 2 sided, could be several hundred pages long).

    I need to take a copy of one of the postcards (2 pages), remove the existing address and add a different one. The idea is to have a copy go to a seed address for quality control purposes.

    What I've done so far is lay a white square graphic over the top of the existing text (it's in a consistent area) then, using itextsharp, stamp the new address over the top. I finally append this 'new' pdf on the end of the existing file.

    So far so good, but the old address is still there, so when I do a text extract from the pdf it gets a bit confused on this record because there are effectively 2 addresses mixed up, even though only the one I want is visible.

    Can anyone suggest a way I could make the white area stamp process remove the text, instead of just covering it visually?

    I'm open to other suggestions as to how to tackle this!

    Thanks for reading

  2. #2

    Thread Starter
    Addicted Member
    Join Date
    Sep 2018
    Posts
    160

    Re: Removing text from a PDF

    From a bit more research it looks like redacting the existing address area might be the way to go (then stamping the new address on top).

    If I have any joy with this I'll post details in case it helps someone.

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Sep 2018
    Posts
    160

    Re: Removing text from a PDF

    Sorry to keep replying to my own post, but I've found what looks to be the answer in the link below, I just need a bit of help with the final stage:

    https://itextpdf.com/en/resources/fa...ply-redactions


    I currently have the following (I've converted from c# used in link above to vb):


    Code:
    Imports iTextSharp.text.pdf
    Imports System.IO
    Imports iTextSharp.xtra.iTextSharp.text.pdf.pdfcleanup
    
    
    Public Class Form1
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
    
            Dim FileName = "C:\test redact\sample.pdf"
            Dim newFileName = "C:\test redact\redact marked.pdf"
    
            ' Mark area to redact with rectangle
    
    
            Using stream As Stream = New FileStream(FileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
                Dim pdfReader As New PdfReader(stream)
                Using stamper As New PdfStamper(pdfReader, New FileStream(newFileName, FileMode.OpenOrCreate))
    
                    Dim page As Integer = 2
                    Dim rect As New iTextSharp.text.Rectangle(620, 220, 370, 350)
                    Dim annotation As New PdfAnnotation(stamper.Writer, rect)
                    annotation.Put(PdfName.SUBTYPE, New PdfName("Redact"))
                    annotation.Title = "My Author"
                    annotation.Put(New PdfName("Subj"), New PdfName("Redact"))
                    Dim fillColor As Single() = {0, 0, 0}
                    annotation.Put(New PdfName("IC"), New PdfArray(fillColor))
                    Dim fillColorRed As Single() = {1, 0, 0}
                    annotation.Put(New PdfName("OC"), New PdfArray(fillColorRed))
                    stamper.AddAnnotation(annotation, page)
    
                End Using
    
            End Using
    
    
    
            ' Now redact the marked area
    
            Dim src = newFileName
            Dim dest = "C:\test redact\redact FINAL.pdf"
    
            Dim reader2 = New PdfReader(New FileStream(src, FileMode.Open))
    
            'Dim stamper2 As New PdfStamper(reader2, New FileOutputStream(dest))
    
            Dim stamper2 = New PdfStamper(reader2, New FileStream(dest, FileMode.Create))
    
            Dim cleaner As New PdfCleanUpProcessor(stamper2)
    
            cleaner.CleanUp()
            stamper2.Close()
            reader2.Close()
    
            ' open the results to have a look
    
            Process.Start("C:\test redact\redact FINAL.pdf")
            Application.Exit()
    
        End Sub
    
    
    End Class
    With this code the 'marking up' section is OK and produces an outline on page 2 of my pdf in the area I want.

    The problem is I get the error System.NullReferenceException: 'Object reference not set to an instance of an object.' on this line:

    Code:
    Dim cleaner As New PdfCleanUpProcessor(stamper2)
    As part of getting this to work I had to replace this line:

    Code:
     'Dim stamper2 As New PdfStamper(reader2, New FileOutputStream(dest))
    with this one:

    Code:
    Dim stamper2 = New PdfStamper(reader2, New FileStream(dest, FileMode.Create))
    Also - probably most important - the original link mentions the process 'requires the itext-xtra.jar'. I've downloaded this file but I don't know how to use it? I'm afraid I know nothing about java.

    Can anyone give me some help please?

    Thanks!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width