Results 1 to 4 of 4

Thread: retrieving google documents contents

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Dec 2011
    Posts
    117

    retrieving google documents contents

    I am trying to extract the contents of a google drive file. It is a google document file. I have the code to pull the entire document, including metadata, etc. I saw the google api page concerning this: https://developers.google.com/drive/...m_google_drive

    I am having trouble being able to parse the body out of what I am getting from my code. Has anyone done this? thanks, Victor.


    Code:
    Dim request As HttpWebRequest = DirectCast(WebRequest.Create("insert google doc url here"), HttpWebRequest)
    
                    Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
    
                    Dim Stream As Stream = response.GetResponseStream()
    
                    Dim reader As StreamReader = New StreamReader(Stream)
    
                    MsgBox(reader.ReadToEnd)

  2. #2
    Frenzied Member
    Join Date
    Oct 2012
    Location
    Tampa, FL
    Posts
    1,187

    Re: retrieving google documents contents

    What do you mean parse the body out of? You mean read the content? It seems per documentation:

    Code:
     HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new Uri(downloadUrl));
            auth.ApplyAuthenticationToRequest(request);
    
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            System.IO.Stream stream = response.GetResponseStream();
            StreamReader reader = new StreamReader(stream);
            return reader.ReadToEnd(); // this returns the string
    What is the problem you are getting or exceptions thrown? What does reader.readtoend() return for you?

  3. #3

    Thread Starter
    Lively Member
    Join Date
    Dec 2011
    Posts
    117

    Re: retrieving google documents contents

    It returns huge amounts of data. You can find the body contents if you search the document. to clarify: I put the 'reader' into a textbox then moved it over to microsoft word for easier viewing. I managed to see the document title and further down I found the contents or 'body'. how ever it was no readable because it looked something like: ~3()}$example title()#}}P(#/{}{3$o}example body text,.#*{32}{4uh}tero(#{}E}{}teno!){}.

    To give you an idea of how much extra there is, I had one word in the body and the file returns 19 pages in word.

    I am not sure how to parse the body from the rest. It seems impossible unless google can do it on their side and send you just the actual contents with no metadata stuff.

  4. #4
    PowerPoster dunfiddlin's Avatar
    Join Date
    Jun 2012
    Posts
    8,245

    Re: retrieving google documents contents

    It's a Google document so it has to be read in Google's software. You would get a very similar result if you read a Word document or indeed the much simpler Rich Text Format as plain text. I would have thought that was obvious. If you want plain text then you need to either save the original document in that form at the server or simply copy and paste from the document in situ.
    As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"

    Reviews: "dunfiddlin likes his DataTables" - jmcilhinney

    Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width