|
-
May 3rd, 2013, 09:33 AM
#1
Thread Starter
Lively Member
retrieving google documents contents
I am trying to extract the contents of a google drive file. It is a google document file. I have the code to pull the entire document, including metadata, etc. I saw the google api page concerning this: https://developers.google.com/drive/...m_google_drive
I am having trouble being able to parse the body out of what I am getting from my code. Has anyone done this? thanks, Victor.
Code:
Dim request As HttpWebRequest = DirectCast(WebRequest.Create("insert google doc url here"), HttpWebRequest)
Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
Dim Stream As Stream = response.GetResponseStream()
Dim reader As StreamReader = New StreamReader(Stream)
MsgBox(reader.ReadToEnd)
-
May 4th, 2013, 12:54 AM
#2
Re: retrieving google documents contents
What do you mean parse the body out of? You mean read the content? It seems per documentation:
Code:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new Uri(downloadUrl));
auth.ApplyAuthenticationToRequest(request);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
System.IO.Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream);
return reader.ReadToEnd(); // this returns the string
What is the problem you are getting or exceptions thrown? What does reader.readtoend() return for you?
-
May 6th, 2013, 12:37 PM
#3
Thread Starter
Lively Member
Re: retrieving google documents contents
It returns huge amounts of data. You can find the body contents if you search the document. to clarify: I put the 'reader' into a textbox then moved it over to microsoft word for easier viewing. I managed to see the document title and further down I found the contents or 'body'. how ever it was no readable because it looked something like: ~3()}$example title()#}}P(#/{}{3$o}example body text,.#*{32}{4uh}tero(#{}E}{}teno!){}.
To give you an idea of how much extra there is, I had one word in the body and the file returns 19 pages in word.
I am not sure how to parse the body from the rest. It seems impossible unless google can do it on their side and send you just the actual contents with no metadata stuff.
-
May 6th, 2013, 12:48 PM
#4
Re: retrieving google documents contents
It's a Google document so it has to be read in Google's software. You would get a very similar result if you read a Word document or indeed the much simpler Rich Text Format as plain text. I would have thought that was obvious. If you want plain text then you need to either save the original document in that form at the server or simply copy and paste from the document in situ.
As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"
Reviews: "dunfiddlin likes his DataTables" - jmcilhinney
Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|