-
[RESOLVED] How to import a word document into a RichTextBox
The code :
RichTextBox1.Loadfile "C:\xyz.RTF", rtfRTF" imports a RTF file fine.
RichTextBox1.Loadfile "C:\xyz.TXT", rtfTEXT" imports a TEXT file fine.
However if one tries to import a MS Word file of extension .doc, a whole load of gash symbols appear in addition to the desired text. Some of these are "valid" alpha-numeric characters, so a routine which parses out "wierd" characters would nonetheless leave in situ some gash characters.
Is there please a neat way to import a .doc file without first having to save it as a .txt or .rtf format file (using MS Word or whatever).?
camoore
Wales, UK
-
Re: How to import a word document into a RichTextBox
.doc is (or at least was) a proprietary closed format, and is not supported by the RTB or other controls.
The easiest way to do what you want is to automate Word (assuming it is installed) to save it as .rtf, and then load that. For example code etc see our Office Development FAQs (at the top of the Office Development forum)
The much more difficult alternative is to write your own code to interpret the .doc file (using the specifications which have now been made public), and re-build the text etc inside the RTB. This method would be hundreds/thousands of times more difficult.
-
Re: How to import a word document into a RichTextBox
if word is not installed i have been able to do this successfully using wordpad
-
Re: How to import a word document into a RichTextBox
Thank you for such rapid replies, Si the Geek and Westconn1.
I wanted, if possible, to avoid having to use WORD to save the document as a .txt or .rtf file first.
VMT Si for the pointer to those Office Development FAQs - very useful in so many ways. There I found code suggested by RobDog 888 for 2-way transfer. However this requires starting up WORD, and that takes quite a while on my PC. My current application required near real-time access to files. Nonetheless, for an application which is not so time-critical, this approach is most valuable. Thank you.
Maybe if, having started up WORD I leave it running subsequent access to .doc files will be much faster?
Just looking at all the gash characters which come back when I try to use :
RichTextBox1.Loadfile "C:\xyz.RTF", rtfRTF" or
RichTextBox1.Loadfile "C:\xyz.TXT", rtfTEXT"
to transfer rapidly a .doc file, it does not look too difficult to parse out the wheat from the chaff given that one knows what one is looking for. But to do this will take processing time (to manipulate the string, character by character checking for a valid ascii value). I will still be left with various gash characters which I do not want, such as inserted %'s which could however be valid characters in another context.
My application would only be time-critical in the .doc to TextBox route. From TextBox to .doc can take much longer if need be.
I will look into Westconn1's suggestion about using wordpad.
For some days am now on walkabout, with laptop. Ability to evaluate will be limited. Will re-post to thread later.
At first sight, the best route is to ensure that source files are of .txt or .rtf format. That is clearly what the RichTextBox is designed to support.
Regards,
camoore
Wales, UK
-
Re: How to import a word document into a RichTextBox
Just one more thought.
Is there a way from within a VB program to make Windows place a .doc document (all of it) with a specified path eg. "C:\TryThis.doc" onto the clipboard?
If so, then I think that GetText would suffice to import it into a text box or a rich text box? That would maybe preserve the text content and eliminate all the .doc chaff overhead. (Images will not be required, just text and / or numeric data).
If this is possible somehow, would it be necessary for MS Word to be running?
I believe it may be regarded as "bad practice" for a program to activate a write to clipboard, but that would be perfectly OK in my application which will be home office specific.
camoore
-
Re: How to import a word document into a RichTextBox
Quote:
Originally Posted by
camoore
Maybe if, having started up WORD I leave it running subsequent access to .doc files will be much faster?
It certainly would be faster that way, and is what I would recommend.
All you need to worry about is closing the files when you finish each one, and Word itself when your program closes - both of which are easy enough.
Quote:
to transfer rapidly a .doc file, it does not look too difficult to parse out the wheat from the chaff given that one knows what one is looking for.... I will still be left with various gash characters which I do not want, such as inserted %'s which could however be valid characters in another context.
Indeed, it certainly isn't a reliable method - and you will get a certain amount of 'bad' characters left over.
It could potentially be faster than keeping an instance of Word open, but probably not by much.
Quote:
I will look into Westconn1's suggestion about using wordpad.
I suspect that would be faster (due to the relative simplicity of WordPad meaning it takes less time to open), but have no idea what functionality is provided - you may not be able to re-use the same instance, in which case Word would probably then be faster.
Quote:
Originally Posted by
camoore
Is there a way from within a VB program to make Windows place a .doc document (all of it) with a specified path eg. "C:\TryThis.doc" onto the clipboard?
Unfortunately not, all Windows can do is put the actual file on the clipboard.
To get the contents of the file you would need to get Word to do it... which would basically be a less reliable version of the other automation options.
The fastest method would almost certainly be to write your own code to properly decode the .doc format (based on the official specifications), but that would take a lot of effort - and I suspect that you can get a decent speed using Word/WordPad.
-
Re: How to import a word document into a RichTextBox
Hating to be beaten, I have today written a pretty simple piece of code which imports a .doc file to a RichTextBox and converts it to "clean"text upon which a Vb program can operate. I will of course post the code when it is presentable(ish).
I am working away from home at present, and so am devoid of all my notes and textbooks.
Can anyone help me please with two queries :
1. How can I write a STRING file to a RichTextBox? Text1=ASTRING woukd work for an ordinary text box, but RichTextBox1=ASTRING generates an error.
2.
-
Re: How to import a word document into a RichTextBox
Oops., above message sent before I completed the second query.
2. I am writing long STRINGS to an ordinary text box. The multiline property is TRUE. How can I make the text stay within the box ie. not go on off the right hand side in an enormous long text line? This seems possible with a LABEL, but I want to use a Text box, not least because of the option of scrollbars.
camoore
-
Re: How to import a word document into a RichTextBox
rtb1.text = astring
a textbox should wordwrap if multiline is true,
-
Re: How to import a word document into a RichTextBox
need to change the scrollbars property... you'll need to play with the values to get the right one you want (can't remember it off the top of my head).
-tg
-
Re: How to import a word document into a RichTextBox
Spot on techgnome, thank you very much.
I had set multiline to TRUE and had asked for both vertical and horizontal SCROLLBARS.
Selecting just the vertical scrollbar option solved the problem. The text now wraps correctly and stays inside the Text Box. A most useful tip.
Now to work on the advice of wesconn1 about the RTBox loading a large STRING variable and I will shortly be able to post my preliminary code for importing direct into VB6 a .doc file.
It seems pretty fast in VB. Hopefully it will be yet faster as a .exe (this tbd.)
In essence it imports the text content of a .doc file, cuts off the considerable initial and terminal character sequences and produces a reasonable clean replica of it in a richtextbox. Of course the full .doc formatting is lost, but much of the layout is preserved and now a VB6 program can act upon it, parse data from it and so forth.
To come shortly, as a beta version!
camoore
-
Re: How to import a word document into a RichTextBox
i doubt this will still work with a word 2007 document
-
1 Attachment(s)
Re: How to import a word document into a RichTextBox
IMPORT of .doc files to a RT Box in VB6
The attached ZIP folder contains, I hope, my program which inter alia does this for MS Office 2000 running under Windows 2000.
It remains to be seen if it works with later OS and later versions of Office / Word.
All comment welcomed please.
Within the Zipped folder is a more descriptive document (in Word!)
camoore
-
Re: How to import a word document into a RichTextBox
This does not work with a single .doc or .docx file on my computer. I tried dozens of different .doc files. Sorry.
-
Re: How to import a word document into a RichTextBox
What about programmatically renaming the doc file as rtf, open in WordPad, copy, paste into your rtb?
For a 2007 docx file you could just read the compressed xml file and enter it into your rtb. I currently dont have 2007 installed since my hd crashed still. Waiting for Win 7 before reinstalling
-
Re: How to import a word document into a RichTextBox
Hi Tom Moran,
Thank you for evaluation. What operating system are you using and what version of MS Word generated the .doc files please?
It works for every single .doc file I have tried in it!
RobDog888 - I wanted a scheme for importing .doc files into VB6 RTBox "automatically" without a need for operator intervention or action.
Strange I have got it running fine here on my Win 2000 machine.
Would be very grateful for any further evaluation info.
Can you, Tom, please describe to me just what happens at each stage of using the program? Maybe by private message if this is getting too intricate for the thread?
Is it possible to send to me a .doc file which does NOT work for you?
camoore
-
Re: How to import a word document into a RichTextBox
Um no user interaction needed as I posted to programmatically do it. ;)
WordPad can support rtf documents like doc files.
You could also programmatcially open/save the doc to rtf using wordpad to clean out and may the file plain rtf. Then open to rtb and set to rtftext.
-
Re: How to import a word document into a RichTextBox
Hi Camoore:
I'm using XP (sp3) and Word 2007. Tried documents saved in Word 2003 (.doc) and 2007 (.docx). I tried over 15 files.
-
Re: How to import a word document into a RichTextBox
At Attachment is a ZIP of the project file, including the descriptive document.
A few things have been "tidied up" but the basic functionality is unaltered.This has an issue of 11-07-09.
It may well be that RobDog888's approach of calling up Wordpad from within VB6 will be the best solution (I just have to find out how to do it. I do not have any notes / books with me at present).
However the fact remains that on my Win 2000 machine I have a program which works with every .doc I put into it. Yet Dave Moran can not get it to work. The factors seem to be : 2000 vv XP and Word 2000 vv later versions.
I think it would be useful to get to the bottom of all this, if only to learn why a VB program should be operating system sensitive (if it is?).
To that end, I have asked Dave to kindly help further to investigate, and I will report back to this thread the result. I feel that the blow by blow detail would be too much for thread inclusion, but that the bottom line of this may potentially be of interest beyond just this application.
If anyone else can evaluate the program, their comments would be much appreciated.
camoore
-
1 Attachment(s)
Re: How to import a word document into a RichTextBox
I Think that attachment may not have attached. Will try again :
-
Re: How to import a word document into a RichTextBox
Your program can work on different windows versions but the dependancy is that Word is installed which it may or may not be. This is your issue - Word.
To Outline of the process:
Use ShellExecute API to launch WordPad passing the file path and name to it. Also, specify it as a hidden window so it doesnt flash on the screen or allow the user to interact with it.
Use FindWindow and FindWindowEx to locate your newly opened hidden wordpad window. Invoke the Save As menu, hook into the save as dialog popup box (may be an issue if its hidden) and pass the file name and file extension to save as.
Open the resulting rtf file into WordPad (easy part).
-
Re: How to import a word document into a RichTextBox
i don't believe that any other program except word (2007) can read 2007 docx files
-
Re: How to import a word document into a RichTextBox
Hi Colin:
Maybe I'm not doing this right but it still errors out saying the document is not recognized as a valid .doc file. I used the test document you sent. Maybe I don't understand your program logic. I copied your doc file to the c:\drive. Give a step by step of how your program is supposed to work, i.e., step click this button, step two click that button, etc. There are so many Go to's in your code I can't quite follow the logic.
-
Re: How to import a word document into a RichTextBox
Tim,
Step by step :
1. Open a document in word. Then select save as, type MS Word and specify location as C:\Testdoc (the save as routine will add the .doc extension)
2. Open my program. The latest version 11/07/09 has a pale green background.
3. Click Select Test Doc
4. Click loadfile.text. Text should appear in the RT Box
5. Click DOCSORT. This should clean up the "mess" and leave you with plain text.
That's all there is to it on my machine!
Here I now have a version running into which I can import .txt, .rtf or .doc files and a single routine extracts clean text from any of them. Not bad eh? But not a lot of use if it will not run on other machines!
Wesconn : could you possibly send me a .docx file with which to "play"
RobDog888 : thank you for the tips about how to use Wordpad from within VB. I will work on this and find out just how to do it when I get home.
-
Re: How to import a word document into a RichTextBox
i only have one (someone sent me), but it will not open in wordpad
there are viewers for word 2007, from microsoft, but whether you can use that for anything, i don't know
-
Re: How to import a word document into a RichTextBox
Am just about to start looking at .docx files. I found this useful link :
http://www.docx2doc.com/docx_details
It seems that .docx files are ZIP folders of a group of associated files which collectively define a document.
But I bet y'all knew that.
camoore
-
Re: How to import a word document into a RichTextBox
Quote:
But I bet y'all knew that.
i had no idea
-
1 Attachment(s)
Re: How to import a word document into a RichTextBox
Westconn1's knowledge of VB6 is exceded only by his modesty! I am delighted that both a guru and I have learnt something from this thread (doubtles I much more).
I will attach what I feel should be the last ZIPped folder of my program and descriptive paper. It is that dated 12-07-09.
The conclusions I have come to with the (as usual) excellent forum help are :-
1. Files of extension .txt, .rtf or .doc may be imported into a Rich Text Box using the RichTextBox1.Loadfile PATH, rtfTEXT method.
2. It is then not too difficult to extract clean text from all the .rtf and especially .doc file formats. Some reasonable layout is usually retained.
3. This leads to the possibility of a fairly simple subroutine, entirely within a VB program, which can import, clean up and make available (eg. as string variables) the information contained in those various source files.
4. The routine does not, as yet act upon .ZIP or .DOCX files (which I gather share many properties).
5. It is uncertain whether the program will work with source documents containing graphics and pictures. The aplication was aimed only at text / data. More than this is tbd.
I am very happy to exchange private messages with any interested parties, but feel that the general interest level in this theme is about exhausted. The above 5 points summarise what I have been able to achieve, with which I am well pleased. VMT. to all contributors.
camoore
Wales, UK
-
Re: [RESOLVED] How to import a word document into a RichTextBox
Here is a sample code that you can play with. This will allow you to copy and paste the Word doc in the richtext box:
Code:
Dim wdApp As Word.Application
Dim wdDoc As Word.Document
Dim wdRng As Word.Range
Dim wdWordEditor As Object
Dim wdApplLiefSchon As Boolean
Dim bWordWasRunning As Boolean
Dim xEnd&
Dim filename As String
Dim dir As String
Dim DirFile As String
Dim prio As String
Dim dofw As String
Dim tm As String
Dim tm2 As String
Dim dt As Date = CDate(Project_Maintenance.DateEntered.Text)
Dim time As DateTime = DateTime.Now
Dim format As String = "dddd, MMMM d, yyyy, HH:mm:ss"
Dim newdfilename As String
Dim wdrun As String
Dim To_Email As String
Dim CC_Email As String
Dim Cnt1 As Integer
Dim Cnt2 As Integer
Dim tst As String
tm = (time.ToString(format))
prio = Project_Maintenance.Priority.Text
filename = "Change Order for Project " & Project_Maintenance.ProTitle.Text & ".docx"
dir = "K:\Space Planner App 3.0\Projects" & Project_Maintenance.ProTitle.Text & ""
DirFile = dir & filename
'If Not System.IO.Directory.Exists(DirFile) Then
'MsgBox("Error", 0, "Error")
'End If
Dim p() As Process = Process.GetProcessesByName("WinWord")
If p.Count = 0 Then ' No word instance opened
wdrun = "No"
Else ' Any Word instance opened
wdrun = "Yes"
End If
p = Nothing
'check if Word is open
On Error Resume Next
wdApp = GetObject(, "Word.Application")
On Error GoTo errorMsgWord
If wdApp Is Nothing Then
'with this you prevent Word from opening a second time
'again with Create Object
wdApp = CreateObject("Word.Application")
wdApp.Visible = False 'True to see what's going on
Else
wdApplLiefSchon = True
End If
wdDoc = wdApp.Documents.Open(dir & filename)
wdApp.Visible = False
' wdDoc.CommandBars("ProWritingAid").Visible = False
wdDoc.Content.Copy()
RichTextBox1.Paste()
wdDoc.Close(Word.WdSaveOptions.wdDoNotSaveChanges)
'If wdrun = "No" Then
wdApp.Quit()
Exit Sub
errorMsgWord:
MsgBox(Err.Description, 16, "Error")
On Error Resume Next
wdDoc = Nothing
wdApp = Nothing
wdRng = Nothing
End Sub
"We can’t solve problems with the same thinking we used to create them!"