-
Feb 16th, 2018, 03:43 PM
#1
Thread Starter
Hyperactive Member
Read MS Word Document Title
Sorry I named the thread wrong. The proper name is "Read MS Word Document Author Property"
I need to get the title of the MS Word document. When the document is created by VB program its Author property is set by VB and the file is saved. Then I need to find out the Author to process something depending on the title of the document. I wrote this code:
Code:
Private Function IsDBBasedProposal(FileName As String) As Boolean
Dim objWord As Object
Dim wd As Object
If objWord Is Nothing Then
Set objWord = CreateObject("Word.Application")
Else
Set objWord = GetObject(, "Word.Application")
End If
DoEvents
Set wd = objWord.Documents.Open(FileName)
If wd.BuiltInDocumentProperties("Author") = "CoordinatorDB" Then 'wdPropertyTitle
IsDBBasedProposal = True
Else
IsDBBasedProposal = False
End If
objWord.ActiveDocument.Close
DoEvents
objWord.Quit
Set objWord = Nothing
End Function
It gives exactly what I want, but it is slow. On the local drive it takes 2 seconds, in the network it is even longer.
Is there any way to read the Author property of MS Word document without creating Word Object, opening the document? I hope it could be faster.
I tried FileSystemObject with no success
Thank you
Last edited by chapran; Feb 16th, 2018 at 03:58 PM.
-
Feb 16th, 2018, 08:41 PM
#2
Re: Read MS Word Document Title
Well, I can tell you "the hard way" to do it. And this is only if we're talking about DOCX (or DOTX, DOCM, DOTM) type files. These files are actually just zipped files with many other files within them. From VB6, I'd use wqweto's unzip utility (which can be found here). It's a pure VB6 solution with all source code available.
Once you're in a position to unzip a file, if you unzip one of these DOCX files, you'll see a sub-folder named docProps. Within this docProps folder, you'll find a file named core.xml.
This is a standard ASCII/ANSI file, and will open with notepad (or using VB6 Line Input statements). Also, there are several XML parsers floating around these forums.
Within this core.xml file, you'll find a <dc:creator> tag. Within that tag will be the document's author. For instance, you might find something like:
<dc:creator>Elroy</dc:creator>
That's what you're supposedly looking for.
Best Of Luck,
Elroy
EDIT1: Just to put it all together, you might try opening one of these DOCX files with something like 7Zip and snooping around in it.
Last edited by Elroy; Feb 16th, 2018 at 08:44 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 16th, 2018, 08:52 PM
#3
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
Symantec Endpoint does not allow to open the file reporting virus.
-
Feb 16th, 2018, 09:07 PM
#4
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
I found this:
https://support.microsoft.com/en-us/...erties-when-yo
I used the DLL and modified the code from example to this:
Code:
Public Function IsDBBasedProposal(FileName As String) As Boolean
Dim objSummProps As Object
Dim blnOpenReadOnly As Boolean
Dim objDocumentProps As Object
blnOpenReadOnly = CBool(cdlOFNFileMustExist And cdlOFNReadOnly)
If Len(FileName) = 0 Then
IsDBBasedProposal = False
Exit Function
End If
Set objDocumentProps = CreateObject("DSOFile.OleDocumentProperties")
objDocumentProps.Open FileName, blnOpenReadOnly, 2 'dsoOptionOpenReadOnlyIfNoWriteAccess
Set objSummProps = objDocumentProps.SummaryProperties
If objSummProps.author = "CoordinatorDB" Then
IsDBBasedProposal = True
Else
IsDBBasedProposal = False
End If
End Function
It works just perfect
-
Feb 16th, 2018, 09:08 PM
#5
Re: Read MS Word Document Title
Open what file? Your DOCX file? Open it how? With 7Zip? Not sure what to tell you. Maybe your DOCX file is infected.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 16th, 2018, 09:11 PM
#6
Re: Read MS Word Document Title
It looks like a nice solution, even if it does require a new dependency. I'm sure that Dsofile.dll is just doing what I outlined. But hey ho. Now you don't have to jump through all the hoops I outlined.
Take Care,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 16th, 2018, 09:43 PM
#7
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
Originally Posted by Elroy
Open what file? Your DOCX file? Open it how? With 7Zip? Not sure what to tell you. Maybe your DOCX file is infected.
You gave me the link. There is a button for download UnzipClass-master.zip file. I downloaded and tried to unzip. I've got the message that one of the internal files is infected. Unzip process was stopped.
-
Feb 17th, 2018, 01:05 AM
#8
Re: Read MS Word Document Title
Ahhh, you seem to be correct. It seems the Project1.exe in wqweto's example is infected. I've only ever taken the source code and incorporated it into my own project. I'll let wqweto know through a PM that his example is infected.
Take Care,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 17th, 2018, 02:14 AM
#9
Re: Read MS Word Document Title
Hi,
you could use Shell
this will return all Files located in Folder TestText
I have Textfiles; Excel ;Word in that Folder, and all Author's
are returned.
Code:
Private Sub Command3_Click()
Dim sFile As Variant
Dim i As Long
Dim oShell: Set oShell = CreateObject("Shell.Application")
Dim oDir: Set oDir = oShell.Namespace("c:\TestText")
For Each sFile In oDir.Items
List1.AddItem oDir.GetDetailsOf(sFile, 9) '9 = is the Author
Next
For i = 0 To 40
Debug.Print i, oDir.GetDetailsOf(oDir.Items, i) ' get a list of what is avalible to read
Next
End Sub
EDIT:
with Listview
Code:
Dim sFile As Variant
Dim i As Long
Dim Li As ListItem
Dim oShell: Set oShell = CreateObject("Shell.Application")
Dim oDir: Set oDir = oShell.Namespace("c:\TestText")
With ListView1
.View = lvwReport
.LabelEdit = lvwManual
.FullRowSelect = True
.ListItems.Clear
.ColumnHeaders.Clear
.ColumnHeaders.Add , , "Filename", 3000
.ColumnHeaders.Add , , "File Type", 1500, vbLeftJustify
.ColumnHeaders.Add , , "Author", 1200, vbLeftJustify
.ColumnHeaders.Add , , "Kb", 900, vbCenter
For Each sFile In oDir.Items
List1.AddItem oDir.GetDetailsOf(sFile, 5) '9 = is the Author
Set Li = .ListItems.Add(, , sFile)
Li.SubItems(1) = oDir.GetDetailsOf(sFile, 2)
Li.SubItems(2) = oDir.GetDetailsOf(sFile, 9)
Li.SubItems(3) = oDir.GetDetailsOf(sFile, 1)
Next
End With
For i = 0 To 40
Debug.Print i, oDir.GetDetailsOf(oDir.Items, i) ' get a list
Next
End Sub
regards
Chris
Last edited by ChrisE; Feb 17th, 2018 at 02:47 AM.
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 17th, 2018, 05:19 AM
#10
Re: Read MS Word Document Title
Originally Posted by Elroy
\It seems the Project1.exe in wqweto's example is infected. I've only ever taken the source code and incorporated it into my own project. I'll let wqweto know through a PM that his example is infected.
Thanks, Elroy.
It turns out it's the other way around. In the past 6 years this unzip class started being used by lots of droppers and other malware so it's signatures got included in most anti-virus databases, so Project1.exe is now falsely recognized as virus or related malware.
Rest assured the sample executable is clean and the warning is a false alarm, but I just removed it from the repo -- it's a bad practice to keep binaries under source control anyway.
I personally would never use this unzip class as it's dog slow, being native VB6 implementation. Latest ZipArchive would be much better for the job of browsing & uncompressing .docx files, and much faster for sure.
It's a single class too, and can be trimmed to extract-only version (that can ony unzip) with conditional compilation like ZIP_NOCOMPRESS = 1
cheers,
</wqw>
-
Feb 17th, 2018, 09:42 AM
#11
Re: Read MS Word Document Title
Ahhh, thanks for getting this straightened out wqweto. It's actually your ZipArchive version that I use.
I just did a quick Google search and bumped into the other one to give chapran a way to unzip files in VB6.
Chapran, if you still have your ears on and you're at all interested in digging out the author the "hard" way, use wqweto's latest ZipArchive to get it done. It sounds like you're on your way though with the Dsofile.dll from Microsoft. Sorry about the somewhat old link.
Take Care,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 17th, 2018, 09:47 AM
#12
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
Tnanks a lot to all who tried to help me.
The way I found is much easier than the way recommended here.
-
Feb 17th, 2018, 09:51 AM
#13
Re: Read MS Word Document Title
Originally Posted by chapran
Tnanks a lot to all who tried to help me.
The way I found is much easier than the way recommended here.
and how ?
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 17th, 2018, 10:06 AM
#14
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
Originally Posted by ChrisE
and how ?
I posted the code with DSOFile above
-
Feb 17th, 2018, 10:21 AM
#15
Re: Read MS Word Document Title
Originally Posted by chapran
I posted the code with DSOFile above
didn't see that, but adding a DLL just to get the Author from documents ?
Last edited by ChrisE; Feb 17th, 2018 at 10:34 AM.
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 17th, 2018, 10:27 AM
#16
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
I tried the suggested here download. It was infected. Then I was informed that the infected file was removed. I downloaded again. No virus - fine. I'm starting the project, it cannot download Form becaus it doesn't see the , I created a new project hoping to add the class. When I added the class its code has many red lines. My knowledg doesn't allow me to find what's wrong with those lines.
So I decided to use the approach I found somewhere.
The way offered here is too complex for my limited knowledge.
Thank you.
-
Feb 17th, 2018, 10:33 AM
#17
Re: Read MS Word Document Title
ah I see,
well glad you got it working now
regards
Chris
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 17th, 2018, 11:22 AM
#18
Re: Read MS Word Document Title
FYI, here is a code snippet to retrieve docProps using cZipArchive as Elroy described in second post:
vb Code:
Option Explicit '--- for WideCharToMultiByte Private Const CP_UTF8 As Long = 65001 Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long Private Sub Form_Load() Dim oProps As Object Dim vKey As Variant Set oProps = GetOpenXmlDocProps("D:\TEMP\Biff12\clip2.xlsb") For Each vKey In oProps.Keys Debug.Print vKey & ": " & oProps.Item(vKey) Next End Sub Public Function GetOpenXmlDocProps(FileName As String) As Object Dim oRetVal As Object Dim baCoreXml() As Byte Dim oRoot As Object Dim oNode As Object Set oRetVal = CreateObject("Scripting.Dictionary") oRetVal.CompareMode = vbTextCompare With New cZipArchive If Not .OpenArchive(FileName) Then GoTo QH End If If Not .Extract(baCoreXml, "docProps/core.xml") Then GoTo QH End If End With With CreateObject("MSXML2.DOMDocument") .LoadXml FromUtf8Array(baCoreXml) .setProperty "SelectionNamespaces", "xmlns:cp=""http://schemas.openxmlformats.org/package/2006/metadata/core-properties""" Set oRoot = .selectSingleNode("//cp:coreProperties") If oRoot Is Nothing Then GoTo QH End If For Each oNode In oRoot.childNodes oRetVal.Item(oNode.baseName) = oNode.Text Next End With QH: Set GetOpenXmlDocProps = oRetVal End Function Public Function FromUtf8Array(baText() As Byte) As String Dim lSize As Long FromUtf8Array = String$(2 * UBound(baText), 0) lSize = MultiByteToWideChar(CP_UTF8, 0, baText(0), UBound(baText) + 1, StrPtr(FromUtf8Array), Len(FromUtf8Array)) FromUtf8Array = Left$(FromUtf8Array, lSize) End Function
`GetOpenXmlDocProps` function usually fetches "creator", "lastModifiedBy", "created" and "modified" as keys of the retured dictionary.
cheers,
</wqw>
Last edited by wqweto; Feb 17th, 2018 at 11:27 AM.
-
Feb 17th, 2018, 09:44 PM
#19
Re: Read MS Word Document Title
chapran, DSOFile is working for you for newer docx files? Gave me errors for the new zip-based ones.
ChrisE, same for going through the shell for me; it works with docx for you? I tried the code you posted just to be sure; I can't seem to access the properties through the files' IPropertyStore either. Was really hoping to not have to parse xml.
-
Feb 18th, 2018, 04:30 AM
#20
Re: Read MS Word Document Title
Hi ,
see Image for results with diffrent Files
you could upload a docx File
I will try to read the Author with the code I have.
regards
Chris
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 18th, 2018, 09:05 AM
#21
Re: Read MS Word Document Title
Chris,
It'd be much better if you just posted your code. Two reasons: 1) I believe they frown on the uploading of Office files, as they're not ANSI and they're not VB6; and 2) if your code works, then others could use it, specifically for the purposes of this thread.
Best Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 18th, 2018, 09:26 AM
#22
Re: Read MS Word Document Title
Originally Posted by Elroy
Chris,
It'd be much better if you just posted your code. Two reasons: 1) I believe they frown on the uploading of Office files, as they're not ANSI and they're not VB6; and 2) if your code works, then others could use it, specifically for the purposes of this thread.
Best Regards,
Elroy
Hi Elroy,
see Post#9
EDIT: with Listview
that is the Code I used
I tried this under Win7 64Bit, I only have VB6 Installed on that PC, there is no Office installed. as you can see from the Image, no Author is shown, but I don't
recieve any Error running the Code ????
wonder why fafalone is getting an Error
here a Image (Win7 64Bit - no Office Installed.)
EDIT:
changed the code and now I get the Author from .doc and .xls Files
Code:
Dim sFile As Variant
Dim i As Long
Dim Li As ListItem
Dim oShell: Set oShell = CreateObject("Shell.Application")
Dim oDir: Set oDir = oShell.Namespace("c:\TestText")
With ListView1
.View = lvwReport
.LabelEdit = lvwManual
.FullRowSelect = True
.ListItems.Clear
.ColumnHeaders.Clear
.ColumnHeaders.Add , , "Filename", 3000
.ColumnHeaders.Add , , "File Type", 1500, vbLeftJustify
.ColumnHeaders.Add , , "Author", 1200, vbLeftJustify
.ColumnHeaders.Add , , "Kb", 900, vbCenter
For Each sFile In oDir.Items
List1.AddItem oDir.GetDetailsOf(sFile, 5) '9 = is the Author
Set Li = .ListItems.Add(, , sFile)
Li.SubItems(1) = oDir.GetDetailsOf(sFile, 2)
Li.SubItems(2) = oDir.GetDetailsOf(sFile, 20)
Li.SubItems(3) = oDir.GetDetailsOf(sFile, 1)
Next
End With
For i = 0 To 40
Debug.Print i, oDir.GetDetailsOf(oDir.Items, i) ' get a list
Next
End Sub
regards
Chris
Last edited by ChrisE; Feb 18th, 2018 at 12:41 PM.
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 18th, 2018, 12:58 PM
#23
Re: Read MS Word Document Title
@ChrisE: Ahhh, very good. I'll get out of the way now. Y'all take care.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Feb 18th, 2018, 04:50 PM
#24
Thread Starter
Hyperactive Member
Re: Read MS Word Document Title
Originally Posted by fafalone
chapran, DSOFile is working for you for newer docx files?
My application uses doc documents as Templates because these features were written before docx were offered by Microsoft. So, I did not try with docx and for my situation it is not important. My current goal to find the method to identify what way this particular document was created. The old documents either have blank Authors property or something not set by my program. New documents will have some special values (for instance "From DB", "Based On Template" etc.) in Authors which will be used to decide what way to work with document.
-
Feb 19th, 2018, 07:45 PM
#25
Re: Read MS Word Document Title
@ChrisE, it's still not working for docx/xlsx files created with Office 2013. It's not that a runtime error is raised it just returns blanks for authors (20) and title (21) (the numbers are from the list of properties that your code prints out at the bottom)...
5: 2/17/2018 9:32 PM
2: Microsoft Word Document
20:
21:
1: 29.5 KB
5: 2/17/2018 9:34 PM
2: Microsoft Excel Worksheet
20:
21:
1: 10.6 KB
5: 2/17/2018 9:34 PM
2: Microsoft Word Document
20:
21:
1: 16.5 KB
I'm on Win7 x64 without Office installed as well. Windows Explorer displays, and can edit these properties, I wonder if it's really just manually digging into the xml. (Your code, like my IPropertyStore based method, works fine on some old Office 97-2003 docs I have; just not the new xml based ones)
Last edited by fafalone; Feb 19th, 2018 at 07:49 PM.
-
Feb 20th, 2018, 01:15 AM
#26
Re: Read MS Word Document Title
Originally Posted by fafalone
@ChrisE, it's still not working for docx/xlsx files created with Office 2013. It's not that a runtime error is raised it just returns blanks for authors (20) and title (21) (the numbers are from the list of properties that your code prints out at the bottom)...
5: 2/17/2018 9:32 PM
2: Microsoft Word Document
20:
21:
1: 29.5 KB
5: 2/17/2018 9:34 PM
2: Microsoft Excel Worksheet
20:
21:
1: 10.6 KB
5: 2/17/2018 9:34 PM
2: Microsoft Word Document
20:
21:
1: 16.5 KB
I'm on Win7 x64 without Office installed as well. Windows Explorer displays, and can edit these properties, I wonder if it's really just manually digging into the xml. (Your code, like my IPropertyStore based method, works fine on some old Office 97-2003 docs I have; just not the new xml based ones)
sure looks like M$ want's us digging, if you think about it... we just want to know the Author...
when I have time I'll get a new Verson of Office and Install, but I don't think it will change anything.
wqweto's shown us how to do it, it would have been nice the other way.
regards
Chris
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|