[RESOLVED] Removing graphics from RTF data Memo field Access 97 table.
I have searched this forum and have not found anything specific to my needs in thread title searches.
I have a situation at a client site where the Access 97 MDB file is near the file size limit.
It turns out that they have been embedding graphics in some of their Service Description entries.
This Service Description data in RichText format is stored in the Access 97 table in a Memo type field.
Using a one-time fix VB6 program, I want to REMOVE / DELETE all graphics from this memo field and leave the rest of the RTF text data alone.
To narrow down which records I need to update, I was going to first use an Access Qry that would Select records from the tblInvoiceService where the tblInvoiceService.Text field contains "{\pict" characters.
After that I was going to read each of the selected records and REMOVE the graphics from the Memo field and Save the record.
My question is... What is the best way to REMOVE / DELETE the graphics data from the RTF data store in the Memo field?
Re: Removing graphics from RTF data Memo field Access 97 table.
exb1,
It sounds like you're on the right track to me.
If it were me, I'd just read through all the Memo fields with VB6, extract the pictures by analyzing the String, and then stuff it back into the Memo field. There are certainly RTF parcers, but it sounds like you're on the right track by just finding the PICT tags and deleting everything between them. And don't forget that those tags can be nested, so be sure you find the close-tag correctly.
Also, a couple of other thoughts:
1) RTF is ASCII 7-bit. This brings up a couple of thoughts. One, you may want to look into how those Memo fields are stored. Access has a "UnicodeCompression" flag, but it's been a while since I've taken a serious look at it. If all you're storing in these Memo fields is RTF, you're possibly giving 16 bits per character where you only need 7 bits per character. 56% waste.
2) That also brings up another idea. You may want to look at some 7-bit compression/decompression utilities. This has the downside of not being able to "see" the data if just called up in Access, but since it's RTF, you probably can't make much sense of it that way anyway. So that might not be a bad idea. Also, if you're careful, you could leave things as Unicode and do a 7-bit-to-16-bit compression (and vice-versa), and save quite a bit.
3) Also, and you probably know this one, but don't forget to Compact-and-Repair when you're done so that the space is recovered.
Best Of Luck,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
Re: Removing graphics from RTF data Memo field Access 97 table.
Elroy,
Thank you for the quick and detailed reply.
Thank you for the Unicode / Compression tips / ideas. Something to look into down the road. I was however, thinking of migrating to a "newer" version of Access that could be used with VB6 - perhaps Jet4?
fwiw... Our vertical market VB6 ap does Invoicing, Inventory, A/R, Customer Follow-Up, etc. The RTF data comes from user entries in a VB6 RichTextBox control.
Not sure what you mean by "those tags can be nested". If I were to use my own "parse logic" for the memo field, I thought that I would be looking for the first occurrence of "}\b\" (after finding the first "{\pict" characters). Is my thinking incorrect?
Re: Removing graphics from RTF data Memo field Access 97 table.
Jet 3.x did not have Unicode Compression anyway.
I have no clue where MS Access comes into the picture at all. Access is a parasite on VB's Jet.
And there hasn't been any excuse for using the creaky old, slow, problematic, non-Unicode, no longer supported Jet 3.x for ages anyway.
It is fairly easy to upgrade an old Jet 3.x MDB, either using JRO or using the JetComp utility from Microsoft. Or if you insist you could also do it via MS Access 2000 or newer... but that dumps in a lot of private MS Access structures, bloating the MDB a bit.
Re: Removing graphics from RTF data Memo field Access 97 table.
Hi exb1,
Well, I still use a Jet 4.0 (Access 2000) database (with the DAO360) and things work peachily. Althought dilettante routinely gives me a hard time about it.
It's just been too long for me to reach back to Jet 3.x, if that's actually what you're using. However, I suspect dilettante may be correct that there's no Unicode. If that's the case, I'm not sure an 8-bit-to-7-bit compression would gain enough to worry about.
That puts you back into deleting your pictures from the RTF strings.
And, regarding nesting, to my eyes, RTF has much the same "flavor" as XML (but not sure that analogy helps). You use open curlibrace ({) to start a section, and close curlibrace (}) to end that section. I'm thinking you can reliably depend on always having the same number of open-curlibraces as you have close-curlibraces. However, there can be sections-within-section, etc. For example:
{some section {nested section {further nesting}} back to main section}
I'm not positive you'll run into that with these embedded pictures, but you may. In other words, when you delete a section, you need to make sure you delete all the way through to the close-curlibrace that corresponds to the open-curlibrace that you found.
Good Luck,
Elroy
EDIT1: Also, you can escape a curli-brace to actually get a true curli-brace into the text. If you're deleting any text, you'd need to watch out for this as well.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
Re: Removing graphics from RTF data Memo field Access 97 table.
Thank you Elroy.
fwiw... I stayed with Access 97 / JET3.5n because developing features for my customers was far more important than spending way too much resources / time keeping up with Microsoft's never-ending version / product upgrades. Doing such would cost me a lot of time where my end users would see "no improvements" from DB engine version updates. I stayed with VB6 because when MS went to VB.Net they got rid of the Variant data type. I heard that they brought back some type of way to use a Variant - but, I don't have the time or desire to research and implement all of the code changes required - especially when my end users would experience no benefits at all from this.
Many thanks again for your input today Elroy. May your weekend be blessed.
Re: Removing graphics from RTF data Memo field Access 97 table.
Never-ending upgrades? Lousy excuse. We've had Jet 4.0 for an incredibly long time and it is nearly as stable as VB6 itself... along with the bonus that like the VB6 runtime it ships as part of Windows.
Aside from fixes delivered via Windows Update as part of routine OS patching Jet 4.0 has been as stable as a rock for almost two decades now.
As for your actual problem though... have you met Tom ("tom" as in Text Object Model). It is pretty spiffy for this sort of thing:
Code:
Private Function ZapImages() As Boolean
'Returns True if zapped any.
With tdocRTB.Range(0, 0)
Do Until .MoveStart(tomObject, 1) = 0
'We found an object, we'll assume it is an image. Delete it. We
'could also find and delete any whitespace around it with a little
'more logic here:
.MoveEnd tomCharacter, 1 'An embedded object is 1 character in length.
.Text = vbNullString
ZapImages = True
Loop
End With
End Function
:
:
OpenDbAndQuery
RTB.Width = 7500 'Just big enough so wrapping doesn't go crazy.
With Recordset
Do Until .EOF
RTB.TextRTF = ![Memo].Value
If ZapImages() Then
![Memo].Value = RTB.TextRTF
.Update
End If
.MoveNext
Loop
End With
CloseDb
CompactDb
So hardly anything to it for cleaning one column of one table. Not a lot more to clean multiple tables or table columns.
Sample program showing RTF with images in it.
Sample program, run again after running a Cleaner on the database
Not perfect, but it ought to be enough to get you started I'd think. Nothing there that hasn't already been covered in all kinds of Q&A and CodeBank threads already though.
Sorry about the size of the attached archive, but it contains a small sample Jet 3.x MDB to work with.
Re: Removing graphics from RTF data Memo field Access 97 table.
dilettante,
Thanks for your additionalfeedback / knowledge.
Right now, that is way too much for me to get up to speed on. I will start with what I am already familiar with for now.
Good to know about the Jet 4.0. I believe that I will look into that if / when time permits. I don't like the idea that Windows Updates can possibly interfere with version of Jet being used. imho... Windows update IS a Virus.
This thread inspired me to go into my older versions of this class and rewrite it for more general use and release it here in the CodeBank.
Writing a program or script using BatchRtb you can "clean out embedded objects" even easier than in my example above, just .OpenDoc() then .SaveDoc() passing WithNoObjects = True.
For WSH script use you'll need to create an ActiveX DLL Project that exposes the BatchRtb class (and optionally the ShStream class).
This isn't tied to Jet MDBs, but should be usable for other DBMSs, files, etc.