-
Mar 13th, 2025, 04:03 PM
#1
[RESOLVED] Replace/Regex.Replace?
I have a large badly formatted wordlist...
ability (n) able
(adj) • be able to
about (adv & prep)
• about 500 students (adv) •
The film is about a small
boy.
(prep) above (adj,
adv & prep) abroad
(adv) absent (adj)
absolutely (adv)
• The movie was absolutely
awful. accent(n)
• She has a beautiful
French accent. accept (v) access
(n)
• disabled access •
internet access accident
(n) accommodation (n)
accompany (v) according to
(prep phr) account (n)
accountant (n) accurate (adj)
ache (n) achieve (v) across (adv
& prep) act (n & v)
• in the second act (of the
play)
(n)
• to act in a play (v)
• to act strangely (v) action
(n) active (adj)
actor (n) actress
(n) actually
(adv)
I need to format that so each (dictionary) word is on one line, each line ending with (n) or (v) etc.
There are bullet point words or sentences in there that i want to completely remove, and the result would be...
ability (n)
able (adj)
about (adv & prep)
above (adj, adv & prep)
abroad (adv)
absent (adj)
absolutely (adv)
accent (n)
accept (v)
access (n)
accident (n)
accommodation (n)
accompany (v)
account (n)
accountant (n)
accurate (adj)
ache (n)
achieve (v)
across (adv & prep)
act (n & v)
action (n)
active (adj)
actor (n)
actress (n)
actually (adv)
Can anyone help with a regex pattern or something which can do that?
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Mar 14th, 2025, 02:19 AM
#2
Re: Replace/Regex.Replace?
Ouch!
Not going to be easy...
i see some issues
ability (n) able
(adj) • be able to
about (adv & prep)
• about 500 students (adv) •
The film is about a small
boy.
(prep) above (adj,
adv & prep) abroad
(adv) absent (adj)
absolutely (adv)
• The movie was absolutely
awful. accent(n)
• She has a beautiful
French accent. accept (v) access
(n)
• disabled access •
internet access accident
(n) accommodation (n)
You want to remove "• be able to " --> OK
But you also want to remove "• about 500 students (adv) • " (INCL: "students (adv)"!!)
and this: "The film is about a small
boy.
(prep)" until here, but keep "above (adj,"
You want to remove "• The movie was absolutely
awful." until here, but keep " accent(n) "
You want to remove "• She has a beautiful
French accent." until here, but keep "accept (v)"
and so on.... just at first look
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Mar 14th, 2025, 02:25 AM
#3
Re: Replace/Regex.Replace?
Yeah it’s tricky. I’m trying a solution based on splitting it word by word, then weeding out the parts I don’t want. I have the complete file as a properly formatted pdf and I can also convert it to a properly formatted docx. The problem is that the words are arranged in columns.
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Mar 14th, 2025, 02:29 AM
#4
Re: Replace/Regex.Replace?
I might try to copy and paste into excel. If it splits it into the fields I want, I think I can work with that…
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Mar 14th, 2025, 10:33 AM
#5
Re: [RESOLVED] Replace/Regex.Replace?
I tried pasting the tabulated data into Excel, and refining the list with VBA. So far, it hasn’t been too bad, I’d prefer a 10 second algorithm, but this text is particularly nasty with all of those bullet points, and truncations where they’re hard to fix… I’ll mark this resolved.
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|