Need your help on this one. I'm trying to find the name of the user that was extracted on text file. Below are the sample list per line.
Line 1
New Hire: Adeng Adeng Hire Date: February 16,2015 Location: Australia New Starter Position : Collection Specialist Manager: Decena Koh Unique Account ID: AdengDe Contractor/Fulltime: Contractor
Line 2
New Hire Name: Erwin Mar Location: USA New Starter Position : Collection Specialist Manager: Mark Tom Unique Account ID: McnamaDi Contractor/Fulltime: Contractor
Below are my scripts.
Code:
MyFile = FreeFile
Open "C:\NewHire.txt" For Input As #MyFile
Do
Line Input #MyFile, ReadLine
Position = InStr(1, ReadLine, ":")
If Position > 0 Then
EMail = Mid(ReadLine, Position + 2, Len(ReadLine) - 1)
End If
Loop Until EOF(MyFile)
Close #MyFile
MsgBox EMail
The output I'm getting is below:
Adeng Adeng Hire Date: February 16,2015 Location: Australia New Starter Position : Collection Specialist Manager: Decena Koh Unique Account ID: AdengDe Contractor/Fulltime: Contractor
and
Erwin Mar Location: USA New Starter Position : Collection Specialist Manager: Mark Tom Unique Account ID: MarkEr Contractor/Fulltime: Contractor
I only want to get the Names which are Adeng Adeng and Erwin Mar. Please help. Users are only an example. there are also other users that has a long name.
Will all 'names' be first and last, or might also there be some middle names and or middle initials? Also, would any first name or last name be of two parts, like 'Rip Van Winkle'?
If so, I am sad to say I could not provide you with any good advice on how to get it done. Because your input file is not of the same format on each line, I don't see any way to do this. Now, if you already have a database with all NAMES in it, then you could do a match of strings and figure it out, but without that, .....
For example, you could capture all characters in between the first and second colons (, but you would end up with, in the example, "Adeng Adeng Hire Date", and "Erwin Mar Location". Now, IF the words "Hire Date" OR "Location" would follow each 'name', then you could compare those characters and delete " Hire Date" or " Location" and end up with your names. HOWEVER, if one of those two sets of words did NOT follow each 'name' every time, I would be at a loss.
Agree with Sam. Your input-file leaves a lot to be desired.
Do you have any influence on how this input-file is created?
If you have, then maybe getting it in INI-Format would make your life a lot easier.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
I'd guess the real data isn't anywhere near as poorly formed, but all we have to go on are two lines. If we examine that we're left with the conclusion that the "keys" have multiple aliases, perhaps even more than we see above.
I'd probably tackle it using a dictionary of the aliases that maps them to column indexes. Something like:
From there it is still pretty ugly. Here's a quick hack that at least seems to work. Demo attached since it's a little ugly to look at posted here as text.
Yeah, this sort of thing can fall apart fast in the wild.
Of course things would be quite a bit easier if there was a unique delimiter between name/value pairs as well as the colon between each name and value. Even at its best this is a poor choice for formatting data, if only because repeating the names eats up a lot of space if there are many rows.
I run into a lot of "garbage feast" data. Almost half of the jobs I get anymore seem to be data cleansing of one kind or another.
Most of the work I do involves a rip and replace on trashbag .Net "applications" created by the Legion of Morts with clean burning, long lasting VB6 alternatives. More often than not the data involved is also a travesty.
Sadly there is no licensing requirement for writing code.
Last edited by dilettante; Feb 25th, 2015 at 08:58 AM.
I'd guess the real data isn't anywhere near as poorly formed, but all we have to go on are two lines. If we examine that we're left with the conclusion that the "keys" have multiple aliases, perhaps even more than we see above.
I'd probably tackle it using a dictionary of the aliases that maps them to column indexes. Something like:
From there it is still pretty ugly. Here's a quick hack that at least seems to work. Demo attached since it's a little ugly to look at posted here as text.
your code is really cool... It will become better if you try to let the user modulate the "keys" and he will be able to store data in your app. it looks like a small database
On my data, there's a word Finance Manager that's not accepted by the script because on the Key Code (number 4) there's already a word Manager.
If I change the position name to Finance on my data, the script will run properly.
How can I avoid this one? it's reading the word manager per line and it's not accepting i.
Within ParseKeys the program builds a list of "field name keyword aliases" and the column number they map to. This list (a Collection) is built in order, longest alias first. When this list is used within ParseData matches are made from beginning to end.
So previously we had a column 4 with one alias "Manager" but now we need a new column. Call this column 7 with one alias "Finance Manager" and since that is a longer key word it will be detected before just "Manager" and everything comes out fine.
You can also use this code:
Option Explicit
Dim prg As String
Dim loc As Long
Dim fullprg() As String
Sub main()
LoadProgram "C:\NewHire.txt"
FIND
End Sub
Sub LoadProgram(prgName As String)
Dim temp As String
Open "C:\NewHire.txt" For Input As #1
While Not EOF(1)
Line Input #1, temp
prg = prg & " " & temp
Wend
Close 1
prg = Replace(prg, vbTab, vbNullString)
fullprg = Split(prg, " ")
End Sub
Private Function FIND()
Dim FINDA
FINDA = InputBox("Type the key you want to find data stored in:")
While loc < UBound(fullprg)
loc = loc + 1
Select Case fullprg(loc)
Case FINDA & ":"
MsgBox fullprg(loc + 1)
End Select
Wend
End Function
dilettante, sorry I did not clear this things to you. I do have a lot of position on my data. Finance Manager, HR Manager, IT manager etc. and those are going under Position Column. How can I avoid it to be read by