Hey folks. I have a database of names and companies and i am trying to normalize the list against a reference list of companies (the Fortune 500). The data looks like the following:
John Doe Abbott Labs
Jane Doe Abbott Laboratories
Jeff Doe Abbott Medical Devices
and so on...
What I am trying to do is figure out who in my list works for Abbott Laboratories. The name in my Fortune 500 company list is "Abbott Laboratories", but just this one example can be found 10 different ways within my list of people. If I were to shorten Abbott Laboratories to just "Abbott", is there some efficient way for me to run my list of names against my Fortune 500 list such that any time "Abbott" is found *within* the person's company name they will be flagged?
To do this with just one name I would normally use the "Find" function as it would find "Abbott" within any cell that has it (even in a cell that had "Abbott Labs" or other permutations). To bounce a list against another list I would usually use a "Vlookup" function. But that only works for exact matches (so far as I know - maybe I'm wrong). So what I need is sort of a combination of the two: bounces large list against large list, and finds matches in partially matching cells. Help would be appreciated!
Help using Excel to find semi-structured data in a database
Answers
the below function returns a TRUE/FALSE if the cell contains (or not) the Abbott
=ISNUMBER(SEARCH("Abbott",A1))
Yes, that and the "FIND" functions allow me to do lookups one at a time. What I'm hoping for is some function (or set of functions) that allows me to look up a bunch of these *at one time*. Such as when you use VLOOKUP and can run a list of values against another list of values. Thanks, though.
Hey Dan-
Did you find your answer? If not, I'd be happy to help you figure it out. Have you ever worked with Access to handle large data?
Bobby Bluford
BobbyBluford.com
I did something like this with public companies (find all the individuals in my list of 40,000 names and tag them with a normalized company name, industry and SIC code of the public company they work for. (and probably 10,000 of the names worked for private companies). I was only able to consistently get good results by using the company part of the individuals email address. Not sure if you have the email information available but I was able to get accurate results.
The process was to parse the email address for the company portion using
That's a great idea! Thanks, Ric.