Help using Excel to find semi-structured data in a database

Dan Ryan's Profile

Hey folks. I have a database of names and companies and i am trying to normalize the list against a reference list of companies (the Fortune 500). The data looks like the following:

John Doe     Abbott Labs
Jane Doe     Abbott Laboratories
Jeff Doe       Abbott Medical Devices
and so on...

What I am trying to do is figure out who in my list works for Abbott Laboratories. The name in my Fortune 500 company list is "Abbott Laboratories", but just this one example can be found 10 different ways within my list of people. If I were to shorten Abbott Laboratories to just "Abbott", is there some efficient way for me to run my list of names against my Fortune 500 list such that any time "Abbott" is found *within* the person's company name they will be flagged?

To do this with just one name I would normally use the "Find" function as it would find "Abbott" within any cell that has it (even in a cell that had "Abbott Labs" or other permutations). To bounce a list against another list I would usually use a "Vlookup" function. But that only works for exact matches (so far as I know - maybe I'm wrong). So what I need is sort of a combination of the two: bounces large list against large list, and finds matches in partially matching cells. Help would be appreciated!

Topics:

Answers

Spyros Floratos's Profile

the below function returns a TRUE/FALSE if the cell contains (or not) the Abbott

=ISNUMBER(SEARCH("Abbott",A1))

For full access, login or register
Dan Ryan's Profile

Yes, that and the "FIND" functions allow me to do lookups one at a time. What I'm hoping for is some function (or set of functions) that allows me to look up a bunch of these *at one time*. Such as when you use VLOOKUP and can run a list of values against another list of values. Thanks, though.

For full access, login or register
Bobby Bluford's Profile

Hey Dan-
Did you find your answer? If not, I'd be happy to help you figure it out. Have you ever worked with Access to handle large data?

Bobby Bluford
BobbyBluford.com

For full access, login or register
Ric Ratkowski's Profile

I did something like this with public companies (find all the individuals in my list of 40,000 names and tag them with a normalized company name, industry and SIC code of the public company they work for. (and probably 10,000 of the names worked for private companies). I was only able to consistently get good results by using the company part of the individuals email address. Not sure if you have the email information available but I was able to get accurate results.

The process was to parse the email address for the company portion using excel string functions and then do a vlookup. One of the tricks I had to deal with was when companies have multiple "@xxx.com" email addresses, but the bulk was handled with the first pass and I was able to triage the rest.

For full access, login or register
Dan Ryan's Profile

That's a great idea! Thanks, Ric.

For full access, login or register