What you know about Internet Resume searching is wrong

What you know about Internet Resume searching is wrong

Do you use search engines to look for resumes on the Internet?   Do you use exclusions such as “-jobs”  or  “-submit”?

If you do. Stop it. Read on and I’ll tell you why.

First a story about Easter hams.

To understand what I am going to say about searching for resumes, you will need to be in the right frame of mind. Here we go…

A little girl was closely watching her mother prepare the Easter Ham. She was five years old, a great age for asking questions about the world.  She watched her mother prepared the glaze, preheated the oven and brought out the large roasting pan.   In an automatic fashion, her mother took a large knife and sliced off 2 full inches of meat from each end of the ham.

The little girl,  Sarah, smiled as a question came to mind.

“Mommy, why do you cut the ends off the ham?”  she asked.

As if startled the mother replied, unconvincingly  “I don’t know Sarah, my mom always did it.  Maybe it is so the glaze gets inside. ”

Not being satisfied with the answer, Sarah tracked down Grandma.

“Grandma” She asked.  “I just saw mommy cut off the two ends of the Easter Ham.  She said that she learned it from you.  Why did you make the Easter Ham that way?”

Grandma answered.  “That is a good question, Sarah, but I learned it from my mom, your great grandma.  I always thought that it was so the Ham cooked faster.”

Again, unsatisfied, Sarah tracked down, Great Grandma, the family Matriarch.

“Great grandma”, She asked as she crawled up on her lap.  “Mommy cut the ends off the Easter Ham. She thought is was so the glaze flavor got into the ham.  She did this because Grandma did it.  Grandma thought it was so the Ham would cook faster.  Grandma learned it from you. ”

With anticipation, Sarah asked her Great Grandma. “Grandma, why did you cut the ends off the Easter Ham?”

Grandma, wise as she was old, chuckled and answered.  “Sarah, when I married your great grandfather, the roasting pan we got for our wedding was too small for a Christmas Ham.”

“We cut the ends off the Ham so it would fit in the pan”.
xmas ham

Such is the progression of knowledge.  There is no fault when we inherit a practical idea that worked in the past, yet is anachronistic.  In the case of the Easter Ham, a practical, real world solution should have lived and died within a single generation, a single iteration.  However, it continued until one with a child’s mind, a questioning mind, wanted to know why.  When she was not satisfied with the answer, went on a journey of discovery.

Looking at resume search with a “Beginners Mind”

In the past 2 year I’ve taken a bit of a journey in questioning how people use search engines to search the Internet.

Observation:  Top Internet searchers, myself included, had an innate set of beliefs that they held.  These observations eventually evolved into The 8 Laws of Internet Search,  which are a set of axioms for searching the Internet.

At this point I want to make a disclaimer:  I am really, really good at finding things on the Internet. This is not due to any formal training, nor did I have the advantage of a teacher or mentor.  I am self-taught.  I have literally been immersed in searching the Internet for the last 15 years.

Second disclaimer:  I do not include myself as one of the search-string guru’s out there.  To be a search string guru, you need to be current, know the latest websites that are out there, as well as the latest capabilities of each of the search engines; you need to be immersed in the searching.  My immersion is in the underlying rules.

I recently had a conversation with a search string guru .  We agreed that the best analogy was that I design the aircraft and the search string gurus are the pilots.  Works for me.

So what about resumes and searching the Internet?

If I attempted to research the state of resume search, without a basis or set of axioms to work from, I would not have known where to start.  Fortunately, I decided to use the 8 Laws of Internet Search as a starting point. With a special emphasis on the first 3.

8 laws of internet search

So the question I decided to ask myself is: How do the commonly taught practices of resume search stack up to the Laws of Internet Search?  This was a definable goal.   Caveat: My focus is “Open web” resume searches and not searches within a controlled environment like Monster.com or CareerBuilder.com.

1-law of environment
The Law of Environment. Trainers do an excellent good job talking about the various search engines, their capabilities and limitations.

Industry score on the Law of Environment: A+

2-law of permutation

In taking The Law of Permutation into consideration, I found 2 areas that were very different.

1.  Boolean search methods

Sub-score:  B.   Trainers are clear on the concepts that you must search using multiple permutations such as “VP of Sales”, “Vice President of Sales”,  “VP Sales”, etc.  However, the reality is that you may need 15-20 title combinations to reach all possible results.

2.  Semantic search methods

Sub-score: C.  A good deal of mis-information is being spread about semantic search.  Some of this stems from irresponsible vendors that are trying to make a buck.  It would not be a big deal,  if trainers actually tested, scientifically, what they started teaching.  The funny thing is that the value proposition is significant with semantic search.  Say what it can (and can not do) and those vendors will have happy customers with proper expectations.  I shouldn’t be too harsh here, in the early days, I believed the software from Broadlook was meant for everyone.  It is not.  Setting clear expectations of technology capabilities is the mark of a mature vendor.

Semantic search is great when you have a type of resume that is well identified and the rules have been built.  However, throw it a niche area that has not been cataloged and it will fall flat.  Advice:  If you are looking for a commodity position like a .NET programmer, semantic search can work marvels.  If you are working in a niche area, pick a semantic search engine that can be trained by inputting sample resume data.  In the later case, you may have to do the leg work with good old Boolean search first.  Also, ask your semantic search vendor if they use exclusions when they mine search engines.  If they do, twist their arm until they stop.  It’s an old Easter Ham.

Industry score on the Law of Permutation: C+

3-law of completeness
The Law of Completeness.    Widely taught methodologies, that have not been questioned in years (like the Easter Ham) are yielding approximately 65%.  If you get 65% on a math test, that is not a good grade.   The first example is not using the full available results from a search string query.  If a google search yields 380 results,  the Law of Completeness states that you must work with the entire set of results for maximum yield.

Completeness is not being reached. Why?  When trainers first started teaching how to use search engines (before google),  there were limitations in the technology.  Those limitations were:

(1) No high accuracy method to screen out page results that were NOT a resume.  Therefore search strings needed to be modified to exclude results that were not resumes.

(2) No method to extract all results from a search query.  Therefore search strings needed to be modified to reduce results to a manageable quantity

In both cases, the strategy worked, unfortunately there was a side effect:  Many good results were also thrown out.

Industry score on the Law of Completeness: D

Dropping the bomb on search string exclusions.

So where is the proof, where is the science?

First, I want to thank Cory Dickenson at Broadlook Technologies for leading the team of researchers on search string exclusion metrics.  Looking through tens of thousands of resumes, by hand, and then doing it two more times, is not a fun task.  The reality is that someone had to do it.  Hopefully when this study is reviewed both recruiters and technology vendors will have a better foundation in which to build upon.  I basically hate inefficiency.

Resume Exclusion Metrics (Broadlook project: FRET, Frikken Resume Exclusion Test)

The study was simple.  What was the effect of using exclusions on a resume search string?

The first thing we did for the study was to mine a bunch of social networks and sites that had advice on resume search strings.  We wanted examples, over the past 10 years, that experts were using.  From a few hundred examples, we made a list off all the popular resume search string exclusions that were being used (i.e. -job -job -you -your -submit).

Creating the resume data set

To set up the study, we created search strings for about job 50 positions.  The positions were a wide range: IT , biotechnology, health care, sales, business development, financial, etc.  Next for each search,  we made sure that the search string was specific enough so the results from the search engine was <1000. We did not use any exclusions.  Last step:  Hand verification of every single search engine result.  Each result was classified in one of 4 categories (1) Resume (2) Resume sample page  (3) resume book page (4) Junk: Not a resume.

At this point, we could bring automation into the equation.  Using Broadlook’s Eclipse tool, we automated each of the 50 searches with one of the exclusion terms.  We then repeated the each of the 50 searches with each of the exclusion terms.  Since we already hand-identified which search engine result pages were resumes, we were able to calculate, for each search-exclusion combination, how many REAL resumes were skipped by using each exclusion term. When the searching was done, we had average percentages, across many industries and titles.  We know, with high precision, what percentage of resumes you will lose by using an exclusion term.

Why did I do this study?  Too much time on my hands?..no.  I was interested in making the best open web resume search tool possible.  To accomplish that goal, the tool needs to work within the framework of the Laws of Internet Search.  Specifically the first 3:  Environment, Permutation, Completeness.  The end result was Broadlook Diver 3.0.  The resume search part of the tool *automatically* screens out pages that are not resumes.  In addition, since it is an automation tool, it allows the user to work with complete results from a search engine.   While you can only get Diver from Broadlook, the Resume Exclusion Metrics are free to all.  Enjoy.

The Axioms of Internet Resume Search

1.  Seek <1000 results per search.

You should conduct your search with enough specificity that the search engine reports that there are less than 1000 results.  If you are doing a search that yields many thousands, break up the search into a few separate searches

2.  Never use single-phrase exclusions

Otherwise you will miss a good percentage of resumes.  It is reasonable to use multi-word exclusions, as the level of ambiguity is low.

3.  Use multiple search engines.

There are varying reports of the cross over being as low as 20%.   (Happy to get comments from additional sources on this)

4.  Use automation to screen out non-resumes

Don’t do it by hand and don’t ignore the data below and use exclusions.  This is not 1998 anymore.  Let automation technology screen out Search Engine Result Pages (SERPS) that are not resumes. This includes sample resume pages, job pages, etc.

And now for the Exclusion metrics.

From pool of about 50 job descriptions,  100+ searches,  75,000 search engine results, 28200 resumes, hand verified.  The sort order is based on the worst offending term.  These exclusion terms were pulled from top experts answers on forums about resume search.  Remember the Easter Ham, it is not my intention to reduce the tremendous contribution of those people that freely answer questions (every day) about internet resume search.  It is my intention to give more data so that the entire industry has more facts in which to work with.

Exclusion % REAL Resumes Missed
-job 49.78%
-jobs 40.89%
-summary 37.33%
-intext:resumes 34.37%
-about 34.07%
-writing 32.74%
-your 29.19%
-you 27.41%
-example 25.78%
-required 25.19%
-require 23.70%
-free 23.26%
-list 19.11%
-“how to” 17.04%
-template 16.15%
-library 14.96%
-intitle:jobs 14.37%
-professor 13.48%
-intitle:job 13.19%
-inurl:aspx 12.74%
-send 12.44%
-write 11.56%
-inurl:php 11.41%
-requirement 10.22%
-apply 9.78%
-intitle:apply 9.78%
-sample 9.78%
-intitle:sample 9.48%
-intitle:career 9.04%
-intitle:example 9.04%
-careers 8.89%
-submit 8.89%
-intitle:examples 8.59%
-intitle:write 8.59%
-intitle:how 8.44%
-intitle:submit 8.44%
-inurl:books 8.44%
-trainings 8.00%
-wizard 7.70%
-samples 7.41%
-inanchor:apply 6.67%
-opening 6.37%
-reply 6.22%
-wanted 6.07%
-applicant 4.89%
-inanchor:sample 4.59%
-inanchor:submit 4.00%
-eoe 3.70%

This resume research project yielded many other interesting facts, such as percentages of doc files vs. pdf, etc.  In the coming weeks, I will be publishing a white paper that breaks down the data in a bunch of categories… after I get back from DisneyWorld!

The 8th law of Internet Search;  The Law of Environment

The 8th law of Internet Search; The Law of Environment

Steven Covey published The Seven Habits of Highly Successful People and it was a great book.

When Dr. Covey came out with a new book, The 8th Habit, I was skeptical.  Why didn’t he think up the 8th habit right from the start?

Now I understand it.  Ideas evolve.  We are the sum total of your experiences at any point in time. You create a set of rules that you believe are universal.  In my case, I am the author of The Seven Laws of Internet Search.

The Original Laws …
1. Permutation
2. Completeness
3. Iteration
4. Frequency
5.  Process
6. Taxonomy
7. Measurable Results

It has been about a year and a half and now, guess what?  I came up with another Law of Internet Search.  The 8th law could not have been created by me…unless I was able to observe people learning and implementing the first seven laws in their Internet search activity.

Here is what I observed:  The Internet is “non-homogeneous”.  The idea of homogeneity  also resonated with me as I wrote the original seven laws.  I played with the idea of a Law of Non-homogeneity.  This means that the Internet exists in many different formats and there is no way to query everything, with a single method or game plan.

“Non-Homogeneous” sounds ugly.  To define something with “non” in front of it…it would be like cheating.  Each of the seven laws of Internet Search is meant to be a simple axiom of advice.    I failed to get my concept of Homogeneity into the laws.

Why did I fail?  It is simple.  Each of the seven laws is a solution.  Whereas “non-homogeneous” or “non-homogeneity” was talking about a problem.

What was I trying to get at?  It is also simple.  The Internet is not homogeneous, therefore, many different methods are needed to search it.  It is those very search mechanisms that the 8th Law takes into account.  The 8th law is  The Law of Environment.

In fact, the 8th Law is so important, I have moved it the top spot in The Laws of Internet Search.  It is now The 1st Law of Internet Search.

8th law of internet search

To understand the Law of Environment.  Get your mind around the concept of the Internet having many modalities. Many sites, each with it’s own set of rules or search environment.


Next.  There are some simple questions to ask.   What is the access method?  What are the sites restrictions?  Etc


In addition to the simple questions about the environment, the more advanced Internet search may want to dive into further understand the full capabilities of the search environment.

in depth environment questions

Once the simple questions about the environment are answered, the Internet search can proceed with quantifiable expectations on what to expect from their chosen search medium.

an ordered vision

For example, it is important to understand that Google will only give you a maximum of 1000 results from any search.  Even if Google reports that their are 2450 results, you only have access to the first 1000.  Understanding this is understanding the limitation of the environment.

google environment

Here are the The Laws of Internet Search, Reloaded

1.  Environment
2. Permutation
3. Completeness
4. Iteration
5. Frequency
6. Process
7. Taxonomy
8. Measurable Results

Dr. Steven Covey, now I understand. Looking forward to the ninth law.

Looking for a recruiting domain? Here are 5700, unregistered!

Shorter domain names are better.  One syllable words are simple, short, and memorable. This weekend I was looking for a domain name for a new project I was working on.  Everything I initially tried was taken.  What I did know was that I wanted to add a one syllable word to the end of my “anchor word”.  It had nothing to do with recruiting, but for this example,  I will use the anchor word  RECRUIT.

I needed to create a repeatable, semi automated process to conquer this task.   Here is what I did:

1. Create a list of one-syllable words.  One of Broadlook’s software engineers, Kevin,  had developed an algorithm to do this.  It is a great thing when you have a team with 7 years of software code to pull off the shelf.

2. Pass this list of one-syllable words past a good set of text to get a frequency count.  Kevin suggested to run it past the Brown Corpus.   It was a good idea.  Once I had a frequency count, I could remove words of very low frequency from the list.  The end result was about 6100 one-syllable words

3.  Build a simple Excel spreadsheet.  The spreadsheet allows me to type in a single word and it will create about 6100 lines of potential domain names.    You can get this spreadsheet here at a site I set up.

4.  In batches of 500, paste them into GoDaddy’s bulk registration system.   All the domains that are already registered will be culled out of the results.

If you are looking for a domain name in the recruiting space, you can try RecruitWho.com, RecruitGo.com or RecruitGun.com.  All of these domains were available as of this writing.  Below is the full list of 5700 available domain names starting with the word RECRUIT.   Within this list are some good domains, and many very bad ones.  They are listed in the order of occurrence of the word in the  English language, starting with “the” being the most used.  While this may seem daunting, check your premises. Try manually thinking up a domain name, checking if it is available, and trying again, again and again  vs.  looking through this list.  This way is much faster.   If someone picks one from this list, let me know, I’d love to hear that I saved you time.  Go get em!



Thoughts on picking a recruiting vertical; building a new desk

What is a good vertical market to recruit in?   I get asked this question every week.    

However, most often the question is more a question of what I “feel” would be a good market to start a new desk specialty in.  

“Donato, what do you feel a good new desk specialty would be?” 

I say. “I feel like a great market would be placing sales reps in recruiting software companies that do real time data mining of contact information.”  

They respond, “but Donato, Broadlook is the only company I know that does this kind of thing.”

“That’s right”, I say.  “..and I don’t pay fees”   <grin>

About this time they realize I am having fun at their expense and I chime in. “You asked me what I feel, not what I think.”

Most recruiters don’t think it through thoroughly when starting a new desk.   Lets face it, thinking is hard.  A day of designing software wipes me out more than a triathlon (ok I’ve only done one). It’s not their fault.  This is how they were taught.  Or I should say, this is how they learned.  They watch someone who was a big biller and tried to do what they heard.

When discussing the creation of a new desk, I hear a good deal about reading everything from an industry, articles, journals, etc.  Wake up, this type of activity is about learning about the industry, not if the niche will support a desk.  If I am going to trust my livelihood to one vertical or another,  forget the gut, give me data.

“Hey Donato, are you telling me to ignore my gut instinct?”  


The role of the “gut instinct” in this whole process should precede the data gathering.  The gut should lead you to the top several candidates and then you then expose to the scientific method.  The gut gets excited while reading and learning.  Don’t let it get carried away.

The gut is the emotion, the wind.  Let the data be the rudder and the sail.

To start a new desk, I would prefer solid facts about a potential specialty, such as:

How many open jobs, by state and nationally?   (size of universe)
How many recruiters specialize in the niche?     (competition)
What are the average fees paid to recruiters?    (compensation)
What resources can I use to build a candidate pool?  (sourcing)
What resources can I use to win business?  (marketing)
Will I enjoy working this desk specialty? (mental health)
Can I own the space, can I brand this space as mine?  (branding)

Once you do decide on a new desk specialty, based on the data, the first thing to do is think about branding yourself.  I’ll focus on the other questions in my next few blogs.  A great example of branding is Harry Joiner and his site MarketingHeadhunter.com

How easy is it to brand yourself?   Two areas that I know are hot are Physical Therapists and Nanotechnology.  Very different, but both very hot.  So I went out to GoDaddy.com and checked the following web sites:

BIOFUELRECRUITER.COM                            (FREE)
FUELCELLRECRUITER.COM                          (TAKEN)

Most of them were free and only one was taken. Again, all hot, hot, hot.

Heads up.  Don’t try to go register, PHYSICALTHERAPISTRECRUITER.COM or NANOTECHNOLOGYRECRUITER.COM because I just registered them.  The others are free as of this writing.  I may not fetch the 99K that Jason Davis is asking for CEOjobs.com, but they will sell for a hefty profit.

The first Physical Therapist Recruiter to purchase the Broadlook Suite, you can have that site for free.  (I have no passion for that desk).  New clients only.

Don’t ask for nanotechnologyrecruiter.com.  Nanotech excites me.  It’s mine.

Secured By miniOrange