Massive online databases vs. individually siloed data…lets take a look.

There is a movement going on, right now.  Companies are starting to abandon large subscription databases and building their own silos of data.  Why?

Lets first examine the general trend of technology.  When a new technology first gets introduced, it tends to be (1) more complex, (2) more expensive and (3) centralized.  Job Boards for example. First there were the large boards like headhunder and, next niche job boards, and then large corporate job boards.  Now even small recruiting firms post their own job postings on their own web sites.   The trend once a technology matures is (1) less complex (2) less expensive and (3) ubiquitous and decentralized.  Less complex because the technology is streamlined and reengineered and less expensive and decentralized due to technology improvements and economies of scale.

Watching this trend has been one of my hobbies, it’s universal like the 80-20 rule.  It’s time to give it a name.  Expensive-Niche-Decentralized or E-N-D. 

An entire series of technologies that follow the END trend.  Web based CRM is also starting to follow this curve. = the early days of  In the last few years, many new CRM’s specific to vertical markets have sprung up.  Recently there is a movement to self-hosted web based CRM.  SugarCRM is open source and Microsoft CRM can be hosted in-house.   CRM, even web-based, is decentralizing.

Ignoring this trend is eqivilant to putting your head in the sand regarding Moore’s law, Kryder’s law, or Nielsen’s law.

What are the variables that will cause data siloing to follow END?

(1) Search, Parsing, Extraction and Export Technology. 

(2) Price and speed of bandwidth

(3) Computer specifications (memory, CPU, hard disk space)

Take a look at history. 

1995 56K modem, 10 meg hard  drive,  16MB ram

2000 256K DSL line, 100 meg hard drive,  256 MB RAM

2007  5MB/S cable line,  500 GIG hard drive, 4 GIG RAM

Everything you need to silo your own data is available today.

The siloing of web-based data is starting to follow the END trend.   It’s possible now.  Companies like Broadlook, PureDiscovery, EmployON,  and others are making it possible.  Today I can store the resulting spidered data from 24 million web sites on 1 desktop computer with a terrabyte drive and 2 gigs of memory to process it.  Today. It takes about 30 days to spider that data on a few $100 per month rented servers. Total cost is less than $5000 to own that data yourself.  You just need to know what to do with it. 

There are a number of online databases that you can pay for access today. Zoominfo, Spoke, Jigsaw, etc.   Some of these were the first movers in the space and have been around for a while. These qualify in the Expensive (E) part of END.  Even if prices are dropped, they still are dead center of  (E), which is Expensive, Complex, Large, Centralized (the E factor is not just expensive).

In the past year, I’ve seen the Niche (N) data silos starting to pop up. I know this because many niche databases that are being built with tools from Broadlook and other vendors (both off the sheld and custom built solutions).  Cottage industries are being built.

None of the Expensive (E) or Niche (N) databases near the value of some of the data silos that individual recruiters are building in-house every day.  These data silos are the cream of the crop of data.  Forward thinking recruiting firms are starting to usher in the decentralized (D) phase of the END trend in siloing web data. 

Will the large data silos like Zoominfo still exist once decentralized data siloing becomes the norm?  I think that there will still be a place for it, however, they will have to keep adding new value that is several years out of the reach of the individual.  Innovate or dissipate. 

More about END as it relates to CRM in a future post.