Sure — for public data that is a reasonable argument. Back when we were having this conversation Facebook had very little data that was shared with public vs the vast majority shared with friends and that is still true today. That was the core of the conversation and what Google folks wanted opened up. That said — I think they key debate here was about the special sauce of the companies and why they wouldn’t share their special sauce. Google search’s magic is the ranking algorithm.
I don’t think most people are quite as gracious as you when they think about making something public. If there were a checkbox when you create your profile that said “allow my content to be indexed by marketing databases” I bet no one would ever opt in to that.
IMDB and LinkedIn -> Whitetruffle are great examples of the challenges and nuances here. If imdb has done all this work to structure and normalize “facts” — why shouldn’t there be business value for them in protecting that including being able to prevent others to just scrape and reuse? You may be right that someone could just write it all down manually. But perhaps that burden of work is enough to deter from just leveraging IMDB vs doing it in an automated fashion.
Muck Rack makes it simple to find people, tweets, or articles that mention any name, keyword, company, hashtag etc. We've compiled this guide to help you make the most of your search.
Selecting a term
Start searching tweets, articles from media outlets, articles mentioned in tweets, journalists'
names, titles and bios with some suggested searches:
Companies or Topics (e.g. iPhone, Microsoft)
Phrases (e.g. "cloud computing") — use quotes to keep the terms together
Twitter handles (e.g. @username) — returns those who have mentioned or replied to
Names (e.g. "David Pogue")
Hashtags (e.g. #sxsw, #london2012)
Bio details (e.g. vegan, Olympics, father)
Muck Rack's Advanced Search allows for many boolean operators.
Find results that mention multiple specified terms, use AND or
+. For example, ensure each result contains both Elon Musk and Mark Zuckerberg by
searching Obama AND Romney or Obama + Romney.
Use the operators OR or , to broaden your search when you'd like either of
multiple terms to appear in results. (This is the default behavior of our search when no operators
are used.) For example, search for democrat OR republican to find results that refer to
Democrats and/or Republicans.
Use NOT or - to subtract results from your search. For
example, searching Disney will yield results about the Walt Disney Company as well as Walt Disney
World Resort. To exclude mentions of Disney World, search for Disney -World or Disney
When using one of these operators with a phrase, enclose it in quotation marks. For example, you can
find results about smartphones excluding Apple's iPhone 4S by searching smartphone -"iPhone
Exact case matching or punctuation
If you're searching for a brand name or keyword that relies on specific punctuation marks or capitalization, you can
find results that match your exact query by adding matchcase: before the keyword you're searching for, like matchcase:E*TRADE .
Use parentheses to separate multiple
boolean phrases. For example, to find journalists talking about having fun in Disney World or
Disneyland, search for ("disney world" OR disneyland) AND fun.
An asterisk can be used to search for any variation of a root word truncated by the asterisk. For example, searching for admin* will return results for administrator, administration, administer, administered, etc.
A near operator is an AND operator where you can control the distance between the words. You can vary the distance the near operation uses by adding a forward slash and number (between 0-99) such as strawberries NEAR/10 "whipped cream", which means the strawberries must exist within 10 words of "whipped cream".