US president Donald Trump is yet to lay the first brick of his proposed wall between the US and Mexico, but some smugglers are already pricing it in. Last week Trump’s Homeland Security secretary, John Kelly, boasted that the administration’s get-tough policies have pushed coyotes, as human smugglers are called, to jack up fees for getting immigrants through certain mountainous areas (he didn’t specify which, and officials didn’t respond to requests for clarification).
There are so many different ways to analyze and parse a dataset—it’s part of what makes data analysis exciting. But working with data can pose major challenges, whether we’re dealing with FOI denials or just trying to free data from (sadly ubiquitous) PDFs. Most time spent on data analysis is devoted to requesting, cleaning, and structuring data, and wrangling it into a format we can actually pipe into a spreadsheet, database, or graphic.
One way to understand long-term trends in medical and health research is to analyze the language used in massive bodies of literature produced in the different fields. To better understand the shifting focus of sex research since the field was established, we downloaded (with permission) 4,545 articles published in the Journal of Sex Research and the Archives of Sexual Behavior from 1970 to 2017, and tracked just over 1,000 of the most-used words in these studies.
@forestgregg@rufuspollock@mckinneyjames I'm certainly interested the probabilistic approach, but I don't think I'd be inclined to put semantic stuff in core agate. Fwiw, we're using probabilistic semantic typing at Enigma.
Muck Rack makes it simple to find people, tweets, or articles that mention any name, keyword, company, hashtag etc. We've compiled this guide to help you make the most of your search.
Selecting a term
Start searching tweets, articles from media outlets, articles mentioned in tweets, journalists'
names, titles and bios with some suggested searches:
Companies or Topics (e.g. iPhone, Microsoft)
Phrases (e.g. "cloud computing") — use quotes to keep the terms together
Twitter handles (e.g. @username) — returns those who have mentioned or replied to
Names (e.g. "David Pogue")
Hashtags (e.g. #sxsw, #london2012)
Bio details (e.g. vegan, Olympics, father)
Muck Rack's Advanced Search allows for many boolean operators.
Find results that mention multiple specified terms, use AND or
+. For example, ensure each result contains both Elon Musk and Mark Zuckerberg by
searching Musk AND Zuckerberg or Musk + Zuckerberg.
Use the operators OR or , to broaden your search when you'd like either of
multiple terms to appear in results. (This is the default behavior of our search when no operators
are used). For example, results will contain either cake or cookie by searching cake OR cookie or cake,cookie
Use NOT or - to subtract results from your search. For
example, searching Disney will yield results about the Walt Disney Company as well as Walt Disney
World Resort. To exclude mentions of Disney World, search for Disney -World or Disney
When using one of these operators with a phrase, enclose it in quotation marks. For example, you can
find results about smartphones excluding Apple's iPhone 4S by searching smartphone -"iPhone
Exact case matching or punctuation
If you're searching for a brand name or keyword that relies on specific punctuation marks or capitalization, you can
find results that match your exact query by adding matchcase: before the keyword you're searching for, like matchcase:E*TRADE .
Use parentheses to separate multiple
boolean phrases. For example, to find journalists talking about having fun in Disney World or
Disneyland, search for ("disney world" OR disneyland) AND fun.
An asterisk can be used to search for any variation of a root word truncated by the asterisk. For example, searching for admin* will return results for administrator, administration, administer, administered, etc.
A near operator is an AND operator where you can control the distance between the words. You can vary the distance the near operation uses by adding a forward slash and number (between 0-99) such as strawberries NEAR/10 "whipped cream", which means the strawberries must exist within 10 words of "whipped cream".