Don't use Hadoop - your data isn't that big - Chris Stucchio
Be afraid when people start saying they need to use hadoop.

From the comments section on Hacker News: "good rule of thumb is the data is big when it won't fit into the RAM on one machine."

Obviosly, the definition of machine can change.
It isn't really big. His cutoff for "really big" is something like 5 TB. If it's smaller than that, use other reasonable tools.
A Contemporary Delphic Oracle: The Church of Big Data |
"We have elevated data to divine standards and have developed a tendency to confuse tools with their creators in the process. Nobody in the 17th Century would have dreamed of claiming a brush and some paint created The Night Watch, or that it's a good idea to spend 18 months on one painting.
The anthropomorphisation of computers was researched in depth by Reeves and Nass in The Media Equation (1996). They show through multiple experiments how people treat computers, television, and new media like real people and places. (..)
Nowadays we turn to data for advice. The oracle of big data functions in a similar way to the oracle of Delphi. Algorithms programmed by humans are fed data and consequently spit out numbers that are then translated and interpreted by researchers into the prophecies the seekers of advice are sent home with. (..)
Can numbers really speak for themselves? (..) Harford (2014) describes these assumptions as four articles of faith. (..) The second article is the belief that not causation, but correlation matters. The biggest issue with this belief is that if you don't understand why things correlate, you have no idea why they might stop correlating either, making predictions very fragile in an ever changing world. Third is the faith in massive data sets being immune to sampling bias, because there is no selection taking place. Yet found data contains a lot of bias, as for example not everyone has a smartphone, and not everyone is on Twitter. (..)
The belief in this oracle has quite far reaching implications. For one, it dehumanises humans by asserting that human involvement through hypotheses and interpretation, is unreliable, and only by removing humans from the equation can we finally see the world as it is. While putting humans and human thought on the sideline, it obfuscates the human hand in the generation of its messages and anthropomorphises the computer by claiming it is able to analyse, draw conclusions, even speak to us. The practical consequence of this dynamic is that it is no longer possible to argue with the outcome of big data analysis. This becomes painful when you find yourself in the wrong category of a social sorting algorithm guiding real world decisions on insurance, mortgage, work, border checks, scholarships and so on. (..)
"Computers, as the experts continually remind us, are nothing more than their programs make them. But as the sentiments above should make clear, the programs may have a program hidden within them, an agenda of values that counts for more than all the interactive virtues and graphic tricks of the technology. The essence of the machine is its software, but the essence of the software is its philosophy" (Roszak, 1986). (..)
In The Empty Brain (2016) research psychologist Robert Epstein writes about the idea that we nowadays tend to view ourselves as information processors, but points out there is a very essential difference between us and computers: humans have no physical representations of the world in their brains."
probably the best lecturer today. About being a slave to the algorithm and GDPR.
: A Brief Breakdown (note have to scroll quite far down on link)
