BigData   35281

« earlier    

Don't use Hadoop - your data isn't that big - Chris Stucchio
Be afraid when people start saying they need to use hadoop.

From the comments section on Hacker News: "good rule of thumb is the data is big when it won't fit into the RAM on one machine."

Obviosly, the definition of machine can change.
bigdata  Software  statistics 
yesterday by olepig
Don't use Hadoop - your data isn't that big - Chris Stucchio
It isn't really big. His cutoff for "really big" is something like 5 TB. If it's smaller than that, use other reasonable tools.
via:HackerNews  bigdata  hadoop 
yesterday by mcherm
A Contemporary Delphic Oracle: The Church of Big Data |
"We have elevated data to divine standards and have developed a tendency to confuse tools with their creators in the process. Nobody in the 17th Century would have dreamed of claiming a brush and some paint created The Night Watch, or that it's a good idea to spend 18 months on one painting.
The anthropomorphisation of computers was researched in depth by Reeves and Nass in The Media Equation (1996). They show through multiple experiments how people treat computers, television, and new media like real people and places. (..)
Nowadays we turn to data for advice. The oracle of big data functions in a similar way to the oracle of Delphi. Algorithms programmed by humans are fed data and consequently spit out numbers that are then translated and interpreted by researchers into the prophecies the seekers of advice are sent home with. (..)
Can numbers really speak for themselves? (..) Harford (2014) describes these assumptions as four articles of faith. (..) The second article is the belief that not causation, but correlation matters. The biggest issue with this belief is that if you don't understand why things correlate, you have no idea why they might stop correlating either, making predictions very fragile in an ever changing world. Third is the faith in massive data sets being immune to sampling bias, because there is no selection taking place. Yet found data contains a lot of bias, as for example not everyone has a smartphone, and not everyone is on Twitter. (..)
The belief in this oracle has quite far reaching implications. For one, it dehumanises humans by asserting that human involvement through hypotheses and interpretation, is unreliable, and only by removing humans from the equation can we finally see the world as it is. While putting humans and human thought on the sideline, it obfuscates the human hand in the generation of its messages and anthropomorphises the computer by claiming it is able to analyse, draw conclusions, even speak to us. The practical consequence of this dynamic is that it is no longer possible to argue with the outcome of big data analysis. This becomes painful when you find yourself in the wrong category of a social sorting algorithm guiding real world decisions on insurance, mortgage, work, border checks, scholarships and so on. (..)
"Computers, as the experts continually remind us, are nothing more than their programs make them. But as the sentiments above should make clear, the programs may have a program hidden within them, an agenda of values that counts for more than all the interactive virtues and graphic tricks of the technology. The essence of the machine is its software, but the essence of the software is its philosophy" (Roszak, 1986). (..)
In The Empty Brain (2016) research psychologist Robert Epstein writes about the idea that we nowadays tend to view ourselves as information processors, but points out there is a very essential difference between us and computers: humans have no physical representations of the world in their brains."
furtherfield  bigdata  data  belief  cybernetics  control  software  counterculture  tool  correlation  brain  machine  memory  roszak 
2 days ago by gohai
probably the best lecturer today. About being a slave to the algorithm and GDPR.
bigdata  from twitter_favs
2 days ago by tnhh
: A Brief Breakdown (note have to scroll quite far down on link)
fintech  tech  AI  bigdata  MachineLearning  ML  from twitter_favs
2 days ago by jonz

« earlier    

related tags

2017w20  351  abdsc  academia  activism  ai  algorithm  alphabet  analysis  analytics  angel  anthropology  apache  apple  architecture  artificialintelligence  autoimmune  awesome  aws  bash  belief  bibliometrics  big  bigdata  bigquery  bigtech  blog  bostonscene  brain  by:cathyoneil  calcite  capitalism  career  cars  cathyoneil  cluster  cmo  commonrule  compliance  control  correlation  counterculture  culture  cx  cybernetics  darkdata  dashboards  data-science  data  data_analytics  data_ingestion  data_science  dataanalysis  databank  database  databases  datamining  dataplatform  datascience  datascientist  dataset  datasets  deeplearning  designthinking  digitalmarketing  diseaseassociation  edpolicy  edtechstrategies  edu  education  ehr  emr  epidemiology  ethics  eu  experience  facial.recognition  fintech  flink  flu  furtherfield  genevariants  genomics  google  graph  green  hadoop  healthcare  identity  immunesystem  infovis  inmemory  innovation  iot  journalism  jterm  kvstore  lambdaarchitecture  learning  linux  machine  machinelearning  map  mapreduce  marc  mdm  memory  metascience  ml  newurbanism  nlp  normate-tech  ontoforce  opensource  organize  personalization  phenome  phewas  pig  politics  prediction  presentation  prevention  privacy  processing  programming  public  python  r  radlib  realtime  recipe  reddit  redis  redshift  reduce  reference  research  researchethics  revolution  roszak  rt  science  scienceculture  search  slides  smartcities  smartcity  social  socialmedia  society  software  sp_issues  spark  sql  sqlite  startup  statistics  stco43803  surveillance.capitalism  surveillance  teachers  teaching  tech&society  tech  tenure  tool  tracking  transportation  trump  twitter  unions  utilities  vc  video  visualisation  vital  wearables  wine  workflow 

Copy this bookmark: