big_data   4163

« earlier    

The Exaggerated Promise of So-Called Unbiased Data Mining | WIRED
"Something extremely unlikely is not unlikely at all if it has already happened."
"The Feynman trap—ransacking data for patterns without any preconceived idea of what one is looking for—is the Achilles heel of studies based on data mining. Finding something unusual or surprising after it has already occurred is neither unusual nor surprising. Patterns are sure to be found, and are likely to be misleading, absurd, or worse."

See also
big_data  statistics  patterns  correlation  random  probability  Feynman 
4 days ago by Tonti
Using pandas with large data
Learn how to use simple techniques to reduce memory usage by almost 90% and work with bigger data using pandas.
python  pandas  big_data 
6 weeks ago by jarechu
Brain-wide Organization of Neuronal Activity and Convergent Sensorimotor Transformations in Larval Zebrafish - ScienceDirect
Simultaneous recordings of large populations of neurons in behaving animals allow detailed observation of high-dimensional, complex brain activity. However, experimental approaches often focus on singular behavioral paradigms or brain areas. Here, we recorded whole-brain neuronal activity of larval zebrafish presented with a battery of visual stimuli while recording fictive motor output. We identified neurons tuned to each stimulus type and motor output and discovered groups of neurons in the anterior hindbrain that respond to different stimuli eliciting similar behavioral responses. These convergent sensorimotor representations were only weakly correlated to instantaneous motor activity, suggesting that they critically inform, but do not directly generate, behavioral choices. To catalog brain-wide activity beyond explicit sensorimotor processing, we developed an unsupervised clustering technique that organizes neurons into functional groups. These analyses enabled a broad overview of the functional organization of the brain and revealed numerous brain nuclei whose neurons exhibit concerted activity patterns.
data  neuroscience  neural_coding_and_decoding  neural_data_analysis  spatio-temporal_statistics  big_data  for_friends  via:? 
6 weeks ago by rvenkat
Algorithmic Government: Automating Public Services and Supporting Civil Servants in using Data Science Technologies
The data science technologies of artificial intelligence (AI), Internet of Things (IoT), big data and behavioral/predictive analytics, and blockchain are poised to revolutionize government and create a new generation of GovTech start-ups. The impact from the ‘smartification’ of public services and the national infrastructure will be much more significant in comparison to any other sector given government’s function and importance to every institution and individual. Potential GovTech systems include Chatbots and intelligent assistants for public engagement, Robo-advisors to support civil servants, real-time management of the national infrastructure using IoT and blockchain, automated compliance/regulation, public records securely stored in blockchain distributed ledgers, online judicial and dispute resolution systems, and laws/statutes encoded as blockchain smart contracts. Government is potentially the major ‘client’ and also ‘public champion’ for these new data technologies. This review paper uses our simple taxonomy of government services to provide an overview of data science automation being deployed by governments world-wide. The goal of this review paper is to encourage the Computer Science community to engage with government to develop these new systems to transform public services and support the work of civil servants.
ai  blockchain  iot  big_data  data_analytics 
8 weeks ago by jcmdlc
Higher patient satisfaction with antidepressants correlates with earlier drug release dates across online user‐generated medical databases
> nThe advent of large online databases in which patients themselves rate drugs allows for a new Big Data–driven approach to compare the efficacy and patient satisfaction with sample sizes exceeding previous studies. Exemplifying this approach with antidepressants, we show that patient satisfaction with a drug anticorrelates with its release date with high significance, across different online user‐driven databases. This finding suggests that a systematic reevaluation of current, often patent‐protected drugs compared to their older predecessors may be helpful, especially given that the efficacy of newer agents relative to older classes of antidepressants such as monoamine oxidase inhibitors (MAOIs) and tricyclic antidepressants (TCAs) is as yet quantitatively unexplored.
antidepressant  psychiatry  big_data 
9 weeks ago by porejide
The Kinds of Data Scientist
-- they forgot to include the data scientists that call out snake oil salesmanship and the ones who worry about ethics.
big_data  data_science  statistics  machine_learning 
9 weeks ago by rvenkat

« earlier    

related tags

advertising  agriculture  ai  airbnb  airbus  algorithms  analysis  analytics  andrew_connolly  antidepressant  apache_beam  apple  architecture  article  artificial_intelligence  astronomy  async  aws  aws_security  batch_processing  bianca_wylie  blockchain  bologna  business  california  cars  chart  cheatsheet  chris_williams  cleaning  climate  cloud  cloud_computing  cmu  company  computing  connectomics  copernicus  correlation  credit_scoring  data  data_analytics  data_engineering  data_ethics  data_pipeline  data_privacy  data_processing  data_science  data_visualisation  database  datascience  dataset  daten  deep-learning  deep_learning  dias  digital_ethics  distributed  dnn  due_process  economics  edge  education  elasticsearch  esa  ethics  ethics_of_algorithms  european_commission  fairness  farming  feynman  for_friends  forecasting  france  geodata  geospatial  germany  google  google_dataflow  government  gpu  graphs  hacking  hadoop  higher_education  hpc  humanitarian  ict4d  innovation  internet_of_things  iot  is:repo  israel  italy  java  journalismus  justice  lambda_architecture  lang:de  law  learning  lsst  machine_learning  management  mapreduce  marketing  memory  methodology  mitsmr  ml  mobile  navigation  netflix  networking  netz  neural_coding_and_decoding  neural_data_analysis  neuroscience  nosql  open_source  opensource  optimization  ownership  pandas  paper  papers  paris  patterns  policy  privacy  probability  psychiatry  python  random  reading_list  reporting  research  resource  robotics  s3  satellite  security  sharing  sidewalk_labs  siemens  smart_cities  software  software_architecture  spatio-temporal_statistics  statistics  storage  stream_processing  surveillance  swisscom  tech  technology  testomonial  tools  trusted_computing  tutorial  tweet  ucberkeley  usa  video  visualization  weather  wind  zero_copy  ★★★★☆  ★★★☆☆ 

Copy this bookmark: