big_data   4099

« earlier    

Hidden Technical Debt in Machine Learning Systems
Machine learning offers a fantastically powerful toolkit for building useful complex
prediction systems quickly. This paper argues it is dangerous to think of
these quick wins as coming for free. Using the software engineering framework
of technical debt, we find it is common to incur massive ongoing maintenance
costs in real-world ML systems. We explore several ML-specific risk factors to
account for in system design. These include boundary erosion, entanglement,
hidden feedback loops, undeclared consumers, data dependencies, configuration
issues, changes in the external world, and a variety of system-level anti-patterns.
paper  big_data  machine_learning 
4 days ago by istemi
Explained: What is big data? - Malwarebytes Labs | Malwarebytes Labs
If the pile of manure is big enough, you will find a gold coin in it eventually. This saying is used often to explain why anyone would use big data. Needless to say, in this day and age, the piles of data are so big, you might end up finding a pirate’s treasure.
How big is the pile?
But when is the pile big enough to consider it big data? Per Wikipedia:
“Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them.”
As a consequence, we can say that it’s not just the size that matters, but the complexity of a dataset. The draw of big data to researchers and scientists, however, is not in its size or complexity, but in how it may be computationally analyzed to reveal patterns, trends, and associations.
When it comes to big data, no mountain is high enough or too difficult to climb. The more data we have to analyze, the more relevant conclusions we may be able to derive. If a dataset is large enough, we can start making predictions about how certain relationships will develop in the future and even find relationships we never suspected to exist.
big_data  data  business  security  privacy 
7 days ago by rgl7194
How Big Data and AI Are Driving Business Innovation in 2018
2018 survey is that an overwhelming 97.2% of executives report that their companies are investing in building or launching big data and AI initiatives. Among surveyed executives, a growing consensus is emerging that AI and big data initiatives are becoming closely intertwined, with 76.5% of executives indicating that the proliferation and greater availability of data is empowering AI and cognitive initiatives within their organizations.
big_data  artificial_intelligence  mitsmr 
15 days ago by tom.reeder
Sensor city: Sidewalk Labs’ Toronto project triggers debate over data - The Globe and Mail
Concerns have begun to emerge, however, that uncritically embracing the opportunity of Quayside could set precedents that stifle Canada's potential. And a growing number of Canadian tech leaders are beginning to ask: If data generated by Canadian cities creates value, shouldn't Canadians share in it?...

How Canada's next generation of infrastructure is built will determine who gets the most value and competitive advantage from it - not just now, but in 10, 20, even 50 years, as cities evolve, innovation blossoms and unanticipated revenue streams emerge. Many tech leaders are calling for a national data strategy, similar to those being discussed across Europe, to ensure that Canada doesn't unwittingly sign away the chance for economic spinoffs. And while Ottawa is promising such a strategy in the coming months, it may not come in time to address Alphabet's Toronto conquest....

Sidewalk has discussed creating a trust to own the data generated by the Quayside project, which Mr. Doctoroff said might be a more independent path than having it handled by governments. It has also promised not to commercialize the data, but Mr. Doctoroff said the company has not ruled out "ultimately licensing the technology" developed in Toronto as a way to monetize the project...

Without a cohesive national strategy, Canada's data risks being taken advantage of, says Ben Bergen, executive director of the Council of Canadian Innovators, which is chaired by former Research in Motion Ltd. co-CEO Jim Balsillie...

forthcoming strategy would address some of the issues Canada's tech community is raising: "Who owns the data? Who will benefit from data? Who monetizes the data? What are some of the ethical issues around that?" In Ontario, Economic Development Minister Steven Del Duca said in an interview that questions around Sidewalk's proposals for Toronto have "focused" the province's need to act, and multiple ministries are working together on a cohesive plan: "We can't afford to wait five or 10 years."
data_privacy  smart_cities  sidewalk_labs  privacy  big_data 
4 weeks ago by shannon_mattern
Input plugins
Get started with the documentation for Elasticsearch, Kibana, Logstash, Beats, X-Pack, Elastic Cloud, Elasticsearch for Apache Hadoop, and our language clients.
4 weeks ago by jarechu

« earlier    

related tags

*_blog_post_items_news  affordances  ai  algorithms  alondra_nelson  alyssa_goodman  amazon  analysis  analytics  apache_beam  apple  archaeology  architecture  artificial_intelligence  arvind_narayanan  async  automation  aws  barbara_koenig  batch_processing  berkeley  bi  bias  bitcoin  blockchain  book  bookmarks_bar  bureaucracy  business  california  cars  challenge  chart  cheatsheet  china  cleaning  cloud  cmu  coding  coe  company  consumer_protection  course  courses  credit_scoring  crisis  critical_theory  critique  culture  danah_boyd  dashboards  data  data_analysis  data_engineering  data_ethics  data_journalism  data_pipeline  data_privacy  data_processing  data_science  data_visualization  database  datascience  dataset  db  deep-learning  deep_learning  deluge_data  design  digital_economy  dna  dnn  ecl  education  emily_keller  ethics  european_commission  farming  fintech  for_friends  fpga  frank_pasquale  free  generative  genomics  geology  geospatial  germany  google  google_dataflow  gov2.0  governance  gpu  grim_meathook_future  hacking  hadoop  hardware  health  history  hpcc  human_trafficking  ibm  ide  images  inequalities  inequality  influencer  information  internet  iot  israel  j  jacob_metcalf  java  kate_crawford  lambda_architecture  lang:fr  learn  linguistics  machine-learning  machine_learning  management  mapreduce  market_microstructure  marketing  mathematics  matthew_zook  measurement  medicine  memory  mitsmr  ml  modelling  my_work  navigation  net_neutrality  netpolicynotes  network_mapping  networking  networks  neutrality  nonprofit  nytimes  open_source  opendata  opensource  optimization  ownership  palantir  paleobiology  paper  papers  performance  phobia  physics  policing  politics  privacy  python  rachelle_hollander  regulation  research  robotics  security  seeta_pena_gangadharan  sharing  sidewalk_labs  siemens  singapore  slavery  smart_cities  social_networks  social_reputation  society  sociology_of_technology  software  solon_barocas  sql  statistics  storage  strategy  stream_processing  surveillance  surveillance_economy  teaching  tech  technology  ted  testing  to_read  tools  translation  translation_industry  trusted_computing  ucberkeley  urban_planning  us_elections  us_politics  usa  voter_supression_complex  wapo  whitepaper  wired  zero_copy 

Copy this bookmark: