Hadoop   25584

« earlier    

Mastering Spark with R
With information growing at exponential rates, it’s no surprise that historians are referring to this period of history as the Information Age. The increasing speed at which data is being collected has created new opportunities and is certainly poised to create even more. This chapter presents the tools that have been used to solve large-scale data challenges. First, it introduces Apache Spark as a leading tool that is democratizing our ability to process large datasets. With this as a backdrop, we introduce the R computing language, which was specifically designed to simplify data analysis. Finally, this leads us to introduce sparklyr, a project merging R and Spark into a powerful tool that is easily accessible to all.
r  spark  dataanalytics  datascience  hadoop  databases 
26 days ago by markogara
Download QuickStarts for CDH 5.13 | Cloudera
Cloudera QuickStart virtual machines (VMs) include everything you need to try CDH, Manager, Impala, and Cloudera Search. VM uses a package-based install.
cdh  hadoop  hbase  vmare 
6 weeks ago by ywz

« earlier    

related tags

2013  advice  analytics  angularjs  apache-avro  apache-parquet  apache  apps  architecture  article  asf  awk  aws  benchmark  big-data  big  big_data  bigdata  bisection  blog  blogentries  book  books  business  camel  cassandra  cdh  classic  cli  clickhouse  client  cloud-computing  cloud  cloudera  cluster  command-line  command  commands  comparison  compliance  computer  computing  course  coursera  courses  curve  data-engineering  data-management  data-science  data  data_lake  data_structures  dataanalytics  database  databases  databricks  datalake  datascience  db  deeplerning  distributed  download  ebook  ebooks  elasticsearch  emr  equation  error  esb  facebook  fast  file  filter  find  fitting  free  funny  gcp  geo  github  google  gpu  hadoop  hat  hbase  hdf5  hdfs  hive  ibm  improve  industry  interesting  intro  java  javascript  job  json  jupyter  jvm  k8s  kafka  kube  kubernete  kubernetes  kylin  lambda-architecture  lambda  learn  learning  lib  library  line  linux  log  logging  luigi  machine-learning  machine  macos  mapr  mapreduce  math  metadata  minio  mobile  mongo  mongodb  monitoring  mule  multiple  mysql  netdata  nosql  notebook  numpy  nvidia  olap  online  oped  open-source  open  opensource  optimize  orchestration  pandas  parquet  peleton  performance  pivotal  postgres  presentation  process  processor  productivity  programming  python  queries  query  r  red  redshift  reference  root  s3  scalability  scheduler  science  scientific-data-management  scikit  scipy  security  sentry  serverless  setup  several  shell  slow  software-architecture  source  spark  speed  sql  storage  storm  sts  teaching  technology  tool  tutorial  tutorials  uber  unix  vmare  warning  web  workflow  xargs  yarn  zookeeper 

Copy this bookmark: