rafaeldff + algorithms   34

Top 10 algorithms in data mining
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm.
paper  DataMining  datascience  algorithm  algorithms  list  introduction  pdf 
march 2014 by rafaeldff
reference request - What's new in purely functional data structures since Okasaki? - Theoretical Computer Science - Stack Exchange
Since Chris Okasaki's 1998 book "Purely functional data structures", I haven't seen too many new exciting purely functional data structures appear (snip). Which other new ideas have appeared since 1998 in this area?
DataStructure  DataStructures  functional  PurelyFunctional  Okasaki  persistent  algorithms  stackoverflow  cstheory  stackexchange  question  Q&A 
december 2011 by rafaeldff
Mapreduce & Hadoop Algorithms in Academic Papers (4th update – May 2011)
Learn from academic literature about how the mapreduce parallel model and hadoop implementation is used to solve algorithmic problems. (with a comprehensive list of papers)
papers  research  MapReduce  parallel  algorithm  algorithms  catalog  paper  academia  hadoop 
may 2011 by rafaeldff
hatful of hollow - Visualising Sorting Algorithms
"Static visualizations of sorting algorithms (clearer than Sedgewick-style animations)"
visualization  algorithm  algorithms  sorting  CS  ComputerScience  graphic  AldoCortesi 
april 2009 by rafaeldff
StringSearch – high-performance pattern matching algorithms in Java
[Java] lacks fast string searching algorithms. StringSearch provides implementations of the Boyer-Moore and the Shift-Or (bit-parallel) algorithms. [They] are easily five to ten times faster than the naïve implementation found in java.lang.String.
project  opensource  utility  String  Strings  Java  search  searching  license:MIT  Boyer-Moore  Boyer-Moore-Horspool  Shift-Or  algorithm  algorithms  wildcards  heuristic  mismatches  EditDistance 
june 2008 by rafaeldff
A Computational Introduction to Number Theory and Algebra
A book introducing basic concepts from computational number theory and algebra, including all the necessary mathematical background.
book  mathematics  cs  computerscience  Algebra  NumberTheory  free  online  VictorShoup  cryptography  theory  algorithms  reference 
may 2008 by rafaeldff
Datawocky: More data usually beats better algorithms
Team A came up with a very sophisticated algorithm using the Netflix data. Team B used a very simple algorithm, but they added in additional data beyond the Netflix set: information []from the [] IMDB. Guess which team did better?
blog  post  data  datamining  AnandRajaraman  algorithm  Stanford  BI  algorithms  competition  efficiency  recommendations 
april 2008 by rafaeldff
Ropes: Theory and practice
"A rope data structure represents an immutable sequence of characters, much like a Java String. But ropes' highly efficient mutations make ropes — unlike Strings and () StringBuffer () — ideal for applications that do heavy string manipulation, especi
toread  algorithm  DataStructure  Java  algorithms  DataStructures  String  Rope  tree  performance  ropes  programming  article  tutorial  AminAhmad  developerworks 
february 2008 by rafaeldff
Public Object: Series Recap: Coding in the small with Google Collections
"several snippets that highlight the carefully designed Google Collections Library:"
Note that despite the name, there is more than just collections in the project - in fact, it looks like an improved, Java5 enabled, subset of Jakarta Commons.
blog  post  tutorial  howto  JesseWilson  Java  Java5  google  GoogleCollectionsLibrary  GoogleCollections  library  collection  collections  containers  DataStructures  api  algorithms  RobertKonigsberg  JeromeMourits 
october 2007 by rafaeldff
Google Code for Educators - Google: Cluster Computing and MapReduce
"This submission contains video lectures and related course materials from a series of lectures that was taught to Google software engineering interns during the Summer of 2007."
site  page  course  academic  Google  distributed  computing  systems  MapReduce  cluster  parallelism  parallel  GFS  video  catalog  clustering  graph  algorithm  algorithms  HPC  lecture  lectures  movie  scalability  towatch 
october 2007 by rafaeldff
Amazon's Dynamo - All Things Distributed
Amazon has also felt the need to develop a semi-general-purpose highly scalable db system. I wonder if there ever will be demand of this kind of Megadata(tm) systems in the overall market or if they will remain restricted to services giants.
toread  blog  post  paper  WernerVogels  Amazon  system  database  storage  data  distributed  systems  scalability  Megadata  GiuseppeDeCandia  DenizHastorun  MadanJampani  GunavardhanKakulapati  AvinashLakshman  AlexPilchin  SwaminathanSivasubramanian  PeterVosshall  reliability  performance  algorithm  algorithms  architecture  availability  consistency  CAP 
october 2007 by rafaeldff
A brief history of Consensus, 2PC and Transaction Commit. Beta Thoughts
Not so brief, in fact. Lots of links to relevant papers, and a superb job of contextualizing those developments. Of course, to really understand the algorithms we have to consult the literature (including those linked papers).
blog  post  MarkMcKeown  distributed  systems  CS  ComputerScience  algorithm  algorithms  transaction  consensus  2PC  3PC  Paxos  fault  availability  safety  research  SOA  REST 
june 2007 by rafaeldff
The Algorithm: Idiom of Modern Science
Famous physicists bashing, american history jokes, semiotics and criptography on an excellent introduction to (algorithmic) computer science
BernardChazelle  Princeton  essay  article  introduction  computerscience  cs  computing  complexity  algorithm  algorithms  math  mathematics  linguistics  history  physics  philosophy  science  cryptography  tractability  decidability  NP  analysis  academia  academic 
september 2006 by rafaeldff
The Calendar FAQ
All you ever didn't want to have to know about calendars and had no one to ask...
site  faq  calendar  date  astronomy  history  programming  time  algorithm  algorithms  i18n  l10n 
august 2006 by rafaeldff
Purely Functional Data Structures
Chris Okasaki thesis on data structures for lazy functional languages (special concern is given to amortized analysis)
paper  thesis  lazy  functional  programming  algorithms  DataStructures  algorithm  DataStructure  data  haskell  ML  pdf 
august 2006 by rafaeldff

related tags

2PC  3D  3PC  academia  academic  ACM  AdrianBowyer  AldoCortesi  AlexPilchin  Algebra  algorithm  algorithms  Amazon  AminAhmad  analysis  AnandRajaraman  api  architecture  arithmethic  arithmetic  article  astronomy  automata  availability  AvinashLakshman  berkeley  BernardChazelle  BI  BillWalster  binary  BinarySearch  bison  bit  bitcoin  bits  blog  book  Boyer-Moore  Boyer-Moore-Horspool  BSD  bug  byte  C  C++  calendar  CAP  catalog  cg  CHPapadimitriou  ChristosPapadimitriou  classic  cluster  clustering  code  collection  collections  combinatorics  competition  compiler  complexity  compression  computer  computerscience  computing  consensus  consistency  containers  contract  course  crossroads  cryptography  cs  cstheory  DAG  data  database  databases  datamining  datascience  DataStructure  DataStructures  date  DavidMacKay  DavidWalend  db  decidability  DenizHastorun  developerworks  digest  digraph  distributed  DistributedAlgorithms  DivideAndConquer  draft  DS  dynamic  DynamicProgramming  Earley  EditDistance  efficiency  entropy  essay  faq  fault  FFT  FP  free  functional  gc  generation  generator  geometry  GFS  git  GiuseppeDeCandia  GLR  google  GoogleCollections  GoogleCollectionsLibrary  googlevideo  graph  graphic  graphical  graphics  graphs  GunavardhanKakulapati  hacks  hadoop  hardware  hash  haskell  heuristic  history  howto  HPC  i18n  information  InformationTheory  integer  interval  interview  interviewing  introduction  java  Java5  JDigraph  JeromeMourits  JesseWilson  JohnAycock  JohnWoodwark  join  JonBentley  JorgArndt  JoshuaBloch  JörgArndt  l10n  lazy  LeapFrogTrieJoin  lecture  LectureNotes  lectures  level  LFTJ  library  license:MIT  linear  linguistics  list  lowlevel  LR  MadanJampani  MapReduce  MarkMcKeown  math  mathematics  mathmatics  Megadata  MergeSort  MerkleTree  minesweeper  mismatches  MIT  ML  movie  NancyLynch  NP  NumberTheory  numerical  Okasaki  online  opensource  optimization  overflow  page  paper  papers  parallel  parallelism  parser  parsing  Paxos  pdf  performance  permutation  permutations  persistent  PeterVosshall  philosophy  physics  post  PrabhakarRagde  presentations  Princeton  problem  problems  programming  project  PurelyFunctional  Python  Q&A  quadtree  question  quora  recommendations  referenc  reference  relational  reliability  research  REST  RichardKarp  RobertKonigsberg  Rope  ropes  RSA  safety  scalability  science  SDasgupta  SeanAnderson  SeanEronAnderson  search  searching  Shift-Or  simplex  site  snippet  snippets  SOA  solution  sorting  source  stackexchange  stackoverflow  stanford  storage  String  Strings  sun  SwaminathanSivasubramanian  system  systems  technique  theory  thesis  time  Tomita  toread  towatch  tractability  transaction  tree  tutorial  UCSD  utility  UVVazirani  versioning  VictorShoup  video  visualization  WernerVogels  wildcards  word  yacc 

Copy this bookmark: