wjy + datamining   16

Introduction to Data Mining
Introduction to Data Mining (Second Edition)
datascience  datamining  books  book  bigdata 
august 2018 by wjy
QMiner
QMiner is a data analytics platform for processing large-scale real-time streams containing structured and unstructured data.
bigdata  datamining  service  api  analytics 
january 2016 by wjy
Weka 3 - Data Mining with Open Source Machine Learning Software in Java
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
datamining  machinelearning  opensource  java  weka 
november 2014 by wjy
Mining of Massive Datasets
The book is based on Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining).
datamining  bigdata  ebook  datasets  books  machinelearning  mooc  datascience  stanford 
november 2014 by wjy
Orange – Data Mining Fruitful & Fun
Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.
visualization  datamining  python  machinelearning  ml  software 
september 2014 by wjy
Apache Mahout: Scalable machine learning and data mining
The Apache Mahout™ project's goal is to build a scalable machine learning library.
machinelearning  datamining  apache  mahout  hadoop  ml  library  opensource 
april 2014 by wjy
Jaccard index - Wikipedia, the free encyclopedia
a statistic used for comparing the similarity and diversity of sample sets.
similarity  statistics  algorithm  wikipedia  clustering  datamining  jaccard  math  sets  programming 
march 2014 by wjy
clips/pattern
Web mining module for Python

Pattern is a web mining module for the Python programming language. It bundles tools for data mining (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), natural language processing (tagger/chunker, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, k-means clustering, Naive Bayes + k-NN + SVM classifiers) and network analysis (graph centrality and visualization). It is well documented and bundled with 30+ examples and 350+ unit tests. The source code is licensed under BSD and available from http://www.clips.ua.ac.be/pages/pattern.
datamining  text-mining  tools  library  module  mining  scraping  webmining  nlp  python 
october 2012 by wjy
Python Data Analysis Library — pandas: Python Data Analysis Library
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
library  dataanalysis  data-analysis  pandas  opensource  datamining  analysis  data  statistics  python 
june 2012 by wjy
Waffles
A collection of command-line tools for researchers in machine learning, data mining, and related fields. All of the functionality is also provided in a clean C++ class library. Demo apps are included to show how to use the class library.
research  opensource  library  c++  programming  datamining  machinelearning 
december 2011 by wjy

Copy this bookmark:



description:


tags: