indexing   5503

« earlier    

asdine/storm: Simple and powerful toolkit for BoltDB
Simple and powerful toolkit for BoltDB. Contribute to asdine/storm development by creating an account on GitHub.
github  go  boltdb  indexing 
yesterday by snahor
blevesearch/bleve: A modern text indexing library for go
A modern text indexing library for go. Contribute to blevesearch/bleve development by creating an account on GitHub.
github  go  indexing  search  text-search 
yesterday by snahor
myui/btree4j: Disk-based B+-tree written in Pure Java
Disk-based B+-tree written in Pure Java. Contribute to myui/btree4j development by creating an account on GitHub.
github  java  btree  data-structures  trees  indexing 
2 days ago by snahor
Against Cleaning
"This may be a only a current issue, a tax on those humanities researchers who wish to adopt new methods, asking them to over-explain their work processes in order to hash out new regimes for research in this domain. Once new methods are more widely practiced, the data-intensive humanities researcher may also be able to toss off the shorthand of “data cleaning.” For now, there is value in being arrested by the obfuscation of this phrase. Trying to more precisely say what we mean by “data cleaning” can be fruitful because this effort directs our attention to an unresolved conversation about data and reductiveness. In turn, this might help us to develop new work that blends the tradition of cultural criticism from the humanities with research that is also digital and data-intensive."
data-cleaning  digitalhumanities  indexing 
11 days ago by jschneider
An Alternative to a Keep It All Data Management Strategy – Index Engines Briefing Note | - The Home of Storage Switzerland
Index Engines, as the name implies, is built from the ground up to rapidly scan data to create an index based on file attributes and information inside the file. Index Engine claims their indexers can scan and process up to 1TB of file data per hour and indexes that are less than 5% of the data set. The organization can install multiple indexers to scale the scanning time to meet business needs. Most importantly the Index Engines solution can also scan across various storage systems, into backup data, and across cloud resources, making it possible to create a universal index and repository for all of the organization’s data.
21 days ago by euler
Consolidate duplicate URLs - Search Console Help
If you have a single page accessible by multiple URLs, or different pages with similar content (for example, a page with both a mobile and a
google  sitemap  indexing 
5 weeks ago by jberkel

« earlier    

related tags

10  11  2018  academia  access  advice  ai  algorithms  amazon  analysis  and  android  androidstudio  app  applink  archive  archiving  assetlinks  aws  bedfiles  benchmark  between  bigdata  bioinformatics  birthday  boltdb  book  books  brin  btree  characters  classification  cloud  clustering  cms  code-search  collation  collection  comp-sci  comparison  computing  concurrency  configuration  content  crawling  data-cleaning  data-structures  data  data_models  database  databasedesign  databases  db  deduplicator  deep_learning  deeplink  dev  development  difference  digital  digitalhumanities  disallow  django  document  dougbooyd  dropbox  drupal  dynamic  editing  elastic  elasticsearch  emacs  email  engine  etf  exclude  feature-construction  finland  from:ifttt  full-text  fulltext  genomics  geo  github  glibc  go  golang  google  googlebot  grapheme  guide  hacker_news  haiti  hashing  heritrix  hide  history  how  howto  ifttt  index-card  index  indexers  information-retrieval  inls201  install  internet  intranet  investing  ivacheung  java  javascript  js  json  jsonb  jsonpath  julia  keyword  knowhow  knowledge-management  kythe  learn  learning  lib  library  lifehacks  locking  logic  login  lucene  machine-learning  machine  memory  mental_health  migration  mobile-first  mobile-friendly  music  nlp  note  ohmsdigital  openproject  opensource  ops  optimization  oralhistory  org-mode  overtime:  paper  paradox  performance-measure  performance  plugin  plugins  podcast  podcasts  postgres  postgresql  preservation  prevent  private  pro  programming  progress  project  publishing  pycharm  python  rather-interesting  rdf  read-later  reading  recipes  reference  regex  replicas  research  resources  robots.txt  ruby  rust  samtools  scalability  scary  scholarship  scotland  search  searchengine  searching  seo  setup  sitemap  smx  software  sorting  sourced  speech  spider  sql  sqlserver  stop  storage  strategy  string  structured  sublimetext  summary  synology  sysex  taxonomy  technology  text-search  text  text_mining  the  threads  tips  to-write-about  to  tool  tools  toread  transcribe  transcribing  trees  tts  tutorial  twitter  unicode  usa  users  utf-8  utf8  uxdesign  verify  vocabularies  voice  web  webarchive  webcrawl  webcrawler  webcrawling  websites  wikibase  writing 

Copy this bookmark: