cmillward + search   105

Regular Expression Matching with a Trigram Index
Regular expression matching over large numbers of small documents can be made fast by using an index of the trigrams present in each document. The use of trigrams for this purpose is not new, but neither is it well known.

Despite all their apparent syntactic complexity, regular expressions in the mathematical sense of the term can always be reduced to the few cases (empty string, single character, repetition, concatenation, and alternation) considered above. This underlying simplicity makes it possible to implement efficient search algorithms like the ones in the first three articles in this series. The analysis above, which converts a regular expression into a trigram query, is the heart of the indexed matcher, and it is made possible by the same simplicity.

If you miss Google Code Search and want to run fast indexed regular expression searches over your local code, give the standalone programs a try.
regex  search  programming 
january 2012 by cmillward
Introduction to Information Retrieval
This is the companion website for the following book.

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.
ir  books  programming  algorithms  search 
july 2010 by cmillward
The Britney Spears Problem [American Scientist]
"Tracking who's hot and who's not presents an algorithmic challenge"
via:migurski  algorithms  search  data  popularity 
may 2010 by cmillward
Idea Navigation: Structured Browsing for Unstructured Text
"Don’t search for keywords! Search for ideas! Our system extracts subject-verb-object triples from unstructured text, groups them into hierarchies, and allows iterative refinement to findexactly what you want."
search  semantic_web  nlp  video  presentation  2008 
june 2008 by cmillward
Google Site Search bookmarklet [HubLog]
Here's an update of an existing Google Site Search bookmarklet: it lets you search the current site—using Google—for a) the currently selected text, b) entered search terms or c) all pages.
bookmarklet  search  google  web 
march 2008 by cmillward
kill-ring-search
Copied something important half an hour ago? Tired of hitting M-y 20 times? Now you can search the kill ring incrementally and yank the result!
elisp  emacs  lib  search  yank 
march 2008 by cmillward
ComicSeeker
Search the Internet's most popular online comic book stores and auctions with ComicSeeker.com.
comics  search  backissues 
february 2008 by cmillward
[0712.3360] Compressed Text Indexes:From Theory to Practice!
A compressed full-text self-index represents a text in a compressed form and still answers queries efficiently. This technology represents a breakthrough over the text indexing techniques of the previous decade, whose indexes required several times the si
algorithms  compression  text  programming  search 
january 2008 by cmillward
natural language processing blog: Particle filtering versus beam search
I had a very interesting discussion at NIPS with Vikash Mansingka about beam search and particle filtering.
search  algorithms  nlp  programming  stochiastic  linguistics 
december 2007 by cmillward
TextMap - The Entity Search Engine - Newspaper Analysis
a search engine for entities: the important (and not so important)people, places, and things in the news. Our news analysis system automatically identifies and monitors these entities, and identifies meaningful relationships between them.
2007  nlp  search  ai  toread  news 
august 2007 by cmillward
University Research Program for Google Search
he University Research Program for Google Search is designed to give university faculty and their research teams high-volume programmatic access to Google Search, whose huge repository of data constitutes a valuable resource for understanding the structur
google  api  research  academia  search 
august 2007 by cmillward
Open Library (Open Library)
Imagine a library that collected all the world's information about all the world's books and made it available for everyone to view and update. We're building that library.
oss  library  search  apps  metadata  books 
july 2007 by cmillward
Official Google Blog: 1-800-GOOG-411: now with maps
a free telephone service that lets you search for businesses by voice and get connected to those businesses for free.
phone  google  maps  search 
july 2007 by cmillward
Netscan : Search Newsgroups
useful usenet interface from Microsoft
microsoft  research  usenet  ui  search  viz  web  tools 
november 2006 by cmillward
TicTap: Painless online shopping with a PC, PDA or SMS
SMS your mobile searches to 763-807-3927 to search for items on Amazon via mobile phone (ISBN or UPC or keywords). great for price compares while in a store
amazon  search  mobile  sms  shopping 
november 2006 by cmillward
Lemonodor: Montezuma Begins
I can't wait to see how it fares performance-wise to the Ruby and Java versions.
lisp  search  lucene 
march 2006 by cmillward
Montezuma - Trac
Montezuma might be a fast, useful text search engine library written entirely in pure Lisp.

Montezuma is a Common Lisp port of Ferret. Ferret is a Ruby port of Lucene.
lisp  apps  dev  lucene  search 
march 2006 by cmillward
A Study of Visualisation Tools for the Web (HCI, CHI, web, search engine, search interface, visualization)
Desperately needed are better designed user interfaces that utilise natural language and agent software to guide the user through the iterative process of formulating the query.
web  hci  search  article  toread  research  nlp  ui 
march 2006 by cmillward
Wired News: Here Comes a Google for Coders
Krugle, which launches officially next month, indexes programming code and documentation from open-source repositories like SourceForge and includes corporate sites for programmers like the Sun Developer Network.
toread  programming  search  web 
february 2006 by cmillward
O'Reilly Network: Googling Your Email
What would it be like to Google your email? Raphaël Szwarc's ZOË is a clever piece of software that explores this idea.
toread  zoe  email  search  google 
january 2006 by cmillward
Swik
an autodisocvery wiki about open source software
oss  wiki  search  programming 
september 2005 by cmillward
Microsoft and Google's War of the Worlds
Opinion: Sure, they're eye-catchers, but will local search services really provide useful and timely information?
business  search  local  article  opinion 
september 2005 by cmillward
[WEB4LIB] RE: Yahoo-OCLC toolbar
Open WorldCat engine for Firefox mycroft
mozdev  search  firefox  library 
august 2005 by cmillward
Tag Central
aggregates multiple webapps supporting tags allowing you to search them all at the same time
tagging  webapps  search 
june 2005 by cmillward
Querying Linguistic Databases
UPenn group behind LPath, for linguistic queries on corpora
nlp  ai  corpus  search  xml  lpath  xpath 
june 2005 by cmillward
Blogdigger - Media/RSS Search Engine
Search Webjay playlists with Blogdigger, then play the results
search  rss  mp3 
march 2005 by cmillward
HubLog: A9's street photos
a little discussion and links about search features in A9 and others
search  blogentry 
january 2005 by cmillward
« earlier      
per page:    204080120160

related tags

academia  ai  algorithms  amazon  api  apps  article  attention  backissues  bible  bibliometrics  bibtex  blogentry  bookmark-logging  bookmarklet  bookmarklets  books  booksearch  business  census  cia  cl  cli  comic  comics  comparch  compression  cooking  corpus  creativecommons  criticism  cs  custom  data  del.icio.us  dev  dictionary  elisp  emacs  email  endeca  eola  errorchecking  evernote  extensions  feedback  firefox  food  formal-languages  generator  gnu  google  graphics  grep  hci  health  history  image  images  ir  java  javascript  journals  lib  library  linguistics  lisp  literature  local  lpath  lucene  macros  maps  math  mefi  memory  meta  metadata  microsoft  mobile  mode  mozdev  mozilla  mp3  music  muxtape  mycroft  mysql  news  nlp  ontology  opensearch  opinion  optimization  organization  oss  osx  paper  perl  personal  phone  php  plantar_fasciitis  popularity  powerset  presentation  programming  python  radio  rdf  reading  readinglist  realtime  recipes  reference  regex  research  resource  rss  ruby  search  semantic  semantics  semantic_web  shopping  sms  spider  stochiastic  tabbing  tabs  tagging  text  tools  toread  tree  tutorial  ui  unix  usenet  utils  via:hublicious  via:migurski  video  viz  web  webapp  webapps  webdev  wiki  wordpress  worldcat  xml  xpath  yahoo  yank  zoe 

Copy this bookmark:



description:


tags: