organizing and analyzing phenotypic data from patients with genetic disorders.
phenotypes  genetics  research  software 
august 2015 by arthegall
John P. A. Ioannidis, "Why Most Published Research Findings Are False" PLoS Medicine (2005)
"Commercially available “data mining” packages actually are proud of their ability to yield statistically significant results through data dredging."
john-ioannidis  research  medicine  research-article  statistics  p-values  via:nikete 
july 2013 by arthegall
Price doesn't always buy prestige in open access : Nature News & Comment
Would totally believe that many inhabitants of the top and bottom of the rankings are outliers due to (small) size. Let's get some multilevel modeling here -- SHHHHHRRRRIIIINKAGE, Eli, you boy!
open-access  journals  science  publishing  costs  research  there-will-be-shrinkage 
january 2013 by arthegall
An embargo on short read alignment tools | biomickwatson
Gets two things wrong:
1. Pessimism : "What I don’t understand is how, with over 70 short-read aligners out there, you can publish a new one that you can show is better than all of the existing tools. And also, why you would bother?" -- which is a line on the order of "The advancement of the arts, from year to year, taxes our credulity and seems to presage the arrival of that period when human improvement must end."
2. Research != Software Development
And then, adds in a dash of funny -- "Oh, and in answer to the burning question you all have, I use Novoalign" -- which is, of course, closed-source and unable to be branched and modified.
via:?  bioinformatics  aligners  sequence-analysis  research  software  programming 
december 2012 by arthegall
Readings in Databases
Yet another list-of-classic-DB-papers.
list  databases  research  via:?? 
july 2012 by arthegall
From Words to Concepts and Back: Dictionaries for Linking Text, Entities and Ideas | Research Blog
What's a "concept" again? (Is this what they meant, when they were writing about "the end of Models?")
concepts  ontology  words  peter-norvig  google  research  dbpedia  statistics 
may 2012 by arthegall
"Investigation, Study, Assay" infrastructure -- to see how this interacts with ontologies like OBI?
annotation  data  ontology  research  science 
august 2011 by arthegall
FaceL CSU Homepage
Facile Face Labeling -- identifies multiple faces within video, and then recognizes them individually.
research  open  face-recognition  video 
may 2011 by arthegall
Face Recognition Homepage - Algorithms
Collection of papers and references for algorithms to do facial recognition.
facial-recognition  list  index  research  papers 
may 2011 by arthegall
The Initiative | ORCID
Ontology for investigators? A replacement for ILAR? Maybe?
ilar  ontology  data  investigators  authors  science  research 
april 2011 by arthegall
"DocumentCloud is an index of primary source documents and a tool for annotating, organizing and publishing them on the web."
web  journalism  data  archiving  open-access  research  annotation 
august 2010 by arthegall
"Open-Source Pharmaceutical Babble" (In the Pipeline)
"And that's it; that's the payoff. We'll all just hop to it, enabling and facilitating, expanding and evolving, stimulating and focusing. None of those are concrete verbs suggesting real courses of action. Whenever you see someone slip into that sort of talk, you can be sure that (at the very least) they have difficulty communicating whatever specific ideas they have. Or (more likely) that they don't have any specific ideas to tell you about at all."
opensource  buzzwords  pharmaceuticals  research  community  web  futurism  science 
july 2010 by arthegall
"A full-featured web site-creation package solely for the academic community. Scholars create web sites in seconds and can easily manage everything themselves (for free)" -- yet another Drupal-based system for "community creation." (YADSCC.)
drupal  education  academia  research  software  web  community  cms  opensource 
july 2010 by arthegall
"Megan McArdle’s Hack Post on Elizabeth Warren’s Scholarship" (Rortybomb)
"If you made it this far, I feel terrible for you. I feel like Virgil leading you through a Glibertarian Inferno." -- I'm glad that Konczal is writing about this, because otherwise I'd have to read Levenson's post, which would make me want to claw my own eyes out.
mike-konczal  megan-mcardle  elizabeth-warren  bankruptcy  statistics  research  politics 
july 2010 by arthegall
List of important publications in computer science - Wikipedia
Including a few gems I'd never seen before, such as the Kildall paper and the "rendering equation" paper.
index  list  papers  computerscience  research  bibliography 
june 2010 by arthegall
"Does teaching matter at (American) research universities?" (Crooked Timber)
"The problem about teaching is that we do not have good instruments for measuring its quality." -- the heart of the matter, I think.
university  academia  teaching  research  measurement 
may 2010 by arthegall » Lire
"The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR)."
lire  lucene  java  library  research  open-source  content-based-image-retrieval  image-retrieval  search  indexing  images 
may 2010 by arthegall
Scholarly Ontologies Project: Knowledge Media Inst., Open U. (UK)
"ClaimBlogger!" -- But their ontology ("schema") at the bottom of the page looks hand-edited and quirky and ad hoc, and has (I would guess) about a 50% chance of being incoherent. For instance, it talks about rdfs:Property, but I think they mean rdf:Property, and that leads to a bunch of other problems later on...
ontologies  research  work  blogging  software  tools  scholonto  via:paoloc 
april 2010 by arthegall
Please do not change your password - The Boston Globe
The Boston Globe article that got me reading papes by Cormac Herley. (As you might expect, the story is a little more complicated than the news article appears to understand, or lets on.) Need to forward some of this to scorlosquet.
security  password  news-article  microsoft  research  internet 
april 2010 by arthegall
Perez, Arenas, and Gutierrez, "Semantics and complexity of SPARQL"
Saved so that I can respond to a snarky email I just received, should the need arise. The short story, as I take it, is that "unrestricted" sparql (that is, sparql with UNION and OPT, along with FILTER and APPEND) is PSPACE-complete. Sparql with just UNION is linear, and Sparql with UNION + restricted OPT (restricted in a way that most reasonable queries will be) is coNP-complete.
complexity  query-language  query-optimization  sparql  semanticweb  computerscience  research 
january 2010 by arthegall
Stephen Friend, "Five Biotechnologies That Will Fade Away This Decade"
Linked to by Derek Lowe -- Friend is one of the leaders of Sage Bionetworks, which I was just talking about with JAR yesterday. Funny to read point #5, because I was under the impression that that had *already* faded away...
bioinformatics  sage-bionetworks  stephen-friend  research  drug-discovery  pharmaceuticals  biotechnology  futurism 
january 2010 by arthegall
NCI caBIG: "Cancer Data Standards Registry and Repository (caDSR)"
"caDSR is a database and a set of APIs and tools to create, edit, control, deploy, and find common data elements (CDEs) for metadata consumers and information about the UML models and Forms containing CDEs for use in software development." --- More wheel re-invention. <sigh>. But ... okay.
cadsr  data  format  metadata  nci  standard  cancer  research  work 
december 2009 by arthegall
P³G Observatory - Home
The "observatory" is a (apparently) a set of hand-curated *pairs* of population genomics studies, reviewed to discover whether they are compatible with each other, that is, whether they can be used in the same meta-analysis.
population-genetics  meta-analysis  genomics  data  science  research 
december 2009 by arthegall
SpringerLink - Book
Machine Learning and Knowledge Discovery in Databases
European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part II
book  springer  research  machinelearning  knowledge-base  database 
november 2009 by arthegall
SpringerLink - Book
Machine Learning and Knowledge Discovery in Databases
European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I
springer  book  research  knowledge-base  machinelearning 
november 2009 by arthegall
Ailon, Chazelle, Clarkson, Liu, Mulzer, and Seshadri, "Self-Improving Algorithms"
Why do I feel like I should recognize the name Kenneth Clarkson? (I don't know. Maybe I shouldn't.)
research  algorithms  arxiv  research-article 
november 2009 by arthegall
ZIB Optimization Suite
Mixed integer and linear program solvers -- free for non-academic use, and written in C++.
linear-programming  optimization  operations  research  software  c++ 
november 2009 by arthegall
Michael Trick’s Operations Research Blog : Without Operations Research, Gridlock!
"Of course, without operations research, which determines the correct times and coordinates it across the network, it would be a disaster all the time." -- A rather narrow counterfactual.
operations  research  traffic  optimization  design  via:Vaguery 
november 2009 by arthegall
CRF Project Page
A reimplementation of some of the CRF papers (including the original one).
software  machinelearning  conditional-random-fields  research 
november 2009 by arthegall
"Closed World vs. Open World: the First Semantic Web Battle" (Stefano’s Linotype)
Not only is this a good, informal description of the two terms, but I think it really gets at the differences in approach and assumption between database people who work on "graph databases" and "triple stores," and 'semantic' types (some of whom, I know, don't really like that term) who rely on reasoners and semantic interpretations. The widespread use of ontologies falls mainly within that second camp. Also of note is Stefano M.'s "father" example -- *note well* the point about "inferring the identity of Antonio and Franco! That's an important thing! And of course, Mazzochi (and Horrocks, and Hayes) were saying all of this four to five years ago. I wonder where the Linked Data people think they fit into all of this...
linked-data  semanticweb  logic  open-world-assumption  closed-world-assumption  opinion  schism  database  research 
november 2009 by arthegall
Collins et al. "The Human Genome Project: Lessons from Large-Scale Biology"
A 2003 retrospective on the organization, milestones, and management of large-scale (factory-style) genomics research.
human-genome-project  genomics  management  biology  research  science  factory-science 
november 2009 by arthegall
SVN 1.5: Repository Maintenance
svndadmin dump/load and svndumpfilter are useful for svn maintenance, for future reference. Although I don't think I can use this to import things into the Google Code repository, which is a bummer.
google-code  svn  research  stereo  programming  tutorial  reference  via:arolfe 
october 2009 by arthegall
Architecture - NeuroCommons
Jonathan's notes about the project under whose aegis I'm currently working (at least in part). Worth reading, including the notes about the Semantic Web at the end.
neurocommons  work  research  semanticweb  by:jar 
october 2009 by arthegall
HyBrow Home Page
This looked oddly familiar when I stumbled across it again... (via Jonathan)
research  science  semanticweb  via:jar 
october 2009 by arthegall
gse-stereo-t (Google Code)
Code for the (forthcoming) publication will be deposited here...
transcription  research  code  google  publication  recomb  segmentation 
october 2009 by arthegall
GettingStarted - support - A quick guide to getting started with project hosting on Google Code.
Guide to setting up a project on Google Code. The transcription stuff is going to go on here...
transcription  research  google  software 
october 2009 by arthegall
International Semantic Web Conference
Archives of old conferences. Lots of good stuff in here -- in particular, check out the best paper awards for '08.
semanticweb  index  conference  research 
october 2009 by arthegall
"A multimodal, multidimensional atlas of the C57BL/6J mouse brain"
Images and data about the anatomical structure of the mouse brain. (Specifically, the "Black" strain from the JAX lab which was part of the public mouse genome sequencing effort.)
mouse  research  biology  brain  neuroscience  data  work 
october 2009 by arthegall
Center for Roundabouts Research and Training
ZOMG, K-State actually has a "center" for "roundabout research"!!1!
traffic  round-abouts  research 
september 2009 by arthegall
Call for papers, including manuscript prep. guidelines. (<= 10 pg, 11pt font, clearly-marked appendix).
recomb  conference  call-for-papers  research 
september 2009 by arthegall
YouTube Research
The social network dynamics on YouTube research paper. To re-read (and possibly work on) during my week off.
youtube  social-networks  research  meme 
september 2009 by arthegall
Michiel Smid's research publications on computational geometry and "geometric spanners."
geometry  computational-geometry  research  publications 
september 2009 by arthegall
Gene regulatory networks and conserved noncoding elements : Pharyngula
A veritable orgy of conflating "functional" with "conserved." Plus, lots of throwing around the word "junk." Is this was most people really think?
genomics  genetics  sequence-conservation  functional-genomics  regulatory-networks  research  to-blog 
august 2009 by arthegall
Plagiarism Detection
"Moss (for a Measure Of Software Similarity) is an automatic system for determining the similarity of C, C++, Java, Pascal, Ada, ML, Lisp, or Scheme programs. To date, the main application of Moss has been in detecting plagiarism in programming classes. Since its development in 1994, Moss has been very effective in this role. The algorithm behind moss is a significant improvement over other cheating detection algorithms (at least, over those known to us)." --- The Winnowing paper that it's based on is an interesting read, and looks familiar enough that I think I must have stumbled across is before.
programming  plagiarism  software  research  text  hashing 
july 2009 by arthegall
"Why Andrew Sullivan is right about Megan McArdle, but not in the way he thinks." (The Inverse Square Blog)
The rest of the criticism may be (probably is) mostly on-target, but his criticism of McArdle's potted one-graf description of academic research vs. pharmaceutical work misses the mark pretty widely. Far from being "laughable," I'd say it's actually a pretty reasonable 100,000 ft. analogy, and I'd be truly surprised if Sue Lindquist or anyone else in her lab disagreed with it. What are laughable are the little blurbs made by his (Levenson's) students at the link -- for instance, "Scientists can cure Parkinson’s Disease in yeast – can they extend this to humans?" To the extent that the Lindquist lab will "cure Parkinson's" in humans, it will be in a theoretical sense. They won't be producing any drugs in our lifetimes. Also, "researchers pounding molecules into receptors" is a pretty poor description of the WIBR, and yeah, I can throw a stone from my office and hit them too. Grrrrr.
idiocy  whitehead  biology  research  science  drug-discovery  markets  health-care  via:cshalizi 
july 2009 by arthegall
Conley, "Capital for College: Parental Assets and Postsecondary Schooling," (2001)
[JSTOR: Sociology of Education, Vol. 74, No. 1 (Jan., 2001), pp. 59-72] -- effects of "parental income" and "parental wealth" on graduation rates for bachelor's degree.
education  wealth  income  college  graduation-rates  jstor  research  sociology 
july 2009 by arthegall
"Log Sum of Exponentials" (LingPipe Blog)
I was literally *just* writing up the portion of my thesis where I'm using this math... and I realize, I'm not checking for over/underflow correctly. Gotta go back and revise that, now. Thanks, LingPipe!!!
thesis  timely  logarithms  numerical-techniques  stability  floating-point-calculations  research  machinelearning 
june 2009 by arthegall
"Reed-Tsochas on "Assembling and disassembling organizational (and other) networks"" (Complexity and Social Networks Blog)
A fascinating talk in a lot of ways -- look for (at the very least) the part, at about the 12 minute mark, where he talks about how (and why) this particular (garment industry) dataset was produced by a workers' union... [at minute 24, "there is clearly a power-law regime,"... blaaaaaahhhh....]
union  garment-industry  network  social-networks  social-science  presentation  video  research  science  data 
may 2009 by arthegall
"Magnatagatune - a new research data set for MIR" (Music Machinery)
"It contains:

* Human annotations collected by Edith Law’s TagATune game.
* The corresponding sound clips from, encoded in 16 kHz, 32kbps, mono mp3. (generously contributed by John Buckman, the founder of every MIR researcher’s favorite label Magnatune)
* A detailed analysis from The Echo Nest of the track’s structure and musical content, including rhythm, pitch and timbre.
* All the source code for generating the dataset distribution."
dataset  music  magnatagatune  research  machinelearning  luis-von-ahn 
april 2009 by arthegall
"Falling for the magic formula" (Earning My Turns)
"Once again, Science falls for a magic formula that purports to answer a contentious question about language: is a certain ancient symbolic system a writing system. They would not, I hope, fall for a similar hypothesis in biology." -- Heh.
language  science  research  biology  genomics  peer-review 
april 2009 by arthegall
