gnat + data   160

Bill Gates is naive, data is not objective « mathbabe
We see that cars are safer for men than women because the crash-test dummies are men.
We see that cars are safer for thin people because the crash-test dummies are thin.
We see drugs are safer and more effective for white people because blacks are underrep
bad  data 
february 2013 by gnat
Big Data: A Workshop Report
This report summarizes that first workshop which explored the phenomenon known as big data.
big  data 
november 2012 by gnat
A Programmer's Guide to Data Mining | The Ancient Art of the Numerati
A guide to practical data mining, collective intelligence, and building recommendation systems
book  data  machinelearning  python  programming  bigdata 
november 2012 by gnat
The Tamiflu story: Why we need access to all data from clinical trials | Open Knowledge Foundation Blog
One consequence of our access to this bonanza of regulatory material has been a comparison between the details and broad message of the few published trials and their regulatory much more detailed reports. Apart from discrepancies in reporting harms and some less-than-detailed aspects of study design, we think the mode of action of the drug is not what the manufacturer says and (like FDA) could not find any evidence supporting a number of effects of the drug (including those for which it was stockpiled).

But we do not know for sure because we do not have all the data. The practical result of all this is our refusal to consider published trials (either on their own or as part of reviews) for inclusion in our reviews. There are signs that this distrust of the published word is spreading.
science  open  data 
november 2012 by gnat
Sourcemap: where things come from
crowdsourced directory of product supply chains and carbon footprints
business  data  maps  visualization 
may 2012 by gnat
Crossfilter is a JavaScript library for exploring large multivariate datasets in the browser. Crossfilter supports extremely fast (<30ms) interaction with coordinated views, even with datasets containing a million or more records; we built it to power analytics for Square Register, allowing merchants to slice and dice their payment history fluidly.
javascript  opensource  big  data 
may 2012 by gnat
Kiwi students making most of Internet access
Percentage of NZ students with computer access at home compared to the OECD average.
* 2000
NZ: Computer 79%, Internet 62%
OECD: Computer 72%, Internet: 45%
* 2009
NZ: Computer 96%, Internet 92%
OECD: Computer 94%, Internet 89%
What students use the computer for at home (percentage):
* 79 - Browsing for fun at least once a week.
* 71 - Using email.
* 68 - Doing homework at home more than once a week.
* 63 - Chatting online.
* 60 - Downloading music, films, games or software.
* 52 - Browsing internet for schoolwork frequently.
education  computers  oecd  data 
july 2011 by gnat
use cases for open bibliographic data
stories for libraries about why you'd open your data up
library  open  data 
march 2011 by gnat
Open Knowledge Foundation Blog » Blog Archive » Art Open Data
There are many potential sources of such data:

Institutions produce collection catalogues, show listings, attendance figures, and organisational information.
Writings by artists, critics, theorists and historians can be processed to provide institutional and market data, to discover factual information, or for affective and aesthetic analysis.
Records of art auction prices have been kept for hundreds of years, with older records freely available.
Biographical information about artists can be extracted from digitised historical sources and from modern sources such as Wikipedia.
Institutional, historical and market data about artworks can build up a picture of its production, reception, and provenance.
Reproductions of artworks can be analysed algorithmically or socially.
art  open  data 
february 2011 by gnat
Using Analytics to Intervene with Underperforming College Students (Innovative Practice) | EDUCAUSE
data mining to find "underperforming" students. "However, there is no clear consensus on how to intervene with current students in a way they will accept and not associate with academic "profiling.""
education  data  mining 
february 2011 by gnat
What lies ahead: Data - O'Reilly Radar
The old prediction engine was built on business intelligence; analytics and reports that people study. The new prediction engine is reflex. It's autonomic. The new engine is at work when Google is running a real-time auction, figuring out which ad is going to appear and which one is going to give them the most money. The engine is present when someone on Wall Street is building real-time bid/ask algorithms to identify who they're going to sell shares to. These examples are built on predictive analytics that are managed automatically by a machine, not by a person studying a report.
data  from instapaper
december 2010 by gnat
Women, Men and the New Economics of Marriage | Pew Social & Demographic Trends
Thus, Americans who already have the largest incomes and who have had the largest gains in earnings since 1970 — college graduates — have fortified their financial advantage over less educated Americans because of their greater tendency to be married.
demographics  data  usa  from instapaper
december 2010 by gnat
In Pursuit of a Mind Map, Slice by Slice
attempting to map the mind, model it in computers
science  data  from instapaper
december 2010 by gnat
The (Australian) Govt 2.0 Taskforce – introduction and initial thoughts
"Also, we’ve quickly realised that you are often working with something akin to 6 different interlocking jigsaw puzzles, each missing 20% of the most crucial bits"
open  data  government 
june 2009 by gnat's Data Transparency Called "Significant Failure" by Watchdog Group
One arguement against raw data came out of the woodwork during the successful push to get the US Senate to offer mashup-friendly XML (extensible markup language) feeds for Senate voting history. "The secretary of the Senate has cited a general standing policy," John Wonderlich, policy director at Sunlight, told Politico's Victoria McGrane, "that they're not supposed to present votes in a comparative format, that senators have the right to present their votes however they want to."
open  data  government  transparency 
june 2009 by gnat
No Raw Data on Significant Failure — Sunlight Foundation Blog
“If the Recovery Act is to fulfill President Obama’s promise about taxpayers being able to go online and see how every dime is spent, then we need sub-recipients’ and sub-sub recipients’ data online, too,”
transparency  government  open  data 
june 2009 by gnat
Home |
discover who gets what from the Common Fisheries Policy
fish  economics  europe  open  data 
june 2009 by gnat » Government Information – does it want to be free?
"ownership" of government information is a thorny question unless you have an ideology that makes it easy ("of course it's not owned by the government, it's taxpayer's information -- we paid for it!" vs "of course it's owned by the government, how else can they justify selling it?").
government  open  data  nz 
june 2009 by gnat
The Nike Experiment: How the Shoe Giant Unleashed the Power of Personal Metrics
And not only can we collect that data, we can analyze it as well, looking for patterns, information that might help us change both the quality and the length of our lives. We can live longer and better by applying, on a personal scale, the same quantitative mindset that powers Google and medical research. Call it Living by Numbers—the ability to gather and analyze data about yourself, setting up a feedback loop that we can use to upgrade our lives, from better health to better habits to better performance.
data  collective  intelligence  life  ipod 
june 2009 by gnat
The Four Hundred--Jeff Jonas Explores the Nature of Data in COMMON Keynote
Aha. Jonas is talking about breaking down the difference between queries, stored queries, and triggers. As you add data, old queries are still waiting for matches, and when matches are found the user's notified even if it's weeks after the original query was entered. Treating database as a jigsaw, not as bucket.
databases  mining  data 
june 2009 by gnat
Yahoo! Developer Network Blog
The first part is a new batch of YQL tables providing data on the U.S. government, earthquake data, and the non-profit micro-lender Kiva. The second part is an incredibly easy way to render YQL queries on websites. After all, what good is data that no one can see?
government  open  data  yahoo  api 
june 2009 by gnat
Lessons Learned: Datablindness
"Have data cause interrupts. We have to invent process mechanisms that force decision makers to regularly confront the results of their decisions. This has to happen with regularity, and without too much time elapsing, or else we might forget what decisions we made."
data  business 
june 2009 by gnat
API Value Creation, Not Monetization « Laura Merling’s Blog
On the side of the unexpected but interesting outcomes, Kevin said they have seen a flurry of internally developed business applications. In the past many valuable, internal-facing projects were turned down because the programs had to meet strict top line to bottom line ratios. With the availability to data and services, many teams within the company now have access to things they didn’t in the past, and project costs have been minimized. Throughout the company, consumers of the API have been able to launch successful projects that have created additional revenue and have reduced the overall development costs for new projects.
api  data  business  strategy 
june 2009 by gnat
Performance comparison: key/value stores for language model counts - Brendan O'Connor's Blog
I’m doing word and bigram counts on a corpus of tweets. I want to store and rapidly retrieve them later for language model purposes. So there’s a big table of counts that get incremented many times. The easiest way to get something running is to use an open-source key/value store; but which? There’s recently been some development in this area so I thought it would be good to revisit and evaluate some options.
data  database  distributed  scaling 
april 2009 by gnat
Circos - visualize genomes and genomic data
Circos is designed for visualizing genomic data such as alignments, conservation, and generalized 2D data, such as line, scatter, heatmap and histogram plots. Circos is very flexible — you can use it to visualize any kind of data, not just genomics. Circos has been used to visualize customer flow in the auto industry, volume of courier shipments, database schemas, and presidential debates.
perl  open  source  data  viz 
april 2009 by gnat » Microsoft offers data mining tools in the cloud
Microsoft offers some data mining functionality of SQL Server 2008 with no local analysis services server in the cloud. The service is offered in two flavors: a cloud service and as a plug-in for Excel.
data  mining  microsoft  cloud  tools 
april 2009 by gnat
OpenSecrets | Goes OpenData - Capital Eye
Center's Researchers Clean Up, Categorize Government Data
government  data 
april 2009 by gnat
SDMX - Statistical Data and Metadata Exchange
SDMX is an initiative to foster standards for the exchange of statistical information.
data  open  statistics  metadata  standards 
march 2009 by gnat
All we want are the facts, ma'am
finally the bullet gets put in Anderson's blather
data  science 
february 2009 by gnat
Visualization Trends For The Noosphere - Articles - MIX Online
"The MIX Online team had to download hundreds of megabytes of data from the Centers for Disease Control, in SAS transport format, and then write an R script to parse it. Mixing the paint shouldn't be so hard"
data  viz 
february 2009 by gnat
Data Mining with R: learning by case studies
abandoned alas in 2003, after 1.5 case studies.
data  programming  mining  r  stats 
january 2009 by gnat
Following the IT crowd - Technology - NZ Herald News
I don't remember blaming the US govt for the lack of EveryBlock in NZ, more that the US geospatial industry is big because the US govt was free with its data in a way that NZ hasn't yet been. Otherwise not a bad summary of my ten minute talk with Anthony on the phone.
nz  data  mobile  me  technology 
january 2009 by gnat
InstantAtlas Data Server
Public Health Intelligence
nz  data  open 
december 2008 by gnat
Why Government should support open and free geospatial data at Gav’s Blog
I don't buy the argument that duplication is bad. I buy the argument that free shitty data isn't as good as free quality data, and the market in NZ is not big enough to make this a business worth protecting, as the repeated near-death of terralink shows. At this point I agree that LINZ ought to have a clearly-defined public-facing mandate for quality data.
geo  nz  open  data 
october 2008 by gnat
Tele Atlas Customers Get Tomtom Data; Let the Crowdsourcing Begin - O'Reilly Radar
Consumer electronics businesses don't have to be about the moment of sale. This ongoing relationship is going to be very valuable.
geo  business  data 
october 2008 by gnat
Government 2.0: Architecting for collaboration - New Zealand E-government Programme
Tara Hunt's translation of O'Reilly's Web 2.0 principles into the government sphere. "shared control with accountability on both sides" is a very interesting phrase.
nz  data  government  web2.0 
september 2008 by gnat
Big data: Welcome to the petacentre : Nature News
fascinating article about big data centers for science: CERN, Sanger Institute, XS4ALL.
data  parallel  science 
september 2008 by gnat
California does it again «
Compare neighbouring schools, hospitals, etc. Absolutely excellent idea -- how do we get the data within NZ to rate and rank DHBs, hospitals, GPs, ...?
local  data  usa  medicine  education 
august 2008 by gnat
Wired Test 2007: Infoporn: The Cost of Living on the Bleeding Edge of Gadgetry
Delicious infopornography. As stamenite said, it's a nifty way of showing several variables changing over time.
data  viz  graphs 
august 2008 by gnat
Avi Ma'ayan's blog - Nature Network
Instruments create massive data sets which require new tools which give rise to new theories. Different disciplines at different stages. A fantastic manifesto for scifoo.
scifoo  science  data 
august 2008 by gnat
start [WaveScope ]
"WaveScope is a system for developing distributed, high-rate applications that need to process streams of data from various sources (e.g., sensors) using a combination of signal processing and database (event stream processing) operations. The execution e
parallel  processing  data 
august 2008 by gnat
Official Google Research Blog: All Our N-gram are Belong to You
wonder whether they'll make an API to this available within G.A.E.
data  language  mining 
july 2008 by gnat
« earlier      
per page:    204080120160

related tags

ai  algorithms  amazon  api  apis  art  audio  australia  bad  been  big  bigdata  blogs  book  business  c  chemistry  clear  cloud  code  collective  computers  computing  copyright  crowdsourcing  culture  data  database  databases  datavis  democracy  demographics  design  distributed  drm  ec2  economics  economy  education  energy  estate  europe  evolution  excellent  facilities  finance  firefox  fish  flash  flickr  food  geo  global  google  government  graphics  graphs  grid  have  health  history  how  imagery  intelligence  internet  ipod  it  java  javascript  jobs  journalism  just  language  last  library  life  local  mac  machinelearning  making  mapping  maps  math  me  media  medicine  metadata  microsoft  mining  mobile  money  mozilla  music  network  news  newzealand  numbers  nz  oecd  of  open  opensource  os  papers  parallel  perl  photos  politics  prices  privacy  processing  programming  psychology  python  quotes  r  rails  rdf  real  research  ridiculous  ruby  scale  scaling  science  scifoo  security  social  source  sql  standards  statistics  stats  storage  strategy  technology  ten  the  tools  transparency  ubicomp  ui  un  urban  usa  uvc  ux  visualization  viz  weather  web  web2.0  wikipedia  wrds  x  yahoo  years 

Copy this bookmark: