data-access 54
[1104.1605] Efficient Top-K Retrieval in Online Social Tagging Networks
december 2011 by Vaguery
"We consider in this paper top-k query answering in social tagging systems, also known as folksonomies. This problem requires a significant departure from existing, socially agnostic techniques. In a network-aware context, one can (and should) exploit the social links, which can indicate how users relate to the seeker and how much weight their tagging actions should have in the result build-up. We propose an algorithm that has the potential to scale to current applications. While the problem has already been considered in previous literature, this was done either under strong simplifying assumptions or under choices that cannot scale to even moderate-size real world applications. We first consider a key aspect of the problem, which is accessing the closest or most relevant users for a given seeker. We describe how this can be done on the fly (without any pre-computations) for several possible choices - arguably the most natural ones - of proximity computation in a user network. Based on this, our top-k algorithm is sound and complete, while addressing the scalability issues of the existing ones. Importantly, our technique is instance optimal in the case when the search relies exclusively on the social weight of tagging actions. To further reduce response times, we then consider directions for efficiency by approximation. Extensive experiments on real world data show that our techniques can drastically improve the response time, without sacrificing precision."
folksonomy
tagging
network-theory
search-algorithms
nudge-targets
data-access
december 2011 by Vaguery
Prelim Finding the holdouts: Who is Required to publicly archive data but still doesn’t? « Research Remix
june 2011 by Vaguery
"So it seems the specific words in a journal policy that requires data archiving doesn’t matter much, though policies that include a general statement about data sharing and request the sharing of other datatypes have higher rates of data archiving. The highest-impact journals that require data archiving have slightly higher archiving rates than those with impact factors between 4 and 7. Mentioning exceptions in a journal policy may be associated with increased rates of archiving. Core clinical journals tend toward high rates of data archiving (likely overlap with the high impact factor journals).
Disheartening to see again that studies about cancer are least likely to publicly archive data, even when required. Some disciplinary trends: studies on bacteria more likely to follow journal mandates. Perhaps related: studies that archived other types of data were more likely to also archive gene expression microarray data."
open-access
data-access
raw-data-now
academic-culture
publishing
Disheartening to see again that studies about cancer are least likely to publicly archive data, even when required. Some disciplinary trends: studies on bacteria more likely to follow journal mandates. Perhaps related: studies that archived other types of data were more likely to also archive gene expression microarray data."
june 2011 by Vaguery
Why Open Source is the New Software Policy in San Francisco
january 2010 by Vaguery
"Since the launch of DataSF last summer, the City’s clearinghouse of government datasets, we have seen our tech community create new services and products never dreamed of within the walls of government. And now we are giving people access to technology systems like our 311 call center through open source, so they can decide how and when they interact with government.
We face many challenges today, none more urgent than the economic crisis, but with it comes an opportunity to seek new ways of governing. In San Francisco, like other cities, we are using this opportunity to engage our greatest resource, the public, to build a government that works better for all of us."
openness
transparency
government2.0
government
data-access
innovation
economics
city-planning
We face many challenges today, none more urgent than the economic crisis, but with it comes an opportunity to seek new ways of governing. In San Francisco, like other cities, we are using this opportunity to engage our greatest resource, the public, to build a government that works better for all of us."
january 2010 by Vaguery
Where is the money going? « Jon Udell
november 2009 by Vaguery
"Recovery.gov can’t bootstrap itself out of this circular trap. But if we use the tags that it has helpfully provided, we might be able to find out a lot more about where the money is going."
government2.0
transparency
data-access
public-policy
funding
democracy
information
government
FOIA
financial-crisis
november 2009 by Vaguery
Rewriting Analyst History
july 2009 by Vaguery
"We document widespread changes to the historical I/B/E/S analyst stock recommendations database. Across seven I/B/E/S downloads, obtained between 2000 and 2007, we find that between 6,580 (1.6%) and 97,582 (21.7%) of matched observations are different from one download to the next. The changes include alterations of recommendations, additions and deletions of records, and removal of analyst names. These changes are nonrandom, clustering by analyst reputation, broker size and status, and recommendation boldness, and affect trading signal classifications and back-tests of three stylized facts: profitability of trading signals, profitability of consensus recommendation changes, and persistence in individual analyst stock-picking ability."
data-access
learning-from-data
analysts
stocks
fudging
what-gets-measured-gets-fudged
july 2009 by Vaguery
LINQ IQueryable Toolkit - Home
july 2009 by sergiopereira
source code to help someone write their own IQueryable/Linq provider
data-access
linq
programming
opensource
july 2009 by sergiopereira
A2DDA Blocks Asterisk Parking Data | VoIP Tech Chat
march 2009 by Vaguery
“Hi all. Over the last day or so I have talked about your project with a few DDA members and what arose from these conversations was a shared concern that because the project was not an initiative created by/run by the DDA there are no controls in place for this at present. For instance, there is no DDA policy about how to allow /or even if it should allow an outside group to use the DDA’s parking data for a private enterprise. There is a concern about how unsecure/secure the DDA website is made when sharing this data. And finally, a concern that if the project had value to parking patrons, that the DDA itself should consider providing this service as an extension of what it is already doing on-line.”
community
activism
data-access
openness
government
government2.0
local
Ann-Arbor
disintermediation
watershed
march 2009 by Vaguery
Copy this bookmark: