via:cshalizi   639

« earlier    

Choosing Your Workflow Applications
As a beginning graduate student in the social sciences, what sort of software should you use to
do your work? More importantly, what principles should guide your choices? This article offers
some answers. The short version is: write using a good text editor (there are several to choose
from); analyze quantitative data with R or Stata; minimize errors by storing your work in a simple
format (plain text is best) and documenting it properly. Keep your projects in a version control
system. Back everything up regularly and automatically. Don’t get bogged down by gadgets, utilities or other accoutrements: they are there to help you do your work, but often waste your time
by tempting you to tweak, update and generally futz with them. To help you get started, I provide
a short discussion of the Emacs Starter Kit for the Social Sciences, a drop-in set of useful defaults
designed to help you get started using Emacs (a powerful, free text-editor) for data analysis and
writing.
emacs  latex  sweave  R  software  statistics  social-science  academia  pdf  via:cshalizi 
2 days ago by dhartunian
Summary of rules from "Elements of Programming Style," 1974 | Beyond The Beyond | Wired.com
"Avoid temporary variables"??? (A lot of the rest is reasonable, but could be subsumed into a reasonable version of 'lint' for whatever language you're working in. Is there an R lint?)
lint  programming  tips  R  via:cshalizi 
3 days ago by arthegall
[1205.2265] Efficient Constrained Regret Minimization
"Online learning constitutes a mathematical framework to analyze sequential decision making problems in adversarial environments. The learner repeatedly chooses an action, the environment responds with an outcome, and then the learner receives a reward for the played action. The goal of the learner is to maximize his total reward. However, there are situations in which, in addition to maximizing the cumulative reward, there are some additional constraints/goals on the sequence of decisions that must be satisfied by the learner. For example, in textit{online marketing}, simultaneously maximizing the cumulative reward and the number of buyers to take advantage of word-of-mouth advertising for future marketing seems to be a more ambitious goal than only maximizing cumulative reward. As another example, learning from costly expert advice captures more realistic settings than the original setting in applications such as routing in networks with power constraint. In this paper we study an extension to the online learning where the learner aims to maximize the total reward given that some additional constraints need to be satisfied. We propose Lagrangian exponentially weighted average (textbf{LEWA}) algorithm, an efficient algorithm to solve constrained online learning, which is a primal dual variant of the well known exponentially weighted average algorithm and inspired by the theory of Lagrangian method in constrained optimization. We establish the regret and the violation of the constraint bounds in full information and bandit feedback models."
online_learning  convex_optimization  via:cshalizi 
13 days ago by dvse
Game of Thrones, US Politics Edition
I disagree on the Daenerys analogy, though -- should be Cersei instead.
politics  satire  via:cshalizi 
28 days ago by mraginsky
Omniscient Gentlemen of The Atlantic | | Notebook | The Baffler
"What mystified Grove was the assertion, voiced by the economist Alan Blinder and others, “that as long as ‘knowledge work’ stays in the U.S., it doesn’t matter what happens to factory jobs.” This was not only inhumane, Grove declared; it was idiotic."
via:cshalizi  corporatism  publishing  social-engineering  journalism  they-say-the-best-astroturf-has-no-color-at-all 
4 weeks ago by Vaguery
Omniscient Gentlemen of The Atlantic | | Notebook | The Baffler
"The din of younger colleagues tapping keyboards is never soothing, but sitting in the press room of the Ideas Forum felt like a human rights violation. What could anyone write about something so tyrannically dull— other than an angry elegy for the massacre of meaning?" --- A little purple, but still pretty funny.
humor  journalism  the-atlantic  mo-tkacik  via:cshalizi  death-of-print 
4 weeks ago by arthegall
Why DH has no future. | The Stone and the Shell
Let me just say that any area of scholarship where, in 20-fucking-12, the idea of moving to open-access, online distribution of writing counts as some kind of radicalism deserves everything that's going to happen to it.
digital  humanities  academia  data-mining  text-analysis  digital-humanities  open-access  via:cshalizi 
5 weeks ago by tsuomela
How Can Herbert Spencer’s 1892 Revisions to his Social Statics Help Us Understand Conservative Opposition to the Individual Mandate? | Rortybomb
"But I think it’s clear what his real objection was: universal suffrage has the potential to advance socialistic causes, interfering with his laissez-faire project. From his autobiography: “Another extension of the franchise since made…will inevitably be followed by a still more rapid growth of socialistic legislation.” When he realized women’s equality could potentially interfere with laissez-faire economics, it was time for women’s equality to get cut from his overall theory of a better world. He would rather mutilate his intellectual project instead of allowing his enemies to continue to build their governance project."
Herbert-Spencer  laissez-faire  corporatism  capitalism  politics  conservatism  via:cshalizi 
5 weeks ago by Vaguery
PopTech : Duncan Watts - Social contagion: What do we really know?
"Again, we don’t know for sure, but we suspect that the analogy with biological disease is badly flawed. For example, whereas it is probably true that most people are susceptible to HIV, our susceptibility to any particular idea, product, musical artists, etc. varies tremendously, depending on our tastes, backgrounds, and circumstances. Unlike for influenza, to which you’re either exposed or not exposed, even the ideas you do encounter have to compete for attention with everything else that you’re exposed to. And unlike models of disease, which assume that disease spreads exclusively from person to person, information can be disseminated by the media and advertising as well as by word of mouth.

All of these differences, along with many others, could dramatically alter the prospects for social epidemics, as well as introduce other mechanisms entirely by which social change can come about, yet models of social influence reflect very little of this added complexity"
social-contagion  ideas  networks  influence  persuasion  society  epidemics  via:cshalizi  from delicious
9 weeks ago by tsuomela
PeteSearch: Keep the web weird
so, two comments:
(1) "computable web" != "canonical names." Common mistake.
(2) the "ambiguity" of reference that [some of] the semantic web people are working to eliminate here isn't the ambiguity he's describing ("My ... apartment has been described as being in the Lower Haight, Duboce Triangle, or Upper Castro, depending on who you ask..."), where one "thing" can have multiple names -- but the exact opposite, where one name refers to many (different) things, in different contexts. Imagine if "Duboce Triangle" was the name of a neighborhood *and* a newspaper about that neighborhood, *and* also the collective name for all the people living within 2 miles of Warden's apartment. A person might even use the same noun (or noun phrase) to refer to all three things, within the space of a single unit of text. It'd get pretty confusing. Using "canonical" names is a (admittedly somewhat simplistic) attempt to get around *that* problem, rather than the one he's describing; and saying that you "embrace" the ambiguity that's latent here is equivalent to saying that you don't care if the web is unusable to certain groups of people (e.g. scientists, researchers) who *are* concerned with avoiding this sort of ambiguity. "People searching for movie times" is just a test-case for "people searching for data about a 'gene'."

Also, I love someone who writes critically about Wolfram (.data, Alpha, and all the rest) as much as the next guy -- but saying, "the web is written for humans to read" is pretty laughable when it's not coming out of the mouth of a guy named "Firefox." The web is written for your web browser, and no number of SXSW presentations will change that.

</rant>
via:cshalizi  web  internet  semanticweb  tagging  rant  folksonomy 
10 weeks ago by arthegall
[1203.0697] Learning High-Dimensional Mixtures of Graphical Models
"We now propose a method for learning the mixture components given n i.i.d. samples y_n
drawn from a graphical mixture model P(y). Our method proceeds in two stages. First, we estimate the graph G_∪ := U_{r}^{h=1} G_h, which is the union of the Markov graphs of the mixture. This is accomplished via a series of rank tests. Note that in the special case when G_h ≡ G_∪, this also gives the graph estimates of the component models. We then use the graph estimate hat{G}_∪ to obtain the pairwise marginals of the respective mixture components via a spectral decomposition method. Finally, we use the Chow-Liu algorithm to obtain tree approximations {T_h}_h of the individual mixture components." -- To do: review how this works in the context of gene expression experiments for transcription factor regulatory relationships, which are (presumably) mixtures of a couple different underlying models or modes.
gene-expression  bioinformatics  research-article  arxiv  via:cshalizi  graphical-models  mixture-models  machinelearning 
10 weeks ago by arthegall

« earlier    

Copy this bookmark:



description:


tags: