rvenkat + algorithms   87

[1506.08527] Symbolic Derivation of Mean-Field PDEs from Lattice-Based Models
Transportation processes, which play a prominent role in the life and social sciences, are typically described by discrete models on lattices. For studying their dynamics a continuous formulation of the problem via partial differential equations (PDE) is employed. In this paper we propose a symbolic computation approach to derive mean-field PDEs from a lattice-based model. We start with the microscopic equations, which state the probability to find a particle at a given lattice site. Then the PDEs are formally derived by Taylor expansions of the probability densities and by passing to an appropriate limit as the time steps and the distances between lattice sites tend to zero. We present an implementation in a computer algebra system that performs this transition for a general class of models. In order to rewrite the mean-field PDEs in a conservative formulation, we adapt and implement symbolic integration methods that can handle unspecified functions in several variables. To illustrate our approach, we consider an application in crowd motion analysis where the dynamics of bidirectional flows are studied. However, the presented approach can be applied to various transportation processes of multiple species with variable size in any dimension, for example, to confirm several proposed mean-field models for cell motility.
algorithms  interating_particle_system  macro_from_micro  active_matter  for_friends 
4 weeks ago by rvenkat
Help Wanted
The hiring process is a critical gateway to economic opportunity, determining who can access consistent work to support themselves and their families. Employers have long used digital technology to manage their hiring decisions, and now many are turning to new predictive hiring tools to inform each step of their hiring process.

This report explores how predictive tools affect equity throughout the entire hiring process. We explore popular tools that many employers currently use, and offer recommendations for further scrutiny and reflection. We conclude that without active measures to mitigate them, bias will arise in predictive hiring tools by default.
algorithms  bias  jobs  labor  discrimination  machine_learning  via:eszter  for_friends 
6 weeks ago by rvenkat
Input–output maps are strongly biased towards simple outputs | Nature Communications
Many systems in nature can be described using discrete input–output maps. Without knowing details about a map, there may seem to be no a priori reason to expect that a randomly chosen input would be more likely to generate one output over another. Here, by extending fundamental results from algorithmic information theory, we show instead that for many real-world maps, the a priori probability P(x) that randomly sampled inputs generate a particular output x decays exponentially with the approximate Kolmogorov complexity 𝐾̃ (𝑥) of that output. These input–output maps are biased towards simplicity. We derive an upper bound P(x) ≲ 2−𝑎𝐾̃ (𝑥)−𝑏, which is tight for most inputs. The constants a and b, as well as many properties of P(x), can be predicted with minimal knowledge of the map. We explore this strong bias towards simple outputs in systems ranging from the folding of RNA secondary structures to systems of coupled ordinary differential equations to a stochastic financial trading model.

--interesting but cannot understand the buzz around this paper.
complexity  information_theory  algorithms  macro_from_micro  ? 
10 weeks ago by rvenkat
Fairness and Abstraction in Sociotechnical Systems by Andrew D. Selbst, danah boyd, Sorelle Friedler, Suresh Venkatasubramanian, Janet Vertesi :: SSRN
A key goal of the FAT* community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outcomes such as fairness, justice, and due process. Bedrock concepts in computer science—such as abstraction and modular design—are used to define notions of fairness and discrimination, to produce fairness-aware learning algorithms, and to intervene at different stages of a decision-making pipeline to produce "fair" outcomes. In this paper, however, we contend that these concepts render technical interventions ineffective, inaccurate, and sometimes dangerously misguided when they enter the societal context that surrounds decision-making systems. We outline this mismatch with five "traps" that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science. We draw on studies of sociotechnical systems in Science and Technology Studies to explain why such traps occur and how to avoid them. Finally, we suggest ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions, and by drawing abstraction boundaries to include social actors rather than purely technical ones.
algorithms  machine_learning  ethics  technology  dana.boyd 
11 weeks ago by rvenkat
[1808.00023] The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area.
machine_learning  algorithms  bias  ethics  privacy  review  for_friends 
august 2018 by rvenkat
Universal Method to Sort Complex Information Found | Quanta Magazine
-- ignore the typically hyperbolic title! ... contains information on soon to be published results relating graph embedding in normed spaces to nearest neighbor queries in databases.
algorithms  graph_theory  metric_spaces  computational_complexity  quanta_mag 
august 2018 by rvenkat
Statistical and Computational Guarantees for the Baum-Welch Algorithm
The Hidden Markov Model (HMM) is one of the mainstays of statistical modeling of discrete time series, with applications including speech recognition, computational biology, computer vision and econometrics. Estimating an HMM from its observation process is often addressed via the Baum-Welch algorithm, which is known to be susceptible to local optima. In this paper, we first give a general characterization of the basin of attraction associated with any global optimum of the population likelihood. By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates and show geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum. As a concrete example, we prove a linear rate of convergence for a hidden Markov mixture of two isotropic Gaussians given a suitable mean separation and an initialization within a ball of large radius around (one of) the true parameters. To our knowledge, these are the first rigorous local convergence guarantees to global optima for the Baum-Welch algorithm in a setting where the likelihood function is nonconvex. We complement our theoretical results with thorough numerical simulations studying the convergence of the Baum-Welch algorithm and illustrating the accuracy of our predictions.
algorithms  statistics  machine_learning  martin.wainwright 
july 2018 by rvenkat
Handbook of Graphical Models - CRC Press Book
A graphical model is a statistical model that is represented by a graph. The factorization properties underlying graphical models facilitate tractable computation with multivariate distributions, making the models a valuable tool with a plethora of applications. Furthermore, directed graphical models allow intuitive causal interpretations and have become a cornerstone for causal inference.

While there exist a number of excellent books on graphical models, the field has grown so much that individual authors can hardly cover its entire scope. Moreover, the field is interdisciplinary by nature. Through chapters by leading researchers from different areas, this handbook provides a broad and accessible overview of the state of the art.

The handbook is targeted at a wide audience, including graduate students, applied researchers, and experts in graphical models.

graphical_models  book  statistics  algorithms 
july 2018 by rvenkat
On the history of the transportation and maximum flow problems
We review two papers that are of historical interest for combinatorial optimization: an article of A.N.Tolstoi from 1930, in which the transportation problem is studied, and a negative cycle criterion is developed and applied to solve a (for that time)large-scale (10X68) transportation problem to optimality; andan,until recently secret,RAND report of T.E.Harris and F.S. Rossfrom 1955, that Ford and Fulkerson mention as motivation to study the maximum flow problem. The papers have in common that they both apply their methods to the Soviet railway network.
networks  combinatorics  optimization  algorithms 
may 2018 by rvenkat
Algorithmic Fairness
Concerns that algorithms may discriminate against certain groups have led to numerous efforts to 'blind' the algorithm to race. We argue that this intuitive perspective is misleading and may do harm. Our primary result is exceedingly simple, yet often overlooked. A preference for fairness should not change the choice of estimator. Equity preferences can change how the estimated prediction function is used (e.g., different threshold for different groups) but the function itself should not change. We show in an empirical example for college admissions that the inclusion of variables such as race can increase both equity and efficiency.
econometrics  algorithms  ethics  machine_learning  sendhil.mullainathan 
may 2018 by rvenkat
[1803.04383] Delayed Impact of Fair Machine Learning
Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.
We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not.
We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.
Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.

-- need to clean up my tag sets to better reflect my understanding and points of view :(
via:randw  algorithms  machine_learning  ethics 
march 2018 by rvenkat
[1711.01134] Accountability of AI Under the Law: The Role of Explanation
The ubiquity of systems using artificial intelligence or "AI" has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before---applications range from clinical decision support to autonomous driving and predictive policing. That said, there exist legitimate concerns about the intentional and unintentional negative consequences of AI systems. There are many ways to hold AI systems accountable. In this work, we focus on one: explanation. Questions about a legal right to explanation from AI systems was recently debated in the EU General Data Protection Regulation, and thus thinking carefully about when and how explanation from AI systems might improve accountability is timely. In this work, we review contexts in which explanation is currently required under the law, and then list the technical considerations that must be considered if we desired AI systems that could provide kinds of explanations that are currently required of humans.
algorithms  law  ethics  explanation  regulation  governance  artificial_intelligence  machine_learning  via:? 
march 2018 by rvenkat
YouTube, the Great Radicalizer - The New York Times
--With all necessary respect to Zeynep, implicit in her argument is a lack of agency among a society's denizens. I've noticed that she gives too much credit to the idea of a fair minded, thoughtful, reasonable, ethical good citizen, and the idea that most of blame lies on private and public institutions. I remain very suspicious of this idea and it continues to irk me a little bit.

Propensity to radicalize reinforced by recommendation algorithms is the problem. I have a feeling any non-zero propensity to polarize would be sufficient to radicalize certain individuals, no matter how careful we are at designing such algorithms. The question is whether one can demonstrate that such algorithms polarize even the _normal_ denizens. Yes, unregulated artificial intelligence gone wild is a problem but so is extant natural stupidity.
polarization  radicalization  algorithms  ethics  networked_public_sphere  platform_studies  GAFA  zeynep.tufekci 
march 2018 by rvenkat
Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability - Mike Ananny, Kate Crawford, 2018
Models for understanding and holding systems accountable have long rested upon ideals and logics of transparency. Being able to see a system is sometimes equated with being able to know how it works and govern it—a pattern that recurs in recent work about transparency and computational systems. But can “black boxes’ ever be opened, and if so, would that ever be sufficient? In this article, we critically interrogate the ideal of transparency, trace some of its roots in scientific and sociotechnical epistemological cultures, and present 10 limitations to its application. We specifically focus on the inadequacy of transparency for understanding and governing algorithmic systems and sketch an alternative typology of algorithmic accountability grounded in constructive engagements with the limitations of transparency ideals.

-- Some of Sunstein's works discuss this albeit with a focus on governance and regulation by _the administrative state_
algorithms  ethics  bureaucracy  platform_studies  agnotology  kate.crawford 
march 2018 by rvenkat
Isomorphism through algorithms: Institutional dependencies in the case of Facebook - Robyn Caplan, danah boyd, 2018
Algorithms and data-driven technologies are increasingly being embraced by a variety of different sectors and institutions. This paper examines how algorithms and data-driven technologies, enacted by an organization like Facebook, can induce similarity across an industry. Using theories from organizational sociology and neoinstitutionalism, this paper traces the bureaucratic roots of Big Data and algorithms to examine the institutional dependencies that emerge and are mediated through data-driven and algorithmic logics. This type of analysis sheds light on how organizational contexts are embedded into algorithms, which can then become embedded within other organizational and individual practices. By investigating technical practices as organizational and bureaucratic, discussions about accountability and decision-making can be reframed.
platform_economics  institutions  algorithms  rational_choice  bureaucracy  cybernetics  platform_studies  dana.boyd 
march 2018 by rvenkat
The Power of Algorithms and Algorithmic Power: Conceptualizing Machine Intelligence and Social Structure | Berkeley Institute for Data Science
It’s been just five years since a journal article about neural networks — a form of computer learning algorithm that uses large datasets to learn to classify input — broke through to the popular press. The article described how Google researchers had connected 16,000 computers and set this network loose on millions of images from YouTube — without supervision. The system invented the concept of a cat, and how to identify it. Since then, there has been an explosion in decision-making software that functions in a similar fashion: churning through large datasets to learn to identify and classify, without being given specific instructions on how to do so, and perhaps more importantly, without the human programmers having an understanding of how it actually functions. The era of machine intelligence has fully arrived, and it is accelerating. Much of the engineering world and scientific press has focused on whether such intelligence is like human intelligence, or if it ever will be. In this talk, I will instead explore what having such types of intelligence in the hands of power — governments, corporations, institutions — means. These systems bring about novel capabilities to the powerful at scale, threaten to displace many human tasks (because they can perform those tasks well enough), create new forms of privacy invasion through their inferential capabilities, introduce new error patterns we have neither statistical tools nor cultural or political institutions to deal with, incentivize massive surveillance because they only work will with massive datasets, and more. I will explore some of the technical aspects of these technologies and connect them directly to core questions of sociology, culture and politics. This event is co-sponsored by CITRIS and BIDS.

-- No video links. The abstract feels like a work/book-in-progress.
zeynep.tufekci  algorithms  agency  authoritarianism  surveillance  inequality  sociology_of_technology 
march 2018 by rvenkat
FAT* - 2018 Program
--includes Arvind Narayanan's talk on Algorithmic Fairness, and others
algorithms  ethics  artificial_intelligence  machine_learning  statistics  moral_philosophy 
february 2018 by rvenkat
AUTOMATING INEQUALITY by Virginia Eubanks | Kirkus Reviews
-- As long as they are open, transparent and regulated, I see no problem with an automated bureaucracy. I understand the concerns but I am becoming increasingly numb to this monotonous tone of the critics. Then again, bureaucracy is never known to operate under openness, transparency and sensible regulations to go along with it.

Technology has been mostly good to humankind and there is no reason to expect that *this* is going to be any different. But who knows, maybe civilization will end in a catastrophic *core dump*...
book  algorithms  machine_learning  big_data  automation  ethics  inequality  critical_theory  phobia  sociology_of_technology  bureaucracy  governance  regulation  via:zeynep 
december 2017 by rvenkat
The Ivory Tower Can’t Keep Ignoring Tech - The New York Times
-- call me jaded but this _start an elite institute_ approach to solving societal problems business has never worked. For many of us playing in the third division, it is not our will but our bureaucrat-masters a.k.a administrators that decide.
algorithms  machine_learning  ethics  academia  critique  via:mathbabe  NYTimes  for_friends  i_remain_skeptical 
november 2017 by rvenkat
Someone ‘Accidentally’ Locked Away $300M Worth of Other People's Ethereum Funds - Motherboard
-- what people refuse to understand is that these decentralized systems still have material constraints and are prone to human errors. A promise of a democratic institution without _elites_ or _authority_ should have a way of guaranteeing reliability and immunity from human errors.
blockchain  cryptocurrency  automation  institutions  algorithms  technology  democracy  ?  motherboard 
november 2017 by rvenkat
Surveillance Intermediaries by Alan Z. Rozenshtein :: SSRN
Apple’s 2016 fight against a court order commanding it to help the FBI unlock the iPhone of one of the San Bernardino terrorists exemplifies how central the question of regulating government surveillance has become in American politics and law. But scholarly attempts to answer this question have suffered from a serious omission: scholars have ignored how government surveillance is checked by “surveillance intermediaries,” the companies like Apple, Google, and Facebook that dominate digital communications and data storage, and on whose cooperation government surveillance relies. This Article fills this gap in the scholarly literature, providing the first comprehensive analysis of how surveillance intermediaries constrain the surveillance executive. In so doing, it enhances our conceptual understanding of, and thus our ability to improve, the institutional design of government surveillance.

Surveillance intermediaries have the financial and ideological incentives to resist government requests for user data. Their techniques of resistance are: proceduralism and litigiousness that reject voluntary cooperation in favor of minimal compliance and aggressive litigation; technological unilateralism that designs products and services to make surveillance harder; and policy mobilization that rallies legislative and public opinion to limit surveillance. Surveillance intermediaries also enhance the “surveillance separation of powers”; they make the surveillance executive more subject to inter-branch constraints from Congress and the courts, and to intra-branch constraints from foreign-relations and economics agencies as well as the surveillance executive’s own surveillance-limiting components.

The normative implications of this descriptive account are important and cross-cutting. Surveillance intermediaries can both improve and worsen the “surveillance frontier”: the set of tradeoffs — between public safety, privacy, and economic growth — from which we choose surveillance policy. And while intermediaries enhance surveillance self-government when they mobilize public opinion and strengthen the surveillance separation of powers, they undermine it when their unilateral technological changes prevent the government from exercising its lawful surveillance authorities.
surveillance  big_data  privacy  algorithms  ethics  law  civil_rights  GAFA 
october 2017 by rvenkat
Free Speech in the Algorithmic Society: Big Data, Private Governance, and New School Speech Regulation by Jack M. Balkin :: SSRN
We have now moved from the early days of the Internet to the Algorithmic Society. The Algorithmic Society features the use of algorithms, artificial intelligence agents, and Big Data to govern populations. It also features digital infrastructure companies, large multi-national social media platforms, and search engines that sit between traditional nation states and ordinary individuals, and serve as special-purpose governors of speech.

The Algorithmic Society presents two central problems for freedom of expression. First, Big Data allows new forms of manipulation and control, which private companies will attempt to legitimate and insulate from regulation by invoking free speech principles. Here First Amendment arguments will likely be employed to forestall digital privacy guarantees and prevent consumer protection regulation. Second, privately owned digital infrastructure companies and online platforms govern speech much as nation states once did. Here the First Amendment, as normally construed, is simply inadequate to protect the practical ability to speak.

The first part of the essay describes how to regulate online businesses that employ Big Data and algorithmic decision making consistent with free speech principles. Some of these businesses are "information fiduciaries" toward their end-users; they must exercise duties of good faith and non-manipulation. Other businesses who are not information fiduciaries have a duty not to engage in "algorithmic nuisance": they may not externalize the costs of their analysis and use of Big Data onto innocent third parties.

The second part of the essay turns to the emerging pluralist model of online speech regulation. This pluralist model contrasts with the traditional dyadic model in which nation states regulated the speech of their citizens.

In the pluralist model, territorial governments continue to regulate the speech directly. But they also attempt to coerce or co-opt owners of digital infrastructure to regulate the speech of others. This is "new school" speech regulation. Digital infrastructure owners, and especially social media companies, now act as private governors of speech communities, creating and enforcing various rules and norms of the communities they govern. Finally, end users, civil society organizations, hackers, and other private actors repeatedly put pressure on digital infrastructure companies to regulate speech in certain ways and not to regulate it in others. This triangular tug of war -- rather than the traditional dyadic model of states regulating the speech of private parties -- characterizes the practical ability to speak in the algorithmic society.

The essay uses the examples of the right to be forgotten and the problem of fake news to illustrate the emerging pluralist model -- and new school speech regulation -- in action.

As private governance becomes central to freedom of speech, both end-users and nation states put pressure on private governance. Nation states attempt to co-opt private companies into becoming bureaucracies for the enforcement of hate speech regulation and new doctrines like the right to be forgotten. Conversely, end users increasingly demand procedural guarantees, due process, transparency, and equal protection from private online companies.

The more that end-users view businesses as governors, or as special-purpose sovereigns, the more end-users will expect -- and demand -- that these companies should conform to the basic obligations of governors towards those they govern. These obligations include procedural fairness in handling complaints and applying sanctions, notice, transparency, reasoned explanations, consistency, and conformity to rule of law values -- the “law” in this case being the publicly stated norms and policies of the company. Digital infrastructure companies, in turn, will find that they must take on new social obligations to meet these growing threats and expectations from nation states and end-users alike.
freedom_of_speech  internet  regulation  governance  administrative_state  big_data  algorithms  privacy  data  artificial_intelligence  machine_learning  ethics  philosophy_of_technology  new_media  social_media  networked_public_sphere  public_sphere  GAFA 
september 2017 by rvenkat
Can Robots Be Lawyers? Computers, Lawyers, and the Practice of Law by Dana Remus, Frank S. Levy :: SSRN
We assess frequently-advanced arguments that automation will soon replace much of the work currently performed by lawyers. Our assessment addresses three core weaknesses in the existing literature: (i) a failure to engage with technical details to appreciate the capacities and limits of existing and emerging software; (ii) an absence of data on how lawyers divide their time among various tasks, only some of which can be automated; and (iii) inadequate consideration of whether algorithmic performance of a task conforms to the values, ideals and challenges of the legal profession.

Combining a detailed technical analysis with a unique data set on time allocation in large law firms, we estimate that automation has an impact on the demand for lawyers’ time that while measureable, is far less significant than popular accounts suggest. We then argue that the existing literature’s narrow focus on employment effects should be broadened to include the many ways in which computers are changing (as opposed to replacing) the work of lawyers. We show that the relevant evaluative and normative inquiries must begin with the ways in which computers perform various lawyering tasks differently than humans. These differences inform the desirability of automating various aspects of legal practice, while also shedding light on the core values of legal professionalism.
automation  algorithms  artificial_intelligence  judgment_decision-making  law  labor  technology 
september 2017 by rvenkat
Algorithmic Labor and Information Asymmetries: A Case Study of Uber’s Drivers by Alex Rosenblat, Luke Stark :: SSRN
Uber manages a large, disaggregated workforce through its ridehail platform, one that delivers a relatively standardized experience to passengers while simultaneously promoting its drivers as entrepreneurs whose work is characterized by freedom, flexibility, and independence. Through a nine-month empirical study of Uber driver experiences, we found that Uber does leverage significant indirect control over how drivers do their jobs. Our conclusions are twofold: First, the information and power asymmetries produced by the Uber application are fundamental to its ability to structure control over its workers; second, the rhetorical invocations of digital technology and algorithms are used to structure asymmetric corporate relationships to labor, which favor the former. Our study of the Uber driver experience points to the need for greater attention to the role of platform disintermediation in shaping power relations and communications between employers and workers.
labor  economics  algorithms  automation  microeconomics  behavioral_economics 
april 2017 by rvenkat
The Taking Economy: Uber, Information, and Power by Ryan Calo, Alex Rosenblat :: SSRN
Sharing economy firms such as Uber and Airbnb facilitate trusted transactions between strangers on digital platforms. This creates economic and other value and raises a set of concerns around racial bias, safety, and fairness to competitors and workers that legal scholarship has begun to address. Missing from the literature, however, is a fundamental critique of the sharing economy grounded in asymmetries of information and power. This Article, coauthored by a law professor and a technology ethnographer who studies the ride-hailing community, furnishes such a critique and indicates a path toward a meaningful response.

Commercial firms have long used what they know about consumers to shape their behavior and maximize profits. By virtue of sitting between consumers and providers of services, however, sharing economy firms have a unique capacity to monitor and nudge all participants — including people whose livelihood may depend on the platform. Much activity is hidden away from view, but preliminary evidence suggests that sharing economy firms may already be leveraging their access to information about users and their control over the user experience to mislead, coerce, or otherwise disadvantage sharing economy participants.

This Article argues that consumer protection law, with its longtime emphasis of asymmetries of information and power, is relatively well positioned to address this under-examined aspect of the sharing economy. But the regulatory response to date seems outdated and superficial. To be effective, legal interventions must (1) reflect a deeper understanding of the acts and practices of digital platforms and (2) interrupt the incentives of sharing economy firms to abuse their position.
algorithms  data  ethics  law  behavioral_economics  labor  automation  dmce  teaching 
april 2017 by rvenkat
Study: Breitbart-led right-wing media ecosystem altered broader media agenda - Columbia Journalism Review
--Interesting, their main argument is that the extreme-right media participated in a concerted strategy and changed the mainstream media's agenda while simultaneously sealing their base's information access.
us_elections  us_politics  polarization  social_media  social_networks  algorithms  right-wing_populism  us_conservative_thought  journalism  report  data  ?  networks  teaching 
march 2017 by rvenkat
Human Decisions and Machine Predictions
We examine how machine learning can be used to improve and understand human decision-making. In particular, we focus on a decision that has important policy consequences. Millions of times each year, judges must decide where defendants will await trial—at home or in jail. By law, this decision hinges on the judge’s prediction of what the defendant would do if released. This is a promising machine learning application because it is a concrete prediction task for which there is a large volume of data available. Yet comparing the algorithm to the judge proves complicated. First, the data are themselves generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the single variable that the algorithm focuses on; for instance, judges may care about racial inequities or about specific crimes (such as violent crimes) rather than just overall crime risk. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: a policy simulation shows crime can be reduced by up to 24.8% with no change in jailing rates, or jail populations can be reduced by 42.0% with no increase in crime rates. Moreover, we see reductions in all categories of crime, including violent ones. Importantly, such gains can be had while also significantly reducing the percentage of African-Americans and Hispanics in jail. We find similar results in a national dataset as well. In addition, by focusing the algorithm on predicting judges’ decisions, rather than defendant behavior, we gain some insight into decision-making: a key problem appears to be that judges to respond to ‘noise’ as if it were signal. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals.
algorithms  criminal_justice  machine_learning  expert_judgment  ethics  policy  dmce  teaching  sendhil.mullainathan 
february 2017 by rvenkat
The Ecology of Collective Behavior
Similar patterns of interaction, such as network motifs and feedback loops, are used in many natural collective processes, probably because they have evolved independently under similar pressures. Here I consider how three environmental constraints may shape the evolution of collective behavior: the patchiness of resources, the operating costs of maintaining the interaction network that produces collective behavior, and the threat of rupture of the network. The ants are a large and successful taxon that have evolved in very diverse environments. Examples from ants provide a starting point for examining more generally the fit between the particular pattern of interaction that regulates activity, and the environment in which it functions.
deborah.gordon  complex_system  collective_dynamics  collective_animal_behavior  evolution  algorithms 
december 2016 by rvenkat
Distributed nestmate recognition in ants | Proceedings of the Royal Society of London B: Biological Sciences
We propose a distributed model of nestmate recognition, analogous to the one used by the vertebrate immune system, in which colony response results from the diverse reactions of many ants. The model describes how individual behaviour produces colony response to non-nestmates. No single ant knows the odour identity of the colony. Instead, colony identity is defined collectively by all the ants in the colony. Each ant responds to the odour of other ants by reference to its own unique decision boundary, which is a result of its experience of encounters with other ants. Each ant thus recognizes a particular set of chemical profiles as being those of non-nestmates. This model predicts, as experimental results have shown, that the outcome of behavioural assays is likely to be variable, that it depends on the number of ants tested, that response to non-nestmates changes over time and that it changes in response to the experience of individual ants. A distributed system allows a colony to identify non-nestmates without requiring that all individuals have the same complete information and helps to facilitate the tracking of changes in cuticular hydrocarbon profiles, because only a subset of ants must respond to provide an adequate response.
deborah.gordon  complex_system  collective_dynamics  collective_animal_behavior  evolution  algorithms  collective_cognition 
december 2016 by rvenkat
The Evolution of the Algorithms for Collective Behavior: Cell Systems
Collective behavior is the outcome of a network of local interactions. Here, I consider collective behavior as the result of algorithms that have evolved to operate in response to a particular environment and physiological context. I discuss how algorithms are shaped by the costs of operating under the constraints that the environment imposes, the extent to which the environment is stable, and the distribution, in space and time, of resources. I suggest that a focus on the dynamics of the environment may provide new hypotheses for elucidating the algorithms that produce the collective behavior of cellular systems.

-- law of requisite variety for decentralized systems?
deborah.gordon  complex_system  collective_dynamics  collective_animal_behavior  evolution  algorithms  collective_cognition  ?  networks  teaching 
december 2016 by rvenkat
[1611.09414] Split-door criterion for causal identification: Automatic search for natural experiments
Unobserved or unknown confounders complicate even the simplest attempts to estimate the effect of one variable on another using observational data. When cause and effect are both affected by unobserved confounders, methods based on identifying natural experiments have been proposed to eliminate confounds. However, their validity is hard to verify because they depend on assumptions about the independence of variables, that by definition, cannot be measured. In this paper we investigate a particular scenario in time series data that permits causal identification in the presence of unobserved confounders and present an algorithm to automatically find such scenarios. Specifically, we examine what we call the split-door setting, when the effect variable can be split up into two parts: one that is potentially affected by the cause, and another that is independent of it. We show that when both of these variables are caused by the same (unobserved) confounders, the problem of identification reduces to that of testing for independence among observed variables. We discuss various situations in which split-door variables are commonly recorded in both online and offline settings, and demonstrate the method by estimating the causal impact of Amazon's recommender system, obtaining more than 23,000 natural experiments that provide similar---but more precise---estimates than past studies.
causal_inference  experimental_design  observational_studies  algorithms  amazon_turk  intervention  duncan.watts 
december 2016 by rvenkat
[1609.05807] Inherent Trade-Offs in the Fair Determination of Risk Scores
Recent discussion in the public sphere about algorithmic classification has involved tension between competing notions of what it means for a probabilistic classification to be fair to different groups. We formalize three fairness conditions that lie at the heart of these debates, and we prove that except in highly constrained special cases, there is no method that can satisfy these three conditions simultaneously. Moreover, even satisfying all three conditions approximately requires that the data lie in an approximate version of one of the constrained special cases identified by our theorem. These results suggest some of the ways in which key notions of fairness are incompatible with each other, and hence provide a framework for thinking about the trade-offs between them.

-- with Jon Kleinberg
data  ethics  algorithms  big_data  privacy  sendhil.mullainathan 
december 2016 by rvenkat
[1606.04956] Assessing Human Error Against a Benchmark of Perfection
An increasing number of domains are providing us with detailed trace data on human decisions in settings where we can evaluate the quality of these decisions via an algorithm. Motivated by this development, an emerging line of work has begun to consider whether we can characterize and predict the kinds of decisions where people are likely to make errors.
To investigate what a general framework for human error prediction might look like, we focus on a model system with a rich history in the behavioral sciences: the decisions made by chess players as they select moves in a game. We carry out our analysis at a large scale, employing datasets with several million recorded games, and using chess tablebases to acquire a form of ground truth for a subset of chess positions that have been completely solved by computers but remain challenging even for the best players in the world.
We organize our analysis around three categories of features that we argue are present in most settings where the analysis of human error is applicable: the skill of the decision-maker, the time available to make the decision, and the inherent difficulty of the decision. We identify rich structure in all three of these categories of features, and find strong evidence that in our domain, features describing the inherent difficulty of an instance are significantly more powerful than features based on skill or time.
sendhil.mullainathan  judgment_decision-making  algorithms  dmce  teaching 
december 2016 by rvenkat
[1611.04135] Automated Inference on Criminality using Face Images
We study, for the first time, automated inference on criminality based solely on still face images. Via supervised machine learning, we build four classifiers (logistic regression, KNN, SVM, CNN) using facial images of 1856 real persons controlled for race, gender, age and facial expressions, nearly half of whom were convicted criminals, for discriminating between criminals and non-criminals. All four classifiers perform consistently well and produce evidence for the validity of automated face-induced inference on criminality, despite the historical controversy surrounding the topic. Also, we find some discriminating structural features for predicting criminality, such as lip curvature, eye inner corner distance, and the so-called nose-mouth angle. Above all, the most important discovery of this research is that criminal and non-criminal face images populate two quite distinctive manifolds. The variation among criminal faces is significantly greater than that of the non-criminal faces. The two manifolds consisting of criminal and non-criminal faces appear to be concentric, with the non-criminal manifold lying in the kernel with a smaller span, exhibiting a law of normality for faces of non-criminals. In other words, the faces of general law-biding public have a greater degree of resemblance compared with the faces of criminals, or criminals have a higher degree of dissimilarity in facial appearance than normal people.

--WTF??? _|_ ....
algorithms  bias  machine_learning  big_data  ethics  for_friends 
november 2016 by rvenkat
All that Glitters Is Not Gold: Comparing Backtest and Out-of-Sample Performance on a Large Cohort of Trading Algorithms by Thomas Wiecki, Andrew Campbell, Justin Lent, Jessica Stauth :: SSRN
When automated trading strategies are developed and evaluated using backtests on historical pricing data, there exists a tendency to overfit to the past. Using a unique dataset of 888 algorithmic trading strategies developed and backtested on the Quantopian platform with at least 6 months of out-of-sample performance, we study the prevalence and impact of backtest overfitting. Specifically, we find that commonly reported backtest evaluation metrics like the Sharpe ratio offer little value in predicting out of sample performance (R² < 0.025). In contrast, higher order moments, like volatility and maximum drawdown, as well as portfolio construction features, like hedging, show significant predictive value of relevance to quantitative finance practitioners. Moreover, in line with prior theoretical considerations, we find empirical evidence of overfitting – the more backtesting a quant has done for a strategy, the larger the discrepancy between backtest and out-of-sample performance. Finally, we show that by training non-linear machine learning classifiers on a variety of features that describe backtest behavior, out-of-sample performance can be predicted at a much higher accuracy (R² = 0.17) on hold-out data compared to using linear, univariate features. A portfolio constructed on predictions on hold-out data performed significantly better out-of-sample than one constructed from algorithms with the highest backtest Sharpe ratios.
algorithms  finance  machine_learning  evaluation  for_friends  via:noahpinion 
june 2016 by rvenkat
Algorithms Need Managers, Too
-- a very business-centric article. Nothing new here ...
algorithms  big_data  generalization  via:smullainathan 
march 2016 by rvenkat
« earlier      
per page:    204080120160

related tags

?  academia  active_matter  administrative_state  agency  agnotology  algorithms  amazon_turk  artificial_intelligence  arvind.narayanan  authoritarianism  automation  behavioral_economics  bias  big_data  blockchain  blog  bloomberg  book  book_review  bots  bureaucracy  buzzfeed  causal_inference  china  civic_engagement  civil_rights  civil_society  collective_animal_behavior  collective_cognition  collective_dynamics  combinatorics  complexity  complex_system  computaional_advertising  computational_biology  computational_complexity  conspiracy_theories  contagion  corruption  course  crime  criminal_justice  critical_theory  critique  cryptocurrency  cybernetics  dana.boyd  data  data_journalism  deborah.gordon  democracy  digital_economy  discrimination  disinformation  dmce  duncan.watts  econometrics  economics  empirical_legal_studies  encryption  ethics  evaluation  evolution  experimental_design  expert_judgment  explanation  fb  finance  for_friends  freedom_of_speech  GAFA  game_theory  generalization  google  governance  graphical_models  graph_theory  graph_visualization  heuristics  homophily  hypergraph  inequality  information_theory  institute  institution  institutions  intellectual_property  interating_particle_system  internet  interpretation  intervention  i_remain_skeptical  jobs  journalism  judgment_decision-making  kate.crawford  labor  law  linear_algebra  machine_learning  macro_from_micro  market_failures  market_microstructure  martin.wainwright  measurement  media_studies  metric_spaces  microeconomics  misguided  misinformation  mit_courseware  moral_philosophy  motherboard  natural_language_processing  networked_life  networked_public_sphere  networks  new_media  numerical  NYTimes  observational_studies  online_experiments  opinion_formation  optimization  organization  philosophy_of_technology  phobia  platform_economics  platform_studies  polarization  policing  policy  political_economy  prediction  privacy  privatization  propublica  protests  psychology  public_opinion  public_policy  public_sphere  quanta_mag  quantum_computing  race  radicalization  rational_choice  reading_list  regulation  report  researcher  review  right-wing_populism  risk_assessment  sendhil.mullainathan  sentiment_analysis  social_media  social_movements  social_networks  social_reputation  social_science  sociology_of_tecchnology  sociology_of_technology  statistics  surveillance  teaching  technology  the_atlantic  the_guardian  tutorial  us_conservative_thought  us_elections  us_politics  us_supreme_court  via:?  via:cshalizi  via:eszter  via:mathbabe  via:noahpinion  via:nyhan  via:randw  via:smullainathan  via:zeynep  video_lecture  virtue_signaling  WaPo  wired  work_culture  zeynep.tufekci 

Copy this bookmark: