1424
ModelDepot - Open, Transparent Machine Learning for Engineers
A platform for discovering, sharing, and discussing easy to use and pre-trained machine learning models.
machinelearning 
8 days ago
California falling short on climate change goals because driving is increasing, report finds - Los Angeles Times
The state’s inability to curb the amount of driving puts it at risk of failing to meet overall climate change goals. The state hit its 2020 goal for reducing emissions below 1990 levels four years in advance largely because of major improvements to the electricity grid. But climate regulators warned that the state’s goal to cut emissions 40% below 1990 levels by 2030 won’t be met without a major turnaround in the transportation sector.

Dramatically increasing the amount of electric vehicles on the road will not solve the problem, the report said. Even if new car sales of zero-emission vehicles increase nearly tenfold from today, the state would still need to reduce vehicle miles traveled per capita by 25% to meet the 2030 goal.

“California will not achieve the necessary greenhouse gas emissions reductions to meet mandates for 2030 and beyond without significant changes to how communities and transportation systems are planned, funded and built,” the report said.
climatechange  transport  transit 
10 days ago
We are Google employees – Google must drop Dragonfly | Hacker News
Pretty bold. A lot of people are saying this wont work, but speaking from my own experience, you'd be surprised what companies are amicable to when it comes to business.

Im an engine mechanic by trade, and our shops handle bids for cash strapped local governments that outsource their motor pool maintenance. We do things like fire trucks and police cars, but we were working on a new regional idea as a "service center" for municipalities that purchased MRAP combat vehicles for their police departments. https://en.wikipedia.org/wiki/MRAP

We all, especially the veterans I work with, hated this idea. MRAP's are for combat, not police work, and have a dangerous propensity to roll over in city streets or escalate already violent situations. 14 of us sent a signed letter to the owner and senior management detailing our major concerns and heard nothing back for about a month. Then out of the blue we got a call for a meeting with 3-4 very senior managers at a local irish bar.

They paid for dinner and tried to explain how the business would be extremely lucrative. we would all see major bonuses, we could hire more workers, and grow the business faster than just large truck repair. It took 3 very emotional hours, but we eventually talked down a handful of people from making a very wrong decision.

for a week after, we were all sort of stunned that it actually worked at all. Tire cages meant for MRAP tires were cut up and turned into random parts holders, or as new hangers for air lines...one even replaced our mailbox post.
ethics  tech 
11 days ago
Feminism's Tipping Point: Who Wins from Leaning in? | Dissent Magazine
Now, with Sandberg’s Lean In, we have a book that tells the story that she and Facebook want to tell about sexism: women can solve it themselves by working harder. This story works in the first instance to supplant a more structural feminist critique of the workplace, but beyond that it promotes Facebook as a cutting-edge work environment where men and women are encouraged to work “harder better faster stronger” in support of the company’s domination and success.

The loser in the Lean In vision of work isn’t one version of feminism or another—other feminist organizations and publications will continue to flourish alongside Lean In, though they may receive less media attention—but uncapitalized, unmonetized life itself. Just as Facebook relies on users to faithfully upload their data to drive site growth, Facebook relies on its employees to devote ever greater time to growing Facebook’s empire.
feminism  facebook  capitalism 
12 days ago
Reflections on Random Kitchen Sinks – arg min blog
The alchemy talk...

"Batch Norm is a technique that speeds up gradient descent on deep nets. You sprinkle it between your layers and gradient descent goes faster. I think it’s ok to use techniques we don’t understand. I only vaguely understand how an airplane works, and I was fine taking one to this conference. But it’s always better if we build systems on top of things we do understand deeply."
machinelearning 
12 days ago
I hate manager READMEs – Camille Fournier – Medium
If you want to build trust, you do that by showing up, talking to your team both individually and as a team, and behaving in an ethical, reliable manner. Over, and over, and over again. You don’t get it from writing a doc about how you deserve their trust.

One of the worst parts of these docs is the airing of your own perceived personality faults. I suck at niceties. I get heated sometimes in discussions. I don’t give praise very much. If you know you have foibles/quirks that you in fact want to change about yourself, do the work. Don’t put them out there for your team to praise you for the intention to do the work, just do it. And while you get to decide which of your foibles/quirks/challenges you will or will not change about yourself, as the manager, it is on you to make your team effective and that may in fact mean changing some things about yourself that you don’t want to change. Writing them down feels good, like you’ve been honest and vulnerable and no one can be surprised when you behave badly, after all you warned them! But it does not excuse these bad behaviors, and it certainly does not take the sting away when someone feels shut down by your rudeness or unhappy from a lack of positive feedback. If you must write a README, please skip this section. Keep your bad behaviors to yourself, and hold yourself accountable for their impact.
management 
15 days ago
A Brief History of DevOps, Part IV: Continuous Delivery and Continuous Deployment
“Continuous delivery is the practice of ensuring that software is always ready to be deployed. Part of that insurance is testing every change you’ve made (a.k.a. continuous integration). In addition, you’ve also made the effort to package up the actual artifacts that are going to be deployed — perhaps you’ve already deployed those artifacts to a staging environment.”
devops 
15 days ago
Brazil’s Election Is The End Of The Far-Right, Populist Wave. Now We Live With The Results.
“The way the world is using their phones is almost completely dominated by a few Silicon Valley companies. The abuse that is happening is due to their inability to manage that responsibility. All of this has become so normalized in the three years since it first began to manifest that we just assume now that platforms like Facebook, YouTube, WhatsApp, and Twitter will exacerbate political and social instability”

“Chances are, by now, your country has some, if not all, of the following. First off, you probably have some kind of local internet troll problem, like the MAGAsphere in the US, the Netto-uyoku in Japan, Fujitrolls in Peru, or AK-trolls in Turkey. Your trolls will probably have been radicalized online via some kind of community for young men like Gamergate, Jeuxvideo.com ("videogames.com") in France, ForoCoches ("Cars Forum") in Spain, Ilbe Storehouse in South Korea, 2chan in Japan, or banter Facebook pages in the UK.

Then far-right influencers start appearing, aided by algorithms recommending content that increases user watch time. They will use Facebook, Twitter, and YouTube to transmit and amplify content and organize harassment and intimidation campaigns. If these influencers become sophisticated enough, they will try to organize protests or rallies. The mini fascist comic cons they organize will be livestreamed and operate as an augmented reality game for the people watching at home. Violence and doxxing will follow them.”
politics  facebook  twitter 
15 days ago
Faked Out — Real Life
As long as mass media has existed in the West, there have been complaints about social acceleration, uncertainty, and the loss of a real, knowable world. In other words, our current conversations about the loss of reality are familiar; while each writer attempts to sound innovative, the concerns are evergreen. If the term “infocalypse” is useful, it is as a synonym for modernity, where truth is always two decades ago and dying today, and a new dark age always on the horizon.
media  journalism 
17 days ago
Will Oldham Unmasked | GQ
It's not just the question of whether it cheapens art to put it on Spotify. He also actually likes the idea of having made Schrödinger's album. A record that both is and isn't a record.

“It's private. Unspoiled. It serves all these musical purposes. It's me exploring and achieving new things that I hadn't achieved before. But it's like, why should I confuse things by releasing it?”

I laugh, because it sounds so self-defeating, and then I apologize, because I do not need Oldham to explain what it feels like to work for years to get good at doing something, then watch as uncaring and/or straight-up evil forces hollow out your industry, systematically devaluing the thing you do and training people to experience your work in a half-conscious manner that robs it of all meaning. I do not need him to explain this, because it's 2018 and I work in journalism.
music  publishing 
21 days ago
Sears Found Some Useful Bonds - Bloomberg
This is super not investing advice, but if your investment strategy is to email yourself a list of stocks and see if Gmail says “buy” or “sell,” I would be interested to hear about it.
anecdata 
22 days ago
Quote by Italo Calvino: “The inferno of the living is not something that...” | Goodreads
“The inferno of the living is not something that will be; if there is one, it is what is already here, the inferno where we live every day, that we form by being together. There are two ways to escape suffering it. The first is easy for many: accept the inferno and become such a part of it that you can no longer see it. The second is risky and demands constant vigilance and apprehension: seek and learn to recognize who and what, in the midst of inferno, are not inferno, then make them endure, give them space.”
quotation  urbanism  culture 
27 days ago
The Bus Is Still Best
“In almost every public meeting I attend, citizens complain about seeing buses with empty seats, lecturing me about how smaller vehicles would be less wasteful. But that’s not the case. Because the cost is in the driver, a wise transit agency runs the largest bus it will ever need during the course of a shift. In an outer suburb, that empty big bus makes perfect sense if it will be mobbed by schoolchildren or commuters twice a day.”

“How many people’s doors can a driver get to in an hour, including the minute or two that the customer spends grabbing their things and boarding? The intuitively obvious answer is the right one: not very many. An Eno Foundation report promoting microtransit could not cite a case study doing better than four boardings an hour of service. John Urgo, the planner of demand-responsive service for AC Transit in Oakland, California, has said that seven boardings an hour is “the best we hope to achieve.” Few fixed-route buses perform that poorly. Across sprawling Silicon Valley, for example, fixed-route buses carried 12 to 45 people an hour in 2015. In a dense city such as Philadelphia, the number can exceed 80”

“In my work as a transit planner, I try to help transit boards think clearly about what balance they want to strike between ridership goals (putting service where lots of people will ride) and coverage goals (providing a little service to everyone). Many citizens demand coverage service and complain if it is removed, but the more coverage service is offered, the less ridership a municipality can expect under a fixed budget. Finding the right balance is a painful process of balancing competing demands, which is the job of elected officials or the board members they appoint.”

“So what technologies make sense in public transit? Efficient transit networks are made of many technologies, each the right one for its own situation. Rail is for high-capacity markets, where you need to move hundreds of people per vehicle. Ferries and aerial gondolas overcome certain obstacles. But everywhere else, the bus is the thing that’s easiest to make abundant. Because labor is the main limit on their quantity, they can be much more abundant after full automation.”
urbanism  transport 
4 weeks ago
The Geomblog: On teaching ethics to tech companies
This seems rather ridiculous. When chemical companies were dumping pesticides on the land by the ton and Rachel Carson wrote Silent Spring, we didn't shake our heads sorrowfully at companies and sent them moral philosophers. We founded the EPA!

When the milk we drink was being adulterated with borax and formaldehyde and all kinds of other horrific additives that Deborah Blum documents so scarily in her new book 'The Poison Squad', we didn't shake our heads sorrowfully at food vendors and ask them to grow up. We passed a law that led eventually to the formation of the FDA.

Tech companies are companies. They are not moral agents, or even immoral agents. They are amoral profit-maximizing vehicles for their shareholders (and this is not even a criticism). Companies are supposed to make money, and do it well. Facebook's stock price didn't slip when it was discovered how their systems had been manipulated for propaganda. It slipped when they proposed changes to their newsfeed ratings mechanisms to address these issues.

It makes no sense to rely on tech companies to police themselves, and to his credit, Brad Smith of Microsoft made exactly this point in a recent post on face recognition systems. Regulation, policing and whatever else we might imagine, has to come from the outside. While I don't claim that regulation mechanisms all work as they are currently conceived, the very idea of checks and balances seems more robust than merely hoping that tech companies will get their act together on their own.
ethics  tech 
6 weeks ago
NLP's ImageNet moment has arrived
Word2vec and related methods are shallow approaches that trade expressivity for efficiency. Using word embeddings is like initializing a computer vision model with pretrained representations that only encode edges: they will be helpful for many tasks, but they fail to capture higher-level information that might be even more useful. A model initialized with word embeddings needs to learn from scratch not only to disambiguate words, but also to derive meaning from a sequence of words. This is the core aspect of language understanding, and it requires modeling complex language phenomena such as compositionality, polysemy, anaphora, long-term dependencies, agreement, negation, and many more. It should thus come as no surprise that NLP models initialized with these shallow representations still require a huge number of examples to achieve good performance.

In NLP, models are typically a lot shallower than their CV counterparts. Analysis of features has thus mostly focused on the first embedding layer, and little work has investigated the properties of higher layers for transfer learning. Let us consider the datasets that are large enough, fulfilling desideratum #1. Given the current state of NLP, there are several contenders.

Language modeling (LM) aims to predict the next word given its previous word. Existing benchmark datasets consist of up to 1B words, but as the task is unsupervised, any number of words can be used for training. See below for examples from the popular WikiText-2 dataset consisting of Wikipedia articles.

In light of this step change, it is very likely that in a year’s time NLP practitioners will download pretrained language models rather than pretrained word embeddings for use in their own models, similarly to how pre-trained ImageNet models are the starting point for most CV projects nowadays.
nlp  deeplearning 
7 weeks ago
It was raining in the data center
“Although the actual paths of fiber-optic cables are considered state and company secrets, it is not unlikely that most or all of the Facebook facility’s data runs along this route. In The Prehistory of the Cloud, Tung-Hui Hu describes the origin of private data service with telecommunications giant Sprint (Southern Pacific Railroad Internal Network), which sold excess fiber-optic bandwidth along train lines to consumers beginning in 1978. He goes on to state in the same text, that “virtually all traffic on the US Internet runs across the same routes established in the 19th century”.”
military  geography  usa  oregon  internet  infrastructure  facebook 
8 weeks ago
The Key to Everything | by Freeman Dyson | The New York Review of Books
Freeman Dyson MAY 10, 2018 ISSUE
Scale: The Universal Laws of Growth, Innovation, Sustainability, and the Pace of Life in Organisms, Cities, Economies, and Companies
by Geoffrey West
Penguin, 479 pp., $30.00
maths  astronomy  science  review  biology  complexity  book 
8 weeks ago
David Foster Wallace on John McCain, 2000 Rolling Stone Story – Rolling Stone
By all means stay home if you want, but don’t bullshit yourself that you’re not voting. In reality, there is no such thing as not voting: you either vote by voting, or you vote by staying home and tacitly doubling the value of some Diehard’s vote.
politics  usa 
8 weeks ago
[Easy Chair] | Forget About It, by Corey Robin | Harper's Magazine
“Ever since the 2016 presidential election, we’ve been warned against normalizing Trump. That fear of normalization misstates the problem, though. It’s never the immediate present, no matter how bad, that gets normalized — it’s the not-so-distant past. Because judgments of the American experiment obey a strict economy, in which every critique demands an outlay of creed and every censure of the present is paid for with a rehabilitation of the past, any rejection of the now requires a normalization of the then.”

“Whenever I said this, people got angry with me. They still do. For months, now years, I puzzled over that anger. My wife explained it to me recently: in making the case for continuity between past and present, I sound complacent about the now. I sound like I’m saying that nothing is wrong with Trump, that everything will work out. I thought I was giving people a steadying anchor, a sense that they — we — had faced this threat before, a sense that this is the right-wing monster we’ve been fighting all along, since Nixon and Reagan and George W. Bush. Turns out I was removing their ballast, setting them afloat in the intermittent and inconstant air.”
politics  usa  republican  history 
11 weeks ago
Traffic Jam? Blame 'Induced Demand.' - CityLab
In urbanism, “induced demand” refers to the idea that increasing roadway capacity encourages more people to drive, thus failing to improve congestion.
Since the concept was introduced in the 1960s, numerous academic studies have demonstrated the existence of ID.
But some economists argue that the effects of ID are overstated, or outweighed by the benefits of greater automobility.
Few federal, state, and local departments of transportation are thought to adequately account for ID in their long-term planning.

Many departments of transportation are instead touting the benefits of toll lanes, a more au courant form of roadway capacity expansion.

Such pricing tools can help mitigate induced demand, but these, too, come with their own negative externalities. Tolls, and ever-elusive congestion pricing schemes have been criticized for being a regressive form of taxation that is spread among high- and low-income drivers alike. The real solution to induced demand could be freeway removal—call it reduced demand—which has been shown to reduce auto traffic while also stimulating new development.
urbanism  transport 
11 weeks ago
The Annotated Transformer
The Transformer from “Attention is All You Need” has been on a lot of people’s minds over the last year. Besides producing major improvements in translation quality, it provides a new architecture for many other NLP tasks. The paper itself is very clearly written, but the conventional wisdom has been that it is quite difficult to implement correctly.

In this post I present an “annotated” version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout. This document itself is a working notebook, and should be a completely usable implementation. In total there are 400 lines of library code which can process 27,000 tokens per second on 4 GPUs.
nlp  deeplearning  pytorch 
11 weeks ago
How do we capture structure in relational data?
The key insight behind the DeepWalk algorithm is that random walks in graphs are a lot like sentences.

Grover and Leskovec (2016) generalize DeepWalk into the node2vec algorithm. Instead of “first-order” random walks that choose the next node based only on the current node, node2vec uses a family of “second-order” random walks that depend on both the current node and the one before it.

Under the structural hypothesis, nodes that serve similar structural functions — for example, nodes that act as a hub — are part of the same neighborhood due to their higher-order structural significance.

For instance, a user’s graph of friends on a social network can grow and shrink over time. We could apply node2vec, but there are two downsides.

It could be computationally expensive to run a new instance of node2vec every time the graph is modified.

Additionally, there is no guarantee that multiple applications of node2vec will produce similar or even comparable matrices .

Node2vec and DeepWalk produce summaries that are later analyzed with a machine learning technique. By contrast, graph convolutional networks (GCNs) present an end-to-end approach to structured learning.
graph  machinelearning  deeplearning 
12 weeks ago
Software 2.0 – Andrej Karpathy – Medium
It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program. In these cases, the programmers will split into two teams. The 2.0 programmers manually curate, maintain, massage, clean and label datasets; each labeled example literally programs the final system because the dataset gets compiled into Software 2.0 code via the optimization. Meanwhile, the 1.0 programmers maintain the surrounding tools, analytics, visualizations, labeling interfaces, infrastructure, and the training code.

Software 2.0 is: Computationally homogeneous. It is much easier to make various correctness/performance guarantees.

Simple to bake into silicon. As a corollary, since the instruction set of a neural network is relatively small, it is significantly easier to implement these networks much closer to silicon, e.g. with custom ASICs, neuromorphic chips, and so on.

Constant running time. Every iteration of a typical neural net forward pass takes exactly the same amount of FLOPS. There is zero variability based on the different execution paths your code could take through some sprawling C++ code base.
Constant memory use. Related to the above, there is no dynamically allocated memory anywhere so there is also little possibility of swapping to disk, or memory leaks that you have to hunt down in your code.

It is highly portable. A sequence of matrix multiplies is significantly easier to run on arbitrary computational configurations compared to classical binaries or scripts.
in Software 2.0 we can take our network, remove half of the channels, retrain, and there — it runs exactly at twice the speed and works a bit worse.

Finally, and most importantly, a neural network is a better piece of code than anything you or I can come up with in a large fraction of valuable verticals, which currently at the very least involve anything to do with images/video and sound/speech.

The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”, e.g., by silently adopting biases in their training data, which are very difficult to properly analyze and examine when their sizes are easily in the millions in most cases.

Finally, we’re still discovering some of the peculiar properties of this stack. For instance, the existence of adversarial examples and attacks highlights the unintuitive nature of this stack.

When the network fails in some hard or rare cases, we do not fix those predictions by writing code, but by including more labeled examples of those cases. Who is going to develop the first Software 2.0 IDEs, which help with all of the workflows in accumulating, visualizing, cleaning, labeling, and sourcing datasets? Perhaps the IDE bubbles up images that the network suspects are mislabeled based on the per-example loss, or assists in labeling by seeding labels with predictions, or suggests useful examples to label based on the uncertainty of the network’s predictions.
machinelearning  workbench 
12 weeks ago
« earlier      
1970s 20c abtesting academia adversarial advertising ai algorithms amazon anecdata antarctica api apple architecture art arxiv astro async audio aws backup bash bayes bias bitcoin book books brexit business c california capitalism car causality churn cia climatechange cloudfront concurrency conference crime cryptocurrency cryptography cs culture data database dataengineering datascience deeplearning design devops differentialprivacy diversity docker economics education engineering english espionage ethics eu europe facebook family federatedlearning feminism fiction film finance functional git github golang google h1b hardware haskell health hiring history housing immigration infrastructure internet interpretability interview investments jobs journalism js jupyter kubernetes labour lambda language law legal linearalgebra linux losangeles machinelearning macos make management map mapreduce maps marketing math maths me media module money music neuralnetworks newyork nlp notebook numpy nyc oop optimization package pandas parenting patterns phone physics politics predictivemaintenance presentation privacy probabilisticprogramming probability product professional programming psephology publishing pycon2017 pymc3 pytest python pytorch quant r racism recipe recommendation reinforcementlearning remote republican research review rnn rust s3 sanfrancisco science scientism scifi scikitlearn security sentiment serverless sexism siliconvalley slack socialism socialmedia spark sql ssh ssl stan startup statistics summarization surveillance talk tax tech technology tensorflow testing text timeseries tmux transport travel trump tutorial tv twitter uber uk unix urbanism usa versioncontrol video vim visualization vpn web webdev word2vec writing

Copy this bookmark:



description:


tags: