ProPublica Illinois Q&A: Meet Data Reporter Sandhya… — ProPublica
I organize my life with spreadsheets because why not? I actually have several spreadsheets for general life usage. I have a spreadsheet of all the things that I have in my apartment for when I move. I have a list of all the places I’ve visited, a list of all the flights I’ve taken and a general packing list so that every time I don’t have to create a new file. I also have a spreadsheet for all the ice cream places I visit. And I have a list of all the books that I have read in the last year. Spreadsheets are great.
6 days ago
Remove the legend to become one — Remains of the Day
When I started my first job at Amazon.com, as the first analyst in the strategic planning department, I inherited the work of producing the Analytics Package. I capitalize the term because it was both a serious tool for making our business legible, and because the job of its production each month ruled my life for over a year.

Back in 1997, analytics wasn't even a real word. I know because I tried to look up the term, hoping to clarify just I was meant to be doing, and I couldn't find it, not in the dictionary, not on the internet. You can age yourself by the volume of search results the average search engine returned when you first began using the internet in force. I remember when pockets of wisdom were hidden in eclectic newsgroups, when Yahoo organized a directory of the web by hand, and later when many Google searches returned very little, if not nothing. Back then, if Russians wanted to hack an election, they might have planted some stories somewhere in rec.arts.comics and radicalized a few nerds, but that's about it.
programming  data-visualization 
6 days ago
Data organization in spreadsheets: The American Statistician: Vol 0, No ja

Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this paper offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, don't leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, don't include calculations in the raw data files, don't use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text file.
spreadsheets  best 
6 days ago
A Guide to Natural Language Processing - Federico Tomassetti - Software Architect
Natural Language Processing (NLP) comprises a set of techniques that can be used to achieve many different objectives. Take a look at the following table to figure out which technique can solve your particular problem.
machine-learning  advice  howto  nlp 
6 days ago
Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning | Dropbox Tech Blog
In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.
ocr  best 
11 days ago
(1) Making NLP work for Investigative Journalism - YouTube
Speaker: Jonathan Stray, Research Scholar, Columbia Journalism School

Presented at the Berkeley Institute for Data Science on Thursday, November 9, 2017. (Note: Due to an equipment error, the first few seconds of this video are missing. Our apologies for the inconvenience.)
11 days ago
1-day numpy training Numpy exercises
training page from facebook research engineer Matthijs Douze
numpy  python  tutorial 
12 days ago
Connecting with the Dots - Learning - Source: An OpenNews project
To illustrate what I mean, here is a similar chart posted to a clickbait Twitter account called @BrilliantMaps of all the car bombings in Baghdad since 2003. It wasn’t originally clear who made this map–lack of attribution is common for these kind of accounts–but the contrast between this map and the Times and Guardian interactives mentioned above is glaring. The problem is that this map is not only wrong, it’s also terrible. Gawker figured out the origins of this map and discovered that it was actually derived from Guardian data of all fatalities in Baghdad from 2003 to 2009, including accidents, so it exaggerated the data. Brilliant Maps later issued a correction, but still got it wrong. I’m beginning to think clickbait twitter accounts aren’t entirely reliable.
data-visualization  mapping  best 
13 days ago
USS McCain collision ultimately caused by UI confusion | Ars Technica UK
On November 1, the US Navy issued its report on the collisions of the USS Fitzgerald and USS John S. McCain this summer. The Navy's investigation found that both collisions were avoidable accidents. And in the case of the USS McCain, the accident was in part caused by an error made in switching which control console on the ship's bridge had steering control. While the report lays the blame on training, the user interface for the bridge's central navigation control systems certainly played a role.

With the USS McCain collision, even Navy tech can’t overcome human shortcomings
According to the report, at 5:19am local time, the commanding officer of the McCain, Commander Alfredo J. Sanchez, "noticed the Helmsman (the watchstander steering the ship) having difficulty maintaining course while also adjusting the throttles for speed control." Sanchez ordered the watch team to split the responsibilities for steering and speed control, shifting control of the throttle to another watchstander's station—the lee helm, immediately to the right (starboard) of the Helmsman's position at the Ship’s Control Console. While the Ship's Control Console has a wheel for manual steering, both steering and throttle can be controlled with trackballs, with the adjustments showing up on the screens for each station.

However, instead of switching just throttle control to the Lee Helm station, the Helmsman accidentally switched all control to the Lee Helm station. When that happened, the ship's rudder automatically moved to its default position (amidships, or on center line of the ship). The helmsman had been steering slightly to the right to keep the ship on course in the currents of the Singapore Strait, but the adjustment meant the ship started drifting off course.

The bridge layout on the McCain, with watchstations labeled. The ship should have had its full "sea and anchor" detail on watch.
The Ship Control Console of the McCain, with helm (at left) and lee helm (at right) stations. Both have trackballs to enter commands into the console through control screens.
As the McCain watch team scrambled to figure out what was going on, the ship overtook and steered across the bow of the Alnic MC.
design  ui  ux 
18 days ago
The Bots That Are Changing Politics - Motherboard
A taxonomy of politibots, a swelling force in global elections that cannot be ignored.

Editor's note: This essay is drawn from discussions and writings around a June 2017 convening organized and led by Samuel Woolley, Research Director of the new DigIntel Lab at the Institute for the Future, alongside fellow bot experts* Renee DiResta, John Little, Jonathon Morgan, Lisa Maria Neudert, and Ben Nimmo. The symposium was held at Jigsaw, the Google / Alphabet think-tank and technology incubator. Disclosure: Jigsaw provided space, funded Woolley as a (former) research fellow, and covered travel costs.

BBots and their cousins—botnets, bot armies, sockpuppets, fake accounts, sybils, automated trolls, influence networks—are a dominant new force in public discourse.

You may have heard that bots can be used to threaten activists, swing elections, and even engage in conversation with the President. Bots are the hip new media; Silicon Valley has marketed the chatbot as the next technological step after the app. Donald Trump himself has said he wouldn't have won last November without Twitter, where, researchers found, bots massively amplified his support on the platform.

Scholars have argued that nearly 50 million accounts on Twitter are actually automatically run by bot software. On Facebook, social bots—accounts run by automated software that mimic real users or work to communicate particular information streams—can be used to automate group pages and spread political advertisements. Recent public revelations from Facebook reveal that a Russian "troll farm" with close ties to the Kremlin spent around $100,000 on ads ahead of the 2016 US election and produced thousands of organic posts that spread across Facebook and Instagram. The same firm, the Internet Research Agency, has been known to make widespread use of bots in its attempts to manipulate public opinion over social media.
bots  compciv  twitter  social-media 
18 days ago
Inside The Great Poop Emoji Feud
“Organic waste isn’t cute,” Everson wrote, aghast that the technical committee would even deign to consider additional excremoji. “It is bad enough that the [Emoji Subcommittee] came up with it, but it beggars belief that the [Unicode Technical Committee] actually approved it,” he wrote. Everson continued:

“The idea that our 5 committees would sanction further cute graphic characters based on this should embarrass absolutely everyone who votes yes on such an excrescence. Will we have a CRYING PILE OF POO next? PILE OF POO WITH TONGUE STICKING OUT? PILE OF POO WITH QUESTION MARKS FOR EYES? PILE OF POO WITH KARAOKE MIC? Will we have to encode a neutral FACELESS PILE OF POO?”
emoji  unicode  language 
19 days ago
When Patents Attack... Part Two! | Transcript | This American Life
Chris Crawford then launches into a rather surprising explanation for how the sentencing that this business was, quote, "Jack Byrd's idea," doesn't actually mean that the business was Jack Byrd's idea. His explanation? He was using the apostrophe S incorrectly.
19 days ago
The suspect told police ‘give me a lawyer dog.’ The court says he wasn’t asking for a lawyer. - The Washington Post
But that’s not how the courts in Louisiana see it. And when a suspect in an interrogation told detectives to “just give me a lawyer dog,” the Louisiana Supreme Court ruled that the suspect was, in fact, asking for a “lawyer dog,” and not invoking his constitutional right to counsel. It’s not clear how many lawyer dogs there are in Louisiana, and whether any would have been available to represent the human suspect in this case, other than to give the standard admonition in such circumstances to simply stop talking.
punctuation  crime  judicial-system 
19 days ago
The Improbable Origins of PowerPoint - IEEE Spectrum
Walking into the hall to deliver the speech was a “daunting experience,” the speaker later recalled, but “we had projectors and all sorts of technology to help us make the case.” The technology in question was PowerPoint, the presentation software produced by Microsoft. The speaker was Colin Powell, then the U.S. Secretary of State.

Powell’s 45 slides displayed snippets of text, and some were adorned with photos or maps. A few even had embedded video clips. During the 75-⁠minute speech, the tech worked perfectly. Years later, Powell would recall, “When I was through, I felt pretty good about it.”
powerpoint  history 
19 days ago
Information Is Power
It was 1969, and the American War on Vietnam seemed unending. Mass outrage over the war had spilled into the nation’s streets and campuses — outrage over the rising heap of body bags returning home, over the neverending spree of bombs that barreled down from US planes onto rural villages, with the images of fleeing families, their skin seared by napalm, broadcast across the world.

Hundreds of thousands of people had begun to resist the war. The fall of 1969 saw the historic Moratorium protests, the largest protests in US history.
19 days ago
A Minimalist Guide to SQLite
SQLite is a self-contained, serverless SQL database. Dr. Richard Hipp, the creator of SQLite, first released the software on the 17th of August, 2000. Since then it has gone on to be the second most deployed piece of software in the world. It's used in systems as important as the Airbus A350 so it comes as no surprise the tests for SQLite 3 are aviation-grade. The software itself is very small, the amd64 Debian client and library package is 765 KB when compressed for distribution and 2.3 MB when fully installed. The software is licensed under a very promiscuous license: Public Domain.
database  guide  sql  sqlite  tutorial 
19 days ago
How We Found Tom Price’s Private Jets - POLITICO Magazine
That meant we had to recreate Price’s schedule from scratch if we were to have any hope of matching his trips to chartered flights. We reviewed the HHS summaries of Price’s meetings. We scoured news sites for reports of Price speeches outside Washington. We obsessively tracked his appearances on social media. Putting all this information together, we built a database of Price’s trips.
data-journalism  investigations 
6 weeks ago
California Regulators Require Auto Insurers to Adjust Rates
The state changed its approach in response to ProPublica’s finding that minority neighborhoods were paying higher premiums than white areas with the same risk.
compciv  algorithms  transparency  best 
8 weeks ago
A Brief History of Religion and the U.S. Census | Pew Research Center
The U.S. Census Bureau has not asked questions about religion since the 1950s, but the federal government did gather some information about religion for about a century before that. Starting in 1850, census takers began asking a few questions about religious organizations as part of the decennial census that collected demographic and social statistics from the general population as well as economic data from business establishments. Federal marshals and assistant marshals, who acted as census takers until after the Civil War, collected information from members of the clergy and other religious leaders on the number of houses of worship in the U.S. and their respective denominations, seating capacities and property values. Although the census takers did not interview individual worshipers or ask about the religious affiliations of the general population, they did ask members of the clergy to identify their denomination – such as Methodist, Roman Catholic or Old School Presbyterian. The 1850 census found that there were 18 principal denominations in the U.S.
census  data-analysis 
8 weeks ago
The Politics of Last Names - The Atlantic
Last names are deeply personal, a kind of shorthand for expressing family bonds. But they’re also profoundly political, reflecting the machinations of governments in the countries that family has passed through over time. The latest example comes courtesy of Afghanistan, where officials are conducting the first nationwide census in three and a half decades—and confronting a major obstacle: names in the country are malleable, and many Afghans use only one. The government’s solution is to urge its people to take on surnames. “The remote, tribal nature of Afghan villages may have had something to do with the lack of surnames,” The New York Times recently noted. “So perhaps did the historic weakness of national governments, which have tended to require fixed names in the interest of keeping track of people, to draft them or tax them.”
9 weeks ago
Unemployed lumber worker goes with his wife to the bean harvest. Note social security number tattooed on his arm, Oregon, 1939 by Dorothea Lange. [1600x1195] : HistoryPorn
"Oregon, August 1939. "Unemployed lumber worker goes with his wife to the bean harvest. Note Social Security number tattooed on his arm."(And now a bit of Shorpy scholarship/detective work. A public records search shows that 535-07-5248 belonged to one Thomas Cave, born July 1912, died in 1980 in Portland. Which would make him 27 years old when this picture was taken.) Medium format safety negative by Dorothea Lange."

A search of the 1940 census finds his wife's name was Vivian (first wife it appears since his wife at the time of his death was Ann Kathryn Bloom. It also looks like he was employed in 1940, as it indicates on the census that he worked 48 hours the week of Mar 24-30, 1940.

9 weeks ago
The Myth Of The Actuary: Life Insurance And Frederick L. Hoffman's Race Traits And Tendencies Of The American Negro
In May 1896, Frederick L. Hoffman, a statistician at the Prudential Life Insurance Company, published a 330-page article in the prestigious Publications of the American Economic Association intended to prove—with statistical reliability—that the American Negro was uninsurable. Race Traits and Tendencies of the American Negro was a compilation of statistics, eugenic theory, observation, and speculation, solicited by the Prudential in response to a wave of state legislation banning discrimination against African Americans.

Race Traits immediately became a key text in one of the central social preoccupations of the turn of the century: the supposed Negro Problem. Numerous turn-of-the-century tracts (including Hoffman's) stipulated that minority racial groups were not only biologically inferior but also barriers to progress. Hoffman, a German immigrant, was one of the leading statisticians of his time and also a strong proponent of racial hierarchy and white supremacy.1 His application of mathematical tools to a social debate set a precedent for the use of statistics and actuarial science—two fields then in their infancies, which absorbed the biases and errors of their early participants. Though Race Traits was hailed by many as a work of genius, even in its own day critics attacked its racist premise and suppositions, noting that Hoffman's sources were problematical and his mathematical analysis flawed. Hoffman's work embedded racial ideologies within its approach to actuarial data, a legacy that remains with the field today.
data-analysis  dirty-data 
9 weeks ago
How Python does Unicode
As we all (hopefully) know by now, Python 3 made a significant change to how strings work in the language. I’m on the record as being strongly in favor of this change, and I’ve written at length about why I think it was the right thing to do. But for those who’ve been living under a rock the past ten years or so, here’s a brief summary, because it’s relevant to what I want to go into today:

In Python 2, two types could be used to represent strings. One of them, str, was a “byte string” type; it represented a sequence of bytes in some particular text encoding, and defaulted to ASCII. The other, unicode, was (as the name implies) a Unicode string type. Thus it did not represent any particular encoding (or did it? Keep reading to find out!). In Python 2, many operations allowed you to use either type, many comparisons worked even on strings of different types, and str and unicode were both subclasses of a common base class, basestring. To create a str in Python 2, you can use the str() built-in, or string-literal syntax, like so: my_string = 'This is my string.'. To create an instance of unicode, you can use the unicode() built-in, or prefix a string literal with a u, like so: my_unicode = u'This is my Unicode string.'.
unicode  python 
10 weeks ago
Billion-Dollar Weather and Climate Disasters: Table of Events | National Centers for Environmental Information (NCEI)

Below is a historical table of U.S. Billion-dollar disaster events, summaries, report links and statistics for the 1980–2017 period of record. In 2017 (as of July 7), there have been 9 weather and climate disaster events with losses exceeding $1 billion each across the United States. These events included 2 flooding events, 1 freeze event, and 6 severe storm events. Overall, these events resulted in the deaths of 57 people and had significant economic effects on the areas impacted.
10 weeks ago
Google's "Director of Engineering" Hiring Test

Recently, I have been interviewed over the phone by a Google recruiter. As I qualified for the (unsolicited) interview but failed to pass the test, this blog post lists the questions and the expected answers. That might be handy if Google calls you one day.
For the sake of the discussion, I started coding 37 years ago (I was 11 years old) and never stopped since then. Beyond having been appointed as R&D Director 24 years ago (I was 24 years old), among (many) other works, I have since then designed and implemented the most demanding parts of TWD's R&D projects* – all of them delivering commercial products:
google  interview-questions 
11 weeks ago
The Life of a South Central Statistic | The New Yorker
What sets the course of a life? Three years before my beloved cousin’s murder—before the weeping, before the raging, before the heated self-recriminations and icy reckonings—I awoke with the most glorious sense of anticipation I’ve ever felt. It was June 29, 2006, the day that Michael was going to be freed. Outside my vacation condo in Hollywood, I climbed into the old white BMW I’d bought from my mother and headed to my aunt’s small stucco home, in South Central. On the corner, a fortified drug house stood like a sentry, but her pale cottage seemed serene, aglow in the morning sun. Poverty never looks quite as bad in the City of Angels as it does elsewhere.
crime  judicial-system 
august 2017
Worldbuilding - Atomic Rockets

There is a grand tradition of scientifically minded science fiction authors creating not just the characters in their novels but also the brass tacks scientific details of the planets they reside on. This is the art and science of Worldbuilding.
august 2017
She Just Won 3 Gold Medals for Her Swimming. She’s Only 73. - The New York Times
“Our bodies are made for being used,” she said. “Physical fitness and activity improves brain function. Anyone who is keeping up physical activity — both the aerobic part, which is really important, and the strength and balance and flexibility — is reducing the risks and buffering the decline that is going on.”

For Mr. Cheek, the nation’s fastest 100-meter sprinter in his age group, there is “a pride and a mental discipline that carries over into your whole lifestyle,” he said. Consistent exercise, said Mr. Cheek, who is a part-time professor of social psychology at California State University, Fresno, allows you to have “a body that can perform for you any time you want.”
august 2017
zeeshanu/learn-regex: Learn regex the easy way
A regular expression is a pattern that is matched against a subject string from left to right. The word "Regular expression" is a mouthful, you will usually find the term abbreviated as "regex" or "regexp". Regular expression is used for replacing a text within a string, validating form, extract a substring from a string based upon a pattern match, and so much more.
regex  tutorial 
august 2017
We Trained A Computer To Search For Hidden Spy Planes. This Is What It Found.
From planes tracking drug traffickers to those testing new spying technology, US airspace is buzzing with surveillance aircraft operated for law enforcement and the military.
compciv  machine-learning  data-journalism 
august 2017
Economic diversity and student outcomes at Stanford University
The median family income of a student from Stanford is $167,500, and 66% come from the top 20 percent. About 2.2% of students at Stanford came from a poor family but became a rich adult.
august 2017
Troy Hunt: Passwords Evolved: Authentication Guidance for the Modern Era

In the beginning, things were simple: you had two strings (a username and a password) and if someone knew both of them, they could log in. Easy.
But the ecosystem in which they were used was simple too, for example in MIT's Time-Sharing Computer, considered to be the first computer system to use passwords:
security  password 
july 2017
What happened to Trump's war on data?
Straightforward as “data collection” may sound, in practice there’s often a strong political component to government data. What information to collect, about whom and how it’s collected are critical questions that don’t always have one objective answer. “Data is inherently political,” said Wonderlich. “And how it’s used depends on who’s collecting it and what they’re representing about the world.”

Sometimes, those questions fall on Congress, such as when lawmakers created the unemployment rate—as unbiased a statistic as exists today—in the 1930’s. According to a history of the U.S. Census, the unemployment rate was the subject of a fierce political fight upon its creation, including how often to collect data on unemployment and who would be counted as unemployed. President Herbert Hoover and his allies thought that the crude unemployment figures, which came from limited Bureau of Labor Statistics surveys and business reports at the time, were adequate measures during the Great Depression. Democrats and many labor economists disagreed and called for additional surveys to determine the true extent of unemployment—and the federal response necessary to alleviate it.
data  padjo 
july 2017
Startup Engineers and Our Mistakes with MongoDB
MongoDB got rave reviews for its usability. But other features mattered too when choosing a database for a growing startup.
mongodb  databases 
july 2017
Improving the Realism of Synthetic Images - Apple Machine Learning Journal
Most successful examples of neural nets today are trained with supervision. However, to achieve high accuracy, the training sets need to be large, diverse, and accurately annotated, which is costly. An alternative to labelling huge amounts of data is to use synthetic images from a simulator. This is cheap as there is no labeling cost, but the synthetic images may not be realistic enough, resulting in poor generalization on real test images. To help close this performance gap, we’ve developed a method for refining synthetic images to make them look more realistic. We show that training models on these refined images leads to significant improvements in accuracy on various machine learning tasks.
july 2017
The limitations of deep learning
The most surprising thing about deep learning is how simple it is. Ten years ago, no one expected that we would achieve such amazing results on machine perception problems by using simple parametric models trained with gradient descent. Now, it turns out that all you need is sufficiently large parametric models trained with gradient descent on sufficiently many examples. As Feynman once said about the universe, "It's not complicated, it's just a lot of it".

In deep learning, everything is a vector, i.e. everything is a point in a geometric space. Model inputs (it could be text, images, etc) and targets are first "vectorized", i.e. turned into some initial input vector space and target vector space. Each layer in a deep learning model operates one simple geometric transformation on the data that goes through it. Together, the chain of layers of the model forms one very complex geometric transformation, broken down into a series of simple ones. This complex transformation attempts to maps the input space to the target space, one point at a time. This transformation is parametrized by the weights of the layers, which are iteratively updated based on how well the model is currently performing. A key characteristic of this geometric transformation is that it must be differentiable, which is required in order for us to be able to learn its parameters via gradient descent. Intuitively, this means that the geometric morphing from inputs to outputs must be smooth and continuous—a significant constraint.

The whole process of applying this complex geometric transformation to the input data can be visualized in 3D by imagining a person trying to uncrumple a paper ball: the crumpled paper ball is the manifold of the input data that the model starts with. Each movement operated by the person on the paper ball is similar to a simple geometric transformation operated by one layer. The full uncrumpling gesture sequence is the complex transformation of the entire model. Deep learning models are mathematical machines for uncrumpling complicated manifolds of high-dimensional data.
deep-learning  python 
july 2017
Automatically generate beautiful visualizations from your data
And other bad ideas

I work in data visualization, a loosely defined and rapidly evolving field that is generally about taking data and turning it into something people can understand. There are a lot of different tools and disciplines involved in doing so, and it is impacting almost every field of study, industry and business. I believe it is essentially a new medium for communication, and we are still very much in the formative stages of its development.
july 2017
Cafe Cracks: Attacks on Unsecured Wireless Networks
Mobile users demand high connectivity in today's world, often at the price of security. Requiring Internet access at the airport, public buildings, and restaurants, users will easily sacrifice a secure connection for a fast and reliable one. By broadcasting rogue access points at these compromising locations, crackers can launch effective Man-in-the-Middle attacks. Our developed crack, Cafe Crack, provides a platform built from open source software for deploying rogue access points and sophisticated Man-in-the-Middle attacks. Built around the Untangle Server software, Cafe Crack allows the hacker to dynamically measure, monitor and redirect network traffic. This paper will provide an example of DNS spoofing using the Cafe Crack platform and then provide simple and effective protection techniques against harmful rogue AP attacks.
july 2017
Tracking Campaign Cash in Colorado - Columbia Journalism Review
It was a nightmare. You have to print out every single (independent expenditure) committee filing, and go through it by hand. You might have a committee that spent $800,000, and a lot of the money is in $200 increments spread over 20 races, and you have to add those with your little hand held calculator. There are 24 races in total across the state, so I had to add up for each committee what they spent on each race and then add all that up on both sides. It was a really time-consuming and tedious job.

We’re downloading the data onto our website, and one reason we’re doing that is because of the issues some people have had accessing the Secretary of State’s website—we think it’s a public service. We have the 2010 contributions up there, but hopefully in the next two months we’ll have up the 2010 expenditures, then the 2008 contributions and expenditures, and hopefully we’re going to be able to keep this going for the year.
campaign-finance  journalism 
july 2017
18F Content Guide - Introduction
How to plan, write, and manage content at 18F.
july 2017
« earlier      
a-b-testing academic advice ai algorithms amazon analysis analytics angularjs animation api apis apple apps architecture art article aws backbone bash bayesian best big-data bioinformatics book bots business c caching campaign-finance census cheatsheet cli clinicaltrials clojure code colors command-line compciv compilers computer computer-science computer-vision computing course crime crypto css d3 data data-analysis data-journalism data-mining data-munging data-science data-sharing data-visualization database databases datajournalism datasets ddj death-data debugging deep-learning deployment design design-example devops digital-humanities django drugs education elections email engineering essay excel facebook fakenews finance flux foia framework funny game game-dev games gaming git github golang google government graphics guide hack hacking hadoop hardware hash haskell health history howto html html5 http image-processing infographic interactive interesting internet introduction investigations ios java javascript journalism jquery json judicial-system language learning linux lisp mac machine-learning map-reduce mapping maps marketing math medicine mobile mongodb music mysql naming-things netsec network neural-networks news nlp nodejs nosql nyc nylist object-oriented ocr oop open-data opencv optimization osx padjo pandas papers patterns performance photography police politics postgres prisons privacy programming publicrecords punctuation python r rails react reactjs reference regex research ruby rust scalability science scraping search security semitechnical seo server server-ops shell spam spreadsheets sql sqlite standards startups statistics style-guide syllabus tdd teaching tensorflow testing text text-mining tools transparency tutorial twitter typography ui unicode unix ux video vim visualizations web web-design web-development web-scraping writing wtfviz

Copy this bookmark: