An acclaimed crime reporter leaves her newsroom for police work - Columbia Journalism Review
Part of the answer, the Blade had heard, could be found on a secret police department map that charted the neighborhoods controlled by each of the city’s gangs. In the summer of 2012, after a little more than a year on the crime beat, Dungjen asked the department for a copy. Publishing the map, she thought, would give readers a sense of how big the city’s gang problem was. But the police refused, arguing that the document was involved in ongoing investigations. To get the map, the Blade sued the department, invoking Ohio’s Public Records Act.

Months passed. The Blade’s publisher, John Block, was confident that the newspaper would eventually win its legal battle, but he was growing increasingly frustrated with the slow pace of the judicial process. “One day, I said, ‘I’m tired of waiting,’” Block recalls. “‘Why don’t we just go out on the street and figure out what gangs are operating where?’”

Blade editors assigned Dungjen and photographer Amy Voigt to the story. For the next three months, the Blade took the two journalists off daily assignments so they could focus all their energy on the gang-map project. The first few days were demoralizing. Voigt and Dungjen didn’t have a plan. They drove aimlessly through gang-controlled neighborhoods, asking locals for information.
15 hours ago
Here’s How Scammers Are Using Fake News To Screw With Bitcoin Investors
So they bought. And they bought without noticing the additional "L" in the Twitter username or the missing verification check that distinguished the bogus McAfee account from the real one, @OfficialMcAfee. When the tweet was first broadcast at around 3 p.m. ET, GVT was bought and sold on the market at $30. By 3:04, it was at $45, and trading volume had doubled.
4 days ago
It is *not* possible to detect and block Chrome headless
A few months back, I wrote a popular article called Making Chrome Headless Undetectable in response to one called Detecting Chrome Headless by Antione Vastel. The one thing that I was really trying to get across in writing that is that blocking site visitors based on browser fingerprinting is an extremely user-hostile practice. There are simply so many variations in browser configurations that you’re inevitably going to end up blocking non-automated access to your website, and–on top of that–you’re really not accomplishing anything in terms of blocking sophisticated web scrapers. To illustrate this, I showed how to bypass all of the suggested “tests” in Antione’s first post and pointed out that they hadn’t been tested in multiple browser versions and would fail for any users with beta or unstable Chrome builds.
chrome  headless-browser  testing  web-scraping 
5 days ago
Hawaii missile alert: How one employee ‘pushed the wrong button’ and caused a wave of panic - The Washington Post
Around 8:05 a.m., the Hawaii emergency employee initiated the internal test, according to a timeline released by the state. From a drop-down menu on a computer program, he saw two options: “Test missile alert” and “Missile alert.” He was supposed to choose the former; as much of the world now knows, he chose the latter, an initiation of a real-life missile alert.

“In this case, the operator selected the wrong menu option,” HEMA spokesman Richard Rapoza told The Washington Post on Sunday.
design  ux  typo  human-error 
9 days ago
Moira Donegan: I Started the Media Men List
In October, I created a Google spreadsheet called “Shitty Media Men” that collected a range of rumors and allegations of sexual misconduct, much of it violent, by men in magazines and publishing. The anonymous, crowdsourced document was a first attempt at solving what has seemed like an intractable problem: how women can protect ourselves from sexual harassment and assault.

One long-standing partial remedy that women have developed is the whisper network, informal alliances that pass on open secrets and warn women away from serial assaulters. Many of these networks have been invaluable in protecting their members. Still, whisper networks are social alliances, and as such, they’re unreliable. They can be elitist, or just insular. As Jenna Wortham pointed out in The New York Times Magazine, they are also prone to exclude women of color. Fundamentally, a whisper network consists of private conversations, and the document that I created was meant to be private as well. It was active for only a few hours, during which it spread much further and much faster than I ever anticipated, and in the end, the once-private document was made public — first when its existence was revealed in a BuzzFeed article by Doree Shafrir, then when the document itself was posted on Reddit.
13 days ago
The #MeToo Movement Has Worked - Bloomberg
Conor Sen: As a fan of American film, I'm worried about where the industry is going. DVD sales have been declining for years, and streaming revenue looks unlikely to ever replace them. The rise of moviegoers in China has created an incentive for Hollywood to make movies that appeal to Chinese audiences as well as Western ones. There are so many entertainment options now, and marketing costs have become so astronomical, that Hollywood has decided to play it safe and focus on large, well-established franchises for movies and sequels. Put all this together and you get a lot of movies like the latest Transformers installment, which dominated the Chinese box office but barely registered in the U.S.
13 days ago
Facebook’s Virtual Assistant M Is Dead. So Are Chatbots | WIRED
That’s because most of the tasks fulfilled by M required people. Facebook’s goal with M was to develop artificial-intelligence technology that could automate almost all of M’s tasks. But despite Facebook’s vast engineering resources, M fell short: One source familiar with the program estimates M never surpassed 30 percent automation. Last spring, M’s leaders admitted the problems they were trying to solve were more difficult than they’d initially realized.
bots  facebook  automation 
13 days ago
We Used Broadband Data We Shouldn’t Have — Here’s What Went Wrong | FiveThirtyEight
Over the summer, FiveThirtyEight published two stories on broadband internet access in the U.S. that were based on a data set made public by academic researchers who had acquired data from Catalist, a well-known political data firm. After further reporting, we can no longer vouch for the academics’ data set. The preponderance of evidence we’ve collected has led us to conclude that it is fundamentally flawed. That’s because:

The academics’ data does not provide an accurate picture of broadband use at the county level relative to other sources.
Some of the data that the academic researchers received from Catalist originated with a third-party commercial source, and Catalist acknowledged that it did not vet that data itself. The researchers and Catalist also disagree about what Catalist said the data represents and what it could be used for.
retractions  methodology  dirty-data  data-journalism 
13 days ago
Unfreed | The Marshall Project
Orman called the DOC at once. He learned not only that Lima-Marin was free, but that he’d been out more than five years, completing his parole. He checked the state court’s computer system and noticed a strange phrase tacked on to each of Lima-Marin’s eight convictions: “No Consecutive/Concurrent Sentences.” Orman wondered if someone else might have been as confused by that phrase as he was, and decided that his sentences were concurrent.
justice  typo  punctuation 
14 days ago
The Washington Post experiments with automated storytelling to help power 2016 Rio Olympics coverage - The Washington Post

The Washington Post will leverage artificial intelligence technology to report key information from the 2016 Rio Olympics, including results of medal events. “Heliograf,” which was developed in-house, automatically generates short multi-sentence updates for readers. These updates will appear in The Post’s live blog, on Twitter at @WPOlympicsbot, and are accessible via The Post’s Olympics skill on Alexa-enabled devices and The Post’s bot for Messenger.

“Automated storytelling has the potential to transform The Post’s coverage. More stories, powered by data and machine learning, will lead to a dramatically more personal and customized news experience,” said Jeremy Gilbert, director of strategic initiatives at The Washington Post. “The Olympics are the perfect way to prove the potential of this technology. In 2014, the sports staff spent countless hours manually publishing event results. Heliograf will free up Post reporters and editors to add analysis, color from the scene and real insight to stories in ways only they can.”
bots  automated-writing 
18 days ago
How One Major Internet Company Helps Serve Up Hate on the — ProPublica
The widespread use of Cloudflare’s services by racist groups is not an accident. Cloudflare has said it is not in the business of censoring websites and will not deny its services to even the most offensive purveyors of hate.

“A website is speech. It is not a bomb,” Cloudflare’s CEO Matthew Prince wrote in a 2013 blog post defending his company’s stance. “There is no imminent danger it creates and no provider has an affirmative obligation to monitor and make determinations about the theoretically harmful nature of speech a site may contain.”
internet  caching  censorship  compciv-2018 
19 days ago
My Life as a New York Times Reporter in the Shadow of the War on Terror
What angered me most was that while they were burying my skeptical stories, the editors were not only giving banner headlines to stories asserting that Iraq had weapons of mass destruction, they were also demanding that I help match stories from other publications about Iraq’s purported WMD programs. I grew so sick of this that when the Washington Post reported that Iraq had turned over nerve gas to terrorists, I refused to try to match the story. One mid-level editor in the Washington bureau yelled at me for my refusal. He came to my desk carrying a golf club while berating me after I told him that the story was bullshit and I wasn’t going to make any calls on it.
journalism  best 
20 days ago
Feeding the Machine: Policing, Crime Data, & Algorithms by Elizabeth Joh :: SSRN
Discussions of predictive algorithms used by the police tend to assume the police are merely end users of big data. Accordingly, police departments are consumers and clients of big data -- not much different than users of Spotify, Netflix, Amazon, or Facebook. Yet this assumption about big data policing contains a flaw. Police are not simply end users of big data. They generate the information that big data programs rely upon. This essay explains why predictive policing programs can’t be fully understood without an acknowledgment of the role police have in creating its inputs. Their choices, priorities, and even omissions become the inputs algorithms use to forecast crime. The filtered nature of crime data matters because these programs promise cutting edge results, but may deliver analyses with hidden limitations.
algorithms  policing  compciv 
20 days ago
How Facebook’s news feed algorithm works.
Alison steers me through a maze of cubicles and open minikitchens toward a small conference room, where he promises to demystify the Facebook algorithm’s true nature. On the way there, I realize I need to use the bathroom and ask for directions. An involuntary grimace crosses his face before he apologizes, smiles, and says, “I’ll walk you there.” At first I think it’s because he doesn’t want me to get lost. But when I emerge from the bathroom, he’s still standing right outside, and it occurs to me that he’s not allowed to leave me unattended.   
algorithms  facebook  best  compciv 
5 weeks ago
Get data on nonfatal and fatal police shootings in the 50 largest U.S. police departments – VICE News
I'm deeply appreciative of the fine folks at VICE who decided that putting all the standardized police shooting data they meticulously collected in a downloadable format was a good idea. It was.

VICE News spent nine months collecting data on both fatal and nonfatal police shootings from the 50 largest local police departments in the United States. For every person shot and killed by cops in these departments from 2010 through 2016, we found, police shot at two more people who survived. We also found that 20 percent of the people cops fired on were unarmed.

We’re making the data public so that others can explore it too. Find your local police department and download the data below. And if you use our data, we’d love to hear about it. Let us know on Twitter, Facebook, or Instagram, or email policeshootings@vice.com.
dataset  investigations  compciv  collaboration  policing  justice 
5 weeks ago
Death & Dysfunction | An NJ.com Special Investigation
Hey listers,

We published an 18-month data investigation into our state medical examiner system I wanted to share:


We fought for months to acquire a database of all 420,000 cases referred to NJ medical examiners over a 20 year period. Analysis revealed a system that’s on the brink of collapse. Our reporting lead us to cases of missing body parts, potential child murders going without investigation, innocent people languishing in jail and major lapses/conflicts of interest in police involved shooting investigations.

We’ll be posting the data, as well as the replication analysis my colleague did to check my work on data.world in the coming days. We’ll also post our code to github once I clean it up and make it readable for humans.

If you like it, please share and we would love any feedback.
data-journalism  nicar  investigations 
5 weeks ago
The Trouble with Bias - NIPS 2017 Keynote - Kate Crawford #NIPS2017 - YouTube
Kate Crawford is a leading researcher, academic and author who has spent the last decade studying the social implications of data systems, machine learning and artificial intelligence. She is a Distinguished Research Professor at New York University, a Principal Researcher at Microsoft Research New York, and a Visiting Professor at the MIT Media Lab. https://twitter.com/omojumiller/status/940824325107736576
video  AI  compciv 
5 weeks ago
Under Trump, E.P.A. Has Slowed Actions Against Polluters, and Put Limits on Enforcement Officers - The New York Times
The Times built a database of civil cases filed at the E.P.A. during the Trump, Obama and Bush administrations. During the first nine months under Mr. Pruitt’s leadership, the E.P.A. started about 1,900 cases, about one-third fewer than the number under President Barack Obama’s first E.P.A. director and about one-quarter fewer than under President George W. Bush’s over the same time period.
data-journalism  investigations 
6 weeks ago
Official Toll in Puerto Rico: 62. Actual Deaths May Be 1,052. - The New York Times
. A review by The New York Times of daily mortality data from Puerto Rico’s vital statistics bureau indicates a significantly higher death toll after the hurricane than the government there has acknowledged.

The Times’s analysis found that in the 42 days after Hurricane Maria made landfall on Sept. 20 as a Category 4 storm, 1,052 more people than usual died across the island. The analysis compared the number of deaths for each day in 2017 with the average of the number of deaths for the same days in 2015 and 2016.

Officially, just 62 people died as a result of the storm that ravaged the island with nearly 150-mile-an-hour winds, cutting off power to 3.4 million Puerto Ricans. The last four fatalities were added to the death toll on Dec. 2.
best  data-journalism  padjo  compciv  death-data  investigations 
6 weeks ago
ClaimReview schema

Instead of pool days and part-time jobs, Sreya Guha spends her summers with lines and lines of code.

A senior at the Castilleja high school in Palo Alto, California, Guha has spent the past two summers creating software. Her most recent project, Related Fact Checks, lets internet users paste article links and search to see if that topic has been already debunked by a fact-checking organization.

The platform isn’t your typical class project — it’s one of the best uses of existing technology to combat online misinformation, several fact-checking experts told Poynter.
metadata  scheme  compciv 
6 weeks ago
for real? they're gonna get him on "track changes"?
Mueller court filing includes Microsoft Word documents showing edits said to have been made by Manafort --->
metadata  compciv 
6 weeks ago
This Israeli Presentation on How to Make Drone Strikes More “Efficient” Disturbed Its Audience
ZAK OPENED HIS presentation with a startling statement that must have, somehow, felt matter-of-fact:

It has been said that in the upcoming round of combat, for example, the Israel Air Force will knock down some 1,000 buildings or more, so anyone who goes into Gaza won’t even be able to identify what he thought he should be able to see there.

Herein lies the problem confronting Israeli’s high-tech air power, as Zak’s team sees it: What happens when you’ve so devastated an urban area that it’s no longer recognizable? How will you navigate, for the purposes of killing and destruction, a place that you’ve been transforming by said killing and destruction? Therein lies a main problem of drone warfare, relying heavily on sensor-laden robots that are still operated by humans with finite memories and with visual processing easily confused by rubble and ruin. This is where Zak’s research comes in. He explained in his remarks that the goal of his research was “at the end of the day, to improve the efficiency of unmanned drone operators in the army in their missions.”

Zak then described the work environment of the drone operator, who has video from the aircraft and a map, typically with some sort of overlay, which might show existing forces. “What he does not have,” Zak said, “is some sort of aggregate information about past missions.”
compciv  algorithm-society 
6 weeks ago
Collecting Data on Amazon Mechanical Turk (AMT)
This page contains details of several interfaces I wrote for collecting image annotations using Amazon Mechanical Turk. These jobs can be launched using the 'external hit' specification of the AMT command line interface. You can donwload the command line interaface for AMT for the unix platform here. Here is the list of interfaces written mostly in Java/Javascript using the canvas tag and relies on url-encoding to pass parameters such as image names, etc., to the GUI. To view the code for the interfaces, you can view-source in any modern browser.
compciv  mechturk  automation  aws 
6 weeks ago
Douglas W. Jones on Bookbinding
This tutorial on bookbinding is oriented towards the preservation of the contents of decaying pulp paperbacks; the first step in this process involves photocopying the decaying book, but most of this applies equally well to making up limited editions based on photocopies of manuscript pages or typewritten material

Assuming you are starting with a decaying paperback, you should ask if you really want to destroy the original! It is very difficult to photocopy an old paperback without destroying what is left of the binding, so it is worth asking if the book can be preserved by other means, for example, by neutralizing the acid in the paper.

If the paper in the book's pages breaks when creased and then reverse creased, the paper is beyond saving. For example, if dog eared corners of pages tend to fall off when they are unfolded or reverse folded, the paper is too brittle to save by any means other than plastic or tissue paper lamination. In the case of the worst of pulp paperbacks, it may only take a decade or two for the paper to reach this state of decay.

Assuming that you have decided to sacrifice the book to be photocopied, you can produce a limited edition of the book on archival paper. With a proper binding and modest care in storage, this should last for centuries.

I don't recommend undertaking this project more than once for any particular book! It is hard work! Read this whole report before trying it yourself. If anyone else has already done the job, you may be able to cut your effort in half if they saved an unbound photocopy that you can copy and bind.
6 weeks ago
Police officers prosecuted for use of deadly force - Washington Post
In 80 percent of the cases, at least one of the following occurred: the victim was shot in the back, there was a video recording of the incident, other officers testified against the shooter or a coverup was alleged.

best  data-journalism  investigations  compciv  padjo  project-idea  policing 
6 weeks ago
How Effective Is Your School District? A New Measure Shows Where Students Learn the Most - The New York Times
CHICAGO — In the Chicago Public Schools system, enrollment has been declining, the budget is seldom enough, and three in four children come from low-income homes, a profile that would seemingly consign the district to low expectations. But students here appear to be learning faster than those in almost every other school system in the country, according to new data from researchers at Stanford.

The data, based on some 300 million elementary-school test scores across more than 11,000 school districts, tweaks conventional wisdom in many ways. Some urban and Southern districts are doing better than data typically suggests. Some wealthy ones don’t look that effective. Many poor school systems do.
data-visualization  education 
7 weeks ago
Google’s New AI Smile Detector Shows How Embracing Race and Gender Can Reduce Bias - MIT Technology Review
A new paper published in arXiv by Google researchers has improved upon state-of-the-art smile detection algorithms by including and training racial and gender classifiers in their model. The racial classifier was trained on four race subgroups (Asian, black, Hispanic, and white) and two for gender.

Their method got to nearly 91 percent accuracy at detecting smiles in the Faces of the World (FotW) data set, a set of 13,000 images of faces collected from the Web that is sometimes used as a benchmark for such algorithms. That represents an improvement of a little over 1.5 percent from the previous mark. The results showed an overall improved accuracy across the board, showing that paying attention to race and gender can yield better results than trying to build an algorithm that is “color blind.”
AI  computer-vision  classifiers  compciv 
7 weeks ago
Why a Generation in Japan Is Facing a Lonely Death - The New York Times
She had been lonely every day for the past quarter of a century, she said, ever since her daughter and husband had died of cancer, three months apart. Mrs. Ito still had a stepdaughter, but they had grown apart over the decades, exchanging New Year’s cards or occasional greetings on holidays.
best  human-interest 
7 weeks ago
The Girl in the Window, 10 years later | Features | Tampa Bay Times
In 2007, a feral child was found starving, covered in her own filth, unable to walk or talk. A new family took in the girl, called her Dani, and tried to make up for years of neglect.
human-interest  best 
7 weeks ago
USDA:NRCS:Geospatial Data Gateway:Home
The Geospatial Data Gateway (GDG) provides access to a map library of over 100 high resolution vector and raster layers in the Geospatial Data Warehouse. It is the One Stop Source for environmental and natural resources data, at any time, from anywhere, to anyone. It allows you to choose your area of interest, browse and select data, customize the format, then review and download.
gis  data 
8 weeks ago
More than a Million Pro-Repeal Net Neutrality Comments were Likely Faked
I used natural language processing techniques to analyze net neutrality comments submitted to the FCC from April-October 2017, and the results were disturbing.
politics  nlp 
8 weeks ago
ProPublica Illinois Q&A: Meet Data Reporter Sandhya… — ProPublica
I organize my life with spreadsheets because why not? I actually have several spreadsheets for general life usage. I have a spreadsheet of all the things that I have in my apartment for when I move. I have a list of all the places I’ve visited, a list of all the flights I’ve taken and a general packing list so that every time I don’t have to create a new file. I also have a spreadsheet for all the ice cream places I visit. And I have a list of all the books that I have read in the last year. Spreadsheets are great.
9 weeks ago
Remove the legend to become one — Remains of the Day
When I started my first job at Amazon.com, as the first analyst in the strategic planning department, I inherited the work of producing the Analytics Package. I capitalize the term because it was both a serious tool for making our business legible, and because the job of its production each month ruled my life for over a year.

Back in 1997, analytics wasn't even a real word. I know because I tried to look up the term, hoping to clarify just I was meant to be doing, and I couldn't find it, not in the dictionary, not on the internet. You can age yourself by the volume of search results the average search engine returned when you first began using the internet in force. I remember when pockets of wisdom were hidden in eclectic newsgroups, when Yahoo organized a directory of the web by hand, and later when many Google searches returned very little, if not nothing. Back then, if Russians wanted to hack an election, they might have planted some stories somewhere in rec.arts.comics and radicalized a few nerds, but that's about it.
programming  data-visualization 
9 weeks ago
Data organization in spreadsheets: The American Statistician: Vol 0, No ja

Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this paper offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, don't leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, don't include calculations in the raw data files, don't use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text file.
spreadsheets  best 
9 weeks ago
A Guide to Natural Language Processing - Federico Tomassetti - Software Architect
Natural Language Processing (NLP) comprises a set of techniques that can be used to achieve many different objectives. Take a look at the following table to figure out which technique can solve your particular problem.
machine-learning  advice  howto  nlp 
9 weeks ago
Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning | Dropbox Tech Blog
In this post we will take you behind the scenes on how we built a state-of-the-art Optical Character Recognition (OCR) pipeline for our mobile document scanner. We used computer vision and deep learning advances such as bi-directional Long Short Term Memory (LSTMs), Connectionist Temporal Classification (CTC), convolutional neural nets (CNNs), and more. In addition, we will also dive deep into what it took to actually make our OCR pipeline production-ready at Dropbox scale.
ocr  best 
10 weeks ago
(1) Making NLP work for Investigative Journalism - YouTube
Speaker: Jonathan Stray, Research Scholar, Columbia Journalism School

Presented at the Berkeley Institute for Data Science on Thursday, November 9, 2017. (Note: Due to an equipment error, the first few seconds of this video are missing. Our apologies for the inconvenience.)
10 weeks ago
1-day numpy training Numpy exercises
training page from facebook research engineer Matthijs Douze
numpy  python  tutorial 
10 weeks ago
Connecting with the Dots - Learning - Source: An OpenNews project
To illustrate what I mean, here is a similar chart posted to a clickbait Twitter account called @BrilliantMaps of all the car bombings in Baghdad since 2003. It wasn’t originally clear who made this map–lack of attribution is common for these kind of accounts–but the contrast between this map and the Times and Guardian interactives mentioned above is glaring. The problem is that this map is not only wrong, it’s also terrible. Gawker figured out the origins of this map and discovered that it was actually derived from Guardian data of all fatalities in Baghdad from 2003 to 2009, including accidents, so it exaggerated the data. Brilliant Maps later issued a correction, but still got it wrong. I’m beginning to think clickbait twitter accounts aren’t entirely reliable.
data-visualization  mapping  best 
11 weeks ago
USS McCain collision ultimately caused by UI confusion | Ars Technica UK
On November 1, the US Navy issued its report on the collisions of the USS Fitzgerald and USS John S. McCain this summer. The Navy's investigation found that both collisions were avoidable accidents. And in the case of the USS McCain, the accident was in part caused by an error made in switching which control console on the ship's bridge had steering control. While the report lays the blame on training, the user interface for the bridge's central navigation control systems certainly played a role.

With the USS McCain collision, even Navy tech can’t overcome human shortcomings
According to the report, at 5:19am local time, the commanding officer of the McCain, Commander Alfredo J. Sanchez, "noticed the Helmsman (the watchstander steering the ship) having difficulty maintaining course while also adjusting the throttles for speed control." Sanchez ordered the watch team to split the responsibilities for steering and speed control, shifting control of the throttle to another watchstander's station—the lee helm, immediately to the right (starboard) of the Helmsman's position at the Ship’s Control Console. While the Ship's Control Console has a wheel for manual steering, both steering and throttle can be controlled with trackballs, with the adjustments showing up on the screens for each station.

However, instead of switching just throttle control to the Lee Helm station, the Helmsman accidentally switched all control to the Lee Helm station. When that happened, the ship's rudder automatically moved to its default position (amidships, or on center line of the ship). The helmsman had been steering slightly to the right to keep the ship on course in the currents of the Singapore Strait, but the adjustment meant the ship started drifting off course.

The bridge layout on the McCain, with watchstations labeled. The ship should have had its full "sea and anchor" detail on watch.
The Ship Control Console of the McCain, with helm (at left) and lee helm (at right) stations. Both have trackballs to enter commands into the console through control screens.
As the McCain watch team scrambled to figure out what was going on, the ship overtook and steered across the bow of the Alnic MC.
design  ui  ux 
11 weeks ago
The Bots That Are Changing Politics - Motherboard
A taxonomy of politibots, a swelling force in global elections that cannot be ignored.

Editor's note: This essay is drawn from discussions and writings around a June 2017 convening organized and led by Samuel Woolley, Research Director of the new DigIntel Lab at the Institute for the Future, alongside fellow bot experts* Renee DiResta, John Little, Jonathon Morgan, Lisa Maria Neudert, and Ben Nimmo. The symposium was held at Jigsaw, the Google / Alphabet think-tank and technology incubator. Disclosure: Jigsaw provided space, funded Woolley as a (former) research fellow, and covered travel costs.

BBots and their cousins—botnets, bot armies, sockpuppets, fake accounts, sybils, automated trolls, influence networks—are a dominant new force in public discourse.

You may have heard that bots can be used to threaten activists, swing elections, and even engage in conversation with the President. Bots are the hip new media; Silicon Valley has marketed the chatbot as the next technological step after the app. Donald Trump himself has said he wouldn't have won last November without Twitter, where, researchers found, bots massively amplified his support on the platform.

Scholars have argued that nearly 50 million accounts on Twitter are actually automatically run by bot software. On Facebook, social bots—accounts run by automated software that mimic real users or work to communicate particular information streams—can be used to automate group pages and spread political advertisements. Recent public revelations from Facebook reveal that a Russian "troll farm" with close ties to the Kremlin spent around $100,000 on ads ahead of the 2016 US election and produced thousands of organic posts that spread across Facebook and Instagram. The same firm, the Internet Research Agency, has been known to make widespread use of bots in its attempts to manipulate public opinion over social media.
bots  compciv  twitter  social-media 
11 weeks ago
Inside The Great Poop Emoji Feud
“Organic waste isn’t cute,” Everson wrote, aghast that the technical committee would even deign to consider additional excremoji. “It is bad enough that the [Emoji Subcommittee] came up with it, but it beggars belief that the [Unicode Technical Committee] actually approved it,” he wrote. Everson continued:

“The idea that our 5 committees would sanction further cute graphic characters based on this should embarrass absolutely everyone who votes yes on such an excrescence. Will we have a CRYING PILE OF POO next? PILE OF POO WITH TONGUE STICKING OUT? PILE OF POO WITH QUESTION MARKS FOR EYES? PILE OF POO WITH KARAOKE MIC? Will we have to encode a neutral FACELESS PILE OF POO?”
emoji  unicode  language 
11 weeks ago
When Patents Attack... Part Two! | Transcript | This American Life
Chris Crawford then launches into a rather surprising explanation for how the sentencing that this business was, quote, "Jack Byrd's idea," doesn't actually mean that the business was Jack Byrd's idea. His explanation? He was using the apostrophe S incorrectly.
11 weeks ago
The suspect told police ‘give me a lawyer dog.’ The court says he wasn’t asking for a lawyer. - The Washington Post
But that’s not how the courts in Louisiana see it. And when a suspect in an interrogation told detectives to “just give me a lawyer dog,” the Louisiana Supreme Court ruled that the suspect was, in fact, asking for a “lawyer dog,” and not invoking his constitutional right to counsel. It’s not clear how many lawyer dogs there are in Louisiana, and whether any would have been available to represent the human suspect in this case, other than to give the standard admonition in such circumstances to simply stop talking.
punctuation  crime  justice 
11 weeks ago
The Improbable Origins of PowerPoint - IEEE Spectrum
Walking into the hall to deliver the speech was a “daunting experience,” the speaker later recalled, but “we had projectors and all sorts of technology to help us make the case.” The technology in question was PowerPoint, the presentation software produced by Microsoft. The speaker was Colin Powell, then the U.S. Secretary of State.

Powell’s 45 slides displayed snippets of text, and some were adorned with photos or maps. A few even had embedded video clips. During the 75-⁠minute speech, the tech worked perfectly. Years later, Powell would recall, “When I was through, I felt pretty good about it.”
powerpoint  history 
11 weeks ago
Information Is Power
It was 1969, and the American War on Vietnam seemed unending. Mass outrage over the war had spilled into the nation’s streets and campuses — outrage over the rising heap of body bags returning home, over the neverending spree of bombs that barreled down from US planes onto rural villages, with the images of fleeing families, their skin seared by napalm, broadcast across the world.

Hundreds of thousands of people had begun to resist the war. The fall of 1969 saw the historic Moratorium protests, the largest protests in US history.
11 weeks ago
A Minimalist Guide to SQLite
SQLite is a self-contained, serverless SQL database. Dr. Richard Hipp, the creator of SQLite, first released the software on the 17th of August, 2000. Since then it has gone on to be the second most deployed piece of software in the world. It's used in systems as important as the Airbus A350 so it comes as no surprise the tests for SQLite 3 are aviation-grade. The software itself is very small, the amd64 Debian client and library package is 765 KB when compressed for distribution and 2.3 MB when fully installed. The software is licensed under a very promiscuous license: Public Domain.
database  guide  sql  sqlite  tutorial 
11 weeks ago
How We Found Tom Price’s Private Jets - POLITICO Magazine
That meant we had to recreate Price’s schedule from scratch if we were to have any hope of matching his trips to chartered flights. We reviewed the HHS summaries of Price’s meetings. We scoured news sites for reports of Price speeches outside Washington. We obsessively tracked his appearances on social media. Putting all this information together, we built a database of Price’s trips.
data-journalism  investigations 
october 2017
California Regulators Require Auto Insurers to Adjust Rates
The state changed its approach in response to ProPublica’s finding that minority neighborhoods were paying higher premiums than white areas with the same risk.
compciv  algorithms  transparency  best 
september 2017
A Brief History of Religion and the U.S. Census | Pew Research Center
The U.S. Census Bureau has not asked questions about religion since the 1950s, but the federal government did gather some information about religion for about a century before that. Starting in 1850, census takers began asking a few questions about religious organizations as part of the decennial census that collected demographic and social statistics from the general population as well as economic data from business establishments. Federal marshals and assistant marshals, who acted as census takers until after the Civil War, collected information from members of the clergy and other religious leaders on the number of houses of worship in the U.S. and their respective denominations, seating capacities and property values. Although the census takers did not interview individual worshipers or ask about the religious affiliations of the general population, they did ask members of the clergy to identify their denomination – such as Methodist, Roman Catholic or Old School Presbyterian. The 1850 census found that there were 18 principal denominations in the U.S.
census  data-analysis 
september 2017
The Politics of Last Names - The Atlantic
Last names are deeply personal, a kind of shorthand for expressing family bonds. But they’re also profoundly political, reflecting the machinations of governments in the countries that family has passed through over time. The latest example comes courtesy of Afghanistan, where officials are conducting the first nationwide census in three and a half decades—and confronting a major obstacle: names in the country are malleable, and many Afghans use only one. The government’s solution is to urge its people to take on surnames. “The remote, tribal nature of Afghan villages may have had something to do with the lack of surnames,” The New York Times recently noted. “So perhaps did the historic weakness of national governments, which have tended to require fixed names in the interest of keeping track of people, to draft them or tax them.”
september 2017
Unemployed lumber worker goes with his wife to the bean harvest. Note social security number tattooed on his arm, Oregon, 1939 by Dorothea Lange. [1600x1195] : HistoryPorn
"Oregon, August 1939. "Unemployed lumber worker goes with his wife to the bean harvest. Note Social Security number tattooed on his arm."(And now a bit of Shorpy scholarship/detective work. A public records search shows that 535-07-5248 belonged to one Thomas Cave, born July 1912, died in 1980 in Portland. Which would make him 27 years old when this picture was taken.) Medium format safety negative by Dorothea Lange."

A search of the 1940 census finds his wife's name was Vivian (first wife it appears since his wife at the time of his death was Ann Kathryn Bloom. It also looks like he was employed in 1940, as it indicates on the census that he worked 48 hours the week of Mar 24-30, 1940.

september 2017
The Myth Of The Actuary: Life Insurance And Frederick L. Hoffman's Race Traits And Tendencies Of The American Negro
In May 1896, Frederick L. Hoffman, a statistician at the Prudential Life Insurance Company, published a 330-page article in the prestigious Publications of the American Economic Association intended to prove—with statistical reliability—that the American Negro was uninsurable. Race Traits and Tendencies of the American Negro was a compilation of statistics, eugenic theory, observation, and speculation, solicited by the Prudential in response to a wave of state legislation banning discrimination against African Americans.

Race Traits immediately became a key text in one of the central social preoccupations of the turn of the century: the supposed Negro Problem. Numerous turn-of-the-century tracts (including Hoffman's) stipulated that minority racial groups were not only biologically inferior but also barriers to progress. Hoffman, a German immigrant, was one of the leading statisticians of his time and also a strong proponent of racial hierarchy and white supremacy.1 His application of mathematical tools to a social debate set a precedent for the use of statistics and actuarial science—two fields then in their infancies, which absorbed the biases and errors of their early participants. Though Race Traits was hailed by many as a work of genius, even in its own day critics attacked its racist premise and suppositions, noting that Hoffman's sources were problematical and his mathematical analysis flawed. Hoffman's work embedded racial ideologies within its approach to actuarial data, a legacy that remains with the field today.
data-analysis  dirty-data 
september 2017
« earlier      
a-b-testing academic advice ai algorithms amazon analysis analytics angularjs animation api apis apple apps architecture art article automation aws backbone bash bayesian best big-data bioinformatics book bots business c caching campaign-finance census cheatsheet cli clinicaltrials clojure code colors command-line compciv compilers computer computer-science computer-vision computing course crime crypto css d3 data data-analysis data-journalism data-mining data-munging data-science data-sharing data-visualization database databases datajournalism dataset datasets ddj death-data debugging deep-learning deployment design design-example devops digital-humanities django drugs education elections email engineering essay excel facebook fakenews finance flux foia framework funny game game-dev games gaming git github golang google government graphics guide hack hacking hadoop hardware hash haskell health history howto html html5 http image-processing infographic interactive interesting internet introduction investigations ios java javascript journalism jquery json justice language learning linux lisp mac machine-learning map-reduce mapping maps marketing math medicine mobile mongodb music mysql naming-things netsec network neural-networks news nlp nodejs nosql nyc nylist object-oriented ocr oop open-data opencv optimization osx padjo pandas papers patterns performance photography policing politics postgres prisons privacy programming publicrecords punctuation python r rails react reactjs reference regex research ruby rust scalability science scraping search security semitechnical seo server server-ops shell spam spreadsheets sql sqlite standards startups statistics style-guide syllabus tdd teaching tensorflow testing text text-mining tools transparency tutorial twitter typography ui unicode unix ux video vim visualizations web web-design web-development web-scraping writing wtfviz

Copy this bookmark: