2930
database - Relational table naming convention - Stack Overflow
Yes. Beware of the heathens. Plural in the table names are a sure sign of someone who has not read any of the standard materials and has no knowledge of database theory.

https://news.ycombinator.com/item?id=16904088
Some of the wonderful things about Standards are: they are all integrated with each other; they work together; and they were written by minds greater than ours, so we do not have to debate them. The standard table name refers to each row in the table, which is used in the all verbiage, not the total content of the table (we know that the Customer table contains all the Customers).
database  naming-things  sql 
2 hours ago
Rethinking GPS: Engineering Next-Gen Location at Uber
https://news.ycombinator.com/item?id=16887276

Location and navigation using global positioning systems (GPS) is deeply embedded in our daily lives, and is particularly crucial to Uber’s services. To orchestrate quick, efficient pickups, our GPS technologies need to know the locations of matched riders and drivers, as well as provide navigation guidance from a driver’s current location to where the rider needs to be picked up, and then, to the rider’s chosen destination. For this process to work seamlessly, the location estimates for riders and drivers need to be as precise as possible.
gps  gis 
3 days ago
How we identified bots on Twitter | Pew Research Center
You can come up with a list of characteristics like these to try to determine whether an account is a bot or not. Of course, it would be far too time-consuming to try to observe those characteristics for 140,000 different Twitter accounts (roughly the number of accounts included in the study). A more practical approach is to come up with a reasonably large dataset of accounts that are bots and not bots, and then use a machine learning system to “learn” the patterns that characterize bot and human accounts. With those patterns in hand, you can then use them to classify a much larger number of accounts.
news  article  bots  twitter  heuristics  methodology 
4 days ago
Design Doesn’t Care What You Think Information Looks Like | Rob Weychert
Hello! My name is Rob Weychert. I’m an editorial experience designer at ProPublica, which means I work on the overall user experience of the ProPublica site as well as working on custom art direction and layout for some of our big feature stories.
design  css  data 
7 days ago
PEP 8, beautiful code, and the tyranny of guidelines.
I’m no Python developer, but I learned a great deal from Raymond Hettinger’s Pycon 2015 presentation, Beyond PEP 8: Best practices for beautiful intelligible code. It could as well have been called ‘the danger of standards’:
compciv  python  style-guide 
20 days ago
How We Collected Nearly 5,000 Stories of Maternal Harm — ProPublica
Asking if readers knew women who died or almost died in childbirth drew an outpouring that carries lessons for both traditional and engaged journalism.
4 weeks ago
News media offers consistently warped portrayals of black families, study finds - The Washington Post
https://twitter.com/geoffhing/status/973280893514248192




If all you knew about black families was what national news outlets reported, you are likely to think African Americans are overwhelmingly poor, reliant on welfare, absentee fathers and criminals, despite what government data show, a new study says.

Major media outlets routinely present a distorted picture of black families — portraying them as dependent and dysfunctional — while white families are more likely to be depicted as sources of social stability, according to the report released Wednesday by Color of Change, a racial justice organization, and Family Story, an advocate of diverse family arrangements.

“This leaves people with the opinion that black people are plagued with self-imposed dysfunction that creates family instability and therefore, all their problems,” said Travis L. Dixon, a communications professor at the University of Illinois at Urbana-Champaign who conducted the study.

https://twitter.com/bechang8/status/971047341976367104
diversity  crime 
6 weeks ago
NLP Concepts with spaCy. Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
NLP concepts with spaCy
“Natural Language Processing” is a field at the intersection of computer science, linguistics and artificial intelligence which aims to make the underlying structure of language available to computer programs for analysis and manipulation. It’s a vast and vibrant field with a long history! New research and techniques are being developed constantly.

The aim of this notebook is to introduce a few simple concepts and techniques from NLP—just the stuff that’ll help you do creative things quickly, and maybe open the door for you to understand more sophisticated NLP concepts that you might encounter elsewhere.

We'll be using a library called spaCy, which is a good compromise between being very powerful and state-of-the-art and easy for newcomers to understand.

(Traditionally, most NLP work in Python was done with a library called NLTK. NLTK is a fantastic library, but it’s also a writhing behemoth: large and slippery and difficult to understand. Also, much of the code in NLTK is decades out of date with contemporary practices in NLP.)

This tutorial is written in Python 2.7, but the concepts should translate easily to later versions.
nlp  spacy  python 
8 weeks ago
FACT CHECK: Were News Stories About the Florida Mass Shooting Posted Days Before It Happened?
Listing the date and time for news stories is complicated, as there is no universal standard used. Sometimes, stories have no date and time on the webpage. Sometimes, a story may appear on a page that lists related stories, and they all have different dates and times shown. Sometimes a publisher might be in a different time zone and the date gets picked up incorrectly. All these things may cause us to occasionally list the wrong date. We’re looking at ways to improve.
time 
8 weeks ago
Jeremy Burge on Twitter: "…also in emoji news this week: Samsung quietly moved the cheese directly on top of the burger patty 😊… "
…also in emoji news this week: Samsung quietly moved the cheese directly on top of the burger patty 😊
emoji 
9 weeks ago
Revealed: The Pentagon Is Spending Up To $2.2 Billion on Soviet-Style Arms for Syrian Rebels - OCCRP
An important clue lay in seven of the contracts, signed in September 2016 and worth $71.6 million, which did initially cite Syria either by name or by the Pentagon’s internal code – V7 – for the Syria Train and Equip program. These references were deleted from the public record after BIRN and OCCRP asked the Pentagon about these deliveries this March.

makingakilling/pentagon-procurement-database.png
Before and after images from a Pentagon procurement database show how the end destinations, "Syria and Iraq," were removed from the procurement records. (Click to enlarge.)
Credit: BIRN
Reporters made copies of the documents before they were deleted. The Pentagon has declined to explain the alterations.

Picatinny is circumspect about its role supplying Syrian rebels given the sensitive nature of the conflict. In addition to pitting an array of militias against Syrian government forces, the fighting is described by experts as a complex proxy war involving Saudi Arabia, Iran, Turkey, and Russia
publicrecords  compciv  military 
9 weeks ago
Some suburbs take only seconds to review red light camera citations, analysis shows - Chicago Tribune
The Tribune sought approval logs for a recent three-month period from a sampling of departments. Those logs chart down to the second when officers approve each ticket. Reporters could determine how long each officer typically spent to review a citation. On the high end, one officer’s median number of seconds for review — the midpoint of his review times — was about 24 seconds between citations.

But some officers were much faster. For Skokie Officer Steven Odeshoo, the median was 7 seconds.

On a recent morning, he showed the Tribune how: sitting in front of a 42-inch flat screen TV that instantly pulled up video after video, with special color-coded cues that let him know if the intersection had unique rules such as no turn on red, allowed him to fast-forward the videos to make a quicker judgment. With two quick clicks of the mouse, a ticket was approved or rejected, and the next suggested violation immediately began playing.

Faster still was Lynwood Officer Stevie Bradich, at 5 seconds. Her deputy chief explained that with the no-turn-on-red, they were relatively easy calls.

“I can see how that can be done in as short as 5 or 6 seconds, and have an approval with what is a true violation,” Shubert said. “Yes, it’s fast, but it’s a pretty fast process. You’re not actually inputting any numbers. It’s just a lot of mouse clicks.”

But one department acknowledged its numbers suggest problems. In Riverdale, one officer typically took 3 seconds to review tickets.
best  investigations  data-journalism  padjo  compciv  foia 
10 weeks ago
Justice, Interrupted - More Perfect - WNYC Studios
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2933016

The first thing that comes to mind is an episode of Radiolab’s More Perfect, on interruptions in the Supreme Court.
https://www.wnycstudios.org/story/justice-interrupted/

Also, here’s a link to the research paper the episode is based on:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2933016

They don’t use either of the sources you mention, but their use of the transcripts is fascinating.

Lucille Sherman
National Data Enterprise Reporter
GateHouse Media
941-361-4903
compciv-2018  text-mining 
10 weeks ago
Floods Are Getting Worse, and 2,500 Chemical Sites Lie in the Water’s Path - The New York Times
The Times analysis focused on facilities on the federal toxic release database, which tracks sites handling chemicals that could be harmful to health and the environment if released. The list does not include properties like Superfund sites or wastewater facilities, or chemical sites where the predominant risks are fire or explosion, as opposed to toxic pollution.

The Times also examined reports of oil and chemical spills tallied by the National Response Center, which is run by the Coast Guard. Companies are required by law to report spills to the N.R.C., although that database has been criticized as incomplete.
environment  geospatial  sdss  padjo 
10 weeks ago
The Shallowness of Google Translate - The Atlantic
One Sunday, at one of our weekly salsa sessions, my friend Frank brought along a Danish guest. I knew Frank spoke Danish well, since his mother was Danish, and he, as a child, had lived in Denmark. As for his friend, her English was fluent, as is standard for Scandinavians. However, to my surprise, during the evening’s chitchat it emerged that the two friends habitually exchanged emails using Google Translate. Frank would write a message in English, then run it through Google Translate to produce a new text in Danish; conversely, she would write a message in Danish, then let Google Translate anglicize it. How odd! Why would two intelligent people, each of whom spoke the other’s language well, do this? My own experiences with machine-translation software had always led me to be highly skeptical about it. But my skepticism was clearly not shared by these two. Indeed, many thoughtful people are quite enamored of translation programs, finding little to criticize in them. This baffles me.
google  language  NLP 
11 weeks ago
specifications - Was the misspelling of the HTTP field name Referer intentional? - Stack Overflow
Its like when I did the referer field. I got nothing but grief for my choice of spelling. I am now attempting to get the spelling corrected in the OED since my spelling is used several billion times a minute more than theirs.
grammar  punctuation 
11 weeks ago
Google vs. Evil | WIRED
The world's biggest, best-loved search engine owes its success to supreme technology and a simple rule: Don't be evil. Now the geek icon is finding that moral compromise is just the cost of doing big business.
google 
11 weeks ago
Why the Toronto Star Stopped identifying an 11-year-old girl who made up Hijab Attack - iMediaEthics
An 11-year-old Toronto girl claimed earlier this month that her hijab had been cut on her way to school. The alleged incident made national news and prompted a police investigation as a possible hate crime. In addition, the girl spoke at a press conference. It turns out, however, that the girl made up the story, something her family admitted after the police said the incident “did not happen.”
forgetme 
11 weeks ago
Why referrer is spelt wrong in http
The misspelling of referrer originated in the original proposal by computer scientist Phillip Hallam-Baker to incorporate the field into the HTTP specification. The misspelling was set in stone by the time of its incorporation into the Request for Comments standards document RFC 1945; document co-author Roy Fielding has remarked that neither "referrer" nor the misspelling "referer" were recognized by the standard Unix spell checker of the period.
11 weeks ago
Asking the Right Questions About AI – Yonatan Zunger – Medium
What happened here wasn’t a bias in Google’s algorithms: it was a bias in the underlying data. This particular bias was a combination of “invisible whiteness” and media bias in reporting: if three white teenagers are arrested for a crime, not only are news media much less likely to show their mug shots, but they’re less likely to refer to them as “white teenagers.” In fact, nearly the only time groups of teenagers were explicitly labeled as being “white” was in stock photography catalogues. But if three black teenagers are arrested, you can count on that phrase showing up a lot in the press coverage.
machine-learning  ethics  compciv  compciv-2018 
11 weeks ago
The invisible hazard afflicting thousands of schools | Center for Public Integrity
Nearly 8,000 U.S. public schools lie within 500 feet of highways, truck routes and other roads with significant traffic, according to a joint investigation by the Center for Public Integrity and Reveal from The Center for Investigative Reporting. That’s about one in every 11 public schools, serving roughly 4.4 million students and spread across every state in the nation. Thousands more private schools and Head Start centers are in the same fix.
health  padjo  geospatial  sdss 
11 weeks ago
3 Smart Data Journalism Techniques that can help you find stories faster
Text processing has never been easier or more powerful. Across industries, analysts increasingly complement close reading with computational approaches to gain insight from large volumes of text. Companies, for instance, assess customer sentiment from millions of reviews or follow topics discussed on social media in real-time.

Meanwhile, the volume of documents available for journalistic inquiry has exploded: reams of information on government operations (Wikileaks Cablegate: 200,000 pages,) private wealth shelters (Paradise Papers: 13.4 million pages,) and public figures’ communication (Sarah Palin’s emails: 24,000 pages) leak, it seems, almost monthly.
data-journalism  data  machine-learning  compciv  machine-journalism 
11 weeks ago
Do ‘Fast and Furious’ Movies Cause a Rise in Speeding? - The New York Times
Using detailed traffic violation data from Montgomery County, Md., we were able to examine all speeding tickets there from 2012 to 2017. This length of time allowed us to investigate the effect of three movies in the “Fast and Furious” series. Looking at the 192,892 speeding tickets recorded, we analyzed the average miles per hour over the speed limit that drivers were charged with going on a given day.
data-analysis  compciv-2018  publicrecords 
11 weeks ago
What He Did on His Summer Break: Exposed a Global Security Flaw - The New York Times
SYDNEY, Australia — When Nathan Ruser, an Australian university student, posted on Twitter over the weekend that a fitness app had revealed the locations of military sites in Syria and elsewhere, he did not expect much response.

But the news ricocheted across the internet, alarming security experts, who said hostile entities could glean valuable intelligence from the Strava app’s global “heat map,” including the locations of secret bases and the movements of military personnel. The Pentagon said it was reviewing the situation.
compciv  data-analysis 
11 weeks ago
Julian Assange Thought He Was Messaging Sean Hannity When He Offered ‘News’ on Democrat Investigating Trump-Russia
At about 4 a.m. on Saturday morning, a couple hours after she started pretending to be Sean Hannity, Dell Gilliam says she got a direct message back from the head of WikiLeaks, Julian Assange. That’s when she said she “kind of panicked.”

“I felt bad. He really thought he was talking to Sean Hannity,” said Gilliam.

Gilliam, a technical writer from Texas, was bored with the flu when she created @SeanHannity__ early Saturday morning. The Fox News host's real account was temporarily deleted after cryptically tweeting the phrase “Form Submission 1649 | #Hannity” on Friday night. Twitter said the account had been “briefly compromised,” according to a statement provided to The Daily Beast, and was back up on Sunday morning.
naming-things 
11 weeks ago
An acclaimed crime reporter leaves her newsroom for police work - Columbia Journalism Review
Part of the answer, the Blade had heard, could be found on a secret police department map that charted the neighborhoods controlled by each of the city’s gangs. In the summer of 2012, after a little more than a year on the crime beat, Dungjen asked the department for a copy. Publishing the map, she thought, would give readers a sense of how big the city’s gang problem was. But the police refused, arguing that the document was involved in ongoing investigations. To get the map, the Blade sued the department, invoking Ohio’s Public Records Act.

Months passed. The Blade’s publisher, John Block, was confident that the newspaper would eventually win its legal battle, but he was growing increasingly frustrated with the slow pace of the judicial process. “One day, I said, ‘I’m tired of waiting,’” Block recalls. “‘Why don’t we just go out on the street and figure out what gangs are operating where?’”

Blade editors assigned Dungjen and photographer Amy Voigt to the story. For the next three months, the Blade took the two journalists off daily assignments so they could focus all their energy on the gang-map project. The first few days were demoralizing. Voigt and Dungjen didn’t have a plan. They drove aimlessly through gang-controlled neighborhoods, asking locals for information.
journalism 
january 2018
Here’s How Scammers Are Using Fake News To Screw With Bitcoin Investors
So they bought. And they bought without noticing the additional "L" in the Twitter username or the missing verification check that distinguished the bogus McAfee account from the real one, @OfficialMcAfee. When the tweet was first broadcast at around 3 p.m. ET, GVT was bought and sold on the market at $30. By 3:04, it was at $45, and trading volume had doubled.
typo 
january 2018
It is *not* possible to detect and block Chrome headless
A few months back, I wrote a popular article called Making Chrome Headless Undetectable in response to one called Detecting Chrome Headless by Antione Vastel. The one thing that I was really trying to get across in writing that is that blocking site visitors based on browser fingerprinting is an extremely user-hostile practice. There are simply so many variations in browser configurations that you’re inevitably going to end up blocking non-automated access to your website, and–on top of that–you’re really not accomplishing anything in terms of blocking sophisticated web scrapers. To illustrate this, I showed how to bypass all of the suggested “tests” in Antione’s first post and pointed out that they hadn’t been tested in multiple browser versions and would fail for any users with beta or unstable Chrome builds.
chrome  headless-browser  testing  web-scraping 
january 2018
Hawaii missile alert: How one employee ‘pushed the wrong button’ and caused a wave of panic - The Washington Post
Around 8:05 a.m., the Hawaii emergency employee initiated the internal test, according to a timeline released by the state. From a drop-down menu on a computer program, he saw two options: “Test missile alert” and “Missile alert.” He was supposed to choose the former; as much of the world now knows, he chose the latter, an initiation of a real-life missile alert.

“In this case, the operator selected the wrong menu option,” HEMA spokesman Richard Rapoza told The Washington Post on Sunday.
design  ux  typo  human-error 
january 2018
Moira Donegan: I Started the Media Men List
In October, I created a Google spreadsheet called “Shitty Media Men” that collected a range of rumors and allegations of sexual misconduct, much of it violent, by men in magazines and publishing. The anonymous, crowdsourced document was a first attempt at solving what has seemed like an intractable problem: how women can protect ourselves from sexual harassment and assault.

One long-standing partial remedy that women have developed is the whisper network, informal alliances that pass on open secrets and warn women away from serial assaulters. Many of these networks have been invaluable in protecting their members. Still, whisper networks are social alliances, and as such, they’re unreliable. They can be elitist, or just insular. As Jenna Wortham pointed out in The New York Times Magazine, they are also prone to exclude women of color. Fundamentally, a whisper network consists of private conversations, and the document that I created was meant to be private as well. It was active for only a few hours, during which it spread much further and much faster than I ever anticipated, and in the end, the once-private document was made public — first when its existence was revealed in a BuzzFeed article by Doree Shafrir, then when the document itself was posted on Reddit.
spreadsheets 
january 2018
The #MeToo Movement Has Worked - Bloomberg
Conor Sen: As a fan of American film, I'm worried about where the industry is going. DVD sales have been declining for years, and streaming revenue looks unlikely to ever replace them. The rise of moviegoers in China has created an incentive for Hollywood to make movies that appeal to Chinese audiences as well as Western ones. There are so many entertainment options now, and marketing costs have become so astronomical, that Hollywood has decided to play it safe and focus on large, well-established franchises for movies and sequels. Put all this together and you get a lot of movies like the latest Transformers installment, which dominated the Chinese box office but barely registered in the U.S.
january 2018
Facebook’s Virtual Assistant M Is Dead. So Are Chatbots | WIRED
That’s because most of the tasks fulfilled by M required people. Facebook’s goal with M was to develop artificial-intelligence technology that could automate almost all of M’s tasks. But despite Facebook’s vast engineering resources, M fell short: One source familiar with the program estimates M never surpassed 30 percent automation. Last spring, M’s leaders admitted the problems they were trying to solve were more difficult than they’d initially realized.
bots  facebook  automation 
january 2018
We Used Broadband Data We Shouldn’t Have — Here’s What Went Wrong | FiveThirtyEight
Over the summer, FiveThirtyEight published two stories on broadband internet access in the U.S. that were based on a data set made public by academic researchers who had acquired data from Catalist, a well-known political data firm. After further reporting, we can no longer vouch for the academics’ data set. The preponderance of evidence we’ve collected has led us to conclude that it is fundamentally flawed. That’s because:

The academics’ data does not provide an accurate picture of broadband use at the county level relative to other sources.
Some of the data that the academic researchers received from Catalist originated with a third-party commercial source, and Catalist acknowledged that it did not vet that data itself. The researchers and Catalist also disagree about what Catalist said the data represents and what it could be used for.
retractions  methodology  dirty-data  data-journalism 
january 2018
Unfreed | The Marshall Project
Orman called the DOC at once. He learned not only that Lima-Marin was free, but that he’d been out more than five years, completing his parole. He checked the state court’s computer system and noticed a strange phrase tacked on to each of Lima-Marin’s eight convictions: “No Consecutive/Concurrent Sentences.” Orman wondered if someone else might have been as confused by that phrase as he was, and decided that his sentences were concurrent.
justice  typo  punctuation 
january 2018
The Washington Post experiments with automated storytelling to help power 2016 Rio Olympics coverage - The Washington Post
https://twitter.com/wpolympicsbot

The Washington Post will leverage artificial intelligence technology to report key information from the 2016 Rio Olympics, including results of medal events. “Heliograf,” which was developed in-house, automatically generates short multi-sentence updates for readers. These updates will appear in The Post’s live blog, on Twitter at @WPOlympicsbot, and are accessible via The Post’s Olympics skill on Alexa-enabled devices and The Post’s bot for Messenger.

“Automated storytelling has the potential to transform The Post’s coverage. More stories, powered by data and machine learning, will lead to a dramatically more personal and customized news experience,” said Jeremy Gilbert, director of strategic initiatives at The Washington Post. “The Olympics are the perfect way to prove the potential of this technology. In 2014, the sports staff spent countless hours manually publishing event results. Heliograf will free up Post reporters and editors to add analysis, color from the scene and real insight to stories in ways only they can.”
bots  automated-writing 
january 2018
How One Major Internet Company Helps Serve Up Hate on the — ProPublica
The widespread use of Cloudflare’s services by racist groups is not an accident. Cloudflare has said it is not in the business of censoring websites and will not deny its services to even the most offensive purveyors of hate.

“A website is speech. It is not a bomb,” Cloudflare’s CEO Matthew Prince wrote in a 2013 blog post defending his company’s stance. “There is no imminent danger it creates and no provider has an affirmative obligation to monitor and make determinations about the theoretically harmful nature of speech a site may contain.”
internet  caching  censorship  compciv-2018 
january 2018
My Life as a New York Times Reporter in the Shadow of the War on Terror
What angered me most was that while they were burying my skeptical stories, the editors were not only giving banner headlines to stories asserting that Iraq had weapons of mass destruction, they were also demanding that I help match stories from other publications about Iraq’s purported WMD programs. I grew so sick of this that when the Washington Post reported that Iraq had turned over nerve gas to terrorists, I refused to try to match the story. One mid-level editor in the Washington bureau yelled at me for my refusal. He came to my desk carrying a golf club while berating me after I told him that the story was bullshit and I wasn’t going to make any calls on it.
journalism  best 
january 2018
Feeding the Machine: Policing, Crime Data, & Algorithms by Elizabeth Joh :: SSRN
Discussions of predictive algorithms used by the police tend to assume the police are merely end users of big data. Accordingly, police departments are consumers and clients of big data -- not much different than users of Spotify, Netflix, Amazon, or Facebook. Yet this assumption about big data policing contains a flaw. Police are not simply end users of big data. They generate the information that big data programs rely upon. This essay explains why predictive policing programs can’t be fully understood without an acknowledgment of the role police have in creating its inputs. Their choices, priorities, and even omissions become the inputs algorithms use to forecast crime. The filtered nature of crime data matters because these programs promise cutting edge results, but may deliver analyses with hidden limitations.
algorithms  policing  compciv 
january 2018
How Facebook’s news feed algorithm works.
Alison steers me through a maze of cubicles and open minikitchens toward a small conference room, where he promises to demystify the Facebook algorithm’s true nature. On the way there, I realize I need to use the bathroom and ask for directions. An involuntary grimace crosses his face before he apologizes, smiles, and says, “I’ll walk you there.” At first I think it’s because he doesn’t want me to get lost. But when I emerge from the bathroom, he’s still standing right outside, and it occurs to me that he’s not allowed to leave me unattended.   
algorithms  facebook  best  compciv 
december 2017
Get data on nonfatal and fatal police shootings in the 50 largest U.S. police departments – VICE News
https://twitter.com/dataeditor/status/940325163794649088
I'm deeply appreciative of the fine folks at VICE who decided that putting all the standardized police shooting data they meticulously collected in a downloadable format was a good idea. It was.
---

VICE News spent nine months collecting data on both fatal and nonfatal police shootings from the 50 largest local police departments in the United States. For every person shot and killed by cops in these departments from 2010 through 2016, we found, police shot at two more people who survived. We also found that 20 percent of the people cops fired on were unarmed.

We’re making the data public so that others can explore it too. Find your local police department and download the data below. And if you use our data, we’d love to hear about it. Let us know on Twitter, Facebook, or Instagram, or email policeshootings@vice.com.
dataset  investigations  compciv  collaboration  policing  justice 
december 2017
Death & Dysfunction | An NJ.com Special Investigation
Hey listers,

We published an 18-month data investigation into our state medical examiner system I wanted to share:

death.nj.com

We fought for months to acquire a database of all 420,000 cases referred to NJ medical examiners over a 20 year period. Analysis revealed a system that’s on the brink of collapse. Our reporting lead us to cases of missing body parts, potential child murders going without investigation, innocent people languishing in jail and major lapses/conflicts of interest in police involved shooting investigations.

We’ll be posting the data, as well as the replication analysis my colleague did to check my work on data.world in the coming days. We’ll also post our code to github once I clean it up and make it readable for humans.

If you like it, please share and we would love any feedback.
data-journalism  nicar  investigations 
december 2017
The Trouble with Bias - NIPS 2017 Keynote - Kate Crawford #NIPS2017 - YouTube
Kate Crawford is a leading researcher, academic and author who has spent the last decade studying the social implications of data systems, machine learning and artificial intelligence. She is a Distinguished Research Professor at New York University, a Principal Researcher at Microsoft Research New York, and a Visiting Professor at the MIT Media Lab. https://twitter.com/omojumiller/status/940824325107736576
video  AI  compciv 
december 2017
Under Trump, E.P.A. Has Slowed Actions Against Polluters, and Put Limits on Enforcement Officers - The New York Times
The Times built a database of civil cases filed at the E.P.A. during the Trump, Obama and Bush administrations. During the first nine months under Mr. Pruitt’s leadership, the E.P.A. started about 1,900 cases, about one-third fewer than the number under President Barack Obama’s first E.P.A. director and about one-quarter fewer than under President George W. Bush’s over the same time period.
data-journalism  investigations 
december 2017
Official Toll in Puerto Rico: 62. Actual Deaths May Be 1,052. - The New York Times
. A review by The New York Times of daily mortality data from Puerto Rico’s vital statistics bureau indicates a significantly higher death toll after the hurricane than the government there has acknowledged.

The Times’s analysis found that in the 42 days after Hurricane Maria made landfall on Sept. 20 as a Category 4 storm, 1,052 more people than usual died across the island. The analysis compared the number of deaths for each day in 2017 with the average of the number of deaths for the same days in 2015 and 2016.

Officially, just 62 people died as a result of the storm that ravaged the island with nearly 150-mile-an-hour winds, cutting off power to 3.4 million Puerto Ricans. The last four fatalities were added to the death toll on Dec. 2.
best  data-journalism  padjo  compciv  death-data  investigations 
december 2017
« earlier      
a-b-testing academic advice ai algorithms amazon analysis analytics angularjs animation api apis apple apps architecture art article automation aws backbone bash bayesian best big-data bioinformatics book bots business c caching campaign-finance census cheatsheet cli clinicaltrials clojure code colors command-line compciv compilers computer computer-science computer-vision computing course crime crypto css d3 data data-analysis data-journalism data-mining data-munging data-science data-sharing data-visualization database databases dataset datasets ddj death-data debugging deep-learning deployment design design-example devops digital-humanities diversity django drugs education elections email engineering essay excel facebook fakenews finance flux foia framework funny game game-dev games gaming git github golang google government graphics guide hack hacking hadoop hardware hash haskell health history howto html html5 http image-processing infographic interactive interesting internet introduction investigations ios java javascript journalism jquery json justice language learning linux lisp mac machine-learning map-reduce mapping maps marketing math medicine mobile mongodb music mysql naming-things netsec network neural-networks news nlp nodejs nosql nyc nylist object-oriented ocr oop open-data opencv optimization osx padjo pandas papers parsing patterns performance photography policing politics postgres prisons privacy programming publicrecords punctuation python r rails react reactjs reference regex research ruby rust scalability science scraping search security semitechnical seo server server-ops shell spam spreadsheets sql sqlite standards startups statistics style-guide syllabus tdd teaching tensorflow testing text text-mining tools transparency tutorial twitter typography ui unicode unix ux video vim visualizations web web-design web-development web-scraping writing wtfviz

Copy this bookmark:



description:


tags: