jerryking + data_journalism   12

How 5 Data Dynamos Do Their Jobs
June 12, 2019 | The New York Times | By Lindsey Rogers Cook.
[Times Insider explains who we are and what we do, and delivers behind-the-scenes insights into how our journalism comes together.]
Reporters from across the newsroom describe the many ways in which they increasingly rely on datasets and spreadsheets to create groundbreaking work.

Data journalism is not new. It predates our biggest investigations of the last few decades. It predates computers. Indeed, reporters have used data to hold power to account for centuries, as a data-driven investigation that uncovered overspending by politicians, including then-congressman Abraham Lincoln, attests.

But the vast amount of data available now is new. The federal government’s data repository contains nearly 250,000 public datasets. New York City’s data portal contains more than 2,500. Millions more are collected by companies, tracked by think tanks and academics, and obtained by reporters through Freedom of Information Act requests (though not always without a battle). No matter where they come from, these datasets are largely more organized than ever before and more easily analyzed by our reporters.

(1) Karen Zraick, Express reporter.
NYC's Buildings Department said it was merely responding to a sudden spike in 311 complaints about store signs. But who complains about store signs?....it was hard to get a sense of the scale of the problem just by collecting anecdotes. So I turned to NYC Open Data, a vast trove of information that includes records about 311 complaints. By sorting and calculating the data, we learned that many of the calls were targeting stores in just a few Brooklyn neighborhoods.
(2) John Ismay, At War reporter
He has multiple spreadsheets for almost every article he works on......Spreadsheets helped him organize all the characters involved and the timeline of what happened as the situation went out of control 50 years ago......saves all the relevant location data he later used in Google Earth to analyze the terrain, which allowed him to ask more informed questions.
(3) Eliza Shapiro, education reporter for Metro
After she found out in March that only seven black students won seats at Stuyvesant, New York City’s most elite public high school, she kept coming back to one big question: How did this happen? I had a vague sense that the city’s so-called specialized schools once looked more like the rest of the city school system, which is mostly black and Hispanic.

With my colleague K.K. Rebecca Lai from The Times’s graphics department, I started to dig into a huge spreadsheet that listed the racial breakdown of each of the specialized schools dating to the mid-1970s.
analyzed changes in the city’s immigration patterns to better understand why some immigrant groups were overrepresented at the schools and others were underrepresented. We mapped out where the city’s accelerated academic programs are, and found that mostly black and Hispanic neighborhoods have lost them. And we tracked the rise of the local test preparation industry, which has exploded in part to meet the demand of parents eager to prepare their children for the specialized schools’ entrance exam.

To put a human face to the data points we gathered, I collected yearbooks from black and Hispanic alumni and spent hours on the phone with them, listening to their recollections of the schools in the 1970s through the 1990s. The final result was a data-driven article that combined Rebecca’s remarkable graphics, yearbook photos, and alumni reflections.

(4) Reed Abelson, Health and Science reporter
the most compelling stories take powerful anecdotes about patients and pair them with eye-opening data.....Being comfortable with data and spreadsheets allows me to ask better questions about researchers’ studies. Spreadsheets also provide a way of organizing sources, articles and research, as well as creating a timeline of events. By putting information in a spreadsheet, you can quickly access it, and share it with other reporters.

(5) Maggie Astor, Politics reporter
a political reporter dealing with more than 20 presidential candidates, she uses spreadsheets to track polling, fund-raising, policy positions and so much more. Without them, there’s just no way she could stay on top of such a huge field......The climate reporter Lisa Friedman and she used another spreadsheet to track the candidates’ positions on several climate policies.
311  5_W’s  behind-the-scenes  Communicating_&_Connecting  data  datasets  data_journalism  data_scientists  FOIA  groundbreaking  hidden  information_overload  information_sources  journalism  mapping  massive_data_sets  New_York_City  NYT  open_data  organizing_data  reporters  self-organization  systematic_approaches  spreadsheets  storytelling  timelines  tools 
10 weeks ago by jerryking
The Art of Statistics by David Spiegelhalter
May 6, 2019 | Financial Times | Review by Alan Smith.

The Art of Statistics, by Sir David Spiegelhalter, former president of the UK’s Royal Statistical Society and current Winton professor of the public understanding of risk at the University of Cambridge.

The comparison with Rosling is easy to make, not least because Spiegelhalter is humorously critical of his own field which, by his reckoning, has spent too much time arguing with itself over “the mechanical application of a bag of statistical tools, many named after eccentric and argumentative statisticians”.

His latest book, its title,
books  book_reviews  charts  Communicating_&_Connecting  data  data_journalism  data_scientists  Hans_Rosling  listening  massive_data_sets  mathematics  statistics  visualization 
may 2019 by jerryking
Meet Amanda Cox, Who Brings Life to Data on Our Pages
Feb. 28, 2019 | The New York Times | By Jake Lucas

Ms. Cox was stepping into a new role: data editor. She will help coordinate data work across departments, in interactive news, computer-assisted reporting, graphics and The Upshot, and pave the way for journalism using data to play a bigger role throughout the newsroom. She will also act as an adviser when big questions arise about how to think about and use data thoughtfully, without overstating what it supports.
charts  Communicating_&_Connecting  data  data_journalism  infographics  NYT  quantitative  visualization 
march 2019 by jerryking
Piecing Together Narratives From the 0′s and 1′s: Storytelling in the Age of Big Data - CIO Journal. - WSJ
Feb 16, 2018 | WSJ | By Irving Wladawsky-Berger.

Probabilities are inherently hard to grasp, especially for an individual event like a war or an election, ......Why is it so hard for people to deal with probabilities in everyday life? “I think part of the answer lies with Kahneman’s insight: Human beings need a story,”....Mr. Kahneman explained their research in his 2011 bestseller Thinking, Fast and Slow. Its central thesis is that our mind is composed of two very different systems of thinking. System 1 is the intuitive, fast and emotional part of our mind. Thoughts come automatically and very quickly to System 1, without us doing anything to make them happen. System 2, on the other hand, is the slower, logical, more deliberate part of the mind. It’s where we evaluate and choose between multiple options, because only System 2 can think of multiple things at once and shift its attention between them.

System 1 typically works by developing a coherent story based on the observations and facts at its disposal. Research has shown that the intuitive System 1 is actually more influential in our decisions, choices and judgements than we generally realize. But, while enabling us to act quickly, System 1 is prone to mistakes. It tends to be overconfident, creating the impression that we live in a world that’s more coherent and simpler than the actual real world. It suppresses complexity and information that might contradict its coherent story.

Making sense of probabilities, numbers and graphs requires us to engage System 2, which, for most everyone, takes quite a bit of focus, time and energy. Thus, most people will try to evaluate the information using a System 1 simple story: who will win the election? who will win the football game?.....Storytelling has played a central role in human communications since times immemorial. Over the centuries, the nature of storytelling has significantly evolved with the advent of writing and the emergence of new technologies that enabled stories to be embodied in a variety of media, including books, films, and TV. Everything else being equal, stories are our preferred way of absorbing information.

“It’s not enough to say an event has a 10 percent probability,” wrote Mr. Leonhardt. “People need a story that forces them to visualize the unlikely event – so they don’t round 10 to zero.”.....
in_the_real_world  storytelling  massive_data_sets  probabilities  Irving_Wladawsky-Berger  Communicating_&_Connecting  Daniel_Kahneman  complexity  uncertainty  decision_making  metacognition  data_journalism  sense-making  thinking_deliberatively 
february 2018 by jerryking
Five Steps to Get Started with Data Journalism
May 6, 2015 | | ICFJ - International Center for Journalists | by Alexandra LudkaCommunications Officer.
data_journalism  data_driven  data  Communicating_&_Connecting 
may 2015 by jerryking
Sponsor Generated Content: 4 Industries Most in Need of Data Scientists
June 16, 2014 12:00 am ET
4 Industries Most in Need of Data Scientists
NARRATIVESby WSJ. Custom Studios for SAS

Agriculture
Relying on sensors in farm machinery, in soil and on planes flown over fields, precision agriculture is an emerging practice in which growing crops is directed by data covering everything from soil conditions to weather patterns to commodity pricing. “Precision agriculture helps you optimize yield and avoid major mistakes,” says Daniel Castro, director of the Center for Data Innovation, a think tank in Washington, D.C. For example, farmers traditionally have planted a crop, then applied fertilizer uniformly across entire fields. Data models allow them to instead customize the spread of fertilizer, seed, water and pesticide across different areas of their farms—even if the land rolls on for 50,000 acres.

Finance
Big data promises to discover better models to gauge risk, which could minimize the likelihood of scenarios such as the subprime mortgage meltdown. Data scientists, though, also are charged with many less obvious tasks in the financial industry, says Bill Rand, director of the Center for Complexity in Business at the University of Maryland. He points to one experiment that analyzed keywords in financial documents to identify competitors in different niches, helping pinpoint investment opportunities.

Government
Government organizations have huge stockpiles of data that can be applied against all sorts of problems, from food safety to terrorism. Joshua Sullivan, a data scientist who led the development of Booz Allen Hamilton’s The Field Guide to Data Science, cites one surprising use of analytics concerning government subsidies. “They created an amazing visualization that helped you see the disconnect between the locations of food distribution sites and the populations they served,” Sullivan says. “That's the type of thing that isn't easy to see in a pile of static reports; you need the imagination of a data scientist to depict the story in the data.”

Pharma
Developing a new drug can take more than a decade and cost billions. Data tools can help take some of the sting out, pinpointing the best drug candidates by scanning across pools of information, such as marketing data and adverse patient reactions. “We can model data and prioritize which experiments we take [forward],” Sullivan says. “Big data can help sort out the most promising drugs even before you do experiments on mice. Just three years ago that would have been impossible. But that's what data scientists do—they tee up the right question to ask.”
drug_development  precision_agriculture  farming  data_scientists  agriculture  massive_data_sets  data  finance  government  pharmaceutical_industry  product_development  non-obvious  storytelling  data_journalism  stockpiles 
june 2014 by jerryking
Profile of the Data Journalist: The Storyteller and The Teacher
Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society.

To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted in-person and email interviews during the 2012 NICAR Conference and published a series of data journalist profiles here at Radar.

Sarah Cohen (@sarahduke), the Knight professor of the practice of journalism and public policy at Duke University, and Anthony DeBarros (@AnthonyDB), the senior database editor at USA Today, were both important sources of historical perspective for my feature on how data journalism is evolving from "computer-assisted reporting" (CAR) to a powerful Web-enabled practice that uses cloud computing, machine learning and algorithms to make sense of unstructured data.

The latter halves of our interviews, which focused upon their personal and professional experience, follow.

What data journalism project are you the most proud of working on or creating?

DeBarros: "In 2006, my USA TODAY colleague Robert Davis and I built a database of 620 students killed on or near college campuses and mined it to show how freshmen were uniquely vulnerable. It was a heart-breaking but vitally important story to tell. We won the 2007 Missouri Lifestyle Journalism Awards for the piece, and followed it with an equally wrenching look at student deaths from fires."

Cohen: "I'd have to say the Pulitzer-winning series on child deaths in DC, in which we documented that children were dying in predictable circumstances after key mistakes by people who knew that their
agencies had specific flaws that could let them fall through the cracks.

I liked working on the Post's POTUS Tracker and Head Count. Those were Web projects that were geared at accumulating lots of little bits about Obama's schedule and his appointees, respectively, that we could share with our readers while simultaneously building an important dataset for use down the road. Some of the Post's Solyndra and related stories, I have heard, came partly from studying the president's trips in POTUS Tracker.

There was one story, called "Misplaced Trust," on DC's guardianship
system, that created immediate change in Superior Court, which was
gratifying. "Harvesting Cash," our 18-month project on farm subsidies, also helped point out important problems in that system.

The last one, I'll note, is a piece of a project I worked on,
in which the DC water authority refused to release the results of a
massive lead testing effort, which in turn had shown widespread
contamination. We got the survey from a source, but it was on paper.

After scanning, parsing, and geocoding, we sent out a team of reporters to
neighborhoods to spot check the data, and also do some reporting on the
neighborhoods. We ended up with a story about people who didn't know what
was near them.

We also had an interesting experience: the water
authority called our editor to complain that we were going to put all of
the addresses online -- they felt that it was violating peoples' privacy,
even though we weren't identifyng the owners or the residents. It was more
important to them that we keep people in the dark about their blocks. Our
editor at the time, Len Downie, said, "you're right. We shouldn't just put
it on the Web." He also ordered up a special section to put them all in
print.

Where do you turn to keep your skills updated or learn new things?

Cohen: "It's actually a little harder now that I'm out of the newsroom,
surprisingly. Before, I would just dive into learning something when I'd
heard it was possible and I wanted to use it to get to a story. Now I'm
less driven, and I have to force myself a little more. I'm hoping to start
doing more reporting again soon, and that the Reporters' Lab will help
there too.

Lately, I've been spending more time with people from other
disciplines to understand better what's possible, like machine learning
and speech recognition at Carnegie Mellon and MIT, or natural language
processing at Stanford. I can't DO them, but getting a chance to
understand what's out there is useful. NewsFoo, SparkCamp and NICAR are
the three places that had the best bang this year. I wish I could have
gone to Strata, even if I didn't understand it all."

DeBarros: For surveillance, I follow really smart people on Twitter and have several key Google Reader subscriptions.

To learn, I spend a lot of time training after work hours. I've really been pushing myself in the last couple of years to up my game and stay relevant, particularly by learning Python, Linux and web development. Then I bring it back to the office and use it for web scraping and app building.

Why are data journalism and "news apps" important, in the context of the contemporary digital environment for information?

Cohen: "I think anything that gets more leverage out of fewer people is
important in this age, because fewer people are working full time holding
government accountable. The news apps help get more eyes on what the
government is doing by getting more of what we work with and let them see
it. I also think it helps with credibility -- the 'show your work' ethos --
because it forces newsrooms to be more transparent with readers / viewers.

For instance, now, when I'm judging an investigative prize, I am quite
suspicious of any project that doesn't let you see each item, I.e., when
they say, "there were 300 cases that followed this pattern," I want to see
all 300 cases, or all cases with the 300 marked, so I can see whether I
agree.

DeBarros: "They're important because we're living in a data-driven culture. A data-savvy journalist can use the Twitter API or a spreadsheet to find news as readily as he or she can use the telephone to call a source. Not only that, we serve many readers who are accustomed to dealing with data every day -- accountants, educators, researchers, marketers. If we're going to capture their attention, we need to speak the language of data with authority. And they are smart enough to know whether we've done our research correctly or not.

As for news apps, they're important because -- when done right -- they can make large amounts of data easily understood and relevant to each person using them."

These interviews were edited and condensed for clarity.
Data  Gov_2.0  Publishing  dataproduct  datascience  nicarinterview  via:rahuldave  show_your_work  narratives  sense-making  unstructured_data  data_driven  data_journalism  visualization  infographics 
february 2013 by jerryking
Change or die: could adland be the new Detroit?
Feb 18, 2011|Campaign |Amelia Torode (head of strategy and innovation at VCCP and the chair of the IPA Strategy Group) and Tracey Follows ( head of planning at VCCP)...

As the world changed with the globalisation of markets, the transformative power of digital technologies and a shift in consumer demand, the automotive industry and the city of Detroit did not. At a fundamental level, nothing changed. Detroit failed to adapt, failed to evolve.

We have started to ask ourselves: is adland the new Detroit?

Data: find stories in numbers

It's time to reimagine our role. We're no longer solving problems but investigating mysteries; no longer taking a brief, rather taking on a case. Like a detective, we start with behaviour, looking for patterns and anomalies. We assume that what we're being told is not entirely the "truth" so search for information that is given from various perspectives and tend to believe our eyes more than our ears.

Imagine the implications for how we approach data. Seen through the lens of "mystery", we're not simply seeing data as a stream of numbers but as a snapshot of behaviour and an insight into human nature. What we do with data is the same thing we do when we sit on a park bench or at a pavement café - people-watching,albeit from desktops. It's human stories hidden within numbers, and it takes away the fear that surrounds "big data".
shifting_tastes  data-driven  data_journalism  Detroit  advertising_agencies  data  storytelling  massive_data_sets  adaptability  evolution  United_Kingdom  Publicis  managing_change  sense-making  insights  behaviours  patterns  anomalies  assumptions  automotive_industry  human_experience  curiosity  consumer_behavior 
december 2012 by jerryking
The Gripping Statistic : How to Make Your Data Matter
Mon Aug 10, 2009 | Fast Company | By Dan Heath & Chip
Heath. A good statistic is one that aids a decision or shapes an opinion. For a stat to do either of those, it must be dragged within the everyday (e.g. using ratios or useful analogies). That's your job -- to do the dragging. In our world of billions and trillions, that can be a lot of manual labor. But it's worth it: A number people can grasp is a number that can make a difference.
analogies  base_rates  Cisco  Communicating_&_Connecting  contextual  data  data_journalism  high-impact  mathematics  narratives  numeracy  persuasion  probabilities  ratios  statistics  storytelling  sense-making  value_creation 
september 2009 by jerryking

related tags

5_W’s  adaptability  advertising_agencies  agriculture  analogies  anomalies  assumptions  automotive_industry  base_rates  behaviours  behind-the-scenes  books  book_reviews  charts  Cisco  Communicating_&_Connecting  complexity  consumer_behavior  contextual  curiosity  Daniel_Kahneman  data  data-driven  dataproduct  datascience  datasets  data_driven  data_journalism  data_scientists  decision_making  Detroit  drug_development  evolution  farming  finance  FOIA  government  Gov_2.0  groundbreaking  hackers  Hans_Rosling  hidden  high-impact  hiring  human_experience  infographics  information_overload  information_sources  insights  in_the_real_world  Irving_Wladawsky-Berger  journalism  listening  managing_change  mapping  massive_data_sets  mathematics  metacognition  narratives  New_York_City  nicarinterview  non-obvious  numeracy  NYT  open_data  organizing_data  patterns  persuasion  pharmaceutical_industry  precision_agriculture  probabilities  product_development  Publicis  Publishing  quantitative  ratios  reporters  self-organization  sense-making  shifting_tastes  show_your_work  spreadsheets  statistics  stockpiles  storytelling  systematic_approaches  thinking_deliberatively  timelines  tools  uncertainty  United_Kingdom  unstructured_data  value_creation  via:rahuldave  visualization 

Copy this bookmark:



description:


tags: