jerryking + spreadsheets   8

How 5 Data Dynamos Do Their Jobs
June 12, 2019 | The New York Times | By Lindsey Rogers Cook.
[Times Insider explains who we are and what we do, and delivers behind-the-scenes insights into how our journalism comes together.]
Reporters from across the newsroom describe the many ways in which they increasingly rely on datasets and spreadsheets to create groundbreaking work.

Data journalism is not new. It predates our biggest investigations of the last few decades. It predates computers. Indeed, reporters have used data to hold power to account for centuries, as a data-driven investigation that uncovered overspending by politicians, including then-congressman Abraham Lincoln, attests.

But the vast amount of data available now is new. The federal government’s data repository contains nearly 250,000 public datasets. New York City’s data portal contains more than 2,500. Millions more are collected by companies, tracked by think tanks and academics, and obtained by reporters through Freedom of Information Act requests (though not always without a battle). No matter where they come from, these datasets are largely more organized than ever before and more easily analyzed by our reporters.

(1) Karen Zraick, Express reporter.
NYC's Buildings Department said it was merely responding to a sudden spike in 311 complaints about store signs. But who complains about store signs? was hard to get a sense of the scale of the problem just by collecting anecdotes. So I turned to NYC Open Data, a vast trove of information that includes records about 311 complaints. By sorting and calculating the data, we learned that many of the calls were targeting stores in just a few Brooklyn neighborhoods.
(2) John Ismay, At War reporter
He has multiple spreadsheets for almost every article he works on......Spreadsheets helped him organize all the characters involved and the timeline of what happened as the situation went out of control 50 years ago......saves all the relevant location data he later used in Google Earth to analyze the terrain, which allowed him to ask more informed questions.
(3) Eliza Shapiro, education reporter for Metro
After she found out in March that only seven black students won seats at Stuyvesant, New York City’s most elite public high school, she kept coming back to one big question: How did this happen? I had a vague sense that the city’s so-called specialized schools once looked more like the rest of the city school system, which is mostly black and Hispanic.

With my colleague K.K. Rebecca Lai from The Times’s graphics department, I started to dig into a huge spreadsheet that listed the racial breakdown of each of the specialized schools dating to the mid-1970s.
analyzed changes in the city’s immigration patterns to better understand why some immigrant groups were overrepresented at the schools and others were underrepresented. We mapped out where the city’s accelerated academic programs are, and found that mostly black and Hispanic neighborhoods have lost them. And we tracked the rise of the local test preparation industry, which has exploded in part to meet the demand of parents eager to prepare their children for the specialized schools’ entrance exam.

To put a human face to the data points we gathered, I collected yearbooks from black and Hispanic alumni and spent hours on the phone with them, listening to their recollections of the schools in the 1970s through the 1990s. The final result was a data-driven article that combined Rebecca’s remarkable graphics, yearbook photos, and alumni reflections.

(4) Reed Abelson, Health and Science reporter
the most compelling stories take powerful anecdotes about patients and pair them with eye-opening data.....Being comfortable with data and spreadsheets allows me to ask better questions about researchers’ studies. Spreadsheets also provide a way of organizing sources, articles and research, as well as creating a timeline of events. By putting information in a spreadsheet, you can quickly access it, and share it with other reporters.

(5) Maggie Astor, Politics reporter
a political reporter dealing with more than 20 presidential candidates, she uses spreadsheets to track polling, fund-raising, policy positions and so much more. Without them, there’s just no way she could stay on top of such a huge field......The climate reporter Lisa Friedman and she used another spreadsheet to track the candidates’ positions on several climate policies.
311  5_W’s  behind-the-scenes  Communicating_&_Connecting  data  datasets  data_journalism  data_scientists  FOIA  groundbreaking  hidden  information_overload  information_sources  journalism  mapping  massive_data_sets  New_York_City  NYT  open_data  organizing_data  reporters  self-organization  systematic_approaches  spreadsheets  storytelling  timelines  tools 
june 2019 by jerryking
Opinion | The Surprising Benefits of Relentlessly Auditing Your Life
May 25, 2019 | The New York Times | By Amy Westervelt, a journalist and podcaster.

"The unexamined life is not worth living" is a famous dictum apparently uttered by Socrates at his trial for impiety and corrupting youth, for which he was subsequently sentenced to death, as described in Plato's Apology (38a5–6).
analytics  data  evidence_based  happiness  housework  marriage  note_taking  patterns  quality_of_life  quantitative  quantified_self  record-keeping  relationships  relentlessness  self-assessment  self-examination  self-improvement  spreadsheets 
may 2019 by jerryking
Stop Using Excel, Finance Chiefs Tell Staffs
Nov. 22, 2017 | WSJ | By Tatyana Shumsky.

“I don’t want financial planning people spending their time importing and exporting and manipulating data, I want them to focus on what is the data telling us (jk: i.e. "interpretation") ,” Mr. Garrett said. He is working on cutting Excel out of this process... for financial planning, analysis and reporting.

Finance chiefs say the ubiquitous spreadsheet software that revolutionized accounting in the 1980s hasn’t kept up with the demands of contemporary corporate finance units. Errors can bloom because data in Excel is separated from other systems and isn’t automatically updated. Older versions of Excel don’t allow multiple users to work together in one document, hampering collaboration. There is also a limit to how much data can be pulled into a single document, which can slow down analysis....Instead, companies are turning to new, cloud-based technologies from Anaplan Inc., Workiva Inc., Adaptive Insights and their competitors....The newer software connects with existing accounting and enterprise resource management systems, including those made by Oracle Corp. or SAP SE . This lets accountants aggregate, analyze and report data on one unified platform, often without additional training.
CFOs  errors  Excel  interpretation  financial_planning  spreadsheets 
november 2017 by jerryking
Novartis’s new chief sets sights on ‘productivity revolution’
SEPTEMBER 25, 2017 | Financial Times | Sarah Neville and Ralph Atkins.

The incoming chief executive of Novartis, Vas Narasimhan, has vowed to slash drug development costs, eyeing savings of up to 25 per cent on multibillion-dollar clinical trials as part of a “productivity revolution” at the Swiss drugmaker.

The time and cost of taking a medicine from discovery to market has long been seen as the biggest drag on the pharmaceutical industry’s performance, with the process typically taking up to 14 years and costing at least $2.5bn.

In his first interview as CEO-designate, Dr Narasimhan says analysts have estimated between 10 and 25 per cent could be cut from the cost of trials if digital technology were used to carry them out more efficiently. The company has 200 drug development projects under way and is running 500 trials, so “that will have a big effect if we can do it at scale”.......Dr Narasimhan plans to partner with, or acquire, artificial intelligence and data analytics companies, to supplement Novartis’s strong but “scattered” data science capability.....“I really think of our future as a medicines and data science company, centred on innovation and access.”

He must now decide where Novartis has the capability “to really create unique value . . . and where is the adjacency too far?”.....Does he need the cash pile that would be generated by selling off these parts of the business to realise his big data vision? He says: “Right now, on data science, I feel like it’s much more about building a culture and a talent base . . . ...Novartis has “a huge database of prior clinical trials and we know exactly where we have been successful in terms of centres around the world recruiting certain types of patients, and we’re able to now use advanced analytics to help us better predict where to go . . . to find specific types of patients.

“We’re finding that we’re able to significantly reduce the amount of time that it takes to execute a clinical trial and that’s huge . . . You could take huge cost out.”...Dr Narasimhan cites one inspiration as a visit to Disney World with his young children where he saw how efficiently people were moved around the park, constantly monitored by “an army of [Massachusetts Institute of Technology-]trained data scientists”.
He has now harnessed similar technology to overhaul the way Novartis conducts its global drug trials. His clinical operations teams no longer rely on Excel spreadsheets and PowerPoint slides, but instead “bring up a screen that has a predictive algorithm that in real time is recalculating what is the likelihood our trials enrol, what is the quality of our clinical trials”.

“For our industry I think this is pretty far ahead,” he adds.

More broadly, he is realistic about the likely attrition rate. “We will fail at many of these experiments, but if we hit on a couple of big ones that are transformative, I think you can see a step change in productivity.”
algorithms  analytics  artificial_intelligence  attrition_rates  CEOs  data_driven  data_scientists  drug_development  failure  Indian-Americans  multiple_targets  Novartis  pharmaceutical_industry  predictive_analytics  productivity  productivity_payoffs  product_development  real-time  scaling  spreadsheets  Vas_Narasimhan 
november 2017 by jerryking
We Survived Spreadsheets, and We’ll Survive AI - WSJ
By Greg Ip
Updated Aug. 2, 2017

History and economics show that when an input such as energy, communication or calculation becomes cheaper, we find many more uses for it. Some jobs become superfluous, but others more valuable, and brand new ones spring into existence. Why should AI be different?

Back in the 1860s, the British economist William Stanley Jevons noticed that when more-efficient steam engines reduced the coal needed to generate power, steam power became more widespread and coal consumption rose. More recently, a Massachusetts Institute of Technology-led study found that as semiconductor manufacturers squeezed more computing power out of each unit of silicon, the demand for computing power shot up, and silicon consumption rose.

The “Jevons paradox” is true of information-based inputs, not just materials like coal and silicon......Just as spreadsheets drove costs down and demand up for calculations, machine learning—the application of AI to large data sets—will do the same for predictions, argue Ajay Agrawal, Joshua Gans and Avi Goldfarb, who teach at the University of Toronto’s Rotman School of Management. “Prediction about uncertain states of the world is an input into decision making,” they wrote in a recent paper. .....Unlike spreadsheets, machine learning doesn’t yield exact answers. But it reduces the uncertainty around different risks. For example, AI makes mammograms more accurate, the authors note, so doctors can better judge when to conduct invasive biopsies. That makes the doctor’s judgment more valuable......Machine learning is statistics on steroids: It uses powerful algorithms and computers to analyze far more inputs, such as the millions of pixels in a digital picture, and not just numbers but images and sounds. It turns combinations of variables into yet more variables, until it maximizes its success on questions such as “is this a picture of a dog” or at tasks such as “persuade the viewer to click on this link.”.....Yet as AI gets cheaper, so its potential applications will grow. Just as better weather forecasting makes us more willing to go out without an umbrella, Mr. Manzi says, AI emboldens companies to test more products, strategies and hunches: “Theories become lightweight and disposable.” They need people who know how to use it, and how to act on the results.
artificial_intelligence  Greg_Ip  spreadsheets  machine_learning  predictions  paradoxes  Jim_Manzi  experimentation  testing  massive_data_sets  judgment  uncertainty  economists  algorithms  MIT  Gilder's_Law  speed  steam_engine  operational_tempo  Jevons_paradox  decision_making 
august 2017 by jerryking
Water Data Deluge: Addressing the California Drought Requires Access to Accurate Data - The CIO Report - WSJ
April 22, 2015| WSJ | By KIM S. NASH.

California, now in its fourth year of drought, is collecting more data than ever from utilities, municipalities and other water providers about just how much water flows through their pipes....The data-collection process, built on monthly self-reporting and spreadsheets, is critical to informing such policy decisions, which affect California’s businesses and 38.8 million residents. Some say the process, with a built-in lag time of two weeks between data collection and actionable reports, could be better, allowing for more effective, fine-tuned management of water.

“More data and better data will allow for more nuanced approaches and potentially allow the water system to function more efficiently,”...“Right now, there are inefficiencies in the system and they don’t know exactly where, so they have to resort to blanket policy responses.”...the State Water Resources Control Board imports the data into a spreadsheet to tabulate and compare with prior months. Researchers then cleanse the data, find and resolve anomalies and create graphics to show what’s happened with water in the last month. The process takes about 2 weeks....accuracy is an issue in any self-reporting scenario...while data management could be improved by installing smart meters to feed information directly to the Control Board automatically... there are drawbacks to any technology. Smart meters can fail, for example. “The nice thing about spreadsheets is anyone can open it up and immediately see everything there,”
lag_time  water  California  data  spreadsheets  inefficiencies  municipalities  utilities  bureaucracies  droughts  vulnerabilities  self-reporting  decision_making  Industrial_Internet  SPOF  bottlenecks  data_management  data_quality  data_capture  data_collection 
april 2015 by jerryking
Fresh Produce Group Chooses NetSuite Over the Competition
Previous systems provided limited visibility into company financial performance.
Vital information had to be retrieved from multiple sources, leading to frustrating delays in financial and management reporting.
High levels of manual processing were required to maintain spreadsheets for forecasting and inventory management, which was costly and prone to error.
An inefficient paper-based inventory management system meant perishable produce was regularly wasted.
Hours were also lost every week locating pallets on the warehouse floor.
Non-financial staff had very limited access to vital business data needed to be more accountable in their roles.
fresh_produce  ERP  challenges  information  IT  perishables  OPMA  spreadsheets  inefficiencies 
june 2014 by jerryking

related tags

5_W’s  algorithms  analytics  artificial_intelligence  attrition_rates  behind-the-scenes  bottlenecks  bureaucracies  California  CEOs  CFOs  challenges  Communicating_&_Connecting  data  datasets  data_capture  data_collection  data_driven  data_journalism  data_management  data_quality  data_scientists  decision_making  droughts  drug_development  economists  ERP  errors  evidence_based  Excel  experimentation  failure  financial_planning  FOIA  fresh_produce  Gilder's_Law  Greg_Ip  groundbreaking  happiness  hidden  housework  Indian-Americans  Industrial_Internet  inefficiencies  information  information_overload  information_sources  interpretation  IT  Jevons_paradox  Jim_Manzi  journalism  judgment  lag_time  machine_learning  mapping  marriage  massive_data_sets  MIT  multiple_targets  municipalities  New_York_City  note_taking  Novartis  NYT  open_data  operational_tempo  OPMA  organizing_data  paradoxes  patterns  perishables  pharmaceutical_industry  predictions  predictive_analytics  productivity  productivity_payoffs  product_development  quality_of_life  quantified_self  quantitative  real-time  record-keeping  relationships  relentlessness  reporters  scaling  self-assessment  self-examination  self-improvement  self-organization  self-reporting  speed  SPOF  spreadsheets  steam_engine  storytelling  systematic_approaches  testing  timelines  tools  uncertainty  utilities  Vas_Narasimhan  vulnerabilities  water 

Copy this bookmark: