Mary Norris Muses on a Lifetime of Literary Vigilance in ‘Between You & Me’ - The New York Times
In such cases, “I always think: ‘The writer likes that comma. That comma is doing something,’ ” she said. “And sometimes I take it out, and sometimes I leave it in.”
5 days ago
Clause and Effect - The New York Times
Refreshing though it is to see punctuation at the center of a national debate, there could scarcely be a worse place to search for the framers’ original intent than their use of commas. In the 18th century, punctuation marks were as common as medicinal leeches and just about as scientific. Commas and other marks evolved from a variety of symbols meant to denote pauses in speaking. For centuries, punctuation was as chaotic as individual speech patterns.
5 days ago
Zero to 353 Pages: Bringing My Web Book to Print and eBook – journal.stuffwithstuff.com
After a bunch of monkeying around, I found a new grid. Instead of a vertical grid where prose is every three grid lines and code is every two, I bumped the fraction to ¾. This opened up the code and asides a bit relative to the text. I brought down the top margin and gave myself more than enough breathing room near the spine.
publishing  book 
7 days ago
How I Made $70,714.20 Self-Publishing a Book About Ruby on Rails
Summary: A step-by-step walkthrough of how I made a nearly-full-time income from my Ruby on Rails course in 2016, and how it radically transformed my freelancing career. (7,121 words/35 minutes)
7 days ago
Code is not literature
As I prepared my presentation, I found myself falling into my usual pattern when trying to really understand a piece of code—in order to grok it I have to essentially rewrite it. I’ll start by renaming a few things so they make more sense to me and then I’ll move things around to suit my ideas about how to organize code. Pretty soon I’ll have gotten deep into the abstractions (or lack thereof) of the code and will start making bigger changes to the structure of the code. Once I’ve completely rewritten the thing I usually understand it pretty well and can even go back to the original and understand it too. I have always felt kind of bad about this approach to code reading but it's the only thing that's ever worked for me.
8 days ago
Secure computing for journalists – A Few Thoughts on Cryptographic Engineering
No, this is not a stupid question. Actually it’s an extremely important question, and judging by some of the responses to this Tweet there are a lot of other people who are confused about the answer.
security  privacy  journalism 
8 days ago
Readings in Database Systems, 5th Edition
Readings in Database Systems (commonly known as the "Red Book") has offered readers an opinionated take on both classic and cutting-edge research in the field of data management since 1988. Here, we present the Fifth Edition of the Red Book — the first in over ten years.
database  sql  databases  book 
8 days ago
[1703.03107] Online Human-Bot Interactions: Detection, Estimation, and Characterization
Increasing evidence suggests that a growing amount of social media content is generated by autonomous entities known as social bots. In this work we present a framework to detect such entities on Twitter. We leverage more than a thousand features extracted from public data and meta-data about users: friends, tweet content and sentiment, network patterns, and activity time series. We benchmark the classification framework by using a publicly available dataset of Twitter bots. This training data is enriched by a manually annotated collection of active Twitter users that include both humans and bots of varying sophistication. Our models yield high accuracy and agreement with each other and can detect bots of different nature. Our estimates suggest that between 9% and 15% of active Twitter accounts are bots. Characterizing ties among accounts, we observe that simple bots tend to interact with bots that exhibit more human-like behaviors. Analysis of content flows reveals retweet and mention strategies adopted by bots to interact with different target groups. Using clustering analysis, we characterize several subclasses of accounts, including spammers, self promoters, and accounts that post content from connected applications.
8 days ago
Logic Gates - Building an ALU
The goal of this tutorial is to understand the basics of building complex circuit from simple AND, OR, NOT and XOR logical gates. (We have studied in class the functionalities of the corresponding bitwise operators.) This tutorial will teach you how to build an Arithmetic Logic Unit (ALU) from scratch, using these simple logic gates and other components. Read each tutorial step carefully and complete the activities listed in each step.
The ALU will take in two 32-bit values, and 2 control lines. Depending on the value of the control lines, the output will be the addition, subtraction, bitwise AND or bitwise OR of the inputs. Schematically, here is what we want to build:
computer-engineering  logic 
8 days ago
The Elements of Computing Systems / Nisan & Schocken
And of the book The Elements of Computing Systems, MIT Press, By Noam Nisan and Shimon Schocken

The site contains all the software tools and project materials necessary to build a general-purpose computer system from the ground up. We also provide a set of lectures designed to support a typical course on the subject.

The materials are aimed at students, instructors, and self-learners. Everything is free and open-source; as long as you operate in a non-profit educational setting, you are welcome to modify and use our materials as you see fit.

The materials also support two courses that we now teach in Coursera:

Nand2Tetris Part I (hardware, projects/chapters 1-6) is offered as an on-demand course that learners take at their own pace. Here is a two-minute video promo of this course.

Nand2Tetris Part II (software, projects/chapters 7-12): is also offered on Coursera, in the same format.
tutorial  programming  hardware  computer  book 
8 days ago
Trump Hires Three Men for Every Woman - Bloomberg
Women have been named to 27 percent of the appointed roles filled by President Donald Trump so far, according to a Bloomberg News analysis of records newly released by the federal government. That number falls far short of overall representation in the U.S. labor force, where women account for 47 percent.
11 days ago
Easy Does It: More Usable CAPTCHAs
Websites present users with puzzles called CAPTCHAs to curb
abuse caused by computer algorithms masquerading as people.
While CAPTCHAs are generally effective at stopping abuse,
they might impair website usability if they are not properly
In this paper we describe how we designed two new
CAPTCHA schemes for Google that focus on maximizing
usability. We began by running an evaluation on Amazon
Mechanical Turk with over 27,000 respondents to test the usability
of different feature combinations. Then we studied user
preferences using Google’s consumer survey infrastructure.
Finally, drawing on the insights gleaned during those studies,
we tested our new captcha schemes first on Mechanical Turk
and then on a fraction of production traffic. The resulting
scheme is now an integral part of our production system and
is served to millions of users. Our scheme achieved a 95.3%
human accuracy, a 6.7% improvement.
computer-vision  captcha 
13 days ago
Election DataBot - ProPublica
The most interesting campaign data in near-real-time, including campaign finance filings, Google search trends, vote activity from sitting members of congress, new polls, forecasts from 538, and Cook Political Report race ratings. Here's more on how to use the Election DataBot and our sources and methodology.
politics  journalism  data  bots 
18 days ago
The O.R. factory: High volume, big dollars, rising tension at Swedish’s Cherry Hill hospital
From NICAR email:

The main story documents the rise of the Swedish Neuroscience Institute and its top surgeons. We mined the state’s Comprehensive Abstract Reporting System, which provided case-level data for all patients admitted to hospitals and allowed us to quantify caseloads and billed charges for the top neuro and spine specialists. Data also show some patients have undergone more invasive surgeries than available alternatives, particularly in the treatment of brain aneurysms...

Hey folks, just wanted to draw your attention to our latest project. Over the weekend, the Seattle Times began publishing “Quantity of Care” an ongoing investigation into one of Seattle’s most respected hospitals and its neuroscience institute. The Times spent a year digging into the inner workings of Swedish-Cherry Hill and found an institution in turmoil. The hospital had shifted its business approach to incentivize high volume, big dollar surgical procedures, enriching its star brain and spine surgeons in the process. As caseloads increased, so did the concerns of other doctors and medical staff from inside the building: unnecessary surgeries, high rates of complications, issues of patient safety.
compciv  padjo  data-journalism  compjoproject 
4 weeks ago
Learn to Live with Academic Rankings | November 2016 | Communications of the ACM
As an academic, I also produce such numbers. I assign grades to my students. I strive to have the assigned grade accurately reflect a student's grasp of the material in my course. But I know this is imperfect. At best, the grade reflects the student's knowledge today. When a prospective employer looks at it two years later, it is possible an A student had crammed for the exam and has since completely forgotten the material, while a B student deepened his or her understanding substantially through a subsequent internship. The employer must learn to get past the grade to develop a richer understanding of the student's strengths and weaknesses.
7 weeks ago
One Dataset, Visualized 25 Ways | FlowingData
Looking at more advanced visualization, you might find yourself wanting to do the same or some variation. That’s good. But if you’re brand new to the practice, programming, or the software, it might feel like a long path to get to where you want to go. That’s fine too.
advice  data  visualization 
8 weeks ago
Kissinger’s Files and Invisible Ink Recipes: C.I.A. Trove Has It All - The New York Times
For those who believe the truth is out there, the website has a collection of reports on unidentified flying objects, and capitalized on interest in last year’s “X-Files” reboot by posting the “top five documents Mulder would love to get his hands on.”

After journalists at MuckRock, a news site, filed Freedom of Information Act requests for access to the Crest database, the C.I.A. said in 2015 that it would take 28 years to publish. In 2015, the agency cut its estimate to six years, and said the documents would be delivered on 1,200 compact discs at the price of $108,000.

Put off by what he perceived as stalling, Mr. Best crowdfunded $15,000 to print, scan and publish files himself. In October, the C.I.A. said it would post the files.

“C.I.A. made significant architectural and procedural changes to load and index the Crest documents more quickly,” said Heather Fritz Horniak, a spokeswoman for the C.I.A. “This means that we were able to post the entire Crest collection, totaling nearly 13 million pages, online much earlier than anticipated.”
8 weeks ago
Orphan Drug Rules Manipulated By Industry To Create Prized Monopolies : Shots - Health News : NPR
More than 30 years ago, Congress overwhelmingly passed a landmark health bill aimed at motivating pharmaceutical companies to develop new drugs for people whose rare diseases had been ignored.

By the drugmakers' calculations, the markets for such diseases weren't big enough to bother with.

But lucrative financial incentives created by the Orphan Drug Act signed into law by President Reagan in 1983 succeeded far beyond anyone's expectations. More than 200 companies have brought almost 450 so-called orphan drugs to market since the law took effect.

Read our second orphan drug story

High Prices For Orphan Drugs Strain Families And Insurers
High Prices For Orphan Drugs Strain Families And Insurers
Yet a Kaiser Health News investigation shows that the system intended to help desperate patients is being manipulated by drugmakers to maximize profits and to protect niche markets for medicines already being taken by millions. The companies aren't breaking the law but they are using the Orphan Drug Act to their advantage in ways that its architects say they didn't foresee or intend. Today, many orphan medicines, originally developed to treat diseases affecting fewer than 200,000 people, come with astronomical price tags.
9 weeks ago
How Many People Will Attend Trump's Inauguration? Why to Take Turnout Estimates With a Grain of Salt | NBC4 Washington
To attempt an accurate estimate, analysts must take into account differences in crowd density in different places of the Mall. For example, far from the Capitol, crowds are often clustered in front of Jumbotrons but more sparse elsewhere.
9 weeks ago
Today, you are an Astronaut. You are floating in inner space 100 miles above the surface of Earth. You peer through your window and this is what you see. You are people watching. These are fleeting moments.

These videos come from YouTube. They were uploaded in the last week and have titles like DSC 1234 and IMG 4321. They have almost zero previous views. They are unnamed, unedited, and unseen by anyone but YOU.

The Astronaut video stream starts when you press GO. Videos change periodically. If you wish to linger, tap the button.
apis  best  project-idea 
9 weeks ago
Biopharma Can't Keep Getting Blindsided by Trump - Bloomberg Gadfly
This dream was a fiction. An early clue was a Time magazine interview in December, when Trump pledged to bring down drug prices, a warning that took 3 percent off the NBI. Then, in a press conference on Wednesday, Trump accused the industry of "getting away with murder" and pledged to save billions of dollars in government health-care spending by forcing companies to bid for business. That caused the NBI to immediately drop another 3 percent.
politics  pharmalot 
9 weeks ago
How The Chicago Reporter Made 'Settling for Misconduct' - Features - Source: An OpenNews project
The Chicago Reporter was a few days from publishing a major investigation into lawsuits against Chicago police when we learned we needed to revise a number in our story.

A city bond issue, used to pay for two years of settlements and judgments of police misconduct, would end up costing Chicagoans $530 million after interest payments—a number higher than we previously thought.

Our conclusion? “We’re gonna need a bigger chart.” Five hundred and thirty million wouldn’t fit the Y-axis of our bar graph.

It was a fitting wrap to the years-long project, which, in a lot of ways, seemed outsized for a six-person nonprofit newsroom.

In researching Settling for Misconduct, we had to account for details from hundreds of county and federal court filings, identify thousands of officers named in civil complaints and tally hundreds of millions of dollars in monetary awards.
data-journalism  best 
10 weeks ago
From Python to Numpy
There are already a fair number of books about Numpy (see Bibliography) and a legitimate question is to wonder if another book is really necessary. As you may have guessed by reading these lines, my personal answer is yes, mostly because I think there is room for a different approach concentrating on the migration from Python to Numpy through vectorization. There are a lot of techniques that you don't find in books and such techniques are mostly learned through experience. The goal of this book is to explain some of these techniques and to provide an opportunity for making this experience in the process.
numpy  python  book  best 
10 weeks ago
Doctors & Sex Abuse: About the AJC’s investigation of doctor misconduct
At that point, our data journalism team wrote computer programs to “crawl” regulators’ websites – a process known as scraping – and obtain board orders. This required building about 50 such programs tailored to agencies across the country. That collected more than 100,000 disciplinary documents. To assist us in identifying those involving sexual misconduct, we then created a computer program based on “machine learning” to analyze each case and, based on keywords, give each a probability rating that it was related to a case of physician sexual misconduct.
compciv  best  investigations 
10 weeks ago
Deep Text Correcter
While context-sensitive spell-check systems (such as AutoCorrect) are able to automatically correct a large number of input errors in instant messaging, email, and SMS messages, they are unable to correct even simple grammatical errors. For example, the message “I’m going to store” would be unaffected by typical autocorrection systems, when the user most likely intendend to communicate “I’m going to the store”.
10 weeks ago
Here’s why the fourth member of the Spotlight team has been pretty quiet
Aside from one compelling scene, Carroll’s role in the film isn’t that glamorous. Like in real life, Carroll’s character creates a spreadsheet of the dozens of pastors suspected of sexual abuse, and he and the team go about filling in those data cells with victims’ stories.
compciv  data-journalism 
10 weeks ago
What Death Penalty Opponents Don’t Get | The Marshall Project
There are fates worse than death.

In many states, the expansion—and the very existence—of life without parole sentences can be directly linked to the struggle to end capital punishment. Death penalty opponents often accept—and even zealously promote--life without parole as a preferable option, in the process becoming champions of a punishment that is nearly unknown in the rest of the developing world.
compciv  best 
10 weeks ago
Spotlight, the movie: A personal view – 3 to read – Medium
Me, I was the geek. (Brian d’Arcy James) I reported and wrote, but also created the database of bad priests. It was an effective tool for developing leads about abusive priests who were placed on sick leave or were transferred frequently because of complaints. Creating the spreadsheet is a nice scene in the movie, and among the few (maybe the first) that makes a database a key part of a journalism movie. Go geeks! Brian later said: “Until I met you, I thought a spreadsheet was something you bought at Bed Bath & Beyond.”

What’s obvious now is the huge appetite for entire databases of primary source, searchable documentation. It wasn’t quite as clear then how deep the hunger for that information would be. Well, live and learn. I can only sigh as think about the dozens of boxes of court documents that are now buried in an Iron Mountain facility somewhere. If it happened now, every page would be online and searchable by the end of the business day.
spreadsheets  databases  best  data-journalism 
10 weeks ago
« earlier      
a-b-testing academic activerecord advice ai algorithm algorithms amazon analysis analytics angular angularjs animation api apis apple apps architecture art article aws backbone bash bayesian best big-data bioinformatics book bots build business c caching campaign-finance census cheatsheet cli clinicaltrials clojure code coffeescript collect-this colors command-line comparison compciv compilers computer computer-science computer-vision computing course crime crypto css d3 data data-analysis data-journalism data-mining data-munging data-science data-sharing data-visualization database databases datajournalism datasets ddj death-data debugging deep-learning deployment design design-example devops digital-humanities django docker drugs education elections emacs email engineering essay excel facebook fakenews finance flux foia framework funny game game-dev games gaming geocoding git github golang google government graphics guide hack hacking hadoop hardware hash haskell health history howto html html5 http humor icons image-processing infographic interactive interesting internet introduction investigations ios java javascript journalism jquery json judicial-system language learning linux lisp mac machine-learning map-reduce mapping maps marketing math medicine memory mobile mongodb music mysql netsec network neural-networks news nlp nodejs nosql nyc nylist object-oriented ocr oop open-data opencv optimization osx padjo pandas papers parsing patterns performance photography police politics postgres prisons privacy programming publicrecords punctuation python r rails react reactjs redis reference regex research ruby rust scalability science scikit scraping search security semitechnical seo server server-ops shell sinatra spam spreadsheets sql sqlite standards startups statistics swift syllabus tdd teaching tensorflow testing text text-mining tools transparency tutorial twitter typography ui unicode unix ux video vim visualization visualizations web web-design web-development web-scraping wordpress workflow writing wtfviz

Copy this bookmark: