paulbradshaw + python   288

Module 3 - Advanced Python Concepts | engMRK
Module 3 – Advanced Python Concepts will introduce the concept of function and class in a simplest possible manner. You will learn and practice with various examples. Before starting Module 3, one must complete Module 1 – Refresh Your Python Basics and Module 2 – Practice Python While Learning Advanced Concept.
python  classes  functions 
august 2018 by paulbradshaw
Python has brought computer programming to a vast new audience - Programming languages
Not all Pythonistas are so ambitious, though. Zach Sims, Codecademy’s boss, believes many visitors to his website are attempting to acquire skills that could help them in what are conventionally seen as “non-technical” jobs. Marketers, for instance, can use the language to build statistical models that measure the effectiveness of campaigns. College lecturers can check whether they are distributing grades properly. (Even journalists on The Economist, scraping the web for data, generally use programs written in Python to do so.)
python  scraping  Economist 
august 2018 by paulbradshaw
Web Scraping Using Python (article) - DataCamp
In this tutorial, you will learn about the following:

• Data extraction from the web using Python's Beautiful Soup module

• Data manipulation and cleaning using Python's Pandas library

• Data visualization using Python's Matplotlib library
scraping  python  beautifulsoup  tutorial 
august 2018 by paulbradshaw
Naïve Bees: Image Loading and Processing - Python - Online | DataCamp
Can a machine distinguish between a honey bee and a bumble bee? Being able to identify bee species from images, while challenging, would allow researchers to more quickly and effectively collect field data. In this Project, you will use the Python image library Pillow to load and manipulate image data. You'll learn common transformations of images and how to build them into a pipeline.
ml  tutorial  python  images  bees 
july 2018 by paulbradshaw
Meet the open-source Twitter bot to help you surface stories on anything
Maybe you want to monitor a collection of local news sources for reporting on a contentious development project; or top tech news sites mentioning a particular group of influential investors; or political blogs acknowledging a long-shot primary challenger. With Track The News, we’ve built a tool that allows the kind of extensive configuration necessary to run such an application, and a mechanism to share the results immediately and automatically with the public.
bot  python  rss  tools  automation 
july 2018 by paulbradshaw
JupyterHub — JupyterHub documentation
JupyterHub, a multi-user Hub, spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server. JupyterHub can be used to serve notebooks to a class of students, a corporate data science group, or a scientific research group.
jupyter  notebooks  Python  Tools  hosting 
may 2018 by paulbradshaw
How To Rotate Proxies using Python 3 and Requests for Web Scraping
If you are using Python-Requests, you can send requests through a proxy by configuring the proxies argument. For example

import requests
proxies = {
  'http': '',
  'https': '',
requests.get('', proxies=proxies)
Scraping  Python  proxies  blocking 
may 2018 by paulbradshaw
Getting started with Python for R developers
RT @rforjournalists: Nice overview of Python from a R perspective:
python  r 
may 2018 by paulbradshaw
Using Databases with Python | Coursera
About this course: This course will introduce students to the basics of the Structured Query Language (SQL) as well as basic database design for storing data as part of a multi-step data gathering, analysis, and processing effort. The course will use SQLite3 as its database. We will also build web crawlers and multi-step data gathering and visualization processes. We will use the D3.js library to do basic data visualization. This course will cover Chapters 14-15 of the book “Python for Everybody”. To succeed in this course, you should be familiar with the material covered in Chapters 1-13 of the textbook and the first three courses in this specialization. This course covers Python 3.
python  mooc  databases  sql  d3 
april 2018 by paulbradshaw
Using Python to Access Web Data | Coursera
About this course: This course will show how one can treat the Internet as a source of data. We will scrape, parse, and read web data as well as access data using web APIs. We will work with HTML, XML, and JSON data formats in Python. This course will cover Chapters 11-13 of the textbook “Python for Everybody”. To succeed in this course, you should be familiar with the material covered in Chapters 1-10 of the textbook and the first two courses in this specialization. These topics include variables and expressions, conditional execution (loops, branching, and try/except), functions, Python data structures (strings, lists, dictionaries, and tuples), and manipulating files. This course covers Python 3.
scraping  python  mooc 
april 2018 by paulbradshaw
RT : Dades del PDF original separades amb , adreçes scrapeades amb , localitzacions extretes de l'API de Go…
Python  Excel  from twitter
april 2018 by paulbradshaw
reticulate: R interface to Python | RStudio Blog
We are pleased to announce the reticulate package, a comprehensive set of tools for interoperability between Python and R. The package includes facilities for:

Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session.

Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays).

Flexible binding to different versions of Python including virtual environments and Conda environments.

Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability. If you are an R developer that uses Python for some of your work or a member of data science team that uses both languages, reticulate can dramatically streamline your workflow!
r  python  conversion  library 
april 2018 by paulbradshaw
pypdfocr 0.9.1 : Python Package Index
Take a scanned PDF file and run OCR on it (using the Tesseract OCR software from Google), generating a searchable PDF
Optionally, watch a folder for incoming scanned PDFs and automatically run OCR on them
Optionally, file the scanned PDFs into directories based on simple keyword matching that you specify
Evernote auto-upload and filing based on keyword search
Email status when it files your PDF
More links:
cleaning  pdfs  tools  Python  library  ocr 
march 2018 by paulbradshaw
Using textual analysis to quantify a cast of characters
The code explained in this post will yield a dictionary of proper nouns and the number of time they are used in the text, like the excerpt shown below for the text of Harry Potter and the Philosopher’s Stone. It also links instances of two proper nouns appearing one after the other, like ‘Albus Dumbledore’ or ‘Aunt Petunia’.
python  nlp  ai  text  dj  tools  tutorial 
february 2018 by paulbradshaw
Cut down database imports by a third using this one weird trick
Version 2.2 of django-postgres-copy, now available on the Python Package Index, boosts the performance of PostgreSQL’s COPY command by automatically dropping indexes and constraints on tables prior to the loading.

The result is significantly faster ingestion. Our speed tests – using tens of millions of state records – found the change reduced load time of large tables by nearly one third.
sql  python  tools  django  speed  postgresql 
january 2018 by paulbradshaw
Anaconda Navigator crashes OS Sierra · Issue #4902 · ContinuumIO/anaconda-issues · GitHub
Please update to the latest version of Navigator to include
the latest fixes.

Open a terminal (on Linux or Mac) or the Anaconda Command Prompt (on windows)
and type:

$ conda update anaconda-navigator
anaconda  python  bugs 
december 2017 by paulbradshaw
Linguistic Features | spaCy Usage Documentation
Processing raw text intelligently is difficult: most words are rare, and it's common for words that look completely different to mean almost the same thing. The same words in a different order can mean something completely different. Even splitting text into useful word-like units can be difficult in many languages. While it's possible to solve some problems starting from only the raw characters, it's usually better to use linguistic knowledge to add useful information. That's exactly what spaCy is designed to do: you put in raw text, and get back a Doc object, that comes with a variety of annotations.
text  tools  Python  nlg  nltk  api 
november 2017 by paulbradshaw
Applying NLP and Entity Extraction To The Russian Twitter Troll Tweets In Neo4j (and more Python!) · William Lyon
Our previous post focused on scraping Internet Archive to retrieve the data and import into Neo4j. We also looked at some Cypher queries we could use to analyze the data. In this post we make use of a natural language processing technique called entity extraction to enrich our graph data model and help us explore the dataset of Russian Twitter Troll Tweets. For example, can we try to see what people, places, and organizations these accounts were tweeting about in the months leading up to the 2016 election?
nlg  neo4j  Networkanalysis  tutorial  Twitter  text  Python 
november 2017 by paulbradshaw
Colaboratory – Google
Jupyter is the open source project on which Colaboratory is based. Colaboratory allows you to use and share Jupyter notebooks with others without having to download, install, or run anything on your own computer other than a browser.
notebooks  tools  google  Python  jupyter  ddj 
november 2017 by paulbradshaw
The Python Graph Gallery – Visualizing data – with Python
The Python Graph Gallery – Visualizing data – with Python
vis  python  gallery  Directory  t 
october 2017 by paulbradshaw
The Internet Archive Python Library — internetarchive 1.7.4 documentation
internetarchive is a command-line and Python interface to Please report any issues on Github.
Python  Library  archives  internetarchive  data  t 
september 2017 by paulbradshaw
About AudioBoom and – KOZ
It would be pretty easy for anyone to script a way to grab all your AudioBoom files and dump them in (what’s called) an ‘Item’ on (Amazon S3 calls Items ‘buckets’)

According to their docs, ‘Items’ should not be over 100Gb. And should not contain over 10,000 files. So i think one would do! You should also be able script the attachment of the original metadata (locations, etc. even comments and text.).
audioboom  archiving  tutorial  t  commandline  python 
september 2017 by paulbradshaw
Top 5 Python IDEs For Data Science (Article)
Top 5 Python IDEs For Data Science (Article)
ide  Python  t 
june 2017 by paulbradshaw
Factors correlating with predicted turnout change in England & Wales · GitHub
Using numpy, a scientific computing package for Python, I used the numpy.corrcoef method to output correlation coefficients for the features in the dataset of England & Wales constituencies.
correlation  Python  elections  data  voting 
june 2017 by paulbradshaw
ParlAI | About
ParlAI (pronounced “par-lay”) is a framework for dialog AI research, implemented in Python.

Its goal is to provide researchers:

a unified framework for training and testing dialog models
multi-task training over many datasets at once
seamless integration of Amazon Mechanical Turk for data collection and human evaluation
Python  ai  Tools  text 
may 2017 by paulbradshaw
Keras Documentation
Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
api  neuralnetwork  python  machinelearning  tools  ml 
may 2017 by paulbradshaw
Python Machine Learning | PACKT Books
Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
book  python  machinelearning  ai  coding  ml 
may 2017 by paulbradshaw
This repo contains Jupyter notebooks, along with related code and example data, for a tutorial on the use of D3 in Jupyter.
Python  d3  jupyter  notebooks  library 
may 2017 by paulbradshaw
Zero to JupyterHub — Zero to JupyterHub with Kubernetes 0.1 documentation
JupyterHub is a tool that allows you to quickly utilize cloud computing infrastructure to manage a hub that enables users to interact remotely with a computing environment that you specify. JupyterHub offers a useful way to standardize the computing environment of a group of people (e.g., for a class of students), as well as allowing people to access the hub remotely.

This growing collection of information will help you set up your own JupyterHub instance. It is in an early stage, so the information and tools may change quickly. If you see anything that is incorrect or have any questions, feel free to reach out at the issues page.
jupyter  servers  Python  hosting  cloud  tools 
may 2017 by paulbradshaw
Python for Data Journalists: Analyzing Money in Politics
You will learn just enough of the Python computer programming language to work with the pandas library, a popular open-source tool for analyzing data. The course will teach you how to use pandas to read, filter, join, group, aggregate and rank structured data.

You will also learn how to record, remix and republish your analysis using the Jupyter Notebook, a browser-based application for writing code that is emerging as the standard for sharing reproducible research in the sciences.
python  mooc 
may 2017 by paulbradshaw
Building a Keyword Monitoring Pipeline with Python, Pastebin and Searx | Automating OSINT Blog
Searx is this really cool project that provides you a self-hosted interface to search multiple search engines at once. This is called a meta-search engine and was all the rave in the late 90s. Yeah, I am that old.

One thing that Searx also provides is the ability to query it and receive the results back in JSON. This gives us the ability to write Python code to talk to it and to process the results without having to use web scraping techniques or paying for an expensive API key.
Search  searchengine  tools  searx  python  tutorial  api 
april 2017 by paulbradshaw
« earlier      
per page:    204080120160

related tags

@andbwell  addresses  advancedsearch  ai  ajax  anaconda  Analytics  api  apis  apps  archives  archiving  audioboom  automation  beautifulsoup  bees  birmingham  birn11  blocking  blog  book  books  bot  bots  bugs  buzzfeed  c  changes  chart  cheatsheets  chernofffaces  cityoj05  cityojDJ  classes  cleaning  client  cloud  code  coding  collaboration  commandline  community  companieshouse  conversion  converter  correlation  course  croatiaDJ1  css  cssselect  csv  d3  data  database  databases  datacleaning  dataj01  datamining  datascience  davidblood  db  ddj  deepweb  diff  Directory  dj  django  docs  documentation  documents  dopostback  drawing  duplicates  ebook  Economist  editing  education  elections  email  errors  ethics  excel  facebook  faces  fanpage  fb  filetype:pdf  football  framework  frequency  functions  fuzzy  gallery  games  generator  geo  geocoding  getattr  github  githubeg  google  googleappengine  graphviz  group  hackshackers  hhbhx  hosting  html  humour  ide  IFTTT  ij  image  images  infographic  init  Instagram  internetarchive  javascript  journalism  js  json  jupyter  jurestabuc  kids  latlong  libraries  library  linkeddata  lists  lxml  machinelearning  mapping  mashups  matplotlib  mechanize  media:document  meta  methods  migration  ml  mlu  mmj03  mooc  mysql  names  nationalgrid  neo4j  networkanalysis  neuralnetwork  newsapp  nicar  nim  nlg  nlp  nltk  notebook  notebooks  ocr  omparison  on  onlineVideo  oop  opendata  ordnancesurvey  osint  packages  panamapapers  panda  pandas  pdf  pdfs  pedagogy  people  php  postcodes  postgresql  presentation  programming  proxies  py3  pydot  pyquery  python  r  raspberrypi  ratelimit  rdf  recursion  ref  reference  regex  return  rss  ruby  sankey  scraper  scraperwiki  scraping  scrapy  screenscraping  script  seaborn  Search  searchengine  searx  security  selenium  servers  slack  sleep  sm  sparql  speed  sport  spotify  sql  statistics  style  sunlightfoundation  syllabus  t  tables  tagthis  telegram  tennis  terminal  test  testing  text  textanalysis  textanalytics  times  tonyhirst  tool  tools  toplay  tor  training  Transparency  tutorial  tutorials  tweepy  twitter  video  virtualenv  vis  visualisation  voting  wayback  waybackmachine  webinar  webscraping  websites  with  word  wordclouds  wordcounter  wordcounts  wordfrequency  wordnet  words  xkcd  xlsx  xml  xpath  yahoopipes  yql  zip  “@paulbradshaw: 

Copy this bookmark: