psychemedia + tm351   217

Interesting - autopandas Give it an input dataframe, and a target dataframe you want to tra…
pandas  pd  programGeneration  automation  tm351 
6 weeks ago by psychemedia
Apache Flink: Stateful Computations over Data Streams
I so don't have a handle on how streaming data processing works...
tm351  streamingdata 
10 weeks ago by psychemedia
QB4ST: RDF Data Cube extensions for spatio-temporal components
Hmm... interesting.. QB4ST: RDF Data Cube extensions for spatio-temporal components A bit m…
data  datatype  TM351  StevensNOIR 
10 weeks ago by psychemedia
How can we describe different types of dataset? Ten dataset archetypes – Lost Boy
Ever an interesting read, @ldodds on "How can we describe different types of dataset? Ten dataset archetypes"
data  archetypes  persona  TM351 
10 weeks ago by psychemedia
Why a data scientist is not a data engineer - O'Reilly Media
Interesting... "Why a data scientist is not a data engineer" I think they are both distinct…
tm351  tm358  dataScience  dataEngineering  data  jobs  dataJobs 
april 2019 by psychemedia
Python Record Linkage Toolkit Documentation — Python Record Linkage Toolkit 0.12 documentation
This could be handy for ETL / reconciling data files / partial matching datasets: #ddj
TM351  dataPipeline  fuzzymatch  ETL  ddj 
march 2019 by psychemedia
Garbage Collection in Python - GeeksforGeeks
That why, sometime i get out of running time or out of memory in Jupyter Notebook on my local or kaggle :D
tm351  bestpractice  py  resourceUsage  memory  garbageCollection  from twitter_favs
february 2019 by psychemedia
Convert VDI (VirtualBox) to raw, qcow2, qed, vmdk, vhd in Windows
Generate raw for upload to openstack from virtualbox box; eg:

VBoxManage clonemedium ~/VirtualBox\ VMs/tm351_18J-student/box-disk001.vmdk tm351_18J-student.raw --format RAW
virtualbox  openstack  tm351  VM 
november 2018 by psychemedia
About — Deon
"command line tool that allows you to easily add an ethics checklist to your data science projects"
ethics  dataEthics  TM351  checklist  guide  data 
october 2018 by psychemedia
Closing issues using keywords - User Documentation
Close issues automatically when a PR branch that addresses the issue is merged into default branch.
github  issues  workflow  TM351 
october 2018 by psychemedia
deathbeds/importnb: notebook files as source
@psychemedia imports notebooks as modules, and it has a ton of notebook tests.…
ipynb  workflow  tm351 
may 2018 by psychemedia
testing jupyterhub unicode errors in logging
"I just setup a docker container that runs something in the background! (it was jupyterhub).
It was this gist.
Starting something in my background wasn't my goal, I was aiming to reproduce a bug in a testable environment, and that meant using supervisor.
But you can do the same, either launch postgres as a service, or launch it with supervisor, or even putting it in the background with nohup postgres &.
I think you'll need to set the ENTRYPOINT of the image to spawn postgress prior to launching the command passed by binder."
jupyter  startup  postgres  binderhub  tm351 
february 2018 by psychemedia
Linked Data Templates
@fantasticlife interesting take on structuring triple soup as an API via an ontology… Or something…
linkedData  LD  tm351  publishing 
january 2018 by psychemedia
betatim/openrefineder: 💠 + 📚 OpenRefine on Binder!
Thinking this recipe from @betatim torun Openrefine via Binderhub could be tweaked to run datasette ? [@simonw]
openrefine  binder  binderhub  mybinder  tm351 
january 2018 by psychemedia
Simpler alternative to pandas
tm351  pandas  data  csv  py  package 
january 2018 by psychemedia
Death by Pokémon GO by Mara Faccio, John McConnell :: SSRN
Faccio, Mara and McConnell, John J., Death by Pokémon GO (November 18, 2017). Available at SSRN:
tm351  accidentData  geodata 
november 2017 by psychemedia
» As a researcher…I’m a bit bloody fed up with Data Management
“As a researcher…I’m a bit bloody fed up with Data Management”, via @cameronneylon Yep…
researchData  data  management  dataManagement  RDM  researchDataManagement  TM351  library 
june 2017 by psychemedia
« earlier      
per page:    204080120160

related tags

@timhunt  access  accessibility  accessible  accidentData  activity  ai  algorithm  algorithms  AMI  anaconda  anonymisation  ansible  apacheDrill  API  archetypes  archiver  assessment  authenticity  automation  aws  bestpractice  binder  binderhub  book  bootable  browser  bulletin  chart  charts  cheat  cheatsheet  checklist  CI  cli  cloud  clustering  code  codeAssessment  codeGolf  codeOfPractice  codeProfiling  codeTesting  coding  conda  connector  context  CONTRIB  contributing  contribution  conversion  converter  cookbook  courtData  courtStats  crime  crimedata  crimestats  crisisdata  csv  d3  d3js  data  dataAnalysis  dataAssetRegister  database  datacleaning  datacleanliness  dataCleansing  datacourse  dataEngineering  dataEthics  dataGenerator  dataGrammar  dataInterchange  dataJobs  dataLifecycle  datalit  datamanagement  datapipeline  dataprotection  DataProtectionAct  dataquality  dataRhetoric  datascience  dataset  datasette  dataShaping  dataskills  dataTraining  datatype  datavalidation  dataviz  datawrangling  dateFormat  dateparser  dates  datetime  db  dbAnalysis  dbtheory  ddj  deployment  detection  detector  devops  DHbox  differ  digischol  dilbert  distribution  docker  DOckerfile  docs  DPA  dplyr  drone  dropbox  dummyData  eAssessment  eastingnorthing  eastingsNorthings  edcuation  editor  elearning  emulation  encoding  encodings  encryptedDB  encryption  environment  erd  ethics  ETL  eusprig  example  excel  extension  extensions  feedthru  fig  fileEncoding  filesync  filetype  flowchart  folium  format  formats  funny  fuzzymatch  gallery  game  garbageCollection  gdrive  generator  geo  geodata  geopandas  ggplot  ggplot2  gh  git  gitbook  github  gituhub  golf  graphviz  grid  guidance  guide  guidelines  heroku  history  hosted  hosting  humour  hyperv  icma  ICO  IEEE  impactAssessment  ingest  interactive  IoC  ipynb  ipython  ipythonMagic  issues  ITmetaphor  jobs  join  journalism  joyFunData  json  jupyter  kitematic  kmeans  landregistry  landscape  largeData  latlon  latlong  layers  LD  leaflet  leaflet.js  leafletjs  learningAnalytics  library  linkedData  linux  liveUSB  logger  LSOA  magic  management  map  mapping  maps  matplotlib  memory  memoryUsage  memUsage  metaphor  metaphors  minard  ml  mongo  mongodb  mooc  moodle  movielens  mpld3  multiuser  mybinder  nbdime  nbextension  nbgrader  nbProfile  nbstats  NESTA  networked  networkx  nlp  nnip  normalisation  notebook  notebookProfiling  OLAP  onboarding  online  onlineAssessment  onramp  ons  opendata  openmark  openrefine  openstack  Orange  OS  OSGB  OSgrid  ou  OU2  OU2.0  package  panamax  pandas  parser  pd  persona  personalData  pg  pivot  postgres  postgresql  ppig  preprocessor  PrimeAir  privacy  privacyImpactAssessment  production  programGeneration  provisioner  provisioning  publisher  publishing  py  python  QA  querybuilder  queryengine  queryplanner  r  rawdata  Rdata  RDM  reading  readingList  recipeBook  reference  regex  regexGolf  REPL  researchData  researchDataManagement  reshape  resources  resourceUsage  rightsOfWay  riskAssessment  RPi  rstats  schema  schoolofdata  scoda  secureDB  security  shell  skills  sonification  sparql  spreadsheet  spreadsheets  sql  sqlalchemy  sqlite  ssh  startup  StevensNOIR  streamingdata  styleguide  test  testing  textbook  textEditor  tidydata  tikz  tm111  tm112  tm351  tm351alumni  tm351vm  tm358  todo  tracking  training  transform  travis  tutorial  tutorials  tw  ubuntu  usage  USB  vagrant  versioning  virtualbox  vis  visualData  visualisation  visualProgramming  viz  vm  wifi  wikicite  windows  workflow  worldbank  xkcd  zip 

Copy this bookmark: