csv   8499

« earlier    

saulpw/visidata: A console spreadsheet tool for discovering and arranging data
visidata - A console spreadsheet tool for discovering and arranging data
python  cli  data  spreadsheet  csv 
2 days ago by geetarista
csvquote
Enables common unix utlities like cut, head, tail to work correctly with csv data containing delimiters and newlines. ---- From:
https://news.ycombinator.com/item?id=7474600

Related: This tool:
https://github.com/dbro/csvquote

will convert all the record/field separators (such as tabs/newlines for TSV) into non-printing characters and then in the end reverse it. Example:
csvquote foobar.csv | cut -d ',' -f 5 | sort | uniq -c | csvquote -u

====
Thanks for bringing up csvquote. I wrote it last year, and am happy to hear that other people find it useful. ---- It is indeed a simple state machine (see https://github.com/dbro/csvquote/blob/master/csvquote.c), and it translates CSV/TSV files into files which follow the spirit of what's described in the original article in this thread. ---- But instead of using control characters as separators, it uses them INSIDE the quoted fields. This makes it easy to work with the standard UNIX text manipulation tools, which expect tabs and newlines to be the field and record separators. ---- The motivation for writing the tool was to work with CSV files (usually from Excel) that were hundreds of megabytes. These files came from outside my organization, and often from nontechnical people - so it would have been difficult to get them into a more convenient format. That's the killer feature of the CSV/TSV format: it's readable by the large number of nontechnical information workers, in almost every application they use. I can't think of a file format that is more widely recognized (even if it's not always consistently defined in practice).
====
[ snip ] ... That's exactly what https://github.com/dbro/csvquote does for commas and newlines both. ---- Why use this instead of sed, awk, flex, lua, etc.? ---- sed does the job and on almost all UNIX clones it never needs to be installed. -- Because it's already there.
csv  tsv  data  dataanalysis  sed  awk 
4 days ago by dusko

« earlier    

related tags

analysis  app  apple.stackexchange  ascii  automation  avro  awk  azupi  bash  batch  big-data  bigdata  bug  bulk  chart  cheatsheet  cli  cms  codegeneration  commandline  computing  contrib  conversion  convert  converter  converters  data  dataanalysis  database  dataexchange  dataextraction  datascience  datascraping  dataset  datasets  date  destination  dev  document  download  drupal  editor  elasticsearch  encoding  error  excel  exploratory.io  exploratory  export  extraction  extractor  fast  file  flask-admin  flask  floss  from  generator  geospatial  github  golang  grok  guide  hadoop  html  import  importexport  ingestnode  issue  javascript  json  kml  lang:en  ltsv  mac  macosx  markdown  mass  module  ncurses  number  numbers  ocr  opensource  osx  pandas  parquet  parse  parser  path  pdf  performance  pipeline  plaintext  postgres  postgresql  processing  python  rails  ransack  read  redirect  reference  rfc  ruby  rust  saas  save  schema  scrape  scraping  search  sed  seo  software  sort  sorting  spark  spreadsheet  sql  sqlite  stackexchange  structureddata  suggestion  table  tables  template  text  time  timestamp  tool  tools  transform  tsv  two  unicode  url  utilities  utility  viewer  web  windows  write  xls 

Copy this bookmark:



description:


tags: