cli   48446

« earlier    

youtube-dl
youtube, soundcloud, etc - command line downloader

brew install youtube-dl
youtube  downloads  cli  business  processing 
yesterday by cvn
XML::Twig - A perl module for processing huge XML documents in tree mode. - metacpan.org - https://metacpan.org/
So 'xml_grep --text_only --html' options seems work a treat; these come with the 'xml-twig-tools' package on Debian/Ubuntu.

Note that the '/text()' XPath selector doesn't seem to be supported (maybe that's a v2 thing?), and I can't comprehend the XML::Twig documentation enough to figure out what's up there. What does "a subset of XPath" mean?

The '--html' is pretty essential, because I think it runs the input through a more liberal HTML parser, produces a valid (X)HTML tree, and then you don't get XML parse errors like do you for just about every page on the Internet, it seems like.

Side note: now I /really/ don't understand why XHTML ceased to be a thing.

Example:
curl -q https://packages.debian.org/wheezy/keepassx | xml_grep --text_only --html '//*[@id="pdeps"]/ul/li//a'
xpath  xslt  xml  parser  parsing  perl  module  library  commandline  cli  fuckina  solution 
yesterday by kme
html - ignore malformed XML with Perl-XML - Stack Overflow - https://stackoverflow.com/
xml_grep, a command line tool which comes with XML::Twig, can be used to extract data from HTML using XPath. Normally it works on XML, but you can use the -html option to process HTML (under the hood it uses HTML::TreeBuilder to convert the XML to HTML).

For example:

> xml_grep -html -t 'a[@class="genu"]' http://stackoverflow.com
> Stack Exchange
xpath  xml  parser  textprocessing  webdevel  commandline  cli  maybesolution 
yesterday by kme
memcache-info
- Simple and efficient way to show information about Memcache.
memcache  cli 
yesterday by lenciel
Grep and Sed Equivalent for XML Command Line Processing - Stack Overflow - https://stackoverflow.com/
Accepted answer recommends XMLStarlet, but links to this handy tutorial: https://www.ibm.com/developerworks/library/x-starlet/index.html.

Also:
To Joseph Holsten's excellent list, I add the xpath command-line script which comes with Perl library XML::XPath. A great way to extract information from XML files:

xpath -q -e '/entry[@xml:lang="fr"]' *xml
xml  xpath  xslt  cli  commandline  textprocessing  list  recommendation 
yesterday by kme
xml - How to execute XPath one-liners from shell? - Stack Overflow - https://stackoverflow.com/
Nokogiri. If I write this wrapper I could call the wrapper in the way described above:

#!/usr/bin/ruby

require 'nokogiri'

Nokogiri::XML(STDIN).xpath(ARGV[0]).each do |row|
puts row
end
XML::XPath. Would work with this wrapper:

#!/usr/bin/perl

use strict;
use warnings;
use XML::XPath;

my $root = XML::XPath->new(ioref => 'STDIN');
for my $node ($root->find($ARGV[0])->get_nodelist) {
print($node->getData, "\n");
}


Also:
xmllint --xpath '//element/@attribute' file.xml
xmlstarlet sel -t -v "//element/@attribute" file.xml
saxon-lint --xpath '//element/@attribute' file.xml
xpath  xml  xslt  parsing  textprocessing  webdevel  commandline  cli  oneliner  list  recommendation  samplecode 
yesterday by kme

« earlier    

related tags

%stack_exchange  2012  ack  admin  alternativeto  amazon.aws  amazon.s3  analogue  apt-cacher  archive  article  awk  backup  bash  best_of  boilerplate  bookmarked_on_site  bookmarks  brew  business  cdn  chat  chrome  click  clipboard  cmd  cms  code  codigo  command-line  command  commandline  compression  computers  console  cookiecutter  coreutils  csv  curl  dashboard  database  db  dd  debian  demo  descargas  dev  developer  docker  documentation  download  downloaders  downloads  editor  editors  electron  email  emailclient  example  examples  extension  ff  files  filesystem  find  firefox  flash  forumthread  fuckina  fuentes  gamedev  generator  git  go  gokit  golang  google  grabar  graphics  grep  haskell  headless  helper  homebrew  homebrew_formulas  host  html  http  humor  image_editing  images  imap  important  introduction  iscsi  iso  jamstack  java  javascript  jq  js  json  kerberos  lib  library  line  linux  list  lists  lpm  lua  mac  macos  maildir  making  maybesolution  memcache  module  monitor  monitoring  mount  mp3  musica  mutt  mysql  nano  network  networking  node  notes  oneliner  opensource  osx  packagemangement  paid  parser  parsing  perl  postgres  postgresql  powershell  presentation  presentations  processing  productivity  programming  psg  python  raspberrypi  react  readme  recommendation  repl  resource  retro  rm  rust  samplecode  script  scripting  search  shell  shells  skills  slack  software  softwarelibre  solution  speed  sql  sqlite  sqlite3  ssmtp  ssventures  stackexchange  storage  sysadmin  terminal  textprocessing  tips  tmux  todo  tool  tools  trash  tsv  txt  type:application  type:tool  ubuntu  unix  update  usb  utilities  video  webapp  webdev  webdevel  wget  windows  xiki  xml  xmlstarlet  xpath  xslt  youtube  zip  zsh 

Copy this bookmark:



description:


tags: