image-processing   1743

« earlier    

Object tracking with dlib - PyImageSearch
This tutorial will teach you how to perform object tracking using dlib and Python. After reading today’s blog post you will be able to track objects in real-time video with dlib.
python  image-processing  computer-vision  programming  tutorial 
22 hours ago by ssorc
SOD - An Embedded, Modern Computer Vision and Machine Learning Library
SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices.
computer-vision  machinelearning  image-processing  api  embedded  opensource 
yesterday by ssorc
[1711.04226] AON: Towards Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.
OCR  image-processing  image-segmentation  machine-learning  invariant-models  generalization  to-write-about  to-simulate 
16 days ago by Vaguery
[1608.02153] OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus
This article describes the results of a case study that applies Neural Network-based Optical Character Recognition (OCR) to scanned images of books printed between 1487 and 1870 by training the OCR engine OCRopus [@breuel2013high] on the RIDGES herbal text corpus [@OdebrechtEtAlSubmitted]. Training specific OCR models was possible because the necessary *ground truth* is available as error-corrected diplomatic transcriptions. The OCR results have been evaluated for accuracy against the ground truth of unseen test sets. Character and word accuracies (percentage of correctly recognized items) for the resulting machine-readable texts of individual documents range from 94% to more than 99% (character level) and from 76% to 97% (word level). This includes the earliest printed books, which were thought to be inaccessible by OCR methods until recently. Furthermore, OCR models trained on one part of the corpus consisting of books with different printing dates and different typesets *(mixed models)* have been tested for their predictive power on the books from the other part containing yet other fonts, mostly yielding character accuracies well above 90%. It therefore seems possible to construct generalized models trained on a range of fonts that can be applied to a wide variety of historical printings still giving good results. A moderate postcorrection effort of some pages will then enable the training of individual models with even better accuracies. Using this method, diachronic corpora including early printings can be constructed much faster and cheaper than by manual transcription. The OCR methods reported here open up the possibility of transforming our printed textual cultural heritage into electronic text by largely automatic means, which is a prerequisite for the mass conversion of scanned books.
OCR  image-processing  natural-language-processing  algorithms  machine-learning  rather-interesting  commodity-software  digital-humanities  to-write-about  consider:swarms  consider:stochastic-resonance 
19 days ago by Vaguery
[1812.07933] Dynamic Programming Approach to Template-based OCR
In this paper we propose a dynamic programming solution to the template-based recognition task in OCR case. We formulate a problem of optimal position search for complex objects consisting of parts forming a sequence. We limit the distance between every two adjacent elements with predefined upper and lower thresholds. We choose the sum of penalties for each part in given position as a function to be minimized. We show that such a choice of restrictions allows a faster algorithm to be used than the one for the general form of deformation penalties. We named this algorithm Dynamic Squeezeboxes Packing (DSP) and applied it to solve the two OCR problems: text fields extraction from an image of document Visual Inspection Zone (VIZ) and license plate segmentation. The quality and the performance of resulting solutions were experimentally proved to meet the requirements of the state-of-the-art industrial recognition systems.
OCR  image-segmentation  image-processing  algorithms  optimization  mathematical-programming  to-write-about  to-compare 
19 days ago by Vaguery
[1711.03247] The nonsmooth landscape of phase retrieval
We consider a popular nonsmooth formulation of the real phase retrieval problem. We show that under standard statistical assumptions, a simple subgradient method converges linearly when initialized within a constant relative distance of an optimal solution. Seeking to understand the distribution of the stationary points of the problem, we complete the paper by proving that as the number of Gaussian measurements increases, the stationary points converge to a codimension two set, at a controlled rate. Experiments on image recovery problems illustrate the developed algorithm and theory.
inverse-problems  feature-extraction  approximation  rather-interesting  nonlinear-programming  to-write-about  image-processing  signal-processing 
4 weeks ago by Vaguery
Krazy Kat Comics
This page goes into detail on how I used Machine Learning to find hundreds of Krazy Kat comics that are now in the public domain.

As a result of this project, several hundred high resolution scans of Krazy Kat comics are now easily available online, including a comic that I couldn't find in any published book!

What follows is a detailed description of what I did to find these comics in online newspaper archives.
OCR  archives  newspapers  rather-interesting  digitization  to-write-about  deep-learning  machine-learning  image-processing 
5 weeks ago by Vaguery
Contribute to netb258/clj-ppm development by creating an account on GitHub.
clojure  computer-graphics  image-processing 
6 weeks ago by id1

« earlier    

related tags

(the-simplest)  adversarial-tricks  ai  algorithm  algorithms  analysis  android  apache-beam  api  approximation  archives  article  arxiv  attention  augmented-reality  background  benchmarking  blending  bugs  c++  cdn  clojure  cloud-computing  clustering  commodity-software  compressed-sensing  compression  computational-complexity  computer-graphics  computer-science  computer-vision  consider:feature-discovery  consider:genetic-programming  consider:looking-to-see  consider:performance-measures  consider:rediscovery  consider:representation  consider:stochastic-resonance  consider:swarms  data-augmentation  data-fusion  dataset  deep-learning  deepmind  development  digital-humanities  digitization  distributed-processing  embedded  esri  fast-ai  feature-extraction  feature-intuiting  ffmpeg  free-images  gan  gatsby  generalization  generative-adversarial-network  generative-art  generative-model  generative-models  github  golang  google-dataflow  google  gpu  graphics-editors  graphics-programming  graphics-software  graphics  hierarchical  horse-races  image-generation  image-recognition  image-resize  image-resizing  image-segmentation  image-service  image-uploading  image  imagenet  images  invariant-models  inverse-problems  ios  javascript  jpeg  keras  lang:python  library  machine-learning  machine-vision  machinelearning  magazine  mathematical-programming  matlab  metrology  mozjpeg  natural-language-processing  neural-networks  newspapers  nodejs  nonlinear-programming  notes  nudge-targets  nvidia  ocr  online-apps  online  opensource  optics  optimization  paint  performance-measure  performance  photography  photos  photoshop  poisson  poissonblending  pretrained-model  processing  programming-languages  programming  python  pytorch  rather-interesting  react-components  recognition  regression  rendering  representation  scanning  self-supervised-learning  semi-supervised-learning  server  serverless  services  short  signal-processing  software  sparse-representations  statistics  study-group  super-resolution  swift  tencent  tensor-flow  to-compare  to-do  to-simulate  to-understand  to-write-about  tool  toolbox  tools  tutorial  unsupervised-learning  vlzfeature  vlztodo  web-services 

Copy this bookmark: