(1) Debiprasad Ghosh's answer to Why does batch normalization help? - Quora
When you have less than million parameters you can have luxury of global optimization or curvature-aided learning. But if you have more than million parameters, you have to work with gradient descent only. Whitening brings spherical shape to the error surface, where originally it is ellipsoidal and Gradient Descent works better if error surface is spherical than elliptical.
11 weeks ago
« earlier      
3d admin advertising advocacy agile ai ajax alert algorithm amazon amazon-web-services analytics android angular.js animation api app apple appstore arduino artwork ascii assembler assembly assets async automation awesome backbone.js backup best-practices biology bitcoins blocks blog blogging boilerplate book books business button buttons c c++ caching camera canvas catalog chart cheatsheet chrome cli cloud cms cocoa cocoatouch code code-signing coding-standards coffeescript collaboration color colortheme comet command-line comparison component components concurrency conference console consulting cooking copyright copywriting course cpu crash-reporting crypto css css3 culture dashboard data database date db debugging deployment design devops dictionaries diet diff dns docker docs dom dropbox ebook ec2 ecommerce editor effects emacs email english esp8266 essentials favorite-theme feynman finance fonts food forms framework free free-stock freelance freelancing fun gamedev gcc gems gif git git-help github golang google grammar graph graphics grid gtd gui hardware hash hazel health hiring history hn hosting html html5 http humor iap icons ie image-processing images inspiration ios ios-design ios-security ipad iphone java javascript jekyll jquery js json keyboard languages layout learning legal library licensing life life-hacks lightbox linux livereload livereload-articles livereload-links livereload-mentions livereload-tools localization logging lorem mac management markdown marketing marketshare math medicine memory meteor.js metrics metro mobile mobile-distribution mockko mockups mongodb monitoring motivation mvc networking node-module node.js nosql notifications novice-designer-recommendations npm objective-c open-source optimization osx paas packages palette parsing passwords patterns payments performance photoshop php physics piracy placeholder plugin pm postgres preprocessors pricing pride-and-prejudice privacy procrastination product-development product-tour productivity programming project-management prototyping psychology python rails react.js real-time redis regexp responsive-layout rest review ruby s3 saas sales samples sandbox scaling science scraping screen screencast scrolling search security selenium seo sharing shell simulator sketch social-media space sql ssh startup startup-essential statistics stock stylus sublime-plugins sublime-text support swift sync sysadmin sysops teaching ted template terminal testing theme time tips tmux tool tools tunnel tutorial twitter twitter-bootstrap typography ui ui-design ui-patterns unix upload url usa usability ux validation video vim virtualization visualization vpn vps web web-design-essentials webdev webdev-essentials webkit websockets wiki win32 windows writing wwdc wysiwyg x86 xcode

Copy this bookmark: