lecture-notes   816
Optimization and Control
This is a home page of resources for Richard Weber's course of 16 lectures to third year Cambridge mathematics students in winder 2016, starting January 14, 2016 (Tue/Thu @ 11 in CMS meeting room 5). This material is provided for students, supervisors (and others) to freely use in connection with this course.
teaching  lecture-notes  reference  control-theory  optimization
9 weeks ago by mraginsky
Intelligent control through learning and optimization (AMATH/CSE 579)
Design of near-optimal controllers for complex dynamical systems, using analytical techniques, machine learning, and optimization. Topics from deterministic and stochastic optimal control, reinforcement learning and dynamic programming, numerical optimization in the context of control, and robotics. Prerequisite: vector calculus, linear algebra, and Matlab. Recommended: differential equations, stochastic processes, and optimization.
teaching  lecture-notes  reference  control-theory  optimization
9 weeks ago by mraginsky
Section 10 Chi-squared goodness-of-fit test.
- pf that chi-squared statistic for Pearson's test (multinomial goodness-of-fit) actually has chi-squared distribution asymptotically
- the gotcha: terms Z_j in sum aren't independent
- solution:
- compute the covariance matrix of the terms to be E[Z_iZ_j] = -sqrt(p_ip_j)
- note that an equivalent way of sampling the Z_j is to take a random standard Gaussian and project onto the plane orthogonal to (sqrt(p_1), sqrt(p_2), ..., sqrt(p_r))
- that is equivalent to just sampling a Gaussian w/ 1 less dimension (hence df=r-1)
QED
pdf  nibble  lecture-notes  mit  stats  hypothesis-testing  acm  probability  methodology  proofs  iidness  distribution  limits  identity  direction  lifts-projections
october 2017 by nhaliday
Early History of Electricity and Magnetism
The ancient Greeks also knew about magnets. They noted that on rare occasions "lodestones" were found in nature, chunks of iron-rich ore with the puzzling ability to attract iron. Some were discovered near the city of Magnesia (now in Turkey), and from there the words "magnetism" and "magnet" entered the language. The ancient Chinese discovered lodestones independently, and in addition found that after a piece of steel was "touched to a lodestone" it became a magnet itself.'

...

One signpost of the new era was the book "De Magnete" (Latin for "On the Magnet") published in London in 1600 by William Gilbert, a prominent medical doctor and (after 1601) personal physician to Queen Elizabeth I. Gilbert's great interest was in magnets and the strange directional properties of the compass needle, always pointing close to north-south. He correctly traced the reason to the globe of the Earth being itself a giant magnet, and demonstrated his explanation by moving a small compass over the surface of a lodestone trimmed to a sphere (or supplemented to spherical shape by iron attachments?)--a scale model he named "terrella" or "little Earth," on which he was able to duplicate all the directional properties of the compass. (here and here)
nibble  org:edu  org:junk  lecture-notes  history  iron-age  mediterranean  the-classics  physics  electromag  science  the-trenches  the-great-west-whale  discovery  medieval  earth
september 2017 by nhaliday
Lecture 14: When's that meteor arriving
- Meteors as a random process
- Limiting approximations
- Derivation of the Exponential distribution
- Derivation of the Poisson distribution
- A "Poisson process"
nibble  org:junk  org:edu  exposition  lecture-notes  physics  mechanics  space  earth  probability  stats  distribution  stochastic-processes  closure  additive  limits  approximation  tidbits  acm  binomial  multiplicative
september 2017 by nhaliday
Recitation 25: Data locality and B-trees
The same idea can be applied to trees. Binary trees are not good for locality because a given node of the binary tree probably occupies only a fraction of a cache line. B-trees are a way to get better locality. As in the hash table trick above, we store several elements in a single node -- as many as will fit in a cache line.

B-trees were originally invented for storing data structures on disk, where locality is even more crucial than with memory. Accessing a disk location takes about 5ms = 5,000,000ns. Therefore if you are storing a tree on disk you want to make sure that a given disk read is as effective as possible. B-trees, with their high branching factor, ensure that few disk reads are needed to navigate to the place where data is stored. B-trees are also useful for in-memory data structures because these days main memory is almost as slow relative to the processor as disk drives were when B-trees were introduced!
nibble  org:junk  org:edu  cornell  lecture-notes  exposition  programming  engineering  systems  dbs  caching  performance  memory-management  os
september 2017 by nhaliday
Introduction to Scaling Laws
http://galileo.phys.virginia.edu/classes/304/scaling.pdf

Galileo’s Discovery of Scaling Laws: https://www.mtholyoke.edu/~mpeterso/classes/galileo/scaling8.pdf
Days 1 and 2 of Two New Sciences

An example of such an insight is “the surface of a small solid is comparatively greater than that of a large one” because the surface goes like the square of a linear dimension, but the volume goes like the cube.5 Thus as one scales down macroscopic objects, forces on their surfaces like viscous drag become relatively more important, and bulk forces like weight become relatively less important. Galileo uses this idea on the First Day in the context of resistance in free fall, as an explanation for why similar objects of different size do not fall exactly together, but the smaller one lags behind.
nibble  org:junk  exposition  lecture-notes  physics  mechanics  street-fighting  problem-solving  scale  magnitude  estimate  fermi  mental-math  calculation  nitty-gritty  multi  scitariat  org:bleg  lens  tutorial  guide  ground-up  tricki  skeleton  list  cheatsheet  identity  levers  hi-order-bits  yoga  metabuch  pdf  article  essay  history  early-modern  europe  the-great-west-whale  science  the-trenches  discovery  fluid  architecture  oceans  giants  tidbits
august 2017 by nhaliday
Subgradients - S. Boyd and L. Vandenberghe
If f is convex and x ∈ int dom f, then ∂f(x) is nonempty and bounded. To establish that ∂f(x) ≠ ∅, we apply the supporting hyperplane theorem to the convex set epi f at the boundary point (x, f(x)), ...
pdf  nibble  lecture-notes  acm  optimization  curvature  math.CA  estimate  linearity  differential  existence  proofs  exposition  atoms  math  marginal  convexity-curvature
august 2017 by nhaliday
Analysis of variance - Wikipedia
Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups), developed by statistician and evolutionary biologist Ronald Fisher. In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups. ANOVAs are useful for comparing (testing) three or more means (groups or variables) for statistical significance. It is conceptually similar to multiple two-sample t-tests, but is more conservative (results in less type I error) and is therefore suited to a wide range of practical problems.

good pic: https://en.wikipedia.org/wiki/Analysis_of_variance#Motivating_example

tutorial by Gelman: http://www.stat.columbia.edu/~gelman/research/published/econanova3.pdf

so one way to think of partitioning the variance:
y_ij = alpha_i + beta_j + eps_ij
Var(y_ij) = Var(alpha_i) + Var(beta_j) + Cov(alpha_i, beta_j) + Var(eps_ij)
and alpha_i, beta_j are independent, so Cov(alpha_i, beta_j) = 0

can you make this work w/ interaction effects?
data-science  stats  methodology  hypothesis-testing  variance-components  concept  conceptual-vocab  thinking  wiki  reference  nibble  multi  visualization  visual-understanding  pic  pdf  exposition  lecture-notes  gelman  scitariat  tutorial  acm  ground-up  yoga
july 2017 by nhaliday
Stat 260/CS 294: Bayesian Modeling and Inference
Topics
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling
july 2017 by nhaliday

Copy this bookmark:

description:

tags: