Kolmogorov's zero–one law - Wikipedia
In probability theory, Kolmogorov's zero–one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, called a tail event, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.

tail events include limsup E_i
Dvoretzky's theorem - Wikipedia
In mathematics, Dvoretzky's theorem is an important structural theorem about normed vector spaces proved by Aryeh Dvoretzky in the early 1960s, answering a question of Alexander Grothendieck. In essence, it says that every sufficiently high-dimensional normed vector space will have low-dimensional subspaces that are approximately Euclidean. Equivalently, every high-dimensional bounded symmetric convex set has low-dimensional sections that are approximately ellipsoids.

http://mathoverflow.net/questions/143527/intuitive-explanation-of-dvoretzkys-theorem
http://mathoverflow.net/questions/46278/unexpected-applications-of-dvoretzkys-theorem
Wald's equation - Wikipedia
important identity that simplifies the calculation of the expected value of the sum of a random number of random quantities
probability - How to prove Bonferroni inequalities? - Mathematics Stack Exchange
- integrated version of inequalities for alternating sums of (N choose j), where r.v. N = # of events occuring
- inequalities for alternating binomial coefficients follow from general property of unimodal (increasing then decreasing) sequences, which can be gotten w/ two cases for increasing and decreasing resp.
- the final alternating zero sum property follows for binomial coefficients from expanding (1 - 1)^N = 0
- The idea of proving inequality by integrating simpler inequality of r.v.s is nice. Proof from CS 150 was more brute force from what I remember.
Computational Complexity: Favorite Theorems: The Yao Principle
The Yao Principle applies when we don't consider the algorithmic complexity of the players. For example in communication complexity we have two players who each have a separate half of an input string and they want to compute some function of the input with the minimum amount of communication between them. The Yao principle states that the best probabilistic strategies for the players will achieve exactly the communication bounds as the best deterministic strategy over a worst-case distribution of inputs.

The Yao Principle plays a smaller role where we measure the running time of an algorithm since applying the Principle would require solving an extremely large linear program. But since so many of our bounds are in information-based models like communication and decision-tree complexity, the Yao Principle, though not particularly complicated, plays an important role in lower bounds in a large number of results in our field.
Carathéodory's theorem (convex hull) - Wikipedia
- any convex combination in R^d can be pared down to at most d+1 points
- eg, in R^2 you can always fit a point in convex hull in a triangle
