{
 "metadata": {
  "name": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Hoeffding Inequality"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Run a computer simulation for flipping 1,000 virtual fair coins. Flip each coin independently 10 times. Focus on 3 coins as follows: \n",
      "\n",
      "* $c_1$ is the first coin flipped, \n",
      "* $c_{rand}$ is a coin chosen randomly from the 1,000, \n",
      "* $c_{min}$ is the coin which had the minimum frequency of heads (pick the earlier one in case of a tie). \n",
      "\n",
      "Let $\\nu_1$, $\\nu_{rand}$, and $\\nu_{min}$ be the fraction of heads obtained for the 3 respective coins out of the 10 tosses. Run\n",
      "the experiment 100,000 times in order to reduce variability in your results (note that $c_{rand}$ and $c_{min}$ will change from run to run) and compute average values for the $\\nu$'s.\n",
      "\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "1\\. $\\nu_{min}$ is closed to:\n",
      "\n",
      "- [a] 0\n",
      "- [b] 0.01\n",
      "- [c] 0.1\n",
      "- [d] 0.5\n",
      "- [e] 0.67\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import numpy as np\n",
      "\n",
      "def experiment(times, coins=1000, flip_count=10):\n",
      "    nu = np.zeros(3, dtype=float)\n",
      "    for i in xrange(times):\n",
      "        flips = np.random.binomial(flip_count, 0.5, coins)\n",
      "        nu += [flips[0], np.min(flips), np.random.choice(flips)]\n",
      "    return nu / times / flip_count\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 28
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "np.abs(experiment(100000)[1] - [0, 0.01, 0.1, 0.5, 0.67])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 31,
       "text": [
        "array([ 0.037523,  0.027523,  0.062477,  0.462477,  0.632477])"
       ]
      }
     ],
     "prompt_number": 31
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "    Answer: [b] second difference is the smallest"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "2\\. Which coin or coins satisfy Hoeffding's Inequality?\n",
      "\n",
      "- [a] $c_1$ only\n",
      "- [b] $c_{rand}$ only\n",
      "- [c] $c_{min}$ only\n",
      "- [d] $c_1$ and $c_{rand}$\n",
      "- [e] $c_{min}$ and $c_{rand}$\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Hoeffding's Inequality:\n",
      "\n",
      "$P[|\\nu - \\mu| > \\epsilon] \\leq 2e^{-2\\epsilon^2 N}$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "    Answer: [d] c_min is not a random draw from a single \"bin\" and as such is bounded by the union rule and not Hoeffding"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Error and Noise"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Consider the bin model for a hypothesis $h$ that makes an error with probability $\\mu$ in approximating a deterministic target function $f$ (both $h$ and $f$ are binary functions). if we use same $h$ to approximate a noisy version of f given by:\n",
      "\n",
      "$P(y|x) = \\left\\{\n",
      "\\begin{array}{11}\n",
      "\\lambda   & y = f(x) \\\\\n",
      "1-\\lambda & y \\neq f(x)\n",
      "\\end{array}\n",
      "\\right.$\n",
      "\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "3\\. What is the probability of error that $h$ makes in approximating $y$? _Hint: Two wrongs can make a right!_\n",
      "\n",
      "- [a] $\\mu$\n",
      "- [b] $\\lambda$\n",
      "- [c] $1-\\mu$\n",
      "- [d] $(1-\\lambda) * \\mu + \\lambda * (1-\\mu)$\n",
      "- [e] $(1-\\lambda) (1-\\mu) + \\lambda * \\mu$\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "4\\. At what value of $\\lambda$ will the performance of $h$ be independent of $\\mu$?\n",
      "\n",
      "- [a] $0$\n",
      "- [b] $0.5$\n",
      "- [c] $1/\\sqrt{2}$\n",
      "- [d] $1$\n",
      "- [e] no values of $\\lambda$\n"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Linear Regression"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Nonlinear Transformation"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}