6.2 Activity 1

Prompt

Calculate the cost:

\[ J(\mathbf{W}) = -\frac{1}{|\text{data}|} \sum_{(x,y) \in \text{data}} y \odot \log_2 p_{\text{model}(\mathbf{W})}(\mathbf{y} | \mathbf{x}) \]

where:

  • \(\mathbf{y}\) is the ground truth label.

  • \(p_{\text{model}}(y|x)\) is the predicted probability.

  • \(\odot\) denotes element-wise multiplication.

Given the data and model predictions:

\[ \mathbf{X} = \begin{bmatrix} 5 & 7 \\ 2 & 9 \\ 1 & 1 \end{bmatrix} \quad \mathbf{Y} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} \quad p_{\text{model}(\mathbf{W})}(\mathbf{y}|\mathbf{x})= \begin{bmatrix} 0.5 & 0.5 \\ 0.9 & 0.1 \\ 0.9 & 0.1 \end{bmatrix} \]

Solution

Step 1: Compute \(y \odot \log_2 p_{\text{model}}(y|x)\) for each row

We first compute \(( \log_2 p_{\text{model}}(y|x) )\) for each probability:

Ground Truth: \(( y )\) Predicted Probabilities: \(( p_{\text{model}}(y|x))\) \(( \log_2 p_{\text{model}}(y|x) )\)
(0,1) (0.5, 0.5) ((_2 0.5), (_2 0.5)) = (-1, -1)
(1,0) (0.9, 0.1) ((_2 0.9), (_2 0.1)) = (-0.15, -3.32)
(0,1) (0.9, 0.1) ((_2 0.9), (_2 0.1)) = (-0.15, -3.32)

Now, applying element-wise multiplication:

Ground Truth: \(( y )\) \(( \log_2 p_{\text{model}}(y|x))\) Element-wise product: \(( y \odot \log_2 p_{\text{model}}(y|x) )\)
(0,1) (-1, -1) (0 × -1, 1 × -1) = (0, -1)
(1,0) (-0.15, -3.32) (1 × -0.15, 0 × -3.32) = (-0.15, 0)
(0,1) (-0.15, -3.32) (0 × -0.15, 1 × -3.32) = (0, -3.32)

Step 2: Compute the cost for each dimension

Summing over all data points:

\[ \sum_{(x,y) \in \text{data}} y \odot \log_2 p_{\text{model}}(y|x) = (0 + (-0.15) + 0, -1 + 0 + (-3.32)) \\ = (-0.15, -4.32) \]

Dividing by \(( |\text{data}| = 3 )\):

\[ J(\mathbf{W}) = -\frac{1}{3} (-0.15, -4.32) \\ =J(\mathbf{W}) = (0.05, 1.44) \]

Final Answer:

\[ J(\mathbf{W}) = (0.05, 1.44) \]