6.2 Activity 1
Prompt
Calculate the cost:
\[ J(\mathbf{W}) = -\frac{1}{|\text{data}|} \sum_{(x,y) \in \text{data}} y \odot \log_2 p_{\text{model}(\mathbf{W})}(\mathbf{y} | \mathbf{x}) \]
where:
\(\mathbf{y}\) is the ground truth label.
\(p_{\text{model}}(y|x)\) is the predicted probability.
\(\odot\) denotes element-wise multiplication.
Given the data and model predictions:
\[ \mathbf{X} = \begin{bmatrix} 5 & 7 \\ 2 & 9 \\ 1 & 1 \end{bmatrix} \quad \mathbf{Y} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} \quad p_{\text{model}(\mathbf{W})}(\mathbf{y}|\mathbf{x})= \begin{bmatrix} 0.5 & 0.5 \\ 0.9 & 0.1 \\ 0.9 & 0.1 \end{bmatrix} \]
Solution
Step 1: Compute \(y \odot \log_2 p_{\text{model}}(y|x)\) for each row
We first compute \(( \log_2 p_{\text{model}}(y|x) )\) for each probability:
| Ground Truth: \(( y )\) | Predicted Probabilities: \(( p_{\text{model}}(y|x))\) | \(( \log_2 p_{\text{model}}(y|x) )\) |
|---|---|---|
| (0,1) | (0.5, 0.5) | ((_2 0.5), (_2 0.5)) = (-1, -1) |
| (1,0) | (0.9, 0.1) | ((_2 0.9), (_2 0.1)) = (-0.15, -3.32) |
| (0,1) | (0.9, 0.1) | ((_2 0.9), (_2 0.1)) = (-0.15, -3.32) |
Now, applying element-wise multiplication:
| Ground Truth: \(( y )\) | \(( \log_2 p_{\text{model}}(y|x))\) | Element-wise product: \(( y \odot \log_2 p_{\text{model}}(y|x) )\) |
|---|---|---|
| (0,1) | (-1, -1) | (0 × -1, 1 × -1) = (0, -1) |
| (1,0) | (-0.15, -3.32) | (1 × -0.15, 0 × -3.32) = (-0.15, 0) |
| (0,1) | (-0.15, -3.32) | (0 × -0.15, 1 × -3.32) = (0, -3.32) |
Step 2: Compute the cost for each dimension
Summing over all data points:
\[ \sum_{(x,y) \in \text{data}} y \odot \log_2 p_{\text{model}}(y|x) = (0 + (-0.15) + 0, -1 + 0 + (-3.32)) \\ = (-0.15, -4.32) \]
Dividing by \(( |\text{data}| = 3 )\):
\[ J(\mathbf{W}) = -\frac{1}{3} (-0.15, -4.32) \\ =J(\mathbf{W}) = (0.05, 1.44) \]
Final Answer:
\[ J(\mathbf{W}) = (0.05, 1.44) \]