Machine Learning - Feature scaling and Learning Rate (Multi-variable)#
Tools#
import numpy as np
import matplotlib.pyplot as plt
from lab_utils_multi import load_house_data, run_gradient_descent
from lab_utils_multi import norm_plot, plt_equal_scale, plot_cost_i_w
from lab_utils_common import dlc
np.set_printoptions(precision=2)
plt.style.use('./deeplearning.mplstyle')
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 3
1 import numpy as np
2 import matplotlib.pyplot as plt
----> 3 from lab_utils_multi import load_house_data, run_gradient_descent
4 from lab_utils_multi import norm_plot, plt_equal_scale, plot_cost_i_w
5 from lab_utils_common import dlc
ModuleNotFoundError: No module named 'lab_utils_multi'
Problem Statements#
주택 가격 예측을 위한 예제를 그대로 사용한다. 훈련 데이터셋에는 네가지 특징 (크기, 침실 수, 층 수, 햇수)을 포함한 세 가지 예제가 아래 표에 나와있다.
Dataset:#
Size (sqft) |
Number of Bedrooms |
Number of floors |
Age of Home |
Price (1000s dollars) |
---|---|---|---|---|
952 |
2 |
1 |
65 |
271.5 |
1244 |
3 |
2 |
64 |
232 |
1947 |
3 |
2 |
17 |
509.8 |
… |
… |
… |
… |
… |
# load the dataset
X_train, y_train = load_house_data()
X_features = ['size(sqft)','bedrooms','floors','age']
fig,ax=plt.subplots(1, 4, figsize=(12, 3), sharey=True)
for i in range(len(ax)):
ax[i].scatter(X_train[:,i],y_train)
ax[i].set_xlabel(X_features[i])
ax[0].set_ylabel("Price (1000's)")
plt.show()
Gradient Descent With Multiple Variables#
Here are the equations you developed in the last lab on gradient descent for multiple variables.:
\[\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline\;
& w_j := w_j - \alpha \frac{\partial J(\mathbf{w},b)}{\partial w_j} \tag{1} \; & \text{for j = 0..n-1}\newline
&b\ \ := b - \alpha \frac{\partial J(\mathbf{w},b)}{\partial b} \newline \rbrace
\end{align*}\]
where, n is the number of features, parameters \(w_j\), \(b\), are updated simultaneously and where
\[\begin{split}
\begin{align}
\frac{\partial J(\mathbf{w},b)}{\partial w_j} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})x_{j}^{(i)} \tag{2} \\
\frac{\partial J(\mathbf{w},b)}{\partial b} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)}) \tag{3}
\end{align}
\end{split}\]
m is the number of training examples in the data set
\(f_{\mathbf{w},b}(\mathbf{x}^{(i)})\) is the model’s prediction, while \(y^{(i)}\) is the target value