台大李宏毅机器学习2021

news/2024/9/20 21:23:40

ML 2021 Spring (ntu.edu.tw)https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.htmlDiscussion：ML2021Spring-hw1 | Kaggle

Different types of functions。How to find a function?

Regression:The function outputs as a scalar.

Classification:Given options(classes),the function outputs the correct one.

Structured Learning:create something with structure(image,documents) .

1、Training

1.1、Model

1.2、Loss

define loss from training data

loss is a function of parameters $L(b,w)$

How good a set of values is.

If y and $\hat{y}$ are both probability distributions------->Cross-entropy.

1.3、Optimization

find the best $w^{*}$ $b^{*}$ to get min L.

way：Gradient Descent.

在做机器学习需要自己设定的东西叫做hyperparameters(就是你自己决定的东西，人所设的东西不是机器自己找出来的)。

hyperparameters：learning rate、Batch size、

Batch、Epoch：

(L表示所有数据N在一起计算时产生的Loss， $L^{1}$ 表示一个batch作为一个数据包计算时产生的Loss，根据 $L^{1}$ 算出gradient，然后再更新参数......再取下一个batch计算，同理如下)

---所以我们并不是拿大L来计算Gradient，实际上我们是拿一个Batch算出来的L1L2L3来计算Gradient，把所有的batch都看过一次(也就是都计算一次)叫做一个Epoch，每一次更新参数叫做一次Update。-----所以Update和Epoch是不一样的东西，每次更新一次参数叫做一次Update，把所有的Batch都看过一遍叫做一个Epoch(所以一个Epoch并不是更新参数一次而是N/B)。

PS：在把所有的资料分成一个个Batch的时候，会做一件事情叫做Shuffle，Shuffle有很多不同的做法，但是常见的做法是：在每一个Epoch开始之前会分一次Batch，然后每一个Epoch的batch都不一样；所以哪一些数据在同一个Epoch里面，每一个Epoch都不一样的；叫做shuffle(洗牌)。