background top icon
background center wave icon
background filled rhombus icon
background two lines icon
background stroke rhombus icon

Download "#17. Гауссовский байесовский классификатор | Машинное обучение"

input logo icon
Video tags
|

Video tags

байесовский классификатор
байесовский метод
байесовский подход
байесовский вывод
bayesian classification in data mining
bayesian classification
bayesian statistics
bayesian optimization
машинное обучение
машинное обучение python
машинное обучение с нуля
машинное обучение python уроки
машинное обучение лекции
машинное обучение курс
искусственный интеллект
искусственный интеллект на python
machine learning
machine learning python
machine learning tutorial
Subtitles
|

Subtitles

subtitles menu arrow
  • ruRussian
Download
00:00:01
Balakirev and we are continuing the course on
00:00:03
machine learning in the previous lesson.
00:00:06
We looked at the hired I’m afraid of
00:00:08
glass pruning shears and generally made an introduction
00:00:10
to the theory of the optimal Bay
00:00:12
classifier. In this lesson we
00:00:14
will continue this topic and talk about the
00:00:16
Gaussian Bay classifier 3, that is,
00:00:19
already not naive about the full-fledged Gaussian
00:00:22
troops of the classic 3, let's first
00:00:25
assume that the set of objects in the
00:00:27
training sample obeys the
00:00:29
Gaussian normal
00:00:31
distribution law and this sample can
00:00:33
look like this, that is,
00:00:35
we have here, as it were, two-dimensional features and
00:00:37
space and each . this is an object of
00:00:39
the training sample and this object is also
00:00:42
for class 1 and these are objects for class 2
00:00:46
then, in accordance with the buy so
00:00:48
all theorem, it burns into the inference machine should be
00:00:50
built based on this formula
00:00:52
here poet and y is still the a priori
00:00:55
probability of the appearance of class y n x
00:00:58
provided y is the conditional
00:01:00
distribution density, that is, the likelihood function of
00:01:03
inputs and outputs, but about x this can be
00:01:07
perceived as a kind of reinforcing
00:01:09
factor that does not depend on oil of
00:01:11
class y, since in our problem we
00:01:14
assume that this sample obeys the
00:01:17
golf distribution, then formally
00:01:20
here
00:01:21
we can write such a conditional probability distribution density
00:01:24
in the form of a multidimensional
00:01:25
Gaussian distribution
00:01:27
here this one I can y this is the vector
00:01:30
of mathematical expectations for images x
00:01:32
that belong to a strictly defined
00:01:35
class and 7 y this is the covariance matrix of
00:01:39
also images that belong to a strictly
00:01:42
defined class is this one
00:01:44
with a letter e is nothing more than the
00:01:46
mathematical expectation operator, that is, we
00:01:48
take the mathematical expectation from the
00:01:50
products of such a product of two
00:01:52
vectors and in the end we get a
00:01:54
covariance matrix, and under these
00:01:57
conditions, the
00:01:58
classification decision algorithm
00:02:00
for multidimensional gold sky probability
00:02:02
distribution density is written in
00:02:05
exactly the same way as for the optimal
00:02:08
Bayevsky stake of the pruning shears, that is, the difference is
00:02:10
formal and there is nothing here, only the
00:02:12
appearance itself, this is the conditional
00:02:14
distribution density, everything else
00:02:18
remains the same, that is, at the level of
00:02:19
such general mathematics, absolutely
00:02:22
nothing changes, but I will remind you that this
00:02:24
lambda y is a penalty and which we
00:02:27
superimpose on the incorrect classification by the
00:02:29
model of class y and look, from this
00:02:32
formula follows a special case for the
00:02:34
naive Abai class of Qatar when
00:02:37
this covariance matrix
00:02:39
is ​​diagonal, that is, it takes a
00:02:41
tweet like this on the diagonal, the variances are all the
00:02:44
rest 0 and if the covariance
00:02:47
matrix is ​​exactly like this then in the case of a
00:02:49
multidimensional Gaussian distribution
00:02:50
we automatically obtain a record of the
00:02:54
probability distribution density
00:02:56
by seeing the products of the corresponding
00:02:59
one-dimensional ones, i.e. here x with an index and
00:03:02
at the top is this feature of the vector x, that
00:03:05
is, the image of x, and with such a
00:03:07
covariance matrix, we get
00:03:09
that the features are independent of each other, but
00:03:13
as you understand, this is an
00:03:14
assumption of the covariance matrix, this is a
00:03:16
very strong assumption bordering on the
00:03:19
range, hence the name
00:03:21
naive I'm afraid of glass pruner signs in
00:03:25
machine learning problems quite
00:03:27
often turn out to be
00:03:29
linearly dependent to some extent, and as we know from
00:03:32
probability theory, the degree of linear
00:03:34
dependence of 2 any random variables
00:03:37
is determined by the covariance of this such an
00:03:39
expression or and the most normalized from the
00:03:42
covariance, we get the
00:03:44
correlation coefficient the correlation coefficient
00:03:46
changes in range from minus one to
00:03:48
one, if the correlation value is
00:03:51
zero, then for a tie of random variables
00:03:53
this automatically means complete
00:03:56
independence,
00:03:57
and for other distributions linear
00:03:59
independence, if the
00:04:01
correlation coefficient is equal to one or minus
00:04:04
one, then the random variable is completely
00:04:07
linearly dependent and will line up like
00:04:09
this in In this case, from the
00:04:11
value of one random variable, it is possible to
00:04:13
calculate the value of 2 random variables
00:04:16
because they have a strict linear
00:04:19
relationship, but in most
00:04:20
practical problems the
00:04:22
correlation coefficient in absolute value cuts between zero and
00:04:24
deni c, that is, the random variables in
00:04:27
this case are only
00:04:29
linearly independent to some extent. This means
00:04:32
that for a certain value of
00:04:34
one feature we can only exhale a
00:04:37
certain range of values, a narrower
00:04:40
range of values ​​for 2 features and
00:04:42
the use of a naive buysku classic of spending
00:04:45
for such a case may be
00:04:47
questionable, especially if the
00:04:49
correlation coefficient in modulus is close to one,
00:04:52
good, but then the question is how do we
00:04:55
In practice, we can take into account these
00:04:57
covariance relationships between characteristics and
00:04:59
build not a hired but a full-fledged Boris
00:05:02
glass pruner. The good news is that in the
00:05:04
case of a multidimensional Gaussian
00:05:06
distribution, this is relatively
00:05:08
easy to do. As you may have guessed, for this
00:05:11
we just need to
00:05:12
construct estimates of the mathematical
00:05:14
expectation and covariance matrix from the training sample for
00:05:17
each class separately and the formula for
00:05:19
calculating those estimates is generally simple and
00:05:21
obviously the mathematical expectation is the
00:05:24
arithmetic mean and the elements of the
00:05:26
covariance matrix can be calculated in
00:05:27
this way, there are theorems
00:05:29
that prove that these
00:05:31
estimates correspond to the
00:05:33
maximum likelihood estimate for Elmer's
00:05:35
Gallic distributions, that is, these the
00:05:38
estimates will be adequate and in the general case
00:05:40
we can’t come up with anything better here.
00:05:43
Of course, we could trust the
00:05:45
estimates ourselves. The number of objects of each class in the
00:05:48
training set should be as
00:05:50
large as possible, otherwise we simply won’t collect
00:05:52
sufficient statistics. It is believed that the
00:05:55
minimum is 100 objects of one
00:05:58
specific class, but that’s all but it is desirable to
00:06:01
have from 1000 or more, in order to
00:06:04
better understand all this, let's apply this
00:06:06
approach of a handful of Bay classifier to a
00:06:09
simulated binary
00:06:11
classification problem in which the training
00:06:13
sample will be generated based on a
00:06:16
bivariate normal distribution with the
00:06:18
following parameters, the
00:06:20
correlation coefficient for class 1 and the variance for
00:06:23
class 1 will be equal accordingly, 0.8
00:06:26
unit mathematical expectation, we
00:06:28
will take a quartz zone of
00:06:30
the matrix based on these parameters
00:06:32
will be determined by these parameters and the
00:06:34
same for the second class, in total we
00:06:37
will generate a thousand images, that is,
00:06:40
1000 points for each class and Using this
00:06:43
data, we will then calculate these
00:06:46
estimates, the mathematical expectation and the
00:06:48
covariance matrix, and then apply
00:06:51
the algorithm of the Gaussian Bay
00:06:53
classifier, which can be written like
00:06:55
this. This video is the form we
00:06:57
get when we take the natural
00:06:58
logarithm from this optimal
00:07:01
Bay classifier, that is, we
00:07:03
take the natural logarithm separately
00:07:05
for this factor and for this
00:07:07
factor which represents a
00:07:09
multidimensional distribution density
00:07:11
probability check the
00:07:13
distribution density in this
00:07:15
distribution plane we have the first
00:07:17
term to select here and the second
00:07:19
term is what is under the exponents,
00:07:21
only the degree remains and in the end we
00:07:23
come to this formula let's
00:07:26
see how everything will work not
00:07:28
through a program in python it is
00:07:30
quite simple it will be laid out by gear
00:07:32
hop and each of you can download it
00:07:34
the link under this video at the beginning we
00:07:36
connect the necessary libraries then we
00:07:39
install the grain of the random
00:07:40
number sensor so that for all of you it gives for the
00:07:43
same value, and then we submit
00:07:46
these variance correlation coefficients
00:07:47
for the 1st class, here we have the
00:07:50
vector of mathematical expectation and the
00:07:52
covariance matrix, and the same for the
00:07:54
second class, then we model a
00:07:56
thousand random variables of each class
00:07:58
in accordance with the multidimensional normal
00:08:01
distribution, well, I give you, we calculate the experimental
00:08:04
estimate for the vector of mathematical
00:08:07
expectation and for covariance matrices,
00:08:09
that is, in one this is a quartz dream and a matrix
00:08:11
for class 1 vg-2 this is a ring for a matrix
00:08:14
for class 2, then we determine the
00:08:17
necessary parameters for the golf
00:08:20
bracket model from which classifier this is the
00:08:22
probability of the class appearing and the penalty
00:08:24
that we impose for incorrect
00:08:26
classification, I took the same penalty everywhere,
00:08:28
then this one would be just
00:08:30
due to this formula, what
00:08:33
is under the argamak itself and then we take the
00:08:36
arc max from the input image x which you have
00:08:40
here given an arbitrary output
00:08:42
image x with two parameters and we calculate
00:08:45
for it this formula for
00:08:48
class 1 and then for class 2, that is, for
00:08:51
each class this vector of
00:08:52
mathematical expectation of the covariance
00:08:54
matrix will be different and then we select
00:08:56
the class whose probability will be
00:08:58
maximum, that is, this is the
00:09:00
max arch, we display the
00:09:02
class number in the console, and here the questions
00:09:04
display the set of points that were
00:09:06
generated, let's run the program
00:09:08
and see how it works,
00:09:10
look at the points distributed in these
00:09:13
ways, the class number we got 0, with
00:09:17
the vector x in this one from the input one
00:09:20
takes the values ​​0 and minus 4, that is,
00:09:23
for X we take 0, for Y we take -4,
00:09:26
getting into the area of ​​these blue
00:09:29
dots and the area of ​​blue. this is just the
00:09:31
first class, but in this case, 0,
00:09:33
let’s put plus 4 instead of minus 4
00:09:35
and run this program again, our
00:09:38
class has changed, that is, when we
00:09:40
put 0 and plus 4, we ended up in the area of
00:09:43
these orange dots and,
00:09:45
accordingly, the class number we
00:09:47
have also changed, that is, a Gaussian bo made of
00:09:49
glass pruning shears works for us, but
00:09:52
now let’s take a closer look at how
00:09:54
this one is different from the angle of the
00:09:56
skib from the pruning shears cycle, here are its naive
00:09:59
implementations when we consider what is recognized
00:10:01
and independent, of course I said that in
00:10:03
this case with a naive Bayevsky to the
00:10:05
squadron, our covariance matrices
00:10:06
take this form, but still,
00:10:09
what does it actually give if you
00:10:12
look again at the distribution of points gauges of
00:10:14
whose distribution with a correlation of 0.7, then
00:10:18
you can see the so-called ellipse of their
00:10:20
scattering and within this ellipse
00:10:23
we can identify two main axes,
00:10:26
here they are marked in orange along these axes,
00:10:28
these points are essentially distributed,
00:10:31
that is, this is how
00:10:32
random variables behave in a
00:10:35
golf scam distribution and in the general case, the
00:10:37
covariance matrix sigma which
00:10:40
describes this behavior of the points
00:10:42
can be imagined in the form of such a
00:10:44
spectral decomposition here in this is a
00:10:47
matrix consisting of eigenvectors of the
00:10:49
matrix stigmata there is a covariance
00:10:51
matrix that because of the eigenvectors
00:10:54
this is just the whole vector that
00:10:55
determines the direction of the main axes of the
00:10:58
scattering ellipse, but this old
00:11:01
matrix with it determines the dispersion
00:11:04
scatter for each of coordinates, that is,
00:11:07
in fact, this covariance
00:11:09
matrix determines linear transformations; it is
00:11:12
uniformly distributed over a set of
00:11:15
points, that is, it is uncorrelated among a
00:11:17
set of points and the
00:11:19
set is correlated. that is, when they
00:11:21
are distributed approximately in this
00:11:22
way, that is, by registering the inverse of
00:11:25
this covariance matrix in
00:11:27
the classifier, we essentially move
00:11:30
to a new space in a new
00:11:33
coordinate system, to this new
00:11:35
coordinate system, and already in this new
00:11:39
coordinate system, the set of these points
00:11:41
turns out to be uncorrelated and here too,
00:11:45
in this new coordinate system,
00:11:47
we process them in the usual naive way, that
00:11:51
is, we simply sum them up and get
00:11:52
the value, that’s exactly how a
00:11:56
Gaussian bo made of glass
00:11:58
pruning shears works, well, in more detail from the attacks but the
00:12:00
spectral decomposition of
00:12:02
eigenvectors and numbers, we also have We’ll
00:12:04
talk about it in future classes, but I think
00:12:07
that the general principle of how the bays work in those
00:12:09
classifiers jumped from those
00:12:10
distributions is clear to you
00:12:12
[music]

Description:

Принцип построения и работы гауссовского байесовского классификатора в многомерном признаковом пространстве. Его отличие от наивного байесовского классификатора. Инфо-сайт: https://proproprogs.ru/ml Телеграм-канал: https://t.me/machine_learning_selfedu machine_learning_17.py: https://github.com/selfedu-rus/machine_learning

Preparing download options

popular icon
Popular
hd icon
HD video
audio icon
Only sound
total icon
All
* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."
** — Link intended for online playback in specialized players

Questions about downloading video

mobile menu iconHow can I download "#17. Гауссовский байесовский классификатор | Машинное обучение" video?mobile menu icon

  • http://unidownloader.com/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.

  • The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.

  • UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.

  • UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

mobile menu iconWhich format of "#17. Гауссовский байесовский классификатор | Машинное обучение" video should I choose?mobile menu icon

  • The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

mobile menu iconWhy does my computer freeze when loading a "#17. Гауссовский байесовский классификатор | Машинное обучение" video?mobile menu icon

  • The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

mobile menu iconHow can I download "#17. Гауссовский байесовский классификатор | Машинное обучение" video to my phone?mobile menu icon

  • You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

mobile menu iconHow can I download an audio track (music) to MP3 "#17. Гауссовский байесовский классификатор | Машинное обучение"?mobile menu icon

  • The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

mobile menu iconHow can I save a frame from a video "#17. Гауссовский байесовский классификатор | Машинное обучение"?mobile menu icon

  • This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

mobile menu iconWhat's the price of all this stuff?mobile menu icon

  • It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.