A Preliminary Study of Machine Learning-Linear Regression

Digression

I was very interested in artificial intelligence from a long time ago. I remember my college graduation thesis, which uses genetic algorithms to solve a classic pathfinding problem.
I have always been in awe and admiration for the classic human thoughts, such as traditional computer data structure algorithm problems, such as classic sorting algorithms or dynamic programming ideas, and put some seemingly
Complicated problems can be solved with just a dozen lines or even a for loop. This makes me feel a kind of aesthetics, and at the same time I admire the great thoughts of human beings.
But the traditional computer algorithm is actually through, people write the code, and people solve the problem through a complete and problem-solving idea. But if the machine can have its own thoughts, if it can "learn" the method to solve the problem, wouldn't it be very cool. But from my current knowledge, artificial intelligence is more like a tool, a "mathematical tool", a "statistics tool",
It sums up a "rule" from a large amount of data to solve practical problems. It is still far from the real thinking of computers, and even at present, the two may not be the same thing. It may be possible for machines to have thinking, and breakthroughs in other disciplines, such as human cognitive mechanisms and brain science, are needed. Haha far away.

Let me introduce some simple knowledge of myself first.

Linear

What is linear?

There is a class of geometric objects, such as straight lines, planes, and cubes, which seem to have edges and corners, and they are all "straight". They are called linear in mathematics.
avatar

To deal with their related issues is very simple. For example, I learned in high school that two straight lines can be expressed by two linear equations. If you want to find their intersection:

avator

Combine the equations of the two together and find the solution of the equations to get the point of intersection

Why study linearity

(1) The world and universe we live in is too complicated, and many phenomena are incomprehensible, let alone described by mathematics;

(2) There are some complex problems that meet certain conditions, which can be transformed into simple linear problems. Linear problems can be completely understood, and can be completely described by mathematics

return

From my current knowledge, there are two main tasks for machine learning.
The first is the classification task, such as

A judge of the picture is a cat or dog (binary, because I defined goal, there are two conclusions that the cat or dog)
Determine whether a stock will rise or fall tomorrow
Determining a digital picture of a few (multi-classification. Because I defined objective conclusion there are 10, 0-9)

avator

In other words, the result of the classification is, pre-defined result ranges of 160a64f4fcac8c

The second type of task is a regression task, and its result is a continuous numeric value, not a category.
E.g

Predict house price
Forecast stock price

What is machine learning

This is my current simple understanding. At present, I think machine learning is a mathematical tool. By feeding a large amount of learning materials to the machine, then the machine runs a machine learning algorithm to train a model. Then the problem is thrown into the machine, and the machine calculates the result through this model.

avator

Preliminary perceptual understanding of linear regression

For example, I have collected two sets of data with x and y (such as age and height), and I want to know whether there is a linear relationship between these two sets of variables. Then I first draw such a scatter plot with one variable as the x-axis and the other variable as the y-axis.

avator

Then I can find such a straight line. The characteristic of this straight line is: as close as possible to all discrete points, or it can be expressed as the smallest sum of the difference of the distance between each discrete point and the straight line.
Then I can predict the unknown y value from the known x value based on the straight line I calculated.
If there is a linear relationship between x and y, then the prediction effect is still very good. So the main task of linear regression is to find this straight line.

Univariate linear regression

Let's start with univariate linear regression to understand, that is, assuming that x has only one feature (such as the concentration of nitric oxide), y is the housing price.
According to the perceptual understanding mentioned above, our goal is to find the best linear equation:

avator

In fact, it is the process of finding parameters a and b.
In fact, our goal is to make according to each x point, make

avator

The smallest. This equation is called the loss function.
You may want to ask why the sum of squares of the difference is the smallest? Rather than the absolute value of the difference and the smallest or the smallest difference to the 3 or 4th power?
The smallest sum of squares of the difference is called least squares in mathematics, here is a link
https://www.zhihu.com/question/24095027 , I won’t go into details here.

Therefore, the basic idea of a class of machine learning algorithms is to obtain a machine learning model by determining the loss function of the problem, and then optimizing the loss function.
How to find the minimum value of this loss function, that is, find the values of a and b. Then you need to differentiate between a and b. The point where the derivative is 0 is the extreme point.
Now we derive a (chain derivation rule of compound function):

avator

To simplify:

avator

According to the same process, a is obtained, and the simplification process is omitted:

avator

Then python implement it:
In short, I need to define two methods.

fit Fitting method. Or the training method we often say. By passing the training data as parameters into this method, the various parameters of the model are obtained.
predict prediction method. Bring the value of x into this method to get the predicted value

avator

Here we need to pay attention: vectorization is used instead of loop to find a. We can see that the numerator and denominator of a can actually be found using cycles.
But in fact, the numerator and denominator of a can actually be regarded as a dot product of a vector (that is, each component in vector a is multiplied by each component in vector b).
This has two advantages:

The code is clearer
Vectors are parallel operations. (Calling the GPU stream processor for parallel operations) is much faster than looping in the cpu

When the parameters of a and b are calculated, we get a model (in this example, y=ax+b), and then we can make predictions. Put x into this equation and you can Get this y value after prediction.

Multiple linear regression

After understanding the univariate linear regression, we began to need to solve, when there are multiple features, how to make predictions?
That is, multiple linear regression.
We can understand that what multiple linear regression actually requires is such an equation

avator

That is, there is a constant coefficient in front of each feature, and a constant (intercept) is added.
Here we organize these coefficients into a (column) vector

avator

Then, for convenience, set a x0, x0 to be equal to 1, then we will finally simplify the dot product of the following two vectors

avator

Then combine all x vectors (samples) into a matrix, and organize theta into a column vector. Then y (vector) is the predicted value of all x vectors. The multiplication of matrices and vectors is used here (haha, if you forget, you have to review linear algebra).

avator

Then according to the least squares method, our goal is to make avator

The smallest. That is to derivate the entire matrix, the specific derivation process is omitted, here is the final theta solution:

avator

That is, we have directly obtained the mathematical solution of the parameter through mathematical derivation. However, in general, there are still relatively few machine learning methods that can directly obtain the mathematical solution of the parameter. It may be possible to use other methods such as gradient descent.出Parameters.

Realization of multiple linear regression

Next, implement it based on this mathematical solution.

avator

Simple linear regression in practice (Boston housing price prediction)

This Boston housing price data set is a data set that comes with sklearn (a machine learning framework)

In fact, I was confused when I saw this data set. Does this example lead us to predict housing prices? Predict the housing prices in Shenzhen tomorrow?
I think it can be understood this way. By collecting some characteristics (learning materials) as shown in the figure below and the average housing prices in certain areas of Boston (target conclusion), we can infer how you or the real estate agent should sell the house more cost-effectively. In other words, through this data set to understand which factor has a greater impact on housing prices.

Data introduction

The data set contains housing information data in the suburbs of Boston, Massachusetts. It comes from the UCI machine learning knowledge base (the data set is offline). It began to count in 1978. It includes 506 samples, each including 12 feature variables and the region’s data. Average house price.

Field meaning

avator

It can be seen that the researchers hope to find out the important factors affecting housing prices, such as environmental factors (nitrogen monoxide concentration), location factors (weighted distance to the 5 central areas of Boston), etc. (but I believe that the factors affecting housing prices in China are more This is much more complicated)

avator

After solving, the value of each parameter is obtained (or learned), and then if the real estate agent wants to set a price, these characteristics can be collected, and then the predict method of the model can be used to obtain a reference value of the house price.

Then we can also see which factors are positively correlated with housing prices and which are negatively correlated. Then the larger the parameter, the more it affects the housing price. This is the interpretability of the results of the linear regression method (some machine learning methods are not supported).

Welcome to follow the blog of Bump Lab: aotu.io

Or follow the AOTULabs official account (AOTULabs) and push articles from time to time.

A Preliminary Study of Machine Learning-Linear Regression

Digression

Linear

return

What is machine learning

Preliminary perceptual understanding of linear regression

Univariate linear regression

Multiple linear regression

Realization of multiple linear regression

Simple linear regression in practice (Boston housing price prediction)

Data introduction

Field meaning

凹凸实验室

引用和评论

招聘 | Taro 团队招人啦！

2025年最新反编译微信小程序的教程及工具

手写一个动态海洋和天空效果的vue hooks

你可能不知道的图片加载相关知识

原生JS大揭秘—JS代码执行原理解刨

原生electron起步-从零到一完成构建和打包

LRU算法，你别跑，我就要吃透你