Digression
I was very interested in artificial intelligence from a long time ago. I remember my college graduation thesis, which uses genetic algorithms to solve a classic pathfinding problem.
I have always been in awe and admiration for the classic human thoughts, such as traditional computer data structure algorithm problems, such as classic sorting algorithms or dynamic programming ideas, and put some seemingly
Complicated problems can be solved with just a dozen lines or even a for loop. This makes me feel a kind of aesthetics, and at the same time I admire the great thoughts of human beings.
But the traditional computer algorithm is actually through, people write the code, and people solve the problem through a complete and problem-solving idea. But if the machine can have its own thoughts, if it can "learn" the method to solve the problem, wouldn't it be very cool. But from my current knowledge, artificial intelligence is more like a tool, a "mathematical tool", a "statistics tool",
It sums up a "rule" from a large amount of data to solve practical problems. It is still far from the real thinking of computers, and even at present, the two may not be the same thing. It may be possible for machines to have thinking, and breakthroughs in other disciplines, such as human cognitive mechanisms and brain science, are needed. Haha far away.
Let me introduce some simple knowledge of myself first.
Linear
- What is linear?
There is a class of geometric objects, such as straight lines, planes, and cubes, which seem to have edges and corners, and they are all "straight". They are called linear in mathematics.
To deal with their related issues is very simple. For example, I learned in high school that two straight lines can be expressed by two linear equations. If you want to find their intersection:
Combine the equations of the two together and find the solution of the equations to get the point of intersection
- Why study linearity
(1) The world and universe we live in is too complicated, and many phenomena are incomprehensible, let alone described by mathematics;
(2) There are some complex problems that meet certain conditions, which can be transformed into simple linear problems. Linear problems can be completely understood, and can be completely described by mathematics
return
From my current knowledge, there are two main tasks for machine learning.
The first is the classification task, such as
- A judge of the picture is a cat or dog (binary, because I defined goal, there are two conclusions that the cat or dog)
- Determine whether a stock will rise or fall tomorrow
- Determining a digital picture of a few (multi-classification. Because I defined objective conclusion there are 10, 0-9)
In other words, the result of the classification is, pre-defined result ranges of 160a64f4fcac8c
The second type of task is a regression task, and its result is a continuous numeric value, not a category.
E.g
- Predict house price
- Forecast stock price
What is machine learning
This is my current simple understanding. At present, I think machine learning is a mathematical tool. By feeding a large amount of learning materials to the machine, then the machine runs a machine learning algorithm to train a model. Then the problem is thrown into the machine, and the machine calculates the result through this model.
Preliminary perceptual understanding of linear regression
For example, I have collected two sets of data with x and y (such as age and height), and I want to know whether there is a linear relationship between these two sets of variables. Then I first draw such a scatter plot with one variable as the x-axis and the other variable as the y-axis.
Then I can find such a straight line. The characteristic of this straight line is: as close as possible to all discrete points, or it can be expressed as the smallest sum of the difference of the distance between each discrete point and the straight line.
Then I can predict the unknown y value from the known x value based on the straight line I calculated.
If there is a linear relationship between x and y, then the prediction effect is still very good. So the main task of linear regression is to find this straight line.
Univariate linear regression
Let's start with univariate linear regression to understand, that is, assuming that x has only one feature (such as the concentration of nitric oxide), y is the housing price.
According to the perceptual understanding mentioned above, our goal is to find the best linear equation:
In fact, it is the process of finding parameters a and b.
In fact, our goal is to make according to each x point, make
The smallest. This equation is called the loss function.
You may want to ask why the sum of squares of the difference is the smallest? Rather than the absolute value of the difference and the smallest or the smallest difference to the 3 or 4th power?
The smallest sum of squares of the difference is called least squares in mathematics, here is a link
https://www.zhihu.com/question/24095027 , I won’t go into details here.
Therefore, the basic idea of a class of machine learning algorithms is to obtain a machine learning model by determining the loss function of the problem, and then optimizing the loss function.
How to find the minimum value of this loss function, that is, find the values of a and b. Then you need to differentiate between a and b. The point where the derivative is 0 is the extreme point.
Now we derive a (chain derivation rule of compound function):
To simplify:
According to the same process, a is obtained, and the simplification process is omitted:
Then python implement it:
In short, I need to define two methods.
- fit Fitting method. Or the training method we often say. By passing the training data as parameters into this method, the various parameters of the model are obtained.
- predict prediction method. Bring the value of x into this method to get the predicted value
Here we need to pay attention: vectorization is used instead of loop to find a. We can see that the numerator and denominator of a can actually be found using cycles.
But in fact, the numerator and denominator of a can actually be regarded as a dot product of a vector (that is, each component in vector a is multiplied by each component in vector b).
This has two advantages:
- The code is clearer
- Vectors are parallel operations. (Calling the GPU stream processor for parallel operations) is much faster than looping in the cpu
When the parameters of a and b are calculated, we get a model (in this example, y=ax+b), and then we can make predictions. Put x into this equation and you can Get this y value after prediction.
Multiple linear regression
After understanding the univariate linear regression, we began to need to solve, when there are multiple features, how to make predictions?
That is, multiple linear regression.
We can understand that what multiple linear regression actually requires is such an equation
That is, there is a constant coefficient in front of each feature, and a constant (intercept) is added.
Here we organize these coefficients into a (column) vector
Then, for convenience, set a x0, x0 to be equal to 1, then we will finally simplify the dot product of the following two vectors
Then combine all x vectors (samples) into a matrix, and organize theta into a column vector. Then y (vector) is the predicted value of all x vectors. The multiplication of matrices and vectors is used here (haha, if you forget, you have to review linear algebra).
Then according to the least squares method, our goal is to make
The smallest. That is to derivate the entire matrix, the specific derivation process is omitted, here is the final theta solution:
That is, we have directly obtained the mathematical solution of the parameter through mathematical derivation. However, in general, there are still relatively few machine learning methods that can directly obtain the mathematical solution of the parameter. It may be possible to use other methods such as gradient descent.出Parameters.
Realization of multiple linear regression
Next, implement it based on this mathematical solution.
Simple linear regression in practice (Boston housing price prediction)
This Boston housing price data set is a data set that comes with sklearn (a machine learning framework)
In fact, I was confused when I saw this data set. Does this example lead us to predict housing prices? Predict the housing prices in Shenzhen tomorrow?
I think it can be understood this way. By collecting some characteristics (learning materials) as shown in the figure below and the average housing prices in certain areas of Boston (target conclusion), we can infer how you or the real estate agent should sell the house more cost-effectively. In other words, through this data set to understand which factor has a greater impact on housing prices.
Data introduction
The data set contains housing information data in the suburbs of Boston, Massachusetts. It comes from the UCI machine learning knowledge base (the data set is offline). It began to count in 1978. It includes 506 samples, each including 12 feature variables and the region’s data. Average house price.
Field meaning
It can be seen that the researchers hope to find out the important factors affecting housing prices, such as environmental factors (nitrogen monoxide concentration), location factors (weighted distance to the 5 central areas of Boston), etc. (but I believe that the factors affecting housing prices in China are more This is much more complicated)
After solving, the value of each parameter is obtained (or learned), and then if the real estate agent wants to set a price, these characteristics can be collected, and then the predict method of the model can be used to obtain a reference value of the house price.
Then we can also see which factors are positively correlated with housing prices and which are negatively correlated. Then the larger the parameter, the more it affects the housing price. This is the interpretability of the results of the linear regression method (some machine learning methods are not supported).
Welcome to follow the blog of Bump Lab: aotu.io
Or follow the AOTULabs official account (AOTULabs) and push articles from time to time.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。