线性方程
线线方程(1):
式(1)用矩阵表示为(2):
即:
线性回归(LR)
求解
线性回归前提:
- 各个自变量x(
)之间不存在严格线性相关性 - 样本n满足: n >= 30 或 n >=3*(k+1)
- 干扰因子
与x无线性相关, 且 均值为零 由线性方程(2)可得:
线性回归的目的是找到一个
y值预测
求解后预测公式为:
代码
cpp
#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main() {
// Input data
MatrixXd X(5, 2);
X << 1, 2,
2, 4,
3, 6,
4, 8,
5, 10;
VectorXd y(5);
y << 3, 6, 9, 12, 15;
// Add a column of ones to X for the intercept term
MatrixXd X_ones(X.rows(), X.cols() + 1);
X_ones << MatrixXd::Ones(X.rows(), 1), X;
// Calculate the least squares solution
VectorXd beta = (X_ones.transpose() * X_ones).inverse() * X_ones.transpose() * y;
// Print the coefficients (intercept and slope)
std::cout << "Intercept: " << beta(0) << std::endl;
std::cout << "Slope 1: " << beta(1) << std::endl;
std::cout << "Slope 2: " << beta(2) << std::endl;
return 0;
}贝叶斯回归(BLR)
求解
贝叶斯线性回归前提:
- 残差(噪声)
服从正态分布, 其方差 服从逆Gamma分布(Inverse-Gamma distribution)
在此条件下, 当X确定时, y被观察到的分布, 也就是似然函数p(y|X)为:
结合贝叶斯定理:
- P(A)是A的先验概率, 它不考虑任何B方面的因素
- P(A|B)是已知B发生后A的条件概率
- P(B|A)是已知A发生后B的条件概率
- P(B)是B的先验概率, 它不考虑任何A方面的因素
推导可得后验分布(3):
预测
代码
cpp
#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main() {
// Input data
MatrixXd X(5, 2);
X << 1, 2,
2, 4,
3, 6,
4, 8,
5, 10;
VectorXd y(5);
y << 3, 6, 9, 12, 15;
// Add a column of ones to X for the intercept term
MatrixXd X_ones(X.rows(), X.cols() + 1);
X_ones << MatrixXd::Ones(X.rows(), 1), X;
// Prior parameters
double alpha = 1.0; // Prior precision for the coefficients
double beta = 1.0; // Prior precision for the noise variance
// Calculate posterior parameters
double N = X_ones.rows();
double M = X_ones.cols();
MatrixXd XtX = X_ones.transpose() * X_ones;
MatrixXd Xty = X_ones.transpose() * y;
MatrixXd Sigma_inv = alpha * MatrixXd::Identity(M, M) + beta * XtX;
MatrixXd Sigma = Sigma_inv.inverse();
VectorXd mu = beta * Sigma * Xty;
// Calculate predictive distribution for new input data points
MatrixXd X_star(1, 2); // New input point for prediction
X_star << 6, 12;
MatrixXd X_star_ones(X_star.rows(), X_star.cols() + 1);
X_star_ones << MatrixXd::Ones(X_star.rows(), 1), X_star;
double y_pred = X_star_ones * mu;
// Print the coefficients (intercept and slope)
std::cout << "Intercept: " << mu(0) << std::endl;
std::cout << "Slope 1: " << mu(1) << std::endl;
std::cout << "Slope 2: " << mu(2) << std::endl;
// Print the predicted output
std::cout << "Predicted Output: " << y_pred << std::endl;
return 0;
}高斯过程回归(GPR)
求解
高斯过程回归的前提:
- 任何有限数量的自变量x的组合都服从联合高斯分布
根据高斯模型方程(4):
和前验分布方程(5):
- m(X)是均值函数
- K(X,X)是X的自身的协方差矩阵
得到后验分布为:
预测
是预测均值 是预测方差
代码
cpp
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/Cholesky>
using namespace Eigen;
// Define the kernel function (Squared Exponential Kernel)
double kernel(const VectorXd& x1, const VectorXd& x2, double lengthScale, double variance) {
double squaredDistance = (x1 - x2).squaredNorm();
return variance * exp(-squaredDistance / (2.0 * lengthScale * lengthScale));
}
int main() {
// Input data
MatrixXd X(5, 1);
X << 1,
2,
3,
4,
5;
VectorXd y(5);
y << 3, 6, 9, 12, 15;
// New input point for prediction
VectorXd X_star(1);
X_star << 6;
// Kernel hyperparameters
double lengthScale = 1.0;
double variance = 1.0;
// Calculate the kernel matrix (covariance matrix)
int n = X.rows();
MatrixXd K(n, n);
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
K(i, j) = kernel(X.row(i), X.row(j), lengthScale, variance);
}
}
// Add small noise to the diagonal of the kernel matrix for numerical stability
K.diagonal() += 1e-6;
// Compute the kernel vector for the new input point
VectorXd k_star(n);
for (int i = 0; i < n; i++) {
k_star(i) = kernel(X.row(i), X_star, lengthScale, variance);
}
// Calculate the predictive mean and variance
VectorXd mu = k_star.transpose() * K.inverse() * y;
double sigma = kernel(X_star, X_star, lengthScale, variance) - k_star.transpose() * K.inverse() * k_star;
// Print the predictive mean and variance
std::cout << "Predicted Mean: " << mu(0) << std::endl;
std::cout << "Predicted Variance: " << sigma << std::endl;
return 0;
}人工神经网络线性回归(ANNR)
求解
人工神经网络是利用残差网络求解出损失L最小时, 权重参数值
对于神经网络其解问题分为以下步骤:
- 选择网络结构: 对于线性回归一般选择单层神经网络.
- 选择选择激活函数: 对于线性回归选择线性激活函数
- 正向传导计算得到
- 计算损失值L
- 反向传导: 选择梯度下降或者其他算法迭代更新权值
- 修正权重参数
, 重复执行3,4,5,6直到损失L最小
预测
代码
cpp
#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
// Linear Activation Function (Identity Function)
MatrixXd linearActivation(const MatrixXd& x) {
return x;
}
// Neural Network Prediction
VectorXd predict(const MatrixXd& X, const MatrixXd& W) {
return X * W;
}
int main() {
// Input data (X) and true output (y)
MatrixXd X(5, 2);
X << 1, 2,
2, 4,
3, 6,
4, 8,
5, 10;
VectorXd y(5);
y << 3, 6, 9, 12, 15;
// Add a column of ones to X for the bias term
MatrixXd X_bias(X.rows(), X.cols() + 1);
X_bias << MatrixXd::Ones(X.rows(), 1), X;
// Randomly initialize the weights (including bias weight)
MatrixXd W(X_bias.cols(), 1);
W.setRandom();
// Hyperparameters
double learningRate = 0.01;
int epochs = 1000;
// Neural Network Training (Gradient Descent)
for (int epoch = 0; epoch < epochs; ++epoch) {
// Forward pass: Prediction
VectorXd y_pred = predict(X_bias, W);
// Calculate the loss (Mean Squared Error)
VectorXd loss = y_pred - y;
double meanSquaredError = loss.squaredNorm() / X.rows();
// Backpropagation: Update the weights
MatrixXd gradient = X_bias.transpose() * loss / X.rows();
W -= learningRate * gradient;
}
// Print the learned weights
std::cout << "Learned Weights:" << std::endl;
std::cout << W << std::endl;
return 0;
}