线性方程

线线方程(1):

y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{3} + . . . β_{k} x_{k} + ϵ

式(1)用矩阵表示为(2):

{[\begin{matrix} y_{1} \\ y_{2} \\ y_{3} \\ . . . \\ y_{n} \end{matrix}]}_{n \times 1} = {[\begin{matrix} 1 & x_{11} & x_{12} & x_{13} & . . . & x_{1 k} \\ 1 & x_{21} & x_{22} & x_{23} & . . . & x_{2 k} \\ 1 & x_{31} & x_{32} & x_{33} & . . . & x_{3 k} \\ . . . & . . . & . . . & . . . & . . . & . . . \\ 1 & x_{n 1} & x_{n 2} & x_{n 3} & . . . & x_{n k} \end{matrix}]}_{n \times (k + 1)} \times {[\begin{matrix} β_{0} \\ β_{1} \\ β_{2} \\ . . . \\ β_{k} \end{matrix}]}_{(K + 1) \times 1} + {[\begin{matrix} ϵ_{1} \\ ϵ_{2} \\ ϵ_{3} \\ . . . \\ ϵ_{n} \end{matrix}]}_{n \times 1}

即:

y = X β + ϵ

线性回归(LR)

求解

线性回归前提:

各个自变量x( $x_{1}, x_{2}, x_{3} . . . x_{k}$ )之间不存在严格线性相关性
样本n满足: n >= 30 或 n >=3*(k+1)
干扰因子 $ϵ$ 与x无线性相关, 且 $ϵ$ 均值为零由线性方程(2)可得:

ϵ = y - X β

线性回归的目的是找到一个 $β$ , 使得 $ϵ$ 的平均方差最小, 最小二乘(OLS)提供了最小平均方差的计算公式:

β = (X^{'} X)^{- 1} X^{'} y

y值预测

求解后预测公式为:

\hat{y} = {\hat{x}}^{T} β

代码

cpp

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main() {
// Input data
MatrixXd X(5, 2);
X << 1, 2,
2, 4,
3, 6,
4, 8,
5, 10;
VectorXd y(5);
y << 3, 6, 9, 12, 15;

// Add a column of ones to X for the intercept term
MatrixXd X_ones(X.rows(), X.cols() + 1);
X_ones << MatrixXd::Ones(X.rows(), 1), X;

// Calculate the least squares solution
VectorXd beta = (X_ones.transpose() * X_ones).inverse() * X_ones.transpose() * y;

// Print the coefficients (intercept and slope)
std::cout << "Intercept: " << beta(0) << std::endl;
std::cout << "Slope 1: " << beta(1) << std::endl; 
std::cout << "Slope 2: " << beta(2) << std::endl;
return 0;
}

贝叶斯回归(BLR)

求解

贝叶斯线性回归前提:

残差(噪声) $ξ$ 服从正态分布, 其方差 $σ^{2}$ 服从逆Gamma分布(Inverse-Gamma distribution)

在此条件下, 当X确定时, y被观察到的分布, 也就是似然函数p(y|X)为:

p (y | X, β, θ^{2}) = \frac{1}{(2 π σ^{2})^{n / 2}} e x p (- \frac{1}{2 σ^{2}} (y - X β)^{T} (y - X β))

结合贝叶斯定理:

P (A | B) = \frac{P (A) P (B | A)}{P (B)}

P(A)是A的先验概率, 它不考虑任何B方面的因素
P(A|B)是已知B发生后A的条件概率
P(B|A)是已知A发生后B的条件概率
P(B)是B的先验概率, 它不考虑任何A方面的因素

推导可得后验分布(3):

p (β, σ^{2} | X, y) \propto p (y | X, β, σ^{2}) . p (β) . p (σ^{2})

预测

p (\hat{y} | \hat{x}, X, y) = \int p (\hat{y} | \hat{x}, β, σ^{2}) p (β, σ^{2} | X, y) d β d σ^{2}

代码

cpp

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;

int main() {
    // Input data
    MatrixXd X(5, 2);
    X << 1, 2,
         2, 4,
         3, 6,
         4, 8,
         5, 10;
    VectorXd y(5);
    y << 3, 6, 9, 12, 15;

    // Add a column of ones to X for the intercept term
    MatrixXd X_ones(X.rows(), X.cols() + 1);
    X_ones << MatrixXd::Ones(X.rows(), 1), X;

    // Prior parameters
    double alpha = 1.0; // Prior precision for the coefficients
    double beta = 1.0;  // Prior precision for the noise variance

    // Calculate posterior parameters
    double N = X_ones.rows();
    double M = X_ones.cols();
    MatrixXd XtX = X_ones.transpose() * X_ones;
    MatrixXd Xty = X_ones.transpose() * y;
    MatrixXd Sigma_inv = alpha * MatrixXd::Identity(M, M) + beta * XtX;
    MatrixXd Sigma = Sigma_inv.inverse();
    VectorXd mu = beta * Sigma * Xty;

    // Calculate predictive distribution for new input data points
    MatrixXd X_star(1, 2); // New input point for prediction
    X_star << 6, 12;

    MatrixXd X_star_ones(X_star.rows(), X_star.cols() + 1);
    X_star_ones << MatrixXd::Ones(X_star.rows(), 1), X_star;

    double y_pred = X_star_ones * mu;

    // Print the coefficients (intercept and slope)
    std::cout << "Intercept: " << mu(0) << std::endl;
    std::cout << "Slope 1: " << mu(1) << std::endl;
    std::cout << "Slope 2: " << mu(2) << std::endl;

    // Print the predicted output
    std::cout << "Predicted Output: " << y_pred << std::endl;

    return 0;
}

高斯过程回归(GPR)

求解

高斯过程回归的前提:

任何有限数量的自变量x的组合都服从联合高斯分布

根据高斯模型方程(4):

y = f (X) + ϵ

和前验分布方程(5):

p (f (X)) \sim N (m (X), K (X, X))

m(X)是均值函数
K(X,X)是X的自身的协方差矩阵

得到后验分布为:

p (f (X) | X, y) \sim N (μ (X), \sum (X, X))

预测

p (\hat{y} | \hat{x}, X, y) \sim N (μ (\hat{x}), σ^{2} (\hat{x}))

$μ (\hat{x})$ 是预测均值
$σ^{2} (\hat{x})$ 是预测方差

代码

cpp

#include <iostream>
#include <Eigen/Dense>
#include <Eigen/Cholesky>

using namespace Eigen;

// Define the kernel function (Squared Exponential Kernel)
double kernel(const VectorXd& x1, const VectorXd& x2, double lengthScale, double variance) {
    double squaredDistance = (x1 - x2).squaredNorm();
    return variance * exp(-squaredDistance / (2.0 * lengthScale * lengthScale));
}

int main() {
    // Input data
    MatrixXd X(5, 1);
    X << 1,
         2,
         3,
         4,
         5;

    VectorXd y(5);
    y << 3, 6, 9, 12, 15;

    // New input point for prediction
    VectorXd X_star(1);
    X_star << 6;

    // Kernel hyperparameters
    double lengthScale = 1.0;
    double variance = 1.0;

    // Calculate the kernel matrix (covariance matrix)
    int n = X.rows();
    MatrixXd K(n, n);
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            K(i, j) = kernel(X.row(i), X.row(j), lengthScale, variance);
        }
    }

    // Add small noise to the diagonal of the kernel matrix for numerical stability
    K.diagonal() += 1e-6;

    // Compute the kernel vector for the new input point
    VectorXd k_star(n);
    for (int i = 0; i < n; i++) {
        k_star(i) = kernel(X.row(i), X_star, lengthScale, variance);
    }

    // Calculate the predictive mean and variance
    VectorXd mu = k_star.transpose() * K.inverse() * y;
    double sigma = kernel(X_star, X_star, lengthScale, variance) - k_star.transpose() * K.inverse() * k_star;

    // Print the predictive mean and variance
    std::cout << "Predicted Mean: " << mu(0) << std::endl;
    std::cout << "Predicted Variance: " << sigma << std::endl;

    return 0;
}

人工神经网络线性回归(ANNR)

求解

人工神经网络是利用残差网络求解出损失L最小时, 权重参数值 $w$ :

y_{p r e d} = w_{0} + w_{1} x_{1} + w_{2} x_{2} + w_{2} x_{2} + . . . w_{k} x_{k} L = \frac{1}{n} \sum_{i = 1}^{n} (y_{p r e d}^{i} - y_{t r u e})^{2}

对于神经网络其解问题分为以下步骤:

选择网络结构: 对于线性回归一般选择单层神经网络.
选择选择激活函数: 对于线性回归选择线性激活函数
正向传导计算得到 $y_{p r e d}$
计算损失值L
反向传导: 选择梯度下降或者其他算法迭代更新权值 $w$
修正权重参数 $w$ , 重复执行3,4,5,6直到损失L最小

预测

y_{p r e d} = w_{0} + w_{1} x_{1} + w_{2} x_{2} + w_{2} x_{2} + . . . w_{k} x_{k}

代码

cpp

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;

// Linear Activation Function (Identity Function)
MatrixXd linearActivation(const MatrixXd& x) {
    return x;
}

// Neural Network Prediction
VectorXd predict(const MatrixXd& X, const MatrixXd& W) {
    return X * W;
}

int main() {
    // Input data (X) and true output (y)
    MatrixXd X(5, 2);
    X << 1, 2,
         2, 4,
         3, 6,
         4, 8,
         5, 10;

    VectorXd y(5);
    y << 3, 6, 9, 12, 15;

    // Add a column of ones to X for the bias term
    MatrixXd X_bias(X.rows(), X.cols() + 1);
    X_bias << MatrixXd::Ones(X.rows(), 1), X;

    // Randomly initialize the weights (including bias weight)
    MatrixXd W(X_bias.cols(), 1);
    W.setRandom();

    // Hyperparameters
    double learningRate = 0.01;
    int epochs = 1000;

    // Neural Network Training (Gradient Descent)
    for (int epoch = 0; epoch < epochs; ++epoch) {
        // Forward pass: Prediction
        VectorXd y_pred = predict(X_bias, W);

        // Calculate the loss (Mean Squared Error)
        VectorXd loss = y_pred - y;
        double meanSquaredError = loss.squaredNorm() / X.rows();

        // Backpropagation: Update the weights
        MatrixXd gradient = X_bias.transpose() * loss / X.rows();
        W -= learningRate * gradient;
    }

    // Print the learned weights
    std::cout << "Learned Weights:" << std::endl;
    std::cout << W << std::endl;

    return 0;
}

线性方程 ​

线性回归(LR) ​

求解 ​

y值预测 ​

代码 ​

贝叶斯回归(BLR) ​

求解 ​

预测 ​

代码 ​

高斯过程回归(GPR) ​

求解 ​

预测 ​

代码 ​

人工神经网络线性回归(ANNR) ​

求解 ​

预测 ​

代码 ​

线性方程

线性回归(LR)

求解

y值预测

代码

贝叶斯回归(BLR)

求解

预测

代码

高斯过程回归(GPR)

求解

预测

代码

人工神经网络线性回归(ANNR)

求解

预测

代码