前情提要
向函数传递Eigen变量
一般而言,往函数里面传矩阵是通过传值进行的,这种方式有下面两种缺点
- 需要生成临时矩阵
- 不能修改原矩阵
例子
Eigen中的四种基本base
- MatrixBase
- ArrayBase
- DenseBase:Matrix+Array
- EigenBase:全部
EigenBase例子
#include <iostream>
#include <Eigen/Core>
using namespace Eigen;
template <typename Derived>
void print_size(const EigenBase<Derived>& b)
{
std::cout << "size (rows, cols): " << b.size() << " (" << b.rows()
<< ", " << b.cols() << ")" << std::endl;
}
int main()
{
Vector3f v;
print_size(v);
// v.asDiagonal() returns a 3x3 diagonal matrix pseudo-expression
print_size(v.asDiagonal());
}
DenseBase例子
template <typename Derived>
void print_block(const DenseBase<Derived>& b, int x, int y, int r, int c)
{
std::cout << "block: " << b.block(x,y,r,c) << std::endl;
}
ArrayBase例子
template <typename Derived>
void print_max_coeff(const ArrayBase<Derived> &a)
{
std::cout << "max: " << a.maxCoeff() << std::endl;
}
MatrixBase例子
template <typename Derived>
void print_inv_cond(const MatrixBase<Derived>& a)
{
const typename JacobiSVD<typename Derived::PlainObject>::SingularValuesType&
sing_vals = a.jacobiSvd().singularValues();
std::cout << "inv cond: " << sing_vals(sing_vals.size()-1) / sing_vals(0) << std::endl;
}
多模板参数例子
template <typename DerivedA,typename DerivedB>
typename DerivedA::Scalar squaredist(const MatrixBase<DerivedA>& p1,const MatrixBase<DerivedB>& p2)
{
return (p1-p2).squaredNorm();
}
通用但不是模板函数
#include <iostream>
#include <Eigen/SVD>
using namespace Eigen;
using namespace std;
float inv_cond(const Ref<const MatrixXf>& a)
{
const VectorXf sing_vals = a.jacobiSvd().singularValues();
return sing_vals(sing_vals.size()-1) / sing_vals(0);
}
int main()
{
Matrix4f m = Matrix4f::Random();
cout << "matrix m:" << endl << m << endl << endl;
cout << "inv_cond(m): " << inv_cond(m) << endl;
cout << "inv_cond(m(1:3,1:3)): " << inv_cond(m.topLeftCorner(3,3)) << endl;
cout << "inv_cond(m+I): " << inv_cond(m+Matrix4f::Identity()) << endl;
}
void cov(const Ref<const MatrixXf> x, const Ref<const MatrixXf> y, Ref<MatrixXf> C)
{
const float num_observations = static_cast<float>(x.rows());
const RowVectorXf x_mean = x.colwise().sum() / num_observations;
const RowVectorXf y_mean = y.colwise().sum() / num_observations;
C = (x.rowwise() - x_mean).transpose() * (y.rowwise() - y_mean) / num_observations;
}
...
MatrixXf m1, m2, m3
cov(m1, m2, m3);
cov(m1.leftCols<3>(), m2.leftCols<3>(), m3.topLeftCorner<3,3>());
在什么情况下函数可以直接传引用呢
只传入矩阵或矩阵的表达式的时候
对于表达式会创建临时矩阵以存储表达式的值
MatrixXf cov(const MatrixXf& x, const MatrixXf& y)
{
const float num_observations = static_cast<float>(x.rows());
const RowVectorXf x_mean = x.colwise().sum() / num_observations;
const RowVectorXf y_mean = y.colwise().sum() / num_observations;
return (x.rowwise() - x_mean).transpose() * (y.rowwise() - y_mean) / num_observations;
}
什么情况下直接传引用会失效呢
传进去不是const的block()
// Note: This code is flawed!
void cov(const MatrixXf& x, const MatrixXf& y, MatrixXf& C)
{
const float num_observations = static_cast<float>(x.rows());
const RowVectorXf x_mean = x.colwise().sum() / num_observations;
const RowVectorXf y_mean = y.colwise().sum() / num_observations;
C = (x.rowwise() - x_mean).transpose() * (y.rowwise() - y_mean) / num_observations;
}
一种解决方法
template <typename Derived, typename OtherDerived>
void cov(const MatrixBase<Derived>& x, const MatrixBase<Derived>& y, MatrixBase<OtherDerived> const & C)
{
typedef typename Derived::Scalar Scalar;
typedef typename internal::plain_row_type<Derived>::type RowVectorType;
const Scalar num_observations = static_cast<Scalar>(x.rows());
const RowVectorType x_mean = x.colwise().sum() / num_observations;
const RowVectorType y_mean = y.colwise().sum() / num_observations;
const_cast< MatrixBase<OtherDerived>& >(C) =
(x.rowwise() - x_mean).transpose() * (y.rowwise() - y_mean) / num_observations;
}
const cast只能用于模板函数,因为Block依然无法转换成Matrix
如何通用地改变矩阵的大小
MatrixBase并不能被改变大小,因为Block之类的不能改变大小,所以下面的代码不能使
MatrixXf x = MatrixXf::Random(100,3);
MatrixXf y = MatrixXf::Random(100,3);
MatrixXf C;
cov(x, y, C);
解决方法:改变derived之后的变量的大小
template <typename Derived, typename OtherDerived>
void cov(const MatrixBase<Derived>& x, const MatrixBase<Derived>& y, MatrixBase<OtherDerived> const & C_)
{
typedef typename Derived::Scalar Scalar;
typedef typename internal::plain_row_type<Derived>::type RowVectorType;
const Scalar num_observations = static_cast<Scalar>(x.rows());
const RowVectorType x_mean = x.colwise().sum() / num_observations;
const RowVectorType y_mean = y.colwise().sum() / num_observations;
MatrixBase<OtherDerived>& C = const_cast< MatrixBase<OtherDerived>& >(C_);
C.derived().resize(x.cols(),x.cols()); // resize the derived object
C = (x.rowwise() - x_mean).transpose() * (y.rowwise() - y_mean) / num_observations;
}
现在函数可以接受表达式、(任意大小的)矩阵
总结
- 只读的参数无论怎么写都没太大关系,但可能会引入不必要的临时变量
- 读写的参数一定要用const引用,并在函数内const cast掉
- 需要改变矩阵或数组大小的时候要改变derived出来的变量的大小
宏
参见:https://eigen.tuxfamily.org/dox/TopicPreprocessorDirectives.html
Assertions
控制编译或运行时的输出信息:https://eigen.tuxfamily.org/dox/TopicAssertions.html
多线程
让Eigen多线程运行
下列算法可通过多线程提高性能
- general dense matrix – matrix products
- PartialPivLU
- row-major-sparse * dense vector/matrix products
- ConjugateGradient with Lower|Upper as the UpLo template parameter.
- BiCGSTAB with a row-major sparse matrix format.
- LeastSquaresConjugateGradient
启用方法
- GCC:
-fopenmp
- ICC:
-openmp
- Visual Studio:右键项目-属性-C/C++-语言-OpenMP支持-是
核数设置(n应小于物理核数)
- 环境变量添加OMP_NUM_THREADS=n
- 代码中添加omp_set_num_threads(n);
- 代码中添加Eigen::setNbThreads(n);
/*
* 多线程测试
* CPU物理核数:6
* 1核:25.6638s
* 2核:14.5245s
* 3核:11.9047s
* 4核:10.8315s
* 5核:9.84828s
* 6核:10.7498s
*/
#include <iostream>
#include "Eigen/Dense"
#include <chrono>
#include <omp.h>
using namespace std;
using namespace Eigen;
const int dim = 100;
int main()
{
std::chrono::time_point<std::chrono::system_clock> start, end;
int n;
n = Eigen::nbThreads();
cout << n << "\n";
Matrix<double, Dynamic, Dynamic> m1(dim, dim);
Matrix<double, Dynamic, Dynamic> m2(dim, dim);
Matrix<double, Dynamic, Dynamic> m_res(dim, dim);
start = std::chrono::system_clock::now();
m1.setRandom(dim, dim);
m2.setRandom(dim, dim);
for (int i = 0; i < 1000; ++i) {
m_res.noalias() = m1 * m2;
}
end = std::chrono::system_clock::now();
std::chrono::duration<double> elapsed_seconds = end - start;
std::cout << "elapsed time: " << elapsed_seconds.count() << "s\n";
return 0;
}
在多线程程序中使用Eigen
在代码前面添加一段
#include <Eigen/Core>
int main(int argc, char** argv)
{
Eigen::initParallel();
...
}
如果使用Eigen 3.3和完全符合C++11的编译器,则这个步骤是可选的
如果程序也是使用OpenMP的话,可能需要关掉Eigen本身的多线程
其他多线程方案
- BLAS/LAPACK:https://eigen.tuxfamily.org/dox/TopicUsingBlasLapack.html
- Intel® MKL:https://eigen.tuxfamily.org/dox/TopicUsingIntelMKL.html
- CUDA:https://eigen.tuxfamily.org/dox/TopicCUDA.html
常见问题
Aliasing、Alignment和传值相关
前面已经讲过
C++11和auto声明
不要用auto,除非你知道不会出问题!
头文件相关
有的时候,类对应的头文件不止一个
比如矢量叉积的函数cross()除了Dense之外还需要Geometry
三元运算符(COND?THEN:ELSE)
三元操作符里面不要用Eigen的表达式
含布尔元素的矩阵
目前含布尔元素的矩阵的行为不统一,需要小心使用
template<int Size>
void foo() {
Eigen::Matrix<bool, Size, Size> A, B, C;
A.setOnes();
B.setOnes();
C = A * B - A * B;
std::cout << C << "\n";
}
这个函数输入foo<3>()的话返回0矩阵,输入foo<10>()的话返回单位阵
C++中的template和typename关键字
使用不恰当会出现奇怪问题,如
expected expression
no match for operator<
用法1:定义模板
template <typename T> // 或template <class T>
bool isPositive(T x)
{
return x > 0;
}
用法2:指定表达式引用模板函数或类型
#include <Eigen/Dense>
#include <iostream>
using namespace Eigen;
void copyUpperTriangularPart(MatrixXf& dst, const MatrixXf& src)
{
dst.triangularView<Upper>() = src.triangularView<Upper>();
}
int main()
{
MatrixXf m1 = MatrixXf::Ones(4,4);
MatrixXf m2 = MatrixXf::Random(4,4);
std::cout << "m2 before copy:" << std::endl;
std::cout << m2 << std::endl << std::endl;
copyUpperTriangularPart(m2, m1);
std::cout << "m2 after copy:" << std::endl;
std::cout << m2 << std::endl << std::endl;
}
但是,如前面函数那部分讲的,这样写不是很好,改进的写法如下
#include <Eigen/Dense>
#include <iostream>
using namespace Eigen;
template <typename Derived1, typename Derived2>
void copyUpperTriangularPart(MatrixBase<Derived1>& dst, const MatrixBase<Derived2>& src)
{
/* Note the 'template' keywords in the following line! */
dst.template triangularView<Upper>() = src.template triangularView<Upper>();
}
int main()
{
MatrixXi m1 = MatrixXi::Ones(5,5);
MatrixXi m2 = MatrixXi::Random(4,4);
std::cout << "m2 before copy:" << std::endl;
std::cout << m2 << std::endl << std::endl;
copyUpperTriangularPart(m2, m1.topLeftCorner(4,4));
std::cout << "m2 after copy:" << std::endl;
std::cout << m2 << std::endl << std::endl;
}
据说掉了template会出问题,但是实测好像没啥毛病,感觉跟C++标准和编译器有关,教程的解释如下(假装懂了)
在上一个示例中必须使用template关键字的原因与模板应该如何在C++中编译的规则有关。编译器必须在定义模板时检查代码的语法是否正确,而不知道模板参数的实际值(示例中的Derived1和Derived2)。这意味着编译器无法知道dst.triangularView是成员模板,并且以下<符号是模板参数分隔符的一部分。另一种可能性是dst.triangularView是一个成员变量,其中<符号引用operator<()函数。事实上,编译器应该根据标准选择第二种可能性。如果dst.triangularView是成员模板(在我们的例子中),程序员应该使用template关键字显式指定它并编写dst.template triangularView。
- 依赖名称指的是依赖于模板参数的名称,这里的dst和src就是依赖名称,依赖于Derived1、Derived2
- 如果代码中包含xxx.yyy或xxx->yyy,xxx是依赖名称,yyy是成员模板,那么yyy前面应该加template,于是有xxx.template yyy或xxx->template yyy
- 如果代码中含有xxx::yyy,xxx是依赖名称,yyy是成员typedef,那么xxx前面应该加上typename,即typename xxx::yyy
所以,下面这个例子
SparseMatrixType mat(rows,cols);
for (int k=0; k<mat.outerSize(); ++k)
for (SparseMatrixType::InnerIterator it(mat,k); it; ++it)
{
/* ... */
}
应该写成
template <typename T>
void iterateOverSparseMatrix(const SparseMatrix<T>& mat;
{
for (int k=0; k<m1.outerSize(); ++k)
for (typename SparseMatrix<T>::InnerIterator it(mat,k); it; ++it)
{
/* ... */
}
}
延伸阅读
- David Abrahams和Aleksey Gurtovoy的《C++ Template Metaprogramming》
- http://pages.cs.wisc.edu/~driscoll/typename.html
- http://www.parashift.com/c++-faq-lite/templates.html#faq-35.18
- http://www.comeaucomputing.com/techtalk/templates/#templateprefix
- http://www.comeaucomputing.com/techtalk/templates/#typename