博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Multidimensional Scaling (MDS)
阅读量:4040 次
发布时间:2019-05-24

本文共 3095 字,大约阅读时间需要 10 分钟。

MDS aims to embed data in a lower dimensional space in such a way that pair-wise distances between data points are preserved.

Say we have N points xiRn for i[1,N] , let X=[x1,x2,,xN] , we don’t know the postion of xi . We are only supplied with the pair-wise Euclidean distances among these points. Now the objection is to find out N points yiRk,k<n , let Y=[y1,y2,,yN] , such that the distance in pairs of X is the same as these of Y.

Given the distance matrix DX , each element of DX can be written as:

(DXij)2=(xixj)T(xixj)=xi22xTixj+xj2
we can easily see that
DX=Z2XTX+ZT

Here, Z=zeT and z=[x12x22xN2]T . Therefore Z takes the form

Z=x12x12x12x12x12x12x12x12x12

Now, let’s translate the mean of the set of hypothetical point set X to the origin. Note that this operation does not change the Euclidean distance between any pairs of points.

For better understanding, we introduce

1NAeeT
and 1NeeTA . Here, A is a N-by-N matrix which taks the form:

A=A11A21AN1A12A22AN2A1NA2NANN

Hence,

1NAeeT=1NA11A21AN1A12A22AN2A1NA2NANN111111111=1Nj=1NA1j1Nj=1NA2j1Nj=1NANj1Nj=1NA1j1Nj=1NA2j1Nj=1NANj1Nj=1NA1j1Nj=1NA2j1Nj=1NANj=mean of first row of Amean of second row of Amean of Nth row of Amean of first row of Amean of second row of Amean of Nth row of Amean of first row of Amean of second row of Amean of Nth row of A
similiarly,
1NeeTA=1N111111111A11A21AN1A12A22AN2A1NA2NANN=1Ni=1NAi11Ni=1NAi11Ni=1NAi11Ni=1NAi21Ni=1NAi21Ni=1NAi21Ni=1NAiN1Ni=1NAiN1Ni=1NAiN=mean of first column of Amean of first column of Amean of first column of Amean of second column of Amean of second column of Amean of second column of Amean of Nth column of Amean of Nth column of Amean of Nth column of A

The centering matrix is defined as:

H=IN1NeeT
Let’s now apply double centering to
DX
to get
AX=HDXH=(IN1NeeT)(Z2XTX+ZT)(IN1NeeT)=(IN1NeeT)Z(IN1NeeT)2(IN1NeeT)XTX(IN1NeeT)+(IN1NeeT)ZT(IN1NeeT)=2(IN1NeeT)XTX(IN1NeeT)=2(X(IN1NeeT))TX(IN1NeeT)=2X~TX~
where
X~=X(IN1NeeT)

BX=12AX=12HDXH=X~TX~

Remember, the task was to find a concrete set of N points Y in k dimensions so that the pairwise Euclidean distances betwwen all the pairs in the concrete set

Y
is a close approximation to the pair-wise distances given to us in the matrix DX i.e. we want to find DY such that

DY=argminDXDY2F
Note that after applying the “double centering” operation to both
X
and
Y
, equation above yields
BY=argminBXBY2F=DY=X~TX~Y~TY~2F

The above equation is a well known optimization problem that can be solved via Singular Value Decomposition(SVD) of BX .

BXUDUT=(UD12)(D12UT)=YTY~

Here, U is N by k matrix and

D
is k by k diagonal matrix with k largest singular values on the diagonal and Y~=D12UT is k by N matrix. Finally, we get N embedding points in k dimension as the column vectors of Y~

Reference:

转载地址:http://mnxdi.baihongyu.com/

你可能感兴趣的文章
iOS 序列化与反序列化(runtime) 01
查看>>
iOS AFN 3.0版本前后区别 01
查看>>
iOS ASI和AFN有什么区别
查看>>
iOS QQ侧滑菜单(高仿)
查看>>
iOS 扫一扫功能开发
查看>>
iOS app之间的跳转以及传参数
查看>>
iOS __block和__weak的区别
查看>>
Android(三)数据存储之XML解析技术
查看>>
Spring JTA应用之JOTM配置
查看>>
spring JdbcTemplate 的若干问题
查看>>
Servlet和JSP的线程安全问题
查看>>
GBK编码下jQuery Ajax中文乱码终极暴力解决方案
查看>>
Oracle 物化视图
查看>>
PHP那点小事--三元运算符
查看>>
解决国内NPM安装依赖速度慢问题
查看>>
Brackets安装及常用插件安装
查看>>
Centos 7(Linux)环境下安装PHP(编译添加)相应动态扩展模块so(以openssl.so为例)
查看>>
fastcgi_param 详解
查看>>
Nginx配置文件(nginx.conf)配置详解
查看>>
标记一下
查看>>