注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

anqiang专栏

不要问细节是怎么搞的,源码说明一切

 
 
 

日志

 
 

关于LWL (Local weighting Learning )的一些问题  

2009-10-28 16:00:15|  分类: Weka 学习系列 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

On 28/10/09 1:51 AM, Richard Cubek wrote:
> Mark Hall schrieb:
>> On 27/10/09 8:35 AM, Richard Cubek wrote:
>>> Hello,
>>>
>>> 2,5 hours ago, i sent a mail with questions about LWL, but I found the
>>> error now. So, the admin does not need to forward my previous mail (it
>>> was > 40 KB and therefore not forwarded automatically).
>>>
>>> Now, I have another question. I want to learn a model, or to stay
>>> correct, a state-prediction for a special real physical process with
>>> locally weighted regression, therefore i started playing with LWL in
>>> weka, generating a simple data set (20 instances) from a simple
>>> quadratic function y = x^2 with a little bit of noise. This works fine,
>>> when generating the test set (50 instances) to test prediction for each
>>> instance, i get a nice "approximation" over the data set. Plot 1 (red
>>> dots: known data, green dots: predictions) shows the result using all
>>> nearest neighbors (the whole data set, lwl.setKKN(0)) and a linear
>>> weighting kernel (lwl.setWeightingKernel(LINEAR)). The result makes
>>> sense. When using only the nearest 5 dots for local regression
>>> (lwl.setKKN(5)), i get plot 2, more realistic predictions.
>>>
>>> The problem is, when using a gaussian weighting kernel
>>> (lwl.setWeightingKernel(GAUSS)), the result is independent from the
>>> number of nearest neighbors i use, it's always the same result, it looks
>>> very similar to plot1. So, it seems, that when using a gaussian
>>> weighting kernel, it's always as i would have set lwl.setKKN(0). I know,
>>> it's the default value, but of course, i tried to overwrite it with
>>> lwl.setKKN(2/5/whatelse), but there is no effect
. A bug? I don't thing
>>> so...any help?
> Well, first of all, before starting "criticizing" - i tried it only for
> one day now and I'm impressed - I like weka and I'm sure we will use it :-)
>
> I think, i didn't understand setKNN:
>
>  From source: "Sets the number of neighbours used for kernel bandwidth
> setting. The bandwidth is taken as the distance to the kth neighbour."
>
> 1 Neighbour means 1 Instance here? If setting setKNN(3), what is the kth
> neighbour? The 3rd? At the end, i can understand the method as "setting
> the amount of nearest instances taken into account for the local
> regression?".

Yes, neighbour means instance. If k = 3, then the three nearest neighbours
(according to the distance function) are returned. Actually, more than three
might get returned as ties in distance are counted as one instance.

LWL passes a weighted version of the training data to the base learning
algorithm. Each training instance is assigned a weight according to the selected
weighting function (which in turn uses the distance of the training instance to
the current test instance - hence the "local" part of locally weighted
learning). Typically, instances further from the test instance receive a lower
weight. In the case of functions with bounded support, k determines the support,
which, in turn, has the effect that some training instances receive zero weight
(and can effectively be ignored). The linear kernel is bounded and gives a
weight of 1.0001 - distance[i], for training instance i. Since distances are
normalized to 0 - 1 range, you can see that this function decreases to zero.
Setting a support value based on k, and then scaling the distances by this
value, effectively means that the kth closest instance will receive the lowest
weight and all instances further away can be ignored (i.e. weights <= 0). The
Gaussian kernel is not bounded (exp(-1 * 1) = 0.3679). So, scaling by the kth
nearest distance is not going to have the effect that some instances receive
zero weight. All instances will always be used. However, at the moment no
scaling is done for the Gaussian, and I should change it to allow the k
parameter to scale the Gaussian.

> At the end,
>
> /** The available kernel weighting methods. */
> protected static final int LINEAR = 0;
> protected static final int EPANECHNIKOV = 1;
> protected static final int TRICUBE = 2; protected static final int
> INVERSE = 3;
> protected static final int GAUSS = 4;
> protected static final int CONSTANT = 5;
>
> in LWL should naturally all be public!

Absolutely! Good call.

Cheers,
Mark.

 

在局部加权学习算法中,当选择“gussian kernel”时,无论选择什么N值,得到的分类效果都是一样的?

Mark的回答是 对于gussian kernel这个函数不能使距离收缩到0,当距离在0~1之间是,它的取值区间是(0.39~1),因此KNN算法时,并不能剔除一些样本(即K以外的那些样本),这样选择gussian kernel以后,每次都是使用全部的样本来进行加权学习的。(这一点基本上是可以理解了,不清楚的大家就看看源码吧。)

 

在这里顺带说一下局部加权学习(LWL)的基本过程是:

1.       使用KNN算法找到待测样本的K个最近邻居;

2.       使用不同的kernel计算K个最近邻的权重;

3.       用给定的学习算法对K个近邻进行学习;

4.       用学习到的模型对待测样本进行分类。

LWL是一个懒惰学习算法(Lazy learning methods),主要思想是,通过对求取每一个测试样本的局部最优化分类模型,来提高分类效果。

PS:一般的分类器算法是通过在训练集上学习到全局的最优化模型,对测试样本进行分类,可想而知,对已一个待测样本来说,一个局部最优化得模型更能够拟合当前问题。

PS2:在这里的kernelSVM中的核是有区别的,请大家不要混淆。
  评论这张
 
阅读(684)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017