netoops启动时报 "XX is a slave device, aborting"

来源:岁月联盟 编辑:exp 时间:2012-06-27

为了监控上线的新内核,我们把google的netoopsbackport到了自己的内核,生产上如有kernel panic,会将panic的栈信息发送到日志服务器,方便调试和修复。


前天,洪川同学报告说以前线上的netoops都是把bond的slave网口作为发消息的dev,新上线2.6.32-220内核后,启动netoops失败,系统报:


”eth0 is a slave device, aborting."


找了一下从 2.6.32-131 到 2.6.32-220 的redhat的变动,发现了王聪同学的这个patch:

 


commit 0c1ad04aecb975f2a2014e1bc5a2fa23923ecbd9
Author: WANG Cong www.2cto.com
Date:   Thu Jun 9 00:28:13 2011 -0700


    netpoll: prevent netpoll setup on slave devices
   
    In commit 8d8fc29d02a33e4bd5f4fa47823c1fd386346093
    (netpoll: disable netpoll when enslave a device), we automatically
    disable netpoll when the underlying device is being enslaved,
    we also need to prevent people from setuping netpoll on
    devices that are already enslaved.
   
    Signed-off-by: WANG Cong <www.2cto.com>
    Signed-off-by: David S. Miller <www.2cto.com>


diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 2d7d6d4..42ea4b0 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -792,6 +792,12 @@ int netpoll_setup(struct netpoll *np)
                return -ENODEV;
        }
 
+       if (ndev->master) {
+               printk(KERN_ERR "%s: %s is a slave device, aborting./n",
+                      np->name, np->dev_name);
+               return -EBUSY;
+       }
+
        if (!netif_running(ndev)) {
                unsigned long atmost, atleast;
 


从此,netpoll就无法使用slave设备了(netoops用的就是netpoll),不过我奇怪为什么以前可以现在又不行了,所以发邮件问了王聪同学为何现在不能使用slave设备,回答是:


“因为slave设备没有IP地址,www.2cto.com”


而且王同学在redhat搞netconsole也遇到了同样的问题,只能改用master网口。
我们的netoops也只能遵循同样的规则,统一改用 bond0做dev。