Первый: 2 x XEONA 6 ядерных с 2портовой intel (igb) сетевухой:
Функционал: radius, mpd5, netgraph (шейпер)
при нагрузке:
netstat -w1 -h -Iigb0
- Код: Выделить всё
- input (igb0) output
 packets errs idrops bytes packets errs bytes colls
 15k 0 0 14M 14k 0 6.8M 0
 14k 0 0 12M 13k 0 6.4M 0
 16k 0 0 15M 15k 0 6.6M 0
 13k 0 0 11M 14k 0 6.9M 0
top -SPHI
- Код: Выделить всё
- last pid: 71625; load averages: 6.33, 7.94, 8.85 up 0+11:43:08 13:45:39
 289 processes: 29 running, 198 sleeping, 2 zombie, 60 waiting
 CPU 0: 0.0% user, 0.0% nice, 2.3% system, 22.1% interrupt, 75.6% idle
 CPU 1: 0.8% user, 0.0% nice, 2.3% system, 29.8% interrupt, 67.2% idle
 CPU 2: 0.0% user, 0.0% nice, 4.6% system, 27.5% interrupt, 67.9% idle
 CPU 3: 0.0% user, 0.0% nice, 1.5% system, 24.4% interrupt, 74.0% idle
 CPU 4: 0.0% user, 0.0% nice, 2.3% system, 13.0% interrupt, 84.7% idle
 CPU 5: 0.0% user, 0.0% nice, 0.0% system, 22.1% interrupt, 77.9% idle
 CPU 6: 0.0% user, 0.0% nice, 2.3% system, 23.7% interrupt, 74.0% idle
 CPU 7: 0.0% user, 0.0% nice, 0.0% system, 23.7% interrupt, 76.3% idle
 CPU 8: 0.0% user, 0.0% nice, 4.6% system, 6.1% interrupt, 89.3% idle
 CPU 9: 0.0% user, 0.0% nice, 9.2% system, 0.0% interrupt, 90.8% idle
 CPU 10: 0.0% user, 0.0% nice, 10.7% system, 0.0% interrupt, 89.3% idle
 CPU 11: 0.0% user, 0.0% nice, 17.6% system, 0.0% interrupt, 82.4% idle
 CPU 12: 0.0% user, 0.0% nice, 13.0% system, 0.0% interrupt, 87.0% idle
 CPU 13: 0.0% user, 0.0% nice, 20.6% system, 0.0% interrupt, 79.4% idle
 CPU 14: 0.0% user, 0.0% nice, 16.8% system, 0.0% interrupt, 83.2% idle
 CPU 15: 0.8% user, 0.0% nice, 22.1% system, 0.0% interrupt, 77.1% idle
 CPU 16: 0.0% user, 0.0% nice, 19.8% system, 0.0% interrupt, 80.2% idle
 CPU 17: 0.0% user, 0.0% nice, 18.3% system, 0.0% interrupt, 81.7% idle
 CPU 18: 0.0% user, 0.0% nice, 18.3% system, 0.0% interrupt, 81.7% idle
 CPU 19: 0.0% user, 0.0% nice, 18.3% system, 0.0% interrupt, 81.7% idle
 CPU 20: 3.1% user, 0.0% nice, 16.8% system, 0.0% interrupt, 80.2% idle
 CPU 21: 0.8% user, 0.0% nice, 23.7% system, 0.0% interrupt, 75.6% idle
 CPU 22: 0.0% user, 0.0% nice, 21.4% system, 0.0% interrupt, 78.6% idle
 CPU 23: 0.0% user, 0.0% nice, 20.6% system, 0.0% interrupt, 79.4% idle
 Mem: 130M Active, 705M Inact, 955M Wired, 417M Buf, 2130M Free
 Swap: 4096M Total, 4096M Free
 PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
 11 root 155 ki31 0K 384K CPU10 10 665:48 93.36% idle{idle: cpu10}
 11 root 155 ki31 0K 384K CPU9 9 672:06 89.89% idle{idle: cpu9}
 11 root 155 ki31 0K 384K RUN 8 661:47 88.57% idle{idle: cpu8}
 11 root 155 ki31 0K 384K CPU11 11 664:58 88.28% idle{idle: cpu11}
 11 root 155 ki31 0K 384K RUN 15 624:01 81.98% idle{idle: cpu15}
 11 root 155 ki31 0K 384K CPU19 19 622:58 80.57% idle{idle: cpu19}
 11 root 155 ki31 0K 384K CPU22 22 623:50 79.59% idle{idle: cpu22}
 11 root 155 ki31 0K 384K CPU0 0 588:43 79.39% idle{idle: cpu0}
 11 root 155 ki31 0K 384K CPU16 16 624:53 79.30% idle{idle: cpu16}
 11 root 155 ki31 0K 384K CPU12 12 626:01 78.47% idle{idle: cpu12}
 11 root 155 ki31 0K 384K CPU13 13 624:44 77.49% idle{idle: cpu13}
 11 root 155 ki31 0K 384K CPU21 21 622:56 77.10% idle{idle: cpu21}
 11 root 155 ki31 0K 384K CPU4 4 593:10 77.10% idle{idle: cpu4}
 11 root 155 ki31 0K 384K CPU14 14 625:34 76.76% idle{idle: cpu14}
 11 root 155 ki31 0K 384K CPU18 18 624:14 76.46% idle{idle: cpu18}
 11 root 155 ki31 0K 384K CPU23 23 622:01 75.39% idle{idle: cpu23}
 11 root 155 ki31 0K 384K RUN 20 624:02 74.85% idle{idle: cpu20}
 11 root 155 ki31 0K 384K RUN 17 623:24 74.56% idle{idle: cpu17}
 11 root 155 ki31 0K 384K CPU5 5 596:37 74.17% idle{idle: cpu5}
 11 root 155 ki31 0K 384K CPU2 2 599:30 69.48% idle{idle: cpu2}
 11 root 155 ki31 0K 384K CPU3 3 572:30 66.99% idle{idle: cpu3}
 11 root 155 ki31 0K 384K CPU7 7 581:33 66.80% idle{idle: cpu7}
 11 root 155 ki31 0K 384K CPU1 1 594:19 62.79% idle{idle: cpu1}
 11 root 155 ki31 0K 384K CPU6 6 599:50 62.60% idle{idle: cpu6}
 12 root -92 - 0K 960K WAIT 1 85:58 37.79% intr{irq257: igb0:que}
 12 root -92 - 0K 960K WAIT 6 85:29 37.35% intr{irq262: igb0:que}
 12 root -92 - 0K 960K WAIT 2 86:38 29.20% intr{irq258: igb0:que}
 12 root -92 - 0K 960K WAIT 7 101:38 29.05% intr{irq263: igb0:que}
В то же время есть второй NAS на i3 c em драйверами:
top -SPHI
- Код: Выделить всё
- last pid: 98036; load averages: 1.39, 1.49, 1.54 up 0+13:05:00 13:51:13
 111 processes: 6 running, 87 sleeping, 18 waiting
 CPU 0: 0.0% user, 0.0% nice, 54.8% system, 0.0% interrupt, 45.2% idle
 CPU 1: 0.0% user, 0.0% nice, 16.1% system, 6.5% interrupt, 77.4% idle
 CPU 2: 0.0% user, 0.0% nice, 32.3% system, 0.0% interrupt, 67.7% idle
 CPU 3: 0.0% user, 0.0% nice, 35.5% system, 0.0% interrupt, 64.5% idle
 Mem: 34M Active, 777M Inact, 495M Wired, 6080K Cache, 209M Buf, 616M Free
 Swap: 4096M Total, 4096M Free
 PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
 11 root 155 ki31 0K 64K RUN 1 683:14 77.39% idle{idle: cpu1}
 11 root 155 ki31 0K 64K RUN 3 645:47 71.78% idle{idle: cpu3}
 11 root 155 ki31 0K 64K CPU2 2 642:18 67.77% idle{idle: cpu2}
 11 root 155 ki31 0K 64K RUN 0 570:15 51.46% idle{idle: cpu0}
 0 root -92 0 0K 224K CPU0 0 192:43 44.97% kernel{em0 que}
 13 root -16 - 0K 64K sleep 2 84:00 23.00% ng_queue{ng_queue1}
 13 root -16 - 0K 64K sleep 1 84:08 22.75% ng_queue{ng_queue0}
 13 root -16 - 0K 64K sleep 2 83:57 22.36% ng_queue{ng_queue2}
 13 root -16 - 0K 64K sleep 1 83:59 21.19% ng_queue{ng_queue3}
 12 root -92 - 0K 288K WAIT 2 36:13 9.47% intr{irq265: em1:rx 0}
 12 root -92 - 0K 288K WAIT 2 7:31 1.66% intr{irq266: em1:tx 0}
 927 root 20 0 202M 41504K select 1 5:22 0.49% mpd5{mpd5}
 995 root 20 0 31668K 4420K select 3 0:13 0.10% zebra
 12 root -60 - 0K 288K WAIT 1 3:57 0.00% intr{swi4: clock}
 0 root -92 0 0K 224K - 1 2:14 0.00% kernel{dummynet}
 1101 root 20 0 37996K 9240K uwait 0 1:25 0.00% utm5_radius{utm5_radius}
 15 root -16 - 0K 16K - 1 1:18 0.00% yarrow
при нагрузке:
netstat -w1 -h -Iem0
- Код: Выделить всё
- input (em0) output
 packets errs idrops bytes packets errs bytes colls
 20k 0 0 21M 16k 0 6.7M 0
 21k 0 0 23M 16k 0 7.1M 0
 22k 0 0 24M 17k 0 7.7M 0
После увиденного начал крутить тюнинг, но нагрузки смешные.
В часы пик (до 300-400Мбит/с на роутер) нагрузка на мощном сервере возрастает до 20 average и все заваливается очередями.
Накрутил до такого:
/boot/loader.conf
- Код: Выделить всё
- netgraph_load="YES"
 ng_ipfw_load="YES"
 ng_car_load="YES"
 autoboot_delay="1"
 kern.ipc.maxpipekva="32000000"
 hw.igb.rxd=4096
 hw.igb.txd=4096
 hw.igb.max_interrupt_rate=32000
less /etc/sysctl.conf
- Код: Выделить всё
- net.inet.ip.forwarding=1
 net.inet.ip.fastforwarding=1
 net.inet.ip.redirect=0
 net.inet.ip.intr_queue_maxlen=10240
 net.inet.tcp.delayed_ack=0
 net.inet.tcp.sendspace=3217968
 net.inet.tcp.recvspace=3217968
 net.inet.tcp.blackhole=2
 net.inet.tcp.drop_synfin=1
 net.inet.udp.recvspace=65535
 net.inet.udp.maxdgram=57344
 net.inet.udp.blackhole=1
 net.inet.icmp.drop_redirect=1
 net.inet.icmp.icmplim=200
 net.inet.icmp.drop_redirect=1
 kern.ipc.nmbclusters=400000
 kern.ipc.somaxconn=8192
 kern.ipc.maxsockbuf=83886080
 kern.ipc.maxsockets=204800
 kern.ipc.shmall=8388608
 net.local.stream.recvspace=65535
 net.local.stream.sendspace=65535
но и это не спасло.
ipfw.conf
- Код: Выделить всё
- 00010 181884 16382428 deny ip from any to table(50) dst-port 135-139,445
 00010 187498 17600048 deny ip from table(50) to any dst-port 135-139,445
 00030 6 264 deny ip from any to table(20) dst-port 22
 00060 20178202 5843214445 allow ip from 172.16.0.0/16 to 10.0.0.0/8
 00060 4935741 349274219 allow ip from 10.0.0.0/8 to 172.16.0.0/16
 00099 4601461 549651491 allow ip from any to any via lo0
 00180 919184129 425389336692 allow ip from 10.0.0.0/8 to 10.0.0.0/8 in recv igb1
 00180 1196334153 1065094085564 allow ip from 10.0.0.0/8 to 10.0.0.0/8 out xmit igb1
 00210 6833708 685236677 allow ip from 172.16.0.0/16 to 172.16.0.0/16
 00240 1495928 399786619 allow ip from any to me
 00250 2149835 230655428 allow ip from me to any
 01000 34531407 30906027365 allow ip from any to table(30)
 01000 30950224 14061136330 allow ip from table(30) to any
 03000 280389 135646215 netgraph tablearg ip from table(105) to any in recv ng*
 03000 301370 289195850 netgraph tablearg ip from any to table(110) out xmit ng*
 03500 849038581 374112825085 allow ip from table(100) to any
 03500 1057548376 1046505656451 allow ip from any to table(100)
 04000 1385327 1911201279 allow ip from table(99) to any
 04000 731285 51537058 allow ip from any to table(99)
 04100 33872 2621244 allow ip from 192.168.0.254 to any
 04100 4 200 allow ip from any to 192.168.0.254
 65535 34711897 5723493059 deny ip from any to any
Добавил
- Код: Выделить всё
- net.graph.maxalloc=65536
 net.graph.maxdata=65536
Исчезли failure из
vmstat -z | grep Graph
- Код: Выделить всё
- NetGraph items: 104, 65540, 7, 3328, 9485623, 0, 0
 NetGraph data items: 104, 65540, 0, 2813,22953496, 0, 0
При отключении шейпера - все нормализуется:
- Код: Выделить всё
- ${fwcmd} add 3000 netgraph tablearg ip from table\(105\) to any in recv ng\*
 ${fwcmd} add 3000 netgraph tablearg ip from any to table\(110\) out xmit ng\*
Шейпим таким способом:
- Код: Выделить всё
- [Thu Jan 3 08:16:46 2013] Action : open
 [Thu Jan 3 08:16:46 2013] Executing '/usr/sbin/ngctl mkpeer ipfw: car 9652 upper', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/usr/sbin/ngctl name ipfw:9652 sh4826', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/usr/sbin/ngctl connect sh4826: ipfw: lower 9653', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/usr/sbin/ngctl msg sh4826: setconf { upstream={ cbs=2176000 ebs=2176000 cir=17408000 greenAction=1 yellowAction=1 redAction=2 mode=0 } downstream={ cbs=2176000 ebs=2176000 cir=17408000 greenAction=1 yellowAction=1 redAction=2 mode=0 } }', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/sbin/ipfw -q table 105 add 172.16.18.218 9652', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/sbin/ipfw -q table 110 add 172.16.18.218 9653', exit code = 0
 [Thu Jan 3 08:16:46 2013] Executing '/sbin/ipfw -q table 100 add 172.16.18.218', exit code = 0
Подскажите откуда берется цифра 104:
# vmstat -z | grep Graph
- Код: Выделить всё
- NetGraph items: 104, 65540, 7, 3328, 9485623, 0, 0
 NetGraph data items: 104, 65540, 0, 2813,22953496, 0, 0
На другой машине (где все ок) - там стоит 72. Может в этом дело?





