Posted by naoliv on Wed 6 Aug 2008 at 23:19
We are having a strange problem here where I work (and I don't know how can I debug this).
Our central switch is a stacked 3Com 5500G-EI SFP + 5500G-EI. Leaf switches are 3Com 2948-SPF, connected to the 5500G via optical fiber.
On one of those 2948, there are more 3Com baselines switches.

It's more or less this (

The network is "big" (300 machines more or less) (Yes, I know. "Break this network", "Create sub-nets", etc; if everything goes well, we will have a better network structure someday).

But well, what is happening these days is that the networking is stopping. We work at [1], and we can't ping the other machines connected to the same switch at [1] (nor we have communication to the other places). The same happens for people located at other places, like [2].

It seems to be something that spreads on the entire network, but I don't have idea of what could be this.
There are days that it takes only 10 seconds, then everything gets back to normal. Today it stayed almost 1 hour without network. The strangest thing is that it seems to stop around 5:00h PM

Do you have any ideas of what can cause something like that? Worm, somebody using some malicious program, something wrong on a network cable, a broken switch? What can we use to debug this, please?

Thank you very much! Edit: See comment #5 for more info, please.


Posted by naoliv on Tue 12 Jun 2007 at 13:40
We are having some difficulties here to find a good solution for a problem. There is a computer connected to 3 ADSL lines (two of 8M and one of 2M), a card to the internal network and in the future, a new card, connected to a radio link (giving 4 connections to the world and one connection to our network).

What we want to do is to have some kind of load balancing and fail over on the available links (so all links get used and if one have a problem, the traffic will continue on the other links). Also, we would like to have a priority system, where traffic is sent preferably on the two 8M links, then on the 2M link and lastly, on the radio link.

Lokiwall seems the tool that we are needing, but it needs to have two patches applied to the kernel (and if possible, we don't want to modify our firewall Kernel).

Do somebody knows if it's possible to do this, please?

Thank you!