I talked about using m0n0wall as a captive portal in an environment with an extremely large number of users earlier. From what I could see remotely at the load balancer, the system appeared to be running fine, since I could see all the WAN links being used. But the manager of the apartment complex called me up to tell me that users have been complaining that they get logged out of the captive portal, and they have to re-input their passwords. This often happens in the evenings when usage peaks.
I asked the manager if the users all called at the same time to complain about this, which could indicate that m0n0wall had crashed and rebooted, thus kicking all users out from the portal. The manager wasn't sure, but he said that he gets calls every single evening.
So I drove out 100 miles to see it again, in person.
It turns out that there still aren't that many users using the system, but those that are using, are hogging the bandwidth by running P2P software. I searched a bit and found that m0n0wall has a hard coded 30,000 connection limit (firewall states), which can't be changed unless the kernel is recompiled. It also appears that when the limit is reached, m0n0wall crashes. Hmm.
I could switch to pfSense and run it on a PC, which seems to not have any issues in this area since it has a configurable maximum states option based on the amount of RAM installed, but the current released is tagged ALPHA-ALPHA, which scares me a little bit. Or I could switch to some other system not based on m0n0wall, but I really like the captive portal, and the apartment complex people also intend to use the vouchers in m0n0wall.
So I changed the TCP idle timeout value, and gave minimum weight to P2P connections using the traffic shaper. I also remembered to set m0n0wall to allow remote access, so I don't have to drive out again if all I needed to do was look at users and change some settings.
But I forgot to take photos of the hardware setup again.