We started to use the built-in firewall function in our Proxmox cluster, and sometimes VM-s started to timeout for time to time, but we found no problem then. It happened again, and finally found the root of our problem.
The following line started to appear in the syslog (both VM, and core machine):
nf_conntrack: table full, dropping packet
You can check the actual conntrack count with the following command:
So our problem was that is was already maxed out. There are some ways that you can increase the maximum value via cli, but they will never work, it will be reset after few seconds to default value, which is 65536.
Solution: Increase the setting via Proxmox’s web interface. You can find it in Datacenter/VM/Firewall/Options. The option’s name is nf_conntrack_max. Increase it to the number you desire. If you have more than one machine in the cluster, then you have to edit this option on every one of them!