Sometimes It’s Not the Network
Marek Majkowski published an awesome real-life story on CloudFlare blog: users experienced occasional short-term sluggish performance and while everything pointed to a network problem, it turned out to be a garbage collection problem in Linux kernel.
Takeaway: It might not be the network's fault.
Also: How many people would be able to troubleshoot that problem and fix it? Technology is becoming way too complex, and I don’t think software-defined-whatever is the answer.
http://blamethenetwork.com/the-moment-you-prove-its-not-the-network/
sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096 87380 4194304
So it seems that the root cause was someone changing the default settings in an attempt to "optimize"