- Network Status
- Contact Us
Affecting System - Fast Cloud Host
One of our host servers is having a yet undetermined hardware failure. We are investigating the problem to determine the best course of action since some customer VMs are being affected. All other hosts in our Fast Cloud clusters are operating correctly. We will update this notice as we gather more information.
Update 1 - The server is still operational, but has lost network connectivity. It is also running very slowly. We are investigating why it's network hardware is not responding.
Update 2 - We have determined that two dual port NICs are not functioning on this server. Since there is more than one NIC not functioning, we do not believe this to be a hardware issue. We are investigating the software now to determine the cause of both NICs not functioning. We have also engaged our software vendor/partner VMware to help identify any issues from their end.
Update 3 - After troubleshooting the software side of the connectivity issue with VMware, we found the NICs to be working from a hardware and driver perspective. The distributed virtual switch that carries the VM network traffic was not working. Other distributed virtual switches on the same host were working correctly. VMware narrowed the problem down to some software glitch in the configuration for that particular networking configuration on the affected host. We have removed the host from production and have scheduled it to be reloaded from scratch to regenerate all of it's configuration. VMs that were on this host have been restarted and they all seem to be fully functional as of 1AM on 10/14/2015. If you still have any outstanding issues with your products, please call into support and we will gladly review your issues at that time.
Affecting System - Network Packet Loss
10:00AM - We've investigated some customers mainly using Time Warner Telecom internet lines having issues getting to Dallas destinations that are behind an Abovenet router. The particular router that seems to be having issues is about three hops away from our data center. We have called Abovenet and they confirmed that they are having issues with congestion on that router. It is in the Equinix data center that is inside the Dallas InfoMart carrier hotel. Any customers that are trying to reach a destination using an internet service that has that particular Abovenet router in its path will experience some intermittent packet loss. Given the close proximity of that router to our network, there will obviously be some traffic destined or coming from our network that traverses that router. We have had several customer complaints on Time Warner Telecom, Cox and Road Runner internet services complain of packet loss. Customers on Comcast, ATT, Level 3 and various other internet service providers have confirmed they are not having any issues reaching our networks since they traverse a differnt path over the internet. We have been calling Abovenet throughout the day to see if they have made progress with their issue. We have checked a lot of the other Abovenet routers and paths in the vicinity and they all seemed to be operating correctly, so we believe the problem is just with that one Abovenet router.
12:00PM - Abovenet has removed their problematic router from the local peering. Internet traffic that used to traverse that router now follows a different path. They have scheduled to re-add this router to the local peering network at 12:00AM today after they resolve the problem it is having.
12:00AM - Abovenet has re-added their problematic router back to their peering network. We have monitored traffic that goes thru this router and it is not experiencing the previous packet loss symptoms anymore. We will continue to monitor its performance Friday morning to see how it performs under more load.
Affecting System - vCenter 6.0 Servers
We will be applying our monthly security updates to our vCenter 6.0 infrastructure. vCenter services will be offline while services are restarted after we apply our patches. Customers will be unable to use vCenter resources for a 10 - 30 minute time window.
-Maintenance was completed without incidents.