Director's Report: DEBRIEF PART 2: Technical Details
and the Future

(Updated: Friday 5th March 5:00PM)

5 new in last 7 days

14/03/2010
Mail Access Servers Temporarily Unavailable 
14:50 - Mail access is temporarily suspected. Investigating now. 
15:10 - Seems both mail access servers went down simultaneously. Engineers are on the scene now and should have things running again shortly. Update within 20 minutes latest. 15:15 - We're back online. 

12/03/2010
Lightning Strike - Service Outage
16:35
All storage servers online. Any email sent to your account would have been accepted by our other servers during the outage and will be delivered to you shortly as the servers quickly process the backlog.
No further updates planned, service is now fully online.

16:30
2nd mailserver has been restored to working status and the failed storage server has been attended to and is now online.
Remounting into the mail cluster is taking place now. Full access should be restored within minutes.

16:15
Still no news on the fate of the unresponsive server. Disaster recovery has started (just in case) by restoring a backup of the data usually stored on it to a reserve server. No estimate on time at this stage.

16:00
Datacenter has confirmed there was a surge in power which appears to have caused damage to one of the industrial sized UPS's that keep power running to the building in the event of power failures.

15:45
Datacenter has confirmed unresponsive server is actually on but we're unable to reach it externally.
They're investigating possible causes now.

15:30
As with any outage, inbound email will have been queued and will be delivered in due course.
Still awaiting information on the one server that has gone down.

15:15
Lightning strike hit the powerlines entering the building causing the powerbreakers to overload. The UPS systems appear to have worked in that the servers themselves remained online but it knocked out portions of the datacenter network.

15:00
Our datacenter has been directly struck by lightning causing the entire power system to trip. Back on however we're still trying to determine the status of at least one of the servers which hasn't powered back up.

14:55
The network issue has caused two storage servers to need remounting. Should be fully online in a few minutes.

14:50 - 14:55
An intermittent issue with the network caused access to be restricted during this 5 minute period.


12/03/2010
Month Subscription
As announced in the Directors Report following the recent storage failures, all subscribed accounts will be receiving the additional month (30 day) subscription today. It will listed under 'My Account' -> 'Account Status / Billing' once applied.
[Update] 11:35am - All accounts have now been updated.

11/03/2010
Overnight Storage [Resolved]
One of the storage servers still hadn't mounted after last nights power breaker failure despite appearing that it had. As a result those of you with email on this server have been unable to access your email from 20:36 last night until 08:15 this morning.


We sincerely apologise for this further outage, this is the 4th week in a row we've suffered some sort of hardware failure that has affected the service and we're seriously considering all available options. Up until now this simply included new machines built with dual power supplies and feeds but discussions are now including moving to a new datacenter provider as years of trust have been wiped out with the constant stream of failures that have been outside of our control.

11/03/2010
Brief Outage [Resolved]
20:36 - 20:42
Brief outage caused by 8 servers all appearing to loose power at the same time despite being on battery/generator backups in case of failure. Currently investigating with datacenter the exact cause.


20:55
One of the storage servers isn't responding, some of you will be unable to login at the current time as a result. Should have it online in a few minutes.


20:56
Storage remounted, service fully online. Still determining original cause of the outage.

22:57
Datacenter has confirmed that a faulty power breaker tripped and had to be replaced with an on-site spare. Power was down for under 5 minutes but it took a few further minutes for the servers to boot back up. As you know we have been investigating A/B power feeds to prevent any single power fault from taking down the service.

04/03/2010
Datacenter Power Failure [Resolved]
19:37 - Power circuit failed in the datacenter, knocking out 3 servers. Service was slow as two of the servers were webservers, leaving web access restricted in speed. Client access and inbound email are both unaffected as no mail servers were on the failed circuit.


20:07 - All affected servers back online, web access stabilising across the webservers. Service may be a little slow for the next 10-15 minutes as users online are redistributed over available webservers.

03/03/2010
Temporary access issues
There was a temporary problem with access this morning between 9.45am and 10.15am due to a database server issue.


This has now been resolved.


10:25am - Updated, issue has been found and fixed.  If you have any problems, try resending your email.

03/03/2010
Thursday Emails Redelivery +Service Maintenance Notice [COMPLETED]
To those members concerned about the Thursday redelivery please be advised that this is now scheduled to happen tonight from 10pm. Other members please be aware that we are scheduling this off-peak because during this time there will be some delays on inbound mail as we process about 30,000 emails through the system. We will update this post when this has completed. 22:15 - this work is starting now. Update in approx 45 minutes. 11:05 - Mail redelivery is under way. This should hopefully complete within 1-2 hours. We will give a progress update at midnight. 12:05 - Redelivery is progressing well and should not take too much longer. Next update within 30 minutes. 00:30 - Restoration complete and full email flow restored. The held mail backlog will be cleared shortly. 01:00 - Mail backlog cleared and all operating as normal. 
01/03/2010
Attachment Issues (Resolved)
There was an issue with not being able to send attachments over the weekend.  This was caused by new attachments being unable to be written while sending.  The problem was fixed at around 7pm Sunday evening.
28/02/2010
Mail between Saturday 20th - Thursday 25th
A small portion of members will have noticed that emails are not showing up between the 20th and 25th. These have not been deleted but need to be resynced with your account following last nights server fault. This has been happening throughout the day but may take some time. Please bear with us. We will let you know when all accounts have finished syncing.





[Update]

 The bulk of the resync is complete. Some users may find messages not present from Thursday, these will be resyncing Monday morning.





System Status
16th Mar @ 10:15
  • Incoming Mail
  • Web Access
  • Client Access
  • Hosting Service

Welcome to PidgeMe

This service has been set up to provide University of Oxford graduates with an independent service that extends that of the OUCS. PidgeMe now also acts as an email hosting service for Oxford Society oxon.org forwarding-only addresses. All new applicants must first obtain one of these addresses from www.oxon.org

We give a FULL email solution that is tailored towards ease of use and functionality and comes with top class, responsive technical support at a level you will never find with a standard public email service.
Free Webmail   Enhanced Accounts
                                         ALL features of Free accounts PLUS:
  • @pidgeme.com address
  •  
  • Remote mail access
  • Email Forwarding
  •      - Outlook / Mobile via IMAP/POP
  • Auto-responders
  •  
  • Send mail via private servers
  • Spam + Virus Filters
  •      - Authenticated SMTP
  • 10MB Storage
  •  
  • Storage capacities from 250MB-11000MB
  • Webmail Access
  •  
  • Personal Web hosting option
  •     Free for Life       From £4.99 / year

    Register Now!