audio9.broadcastify.com server downtime on 5/31/23 0330 PDT?

W5EBC

Newbie
Feed Provider
Joined
May 11, 2018
Messages
9
Location
Honey Grove, TX
All,

Follow up: Broadcastify emailed me and said "I received word from our network provider that one of our servers (audio9) was experiencing problems. Can you confirm that us failing it over has resolved the issue?" Kind of funny. The word that he received was from me telling him that they had been offline for over eight hours. Anyway, for future reference and for Mr. Google to index, if you get these error codes start looking at server side issues.

Phillip
 

SOUTHBAYSCANNER

Welcome to SouthBayScanner.com
Feed Provider
Joined
Feb 28, 2008
Messages
34
Location
Lawndale, CA
Update at 4:59pm PDT - Looks like the Audio9 server is working again as the feed is finally broadcasting. I didn't even need to re-boot.
 

blantonl

Founder and CEO
Staff member
Super Moderator
Joined
Dec 9, 2000
Messages
11,120
Location
San Antonio, Whitefish, New Orleans
Sorry folks, I was traveling during the outage and wasn't able to initiative the recovery process until later that afternoon.

Unfortunately audio master servers (1,3,9) are single points of failure if they decide to die, and they must be manually recovered. So if the team isn't available to do it right away some feeds might be down for a bit.

Don't worry, we're investigating hardening up the architecture to provide for better auto-failover etc. But in the meantime take comfort in knowing that this very very very rarely happens and we provide excellent uptime and resiliency for your feeds overall during the year.
 

blantonl

Founder and CEO
Staff member
Super Moderator
Joined
Dec 9, 2000
Messages
11,120
Location
San Antonio, Whitefish, New Orleans
Kind of funny. The word that he received was from me telling him that they had been offline for over eight hours.
Actually, the word that we received was numerous alerts and messages from our monitoring system, in addition to 180 trouble tickets.

Everyone here can take comfort in knowing that the SECOND something happens on this site or RadioReference, we receive hundreds of alerts and complaints and we are intimately aware of the issue almost certainly before you were.

The issue arises where we might not be in a position to recover it within minutes, or our ability to respond to a problem is limited due to travel etc, hardware availability, or us not knowing which of the 4,205 different possible issues could be at the time. It's technology, stuff breaks sometimes, all I can ask is for your patience if we have an outage or something happens. In the grand scheme of things, we do a pretty good job of keeping things running smoothly around here.

This extended outage for one server was the result of me flying an airplane and traveling when the outage occurred. Murphy's law dictates that I can work in my office for 6 months straight and the entire infrastructure will hum along nicely with no problems, but the second I go travel or get on an airplane something will break. 😀
 

hruskacha

Member
Premium Subscriber
Joined
Nov 9, 2020
Messages
244
Location
Muskegon
Actually, the word that we received was numerous alerts and messages from our monitoring system, in addition to 180 trouble tickets.

Everyone here can take comfort in knowing that the SECOND something happens on this site or RadioReference, we receive hundreds of alerts and complaints and we are intimately aware of the issue almost certainly before you were.

The issue arises where we might not be in a position to recover it within minutes, or our ability to respond to a problem is limited due to travel etc, hardware availability, or us not knowing which of the 4,205 different possible issues could be at the time. It's technology, stuff breaks sometimes, all I can ask is for your patience if we have an outage or something happens. In the grand scheme of things, we do a pretty good job of keeping things running smoothly around here.

This extended outage for one server was the result of me flying an airplane and traveling when the outage occurred. Murphy's law dictates that I can work in my office for 6 months straight and the entire infrastructure will hum along nicely with no problems, but the second I go travel or get on an airplane something will break. 😀
We appreciate your dedication to your company and RR community, especially during your own personal time. You already go above and beyond for us and it is awesome of you to communicate with us on a chat level via forums. It already been said, us feed providers do take pride in our uptime stats, but tech is tech and things still break without any interaction. Lets look forward to what scanner calls lie ahead and hope this never happens again. Also, some brief feedback... If a widespread issue is discovered across numerous feeds, It would be nice to have an automatic error message vs each feed provider having to type their own description of the "outage" on their own. Like if we eventually found out that audio9 was down, have all audio9 feeds display a server error message instead of just "offline". I think it could possibly reduce panic across the board and reduce ticket requests.
 

iceman977th

Member
Feed Provider
Joined
Dec 25, 2009
Messages
388
Location
Catlettsburg, KY
Actually, the word that we received was numerous alerts and messages from our monitoring system, in addition to 180 trouble tickets.

Everyone here can take comfort in knowing that the SECOND something happens on this site or RadioReference, we receive hundreds of alerts and complaints and we are intimately aware of the issue almost certainly before you were.

The issue arises where we might not be in a position to recover it within minutes, or our ability to respond to a problem is limited due to travel etc, hardware availability, or us not knowing which of the 4,205 different possible issues could be at the time. It's technology, stuff breaks sometimes, all I can ask is for your patience if we have an outage or something happens. In the grand scheme of things, we do a pretty good job of keeping things running smoothly around here.

This extended outage for one server was the result of me flying an airplane and traveling when the outage occurred. Murphy's law dictates that I can work in my office for 6 months straight and the entire infrastructure will hum along nicely with no problems, but the second I go travel or get on an airplane something will break. 😀

Genuine question, is there not more support staff that can handle these types of failures if you’re not available? I’m sure RR/BCFY isnt a one-man show, or I at least hope it’s not, in that regard. I understand the entire site is full of volunteers for feed providers, and things are up 99.95% of the time, but I’m sure that 0.05% of the time could be a major event, you just never know.

I promise I’m not being demanding and picky, I’m just curious if it is just you or if there are more people involved & it was just a poorly timed event overall.

Mike
 
Top