About Torren

I'm a Microsoft Certified Solution Master for Communication, aka Skype for Business. As a consultant I work in all areas of SfB design, deployments, and support, with a particular focus on voice workloads, networking, and high availability.

How emergency calling works from various deployment scenarios: SfB Server

At a recent user group meeting, an attendee asked about all of the various call flows for emergency calls, from Skype for Business Server, Skype for Business Online, hybrids, and Teams. Over the next couple of posts, I’ll cover how the various scenarios work. First up, Skype for Business server.

SfB Server has native emergency calling functionality, using Location Policies, Sites, and Subnets. You can use this functionality to override the caller ID (thus setting an ELIN, or Emergency Location Identification Number), and routing it out via the appropriate gateway. You can also turn on alerting via IM, though this doesn’t do a lot of good unless you go to the next step…

SfB Server also has a Location Information Service, or LIS. This service uses the BSSID (the access point and channel a device is connected to), subnet, switch port, or switch to location the endpoint. Additionally, you can integrate with 3rd party servers/services that perform this LIS role. Some of the 3rd party services may perform better if you need to get down to the switch/switchport level to determine a location. The various locations (ERL, or Emergency Response Location) are programmed by you into the LIS, and associated with the subnet/BSSID/switch/switchport data. Yeah, this is a LOT of work to keep up! You enter the location in a format called MSAG, or Master Street Address Guide. This is a strict format that helps avoid confusion, especially when you get into suite numbers, floor numbers, and things like “east”. I have seen addresses like “235 East Highway 16 West”. It’s important to get things in the right place!

Once SfB knows your location from the LIS process, it includes the address in PIDFLO (Presence Information Data Format Location Object) in the SIP header that it sends to the gateway. There are a couple of connectivity options here:

If you have a gateway with ELIN capability (and licenses), the gateway can use the PIDFLO to select an ELIN, then send the call via the PSTN to the PSAP. If the PSAP needs to call back, the ELIN gateway maintains the translation for 30 minutes (usually configurable if 30 minutes doesn’t work for you).

If you’re in multiple PSAP jurisdictions, you’ll need to have a SIP trunk PSTN service that covers these, or if you can’t get a SIP trunk that does that, or if you’re using PRIs, you may need to route the emergency call gateway to gateway within your organization to reach the gateway that is in the correct location. You can make these routing decisions based on the ELIN (oh, 425-123-xxxx is Redmond, so send the call to that gateway, then send it to 911). You can’t route using 911 as the destination address, so this can turn into a bit of a routing mess.

An emergency call goes from a user to SfB Servers to a gateway, then via the PSTN to the PSAP

Routing emergency calls is easy when you only have one site.

If you want to avoid the routing headaches just described, you can also use 3rd party solutions from companies like West and RedSky. They have the LIS systems described above, but can also handle the ELIN translation function, and add enhanced notification/alerting options. Both offer services where your emergency calls are sent to their response centers, and then routed to the appropriate PSAP. This routing takes place automatically if the information included (think PIDFLO) is valid and matches their records. If it’s not, an operator answers the calls, gathers location information, and sends the call to the appropriate PSAP.

SfB users in two sites place calls, via the same SfB Servers, but then to different PSAPs via different gateways and PSTN services

Emergency call routing is more complex with multiple sites and/or multiple PSTN services. You MUST route emergency calls to the correct PSAP!

When you use these services, you also gain the option to have your receptionist/security desk conferenced into the call. This may be listen-only, or they may be able to speak (listen only keeps the call taker at the PSAP from getting confused as to what’s going on at the scene… Call taking is stressful. Take the stress of a help desk employee trying to decipher a technology problem over the phone, and now add the pressure of time and safety.) The conference function allows the reception/security desk personnel to take action locally – send a security or first aid team to the location, evacuate the building, meet emergency responders to direct them to the site.

Next up, we’ll hop online and see what Skype for Business Online and MS Teams can do for us, when using PSTN Calling services from Microsoft.

Teams, e911 dynamic locations, and Location Based Routing

Two features that have been noticeably missing from Microsoft Teams are LBR (Location Based Routing) and dynamic location support for e911. Both have been available for on-prem deployments since the days of Lync. With this announcement https://techcommunity.microsoft.com/t5/Microsoft-Teams-Blog/Additional-Voice-Features-for-the-New-Year/ba-p/295062 LBR is now in preview and is expected to be generally available by the end of Q1.

What is LBR
In Skype for Business, users are assigned a voice policy. That policy links usages and routes. Together, these determine whether the user is permitted to call a number, and what path through the system the call will take. If William from New York is in his office and calls a customer down the street, that call will travel through the SfB system and exit to the PSTN in New York. If William travels to Los Angeles and calls that same customer, the call will flow back to the New York office (via the WAN if William is on the corporate network, otherwise via the Internet and Edge server) and will exit to the PSTN in New York.

In some countries, this type of routing isn’t permitted. When William is in Los Angeles, the call to his customer in New York must flow via the PSTN. There may also be restrictions on when you can blend PSTN and SIP calls in a conference. For example, you may be able to have PSTN callers join a SfB meeting, but only from one location. Thus, instead of call routing being done via the policies assigned to the user, we have Location Based Routing – the call routing is determined by the location of the caller.

In SfB, configuring LBR meant entering your IP subnets and assigning them to sites. Each site would then be configured to route PSTN calls via a particular gateway. Further policies within SfB would do things like block two PSTN sites from joining a conference.

The challenge in trying to build something like LBR in Teams versus in SfB comes down to the uniqueness of the IP address, which is used to establish the users location. In SfB, your office and favorite coffee shop might share the same IP subnet, however SfB knew if you were on the corporate network or not based on whether your client was connected directly to your Front-End pool, or was connecting via the Edge pool.

With Teams, the Edge and Front-End infrastructure isn’t there to help disambiguate the subnet that a user is on. Reading through the LBR documentation https://docs.microsoft.com/en-us/microsoftteams/location-based-routing-configure-network-settings we can see a new cmdlet

New-CsTenantTrustedIPAddress

This cmdlet lets you define your external IP address and assign them to your tenant. For example

New-CsTenantTrustedIPAddress -IPAddress 198.51.100.0 -MaskBits 30 -Description “HQ Internet”

When your Teams client or device traverses a NAT firewall and has a matching public IP, the tenant now knows that this Teams client/device is on an internal network, and it can apply LBR according to the internal subnets and sites that you’ve defined.

What about e911?

Emergency calling (e911) and LBR both require the same underlying technology to be able to identify a user’s location. With this basic foundation in place, we can likely expect to see subnet-based location policies for e911 soon. There’s still some additional work to be done, as at a minimum Teams will need to provide for masking/translating a user’s DID and replacing it with a number that’s unique to the location of the user when 911 is called.Subnets may not meet legal requirements for the granularity of the location that’s reported. In Skype for Business Server, there’s the LIS (Location Information Service) database and the ability to embed PIDF-LO (
Presence Information Data Format Location Object) – aka your location – into SIP packets. These allows a client to be located by the access point, switch port, or switch that they’re connected to. SfB Server talks to external LIS databases that may be provided by vendors like West or Redsky, who take on the task of determining the users location and providing it to SfB.

None of this functionality exists in Teams yet, and it’s all required to do proper granular, dynamic location determination for emergency calling, natively in Teams.

LPE vs 3PIP vs Teams Native Phones

When OCS (or was it OCS 2007 R2?) first really featured enterprise voice and had deskphones, there were two types. One was the Tanjay, a futuristic, right angled obelisk of discomfort, and Aries.

Tanjay was interesting in that it would allow you to type your AD credentials on the touch screen, if you had the patience. It was a terrible interface for sure. More often than not, the USB “better together” over ethernet function was used. The receiver (the part you hold in your hand, for those that might not share this same terminology) was uncomfortable to hold in your hand, to press against your head, and to squish it between your head and should was agony. I screamed in terror when one customer advised me that they had found 50 of these units on sale on eBay or some other site for “cheap cheap cheap!”.

The Aries phone was less edgy (literally and figuratively), and the Polycom CX500 and CX600 series were perhaps the most deployed (and most loved) phones of the group. They worked well, felt comfortable, had enough features without being rocketships, and just worked.

Both Tanjay and Aries devices are part of Lync Phone Edition, or LPE, family of devices. The software for these devices was provided by Microsoft – as were some initial physical design – with the manufacture and support of the devices left to the manufacturer – Polycom, Aastra (since acquired by Mitel), and HP. I think it’s fair to say, however, that these devices were underspec and thus underpowered when it came to doing true enterprise level telephony. Specifically, the lack of ability to any soft of programming or customization, and the woeful response time for call setup in a response group.

Under the hood, the LPE devices ran Windows CE. Win CE (or wince, if you weren’t a fan) hasn’t been a desired or supported development platform for some time now. It doesn’t support TLS 1.2 and above, which means that it’s soon to be no longer supported with Office 365 (to be clear, this means “will no longer work” versus “don’t call us if something doesn’t work right, but it might be ok”), nor in most on-prem or hybrid environments with a basic understanding of the risks of running out-of-date security protocols like TLS 1.1.

LPE devices were managed natively through Skype for Business. Initially, this was via a HORRIBLE nightmare of SharePoint team services ugliness, then directly supposed through later versions of Lync/SfB. I should clarify that by “Managed”, I mean chunky and clunky firmware update processes that were about as graceful as an elephant on figure skates. Reporting was iffy, and embarrassing, to be honest.

Rewinding a few years, we saw the introduction of 3PIP, or 3rd Party IP Phone, in late 2011 . These devices didn’t run the Microsoft provided Win CE software, or run the Microsoft dictated hardware under the hood. Instead, Microsoft provided a framework for IP Phone vendors to develop their own devices that would be certified to function with Lync and then Skype for Business. These devices offered a much increased level of customization, were faster, and with a number of different manufacturers producing a number of different models, 3PIP provided more than an adequate spectrum of devices for your telephony pleasure.

3PIP phones can mostly be managed the same as LPE devices, or instead you could (and should) manage them through the manufacturers management software. This approach can be hit and miss. For the ability to configure all kinds of functionality on the devices from a central platform, we give two thumbs up. For those who had to suffer with multiple vendors, and thus multiple management servers and differing feature sets, feature support, and ability to not hate your job, we are obliged to give two thumbs down, and more if we could.

With the introduction of MS Teams, we’re about to see the introduction of phones that again run Microsoft provided code, like the LPE, on vendor provided devices. The really interesting part here, is that the platform for this code is Android. Yup – the platform for “Microsoft Teams” phones is Android. Microsoft had indeed changed!

When it comes to Microsoft phone systems online, there may be some gotchas. First, LPE is soon to be out of the picture. It’s unrealistic to update the decade old software on the decade old hardware to support modern encryption and provide user experience that feels good and doesn’t involve smashing the device with a hammer.

Next up, let’s talk about 3PIP. The 3PIP devices all talk SIP, and work very well natively with Skype for Business – online and on-premises. When it comes to MS Teams support, you can expect basic phone functionality, but not much more than that. 3PIP devices run SIP, and MS Teams runs MNP24 with SIP for backward compatability. 3PIP devices are connecting to MS Teams through a gateway for Teams connectivity. There will be some MS Teams functions that will likely never reach these devices. Specifically, modern portal device management, media bypass (Teams, not SfB Server media bypass), and less mainstream calling features like call park/retrieve.

And now on to MS Teams devices. These are the future of the Microsoft phone product line, including desktop phones, conference room phones (“starfish”), and for some room system devices. I expect lots of feature growth here, including when beefier hardware is needed – this is just a Teams app running on Android, which is a much more agile platform than the legacy days of Win CE.

The moral of this post, is that you should avoid LPE devices at all costs, including for on-prem (how long until the OS/Skype for Business Server app no longer support that laughable “security” that TLS 1.1 and below provide?). 3PIP devices are great, but going forward I be cautious. While they’ll work with SfB and SfBO and MS Teams, the fact that they connect via a gateway to MS Teams and have limited MS Teams modern portal support means that some caution is warranted. Make sure that they’ll be able to do what you require in terms of functionality, but also management now and in the future.

Trunks, gateways and ports (oh my!)

I was commiserating with a co-worker recently about how challenging SIP trunks can be when they’re new to you. Specifically we were talking about what port and protocol a device listens on, and what port and protocol the peer device will listen on. Somehow, I agreed that I’d write a blog post about this topic, so here it is!

Here’s a snippet from the SfB topology builder for a gatewaysfb tb gateway

 

Our Gateway has an FQDN, which must resolve to the IP of the peer device. You could also use the IP address here. In larger environments, a meaningful FQDN can be helpful vs trying to memorize or find your (probably outdated) list of IP addresses and gateways.

At the bottom, we see a list of trunks that are configured for this gateway in SfB. You can have more than one, but that’s a more advanced configuration and we’re shooting for basic understanding. Similarly, you can ignore the IPv4 addresses and Alternate Media entries.

Here is the associated the topology builder entry for the trunk:

sfb tb trunks

We’ve got a name for the trunk, which also happens to be the FQDN of the gateway. This is the default that SfB creates, you can change it during (but not after) the creation process.

The PSTN Gateway line has the FQDN and SfB Site (in brackets, HQ here) of the gateway/peer device.

The Listening port and SIP Transport Protocol indicate what protocol and port the gateway device is listening on. We’ll see that in the gateway configuration in a minute.

Mediation Server indicates the FQDN of the mediation server (or mediation pool, if you’ve got a pool of them).

As promised, here’s the configuration on the other end of the trunk, in this case from an AudioCodes SBC, showing what port it’s listening on:

audc sip interface

Here you can see that the device is listening on TCP port 5060. Note that it also appears to be listening on port 0 for UDP and TLS. That’s just AudioCodes configuration for “not listening”

notlisteningemoji

And here’s the entry from the Proxy Set on the AudioCodes SBC showing the SfB mediation pool, and what port and protocol it’s listening on.  Proxy Sets on the AudioCodes devices are the equivalent of the Gateway in SfB topology.

 

audc proxy set

 

Here you can see that the AudioCodes expects the peer device at 10.12.1.101 to be listening on TCP port 5068.

Okay, so why do I have an IP address here, but an FQDN in SfB? Well, you can use either, as we discussed above. However, one thing that we do stumble across (too) often, is someone poking the DNS records for the SfB mediation pool. They don’t understand that “FEPool1.example.com” *should* in fact have one entry per front-end server, so they decide to do some “cleanup”.

There are some common (incorrect) assumptions that we often see. The first is that the port and protocol need to be the same on the devices at both ends of a trunk. This isn’t true. They need to be the same protocol, but it’s perfectly normal for two devices to be listening on different ports.
The other issue we see is the assumption that a specific port/protocol has to be used. You’re restricted to using the protocol(s) that your device supports. That’s always TCP for SfB, and UDP and TCP for most SBCs and gateways. You can use any available port that you wish, though there are some conventions to use 5060 for SIP, and 5061 for SfB’s encrypted SIP, 5068 shows up by default for the Mediation pool to listen on, you can change that if you want too, just don’t get too creative. Somewhere down the road, someone will need to support what you’re creating, and a port of 8765 for SIP traffic is going to seem odd.

AudioCodes Configuration for SfB

I’ve worked with a couple of organizations in the past several months that have had… inconsistent configuration on their AudioCodes gateways and SBCs for proper connectivity to multiple (or in some cases single) SfB pools. For simplicity, I’ll use the term “SBC” to mean both the SBC and Gateway, unless there’s a particular need to call out one versus the other.

In an AudioCodes SBC, you can route traffic to/from your SfB system using a variety of uncivilized methods, including specifying IP addresses. Yuck. Do yourself – and those who’ll look at your config after you’re done – a favor and use Proxy Sets. You’ll find those under “Signaling & Media”, then “Core Entities”, then “Proxy Sets”.

A Proxy Set lets you specify a number of characteristics for a system that you want your SBC to communicate with, allowing for much less clumsy and random configuration. Here’s a Proxy Set configured for one simple SfB server:

ProxySetSE

This could be a standalone mediation server, a Standard Edition server, or a solo Enterprise Edition Server. Let’s have a more detailed look at the configuration.

General

ProxySetGeneral

The General section is pretty straight forward. You need to give your Proxy Set a name. There aren’t any particular requirements here, so use something that make sense to you. The first part of your SfB pool name is a good choice “LAXSfBPool”, for example. The name of your telco provider is a good choice for connections to the PSTN. For the rest of this post, I’ll assume you’re connecting to an SfB Standard Edition server.

Next you need to specify the SIP interface that your SfB server will connect on.  A SIP interface is a network interface plus a few other characteristics, like what port and protocol the SBC will listen on. The TLS context is the group of TLS settings that you’ll use to talk to your SfB server.

Keep Alive

ProxySetKeepAlive

SfB uses SIP OPTIONS (yeah, it’s capitalized, I’m not yelling) for two purposes. One is to advertise capabilities (“options”) and the other is as a heart-beat or keep alive mechanism. For connecting an SBC to SfB, you’ll want to set the Proxy Keep-Alive to Using Options. There’s a timer for how often you want these sent – the default is every 60 seconds, and you can leave that in place.  The remainder of the variables here are for tuning the specific behavior of your connection if OPTIONS aren’t received, and what criteria need to be met for the connection to be “up” again. Generally, you can leave these alone – I’ve never had to tweak them in a couple decades of doing this stuff.

Advanced

ProxySetAdvanced

The advanced section has only two settings. Classification Input determines whether the SBC will listen to anything from the specified IP addresses (we’ll see that part later) or if the port and transport type – TCP,UDP.TLS  must match too. Go ahead and change this to include Port & Transport Type if you’re super security aware, otherwise leave it at IP address. For DNS resolve method, you can leave this blank to use the global parameter “Proxy DNS Query Type”. You can do some funky things here, but I’ll talk about FQDNs and IPs later when we specify those.

Redundancy

ProxySetRedundancy

This section is where I see the most misconfiguration. For a Standard Edition server, however, you don’t need any settings here, so leave them at their defaults (having them set won’t affect anything, but sure is confusing to look at when troubleshooting).

Proxy Address

ProxySetProxyAddress

This is the spot where you add the FQDN or IP address, port, and transport type for you SfB servers. You can use server FQDNs if you like, especially for more advanced configurations that use the various DNS Resolve Methods in the Advanced section… However, for SfB, most everyone that I know uses the IP addresses of the servers – they’re unlikely to ever change, and using IPs means you’re not dependent on DNS infrastructure. This also guards against changes or deletions of DNS records, though you have other problems if that happens! I always specify the port and transport type, to prevent any gotchas should the system defaults ever be changed.

Next post, we’ll have a  look at the changes that need to be made when you have an Enterprise Edition pool with multiple front-ends, as well as multiple pools. Just for fun, there are some differences in configuration here if you have a gateway or an SBC.

Branch Offices in Hybrid and Online Environments

In my previous two posts, I’ve covered branch office solutions including the SBA and the alternatives. No discussion on branch offices could be called complete without including hybrid environments.

The hybrid conversation is the same as on-prem when your user is homed on-prem and not online. As soon as your user is homed online, you have two seperate considerations

  1. User connectivity to the Cloud for all functions but PSTN
  2. Cloud connectivity to Cloud Connector Edition (CCE), Direct Routing (DR), or On-Premises Call Handling (OPCH)

For the first point, local Internet is what Microsoft recommends. Recall from earlier posts that you can use your router/firewall to direct O365 traffic out locally, and send all other traffic across a WAN, if that’s what your organization requires. If this Internet connection fails, your options are to route across the WAN, a 2nd Internet connection, mobile clients with LTE, or head elsewhere to work.

For the second point, things can get a bit more complex. The routing and high-availability of CCE, OPCH, and DR varies. Other factors will be centralized PSTN breakout vs a more localized approach – you’re more likely to have a business class or redundant connection for a centralized service than if you have distributed services.

Cloud Branches

One common solution that I see a lot of, is a hybrid scenario with branch office users hosted online, and main office users hosted in the on-prem pools. This eliminates the cost and administrative overhead of running branch office solutions, while keeping some infrastructure around for financial, compliance, interoperability with other services/devices and other reasons. Cloud-homed branches are also a great stepping off point when you’re moving a larger organization to a pure online environment.

Pure Cloud Considerations

At this point it shouldn’t be surprising to you that there really aren’t any new or unique considerations for branch offices when your entire organization is cloud based. From a branch office perspective, there’s no local infrastructure different versus hybrid scenarios.

Edge Cases and Wrap-up

In the past couple of posts, I’ve covered branch office considerations for high-availability. The range from SBAs, redundant WANs, redundant Internet, full pools, and more. While comprehensive, I didn’t cover every use case. When considering the solutions that best apply to you, draw up a simplified map of your environment, get a bunch of copies of it, and have at them with a red pen to indicate failure points. Work through these outages using your most important use cases to establish what works, what’s limited or hobbled, and what’s entirely broken. If a scenario doesn’t work for your use cases, put it aside.

You’ll now have two piles – works for me, and doesn’t work for me. Next, review the scenarios that do work, and establish which one best fits your business needs, including pricing. If the mighty dollar sign knocks all of these scenarios out of contention, you now need to sort through the “doesn’t work” scenarios, and work through them to find “the best of the worst” that does the best job of fitting your business needs and budget.

Branch Office Options

In my earlier post, I covered the SBA and what I feel are some pretty significant downsides given the technology changes in the past 10 or so years. So what are the options?

Redundant WAN

The simplest option, from an SfB point of view, is to have redundant connectivity from your branch office to your main office. How you go about this can vary. You could get a 2nd line from the same carrier, but that doesn’t help you if that carrier suffers an outage. A different carrier would guard against that, though watch out for the 2nd carrier simply using the first carrier for all of part of their services. Even with two different carriers, you could wind up with fibre in the same conduit, and you may suffer the dreaded backhoe fading.

Backup VPN

A backup VPN might make more sense. An Internet connection is less expensive, and there’s a good chance that it’s not sharing much or any infrastructure with the WAN link. The first issue to watch out for with VPNs is that you may not have sufficient upstream bandwidth. The second is that you may not have sufficient bandwidth at all. If you are using a lower capacity link as a backup, you can use the DSCP markings that you applied to your SfB traffic for QoS (you did do QoS, right?) to help you out. Your firewall/VPN device can be set to prioritize voice traffic based on these markings, and potentially block video all together.

Use SfB Server Standard Edition instead of an SBA

If redundant connections aren’t feasible, using a Standard Edition server may be. This moves all of your users functionality to their location, preventing the ugliness of limited functionality mode.  However, you now have to license this server, and you’re no longer dealing with an appliance – though you’ll recall that some SBAs were just servers with PRI cards anyway.

More downsides here are that if a user homed on this server hosts a meeting with a large number of participants from outside their office, all of that traffic is going to hit the WAN. Also, if the users in this office work remotely a lot, all of that traffic transits the WAN to reach the Edge servers at the main office…. unless you deploy an edge in the branch office, and now it seems like we’re boiling the ocean and building rocket ships to guard against a branch office WAN failure.

Get Out of the Office

The last option to deal with a branch office outage would be to get out of the office, either virtually or physically.

SfB has excellent mobile clients. If your users are homed in a central office or a datacentre, they can use the mobile clients to connect to their pool over LTE. There may be some limits here, like not being able to be a member of a Response Group, but as a backup option this one is pretty simple, and your staff may all already have company phones or subsidized company phones.

Lastly, the users can find a different place to work. This could be home, it could be a co-working space, or a coffee shop.

What about Hybrid and Cloud?

Finally, if you’re in a hybrid or cloud deployment, I’ll provide some thoughts on how to handle branch offices in the next two posts.