When Your IP Subnets Span Sites and Break Teams Location Things

Perhaps the most time consuming portion of the Teams voice projects that I’m involved in relates to location…  A Teams user’s location can be used for emergency calling, bandwidth consumption over constricted WAN or internet links, or things like location based routing and local media optimization.

One of the first questions I ask is: “Do you have subnets that span multiple buildings, especially for wifi”. For emergency calling, each building or facility may need to be broken up into “dispatchable locations” if you’re reading legislation or “places” in Teams lingo. Essentially, if you’re a large building like a school or factory, how do the first responders find you? That’s your “emergency response location” or your “place”. You’ll need a network characteristic to establish which place a Teams client is in. That could be subnet, the switch chassis ID, or the switch port on that switch you are connected to, or it could be the BSSID, which reflects which access point you’re connected to.

If you have IP subnets that span buildings, you will need to re-subnet in order to get the proper physical granularity that’s needed for location-based functions in Teams.  There are a couple of exceptions to this.

First, if the only spanned subnets you have are for server/storage infrastructure, you’re fine. Just ensure that phones can’t be plugged into this network (which you should be doing anyway!), especially a server room phone.

The second exception would be if you don’t need any Network Site based policies other than Emergency Calling Policy. This is the one that notifies someone at the facility that an emergency call has been placed. In the US, the FCC website says (this is Kari’s law):

“When a 911 call is placed on a MLTS system, the system must be configured to notify a central location on-site or off-site where someone is likely to see or hear the notification.”

From https://www.fcc.gov/mlts-911-requirements

Typically, you would need to notify someone at that site about the emergency call, and “that site” has IP subnets associated with it, and an Emergency Calling Policy associated with it. The Emergency Calling Policy handles the notifications, so if a subnet spans multiple sites, you can see how this makes it impossible to notify someone only at the site from which the call was placed.

There are two options or scenarios here. The first is one in which you can notify parties at both/all sites, and train them how to tell when a call is from their site, and to take appropriate action. The second would be when you have something like a central security desk that can be notified, and that security desk can then take appropriate action – commonly this is facilitating first responder access to the building, but it could also be first aid or security teams responding to the incident.

A university campus is an example of a site where I common encounter IP subnets that span sites/buildings/facilities. It’s also a great example of a site where this many not matter, and every university that I’ve encountered has security and/or police departments who can handle responses to these notifications.

An example of where things don’t work with spanned subnets would be a wifi subnet that spans the facilities for a municipality that doesn’t have a security desk (though they may have some “in charge” of security). A call to emergency services from a library would need to notify someone at the library, and not HR, IT or city hall reception. This isn’t so much because they couldn’t in turn notify someone else at the library to take action. My concern here would mainly be around availability and hours of work. If the library is open until 9pm and HR, IT, and city hall are all closed, you’re not notifying anyone.

In a case like this, I would recommend the city to re-subnet their wifi. In some cases, depending on their wifi controller(s), this could be a painful exercise, as some wifi controllers cannot have a single SSID (which is desirable) across the organization use a different subnet at each location. Instead, one SSID would need to be created per location, and then all of the wifi clients at that location would need to have their wireless connections updated to the new SSID. That’s no fun but is necessary in many cases to properly meet emergency calling legislation, or to use other location-based policies like Network Roaming (WAN/Internet bandwidth consumption limits for Teams, based on the site).

When You Do and Do Not Need an Emergency Call Routing Policy

When you’re configuring emergency calling, there are a number of spots where you need to configure things. If you’re on Direct Routing, one of the things you need to configure is the Emergency Call Routing Policy. This lets Teams know what your emergency numbers are for a given location, and where those calls should be routed to.

If you’re using Calling Plans or Direct Routing however, you do not need to configure anything for Emergency Call Routing. This is all handled by Microsoft (for Calling Plans) and your operator (for Operator Connect). You don’t even need to flip the “use dynamic 911” switch.

If you’re concerned over this, you can test out emergency calling before you deploy. Assign a Calling Plan or Operator Connect number to a user (your Operator Connect operator can either get you a test number, or they’ll likely allow you to call out from Teams before a number port, if you’re porting in). Call 933, which is the test number for 911 on these platforms. If you want to test your dynamic address capabilities, you’ll need to at least configure your Trusted IP address and a subnet for the location you’re testing from.

Heads-up! If you have notifications turned on in the Emergency Calling Policy (Calling, not Call Routing, Policy), the users that you have configured to receive notifications will see the exact same notification if you call 933 as if you had called 911. From the Teams perspective, there is NO difference between 933 as a test, and 911 as a true emergency call. The only difference is at your carrier/operator, in which case they’ll direct 933 calls to their test bot rather than to a live emergency call taker/operator. You should give those people that are configured for notifications enough notice of the test so they don’t panic, and then you should advise them when your testing is complete, so they know any subsequent notifications are true emergencies.

Troubleshooting Emergency Location not displaying in the Teams Client

All Microsoft Teams clients are capable of using location information from the Network, OS, or User for emergency calls. When the network is the desired source of the location, there are 3 key places that need to be configured.

First, the Trusted IP Address. This address is used by the Teams client to understand if it is on your organization’s network, or outside/remote. Trusted IP Address is not used beyond that.

Second, the client’s IP address is checked about the subnets/supernets configured in the Network Topology to establish if there’s a match for a network topology site. If there is a match, the policies from that site are applied. For emergency calls, there is the Emergency Call Routing Policy, that establishes the route a call will take to reach emergency call takers if Direct Routing is being used, and there is the Emergency Calling Policy to configure notifications.[1] There is one more policy not related to emergency calling, and that’s the Roaming Policy, which you can read about here.

Third, the client’s IP address and BSSID for wireless, and IP address, switch chassis, and switch port for wired, are compared against the LIS database. LIS stands for Location Information Services, and its job is to match your network information to the various places in your organization that first responders should respond to.

Ultimately, the goal is to have the correct location displayed in the user’s Teams client. This location is what will be passed through to the SBC in the Emergency Call Routing Policy, and off to an emergency call taker. It’s not uncommon for an address to not appear, or for the wrong address to appear, especially in new deployments. Fret not, there’s an easy way to see what’s happening.

Right click on the Teams icon in the systray – you may need to click on the ^ icon beside the clock to see it. Select “Collect Support Files”. You’ll see a flurry of activity as the client writes files to the downloads folder, and then zips them.

Grab that zip file and open it. We’re just reading a text file in the zip, so you don’t need to unzip and then find the file. Click on the folder called “web”, and then double click the file named “MSTeams Diagnostics Log 7_7_2022__9_31_02_PM_calling.txt” or similar, depending on the date/time you created the log file.

Search for the phrase “trustedIpMatchInfo”. That’s the actual case used in the file, but make life easy on yourself and make sure your search isn’t case sensitive. The first block you’ll see will look like this:

“debugInfo”: {
“ncsDebugInfo”: {
“trustedIpMatchInfo”: {
“publicIp”: “203.0.113.5”,
“reason”: “NotMatched”,
“_comment”: “Match Client Public IP to Tenant Trusted IP”
},

Here, the public IP shown is the public IP address that the Teams client used to connect to Teams. This IP needs to be in the “Trusted IP Address” list within Teams for the Teams client to be considered “internal”, and for the rest of the internal network policies to take effect.

The next section is used to apply policies if the Teams client is within a given site internal to the organization.

“siteMatchInfo”: {
“ipv4”: “192.168.20.107”,
“subnetLengthIPv4”: “24”,
“enableLocationBasedRouting”: false,
“reason”: “NotMatched”,
“_comment”: “Used to match endpoint subnet to Tenant site if trustedIpMatchInfo matches”
},

In this case, the LAN IP is 192.168.20.207. Even if there is a Network Site defined with this subnet, the policies for that Network Site will not apply since the Public IP/Trusted IP did not match.

Common policies applied based on the Network Site include Location Based Routing, Roaming Policies, Emergency Calling Policy, and Emergency Call Routing Policy. See this post for details.

The next section is used to establish the Teams Client’s location within the organization for emergency calling purposes.

“networkLocationMatchInfo”: {
“matchedNetworkType”: “CLS”,
“matchingIdentity”: “50.6583, -120.383”,
“ipv4”: “192.168.20.107”,
“reason”: “Matched”,
“_comment”: “Used to find emergency address, against Tenant Location Network Information (LIS), otherwise against Client Geo Location Information (CLS) if available”
}
},

Note here that my LAN IP did match against the LIS database, however Teams would not use this information for my emergency location as my Public IP didn’t match. There’s no way for Teams to know that I’m not in a coffee shop or at home with the same 192.168.20.0/24 subnet as the office.

For the Teams client to “find” your location for emergency calling, ideally all 3 of the above should result in “Matched”: The client’s public IP must be in the Trusted IP Address list, the NetworkLocationMatchInfo must find your IP subnet, wireless BSSID, Switch chassis ID, or switch Chassis ID and port name in the LIS database. You don’t necessarily need to have “matched” for the Network Location section, as you could have valid Global policies that would apply, or perhaps user-level policies assigned to the user. That may be the case in a very small organizationwith just one location, but would indicate a bad design in a larger organization with multiple locations.

Requirements for and Constraints of Site based Teams Policies

If you need to implement Site Based policies in Teams, there are several pre-requisites and design decisions that you need to be aware of and consider in your design.

If you are in Canada, the United States, or India then this absolutely applies to you. If you are in any other location, it probably applies to you unless you are a very small Teams deployment in a single office.

Here’s a list of the site based policies in Teams:

  • Location Based Routing
  • Emergency Call Routing Policy
  • Emergency Calling Policy
  • Local Media Optimization
  • Roaming Policies

Location Based Routing

Location Based Routing is mostly used in India, though there are a few other scenarios where it may be useful. The general idea here is that you can’t mix IP based calls/meetings with more than one PSTN site.

Emergency Call Routing Policy

This policy controls where Teams will route emergency calls to when using dynamic emergency calling for Direct Routing and Operator Connect scenarios. You use dynamic emergency calling in Canada and the US by law, and it make sense to me to use it in other locations as well.

Emergency Calling Policy

This should really be called the Emergency Call Notification Policy. When an emergency call is placed from a site, this policy controls who is notified via IM, who is conferences in by phone, and whether the phone call is muted or unmuted (you should ALWAYS configure muted, you do not want to interfer with the emergency call taker getting the information they require).

Also tucked into this policy is “External location lookup mode”. I have yet to establish why this feature is in this policy. The External location lookup mode turns on the “work from home” location experience for mobiles and computers, to user the device’s native location services and/or user input to establish a location for emergency calls. If you have the network information for this policy to kick in for a site, you also have the network information to dynamically establish the caller’s location. Realistically, you need to turn on the External location lookup mode in an Emergency Calling Policy that is assigned to the user, so that it triggers when they’re outside of your organization. (Note: there is currently a bug where if you set the External location lookup mode on a global Emergency Calling Policy, it does not work. You need to assign the policy to each user individually).

Local Media Optimization

LMO allows a client device to send media directly to the inside interface of an SBC, avoid travelling up to Teams and back down. The previous couple of posts here cover LMO.

Roaming Policies

Here’s another policy name that doesn’t make sense to me. Roaming Policies apply bandwidth restrictions to users that are present in a site, replacing the same two parameters that are in the policy that is assigned to the user. In the SfB world, this was called Call Admission Control, and was handled in an entirely different fashion.

Requirements

The requirements for these site based policies to trigger are:

First, the client must egress to the Internet that is static, and dedicated to the organization. This is generally the IPv4 address that the client NATs to, but it could be an IPv6 address or an IPv4 address assigned to the device that isn’t NAT’d. These addresses are called Trusted IPs.

Second, you must define a Network Site under Locations > Network Topology and assign the subnets in use at this site to the Network Site.

Constraints

There are some technical constraints to these requirements.

For the Trusted IPs, static means that the IP address(es) do not change. A DHCP assigned address with a reservation is okay. If the address were to change, you would a way to be notified of this change, then you’ve have to update this information in Teams, then policies would have to replicate and be pulled by clients. That’s no good for emergency calling, and isn’t that great for other scenarios either.

Also for the Trusted IPs, dedicated to the organization means that you can’t use cloud proxy/firewall services such as those provided by zScaler, and the address could be shared between multiple organizations and the address(es) assigned to your clients may change.

For the Network Sites, the subnets must be assigned to only one site. You cannot have a subnet assigned to or spanning multiple sites. If you need your sites to be more granular that your subnets permit, you will need to resubnet. A typical place this happens is with centralized wifi controllers that lay one very large subnet over an entire organization.

There is also a large design constraint to be aware of. These five policies (above) are typically assigned to sites. You define the site, then assign the policies to the site. You do not get to define different sites for different policy types.

Let’s use an example to clarify that. A university campus has several buildings in a campus, and you’d like to set the same LMO policy and Emergency Call Routing policy to the entire campus. That makes sense, the whole campus is probably served by the same PSTN service, and likely needs to send all emergency calls to the same spot. Plus, the university has a wifi system that uses a large /20 subnet for the entire campus. You can define the entire campus as one site and meet these objectives (sorry for making this sound like a certification exam!)

However, in the US Kari’s law states that you must notify someone within the building with the details of any emergency call from that building. You need one Emergency Calling Policy for each building. This means that you must define each building as a site, and you must break the wifi subnet up into smaller subnets that only belong to one building. You don’t need multiple Emergency Call Routing policies though, you can assign the same “campus” policy to each site.

This type of scenario makes network engineers scream, and threatens to delay the project until they can sort the wifi. As an interim measure, I have seen some organizations configure each building as a site and assign the wired subnets to them. They then assign the giant wifi subnet to a “campus wifi” site, and configure an Emergency Calling Policy to notify staff who can relay the notification information to the appropriate parties. Don’t make a decision to do this on your own, you are not a lawyer (probably). Give this decision to your legal/risk/executives to deal with.

How emergency calling works from various deployment scenarios: SfB Server

At a recent user group meeting, an attendee asked about all of the various call flows for emergency calls, from Skype for Business Server, Skype for Business Online, hybrids, and Teams. Over the next couple of posts, I’ll cover how the various scenarios work. First up, Skype for Business server.

SfB Server has native emergency calling functionality, using Location Policies, Sites, and Subnets. You can use this functionality to override the caller ID (thus setting an ELIN, or Emergency Location Identification Number), and routing it out via the appropriate gateway. You can also turn on alerting via IM, though this doesn’t do a lot of good unless you go to the next step…

SfB Server also has a Location Information Service, or LIS. This service uses the BSSID (the access point and channel a device is connected to), subnet, switch port, or switch to location the endpoint. Additionally, you can integrate with 3rd party servers/services that perform this LIS role. Some of the 3rd party services may perform better if you need to get down to the switch/switchport level to determine a location. The various locations (ERL, or Emergency Response Location) are programmed by you into the LIS, and associated with the subnet/BSSID/switch/switchport data. Yeah, this is a LOT of work to keep up! You enter the location in a format called MSAG, or Master Street Address Guide. This is a strict format that helps avoid confusion, especially when you get into suite numbers, floor numbers, and things like “east”. I have seen addresses like “235 East Highway 16 West”. It’s important to get things in the right place!

Once SfB knows your location from the LIS process, it includes the address in PIDFLO (Presence Information Data Format Location Object) in the SIP header that it sends to the gateway. There are a couple of connectivity options here:

If you have a gateway with ELIN capability (and licenses), the gateway can use the PIDFLO to select an ELIN, then send the call via the PSTN to the PSAP. If the PSAP needs to call back, the ELIN gateway maintains the translation for 30 minutes (usually configurable if 30 minutes doesn’t work for you).

If you’re in multiple PSAP jurisdictions, you’ll need to have a SIP trunk PSTN service that covers these, or if you can’t get a SIP trunk that does that, or if you’re using PRIs, you may need to route the emergency call gateway to gateway within your organization to reach the gateway that is in the correct location. You can make these routing decisions based on the ELIN (oh, 425-123-xxxx is Redmond, so send the call to that gateway, then send it to 911). You can’t route using 911 as the destination address, so this can turn into a bit of a routing mess.

An emergency call goes from a user to SfB Servers to a gateway, then via the PSTN to the PSAP

Routing emergency calls is easy when you only have one site.

If you want to avoid the routing headaches just described, you can also use 3rd party solutions from companies like West and RedSky. They have the LIS systems described above, but can also handle the ELIN translation function, and add enhanced notification/alerting options. Both offer services where your emergency calls are sent to their response centers, and then routed to the appropriate PSAP. This routing takes place automatically if the information included (think PIDFLO) is valid and matches their records. If it’s not, an operator answers the calls, gathers location information, and sends the call to the appropriate PSAP.

SfB users in two sites place calls, via the same SfB Servers, but then to different PSAPs via different gateways and PSTN services

Emergency call routing is more complex with multiple sites and/or multiple PSTN services. You MUST route emergency calls to the correct PSAP!

When you use these services, you also gain the option to have your receptionist/security desk conferenced into the call. This may be listen-only, or they may be able to speak (listen only keeps the call taker at the PSAP from getting confused as to what’s going on at the scene… Call taking is stressful. Take the stress of a help desk employee trying to decipher a technology problem over the phone, and now add the pressure of time and safety.) The conference function allows the reception/security desk personnel to take action locally – send a security or first aid team to the location, evacuate the building, meet emergency responders to direct them to the site.

Next up, we’ll hop online and see what Skype for Business Online and MS Teams can do for us, when using PSTN Calling services from Microsoft.

911 and Mobile Phones

In my previous post I talk about how e911 services can use ANI and ALI information to know where you’re calling from when you’re on an analog circuit. Now let’s consider what happens when you call from your mobile phone instead of an analog landline.

Mobile Challenges

When you call 911 from a mobile phone, several challenges arise when trying to determine your location. Since you could be just about anywhere, it cannot be assumed that you are at your home or “billing” address. Instead, the telco needs to sort out your location. There are two different location determinations that need to be made.

Routing your call to the correct PSAP

The telco needs to connect your emergency call to the correct PSAP. This could be a simple determination if you’re in the middle of a large geographic region service by one PSAP. It could be a losing cause if you’re near a jurisdictional boundary, especially if your mobile is connected to a tower in the neighbouring jurisdiction. Two neighbouring counties may know how to re-route your call if you wind up talking to the wrong PSAP, however that may be more complicated if national boundaries are involved.

Providing your location to first responders

Once the telco has routed your call to the (hopefully!) correct PSAP, the PSAP needs to know where you’re located so that first responders can be dispatched.

PSAPs and Pizza are not the same

John Oliver has a brilliant segment on YouTube where they stand in a PSAP and order pizza using a smartphone app. The app knows exactly where they are. The PSAP call taker doesn’t get the same results.

Why the different results? When you use an app to order pizza, the app has full access to the location gizmos like GPS in your smartphone. Your mobile call to 911 doesn’t, since there’s no standards or protocols in place for your phone to report its location to the PSAP. The 911 infrastructure just isn’t advanced enough to receive that location information.

Instead, it’s up to the telco to try to establish your location. It does this by calculating the length of time it takes a signal from your phone to reach a cell tower. This tells the tower how far away from it you are, but does not give a reasonable direction. Assuming you are near multiple towers, up to 4 are used. More towers means better accuracy.

A tower will have a very rough idea of your direction. Most towers have 3 or 4 antennas pointing in different directions, and their coverage areas are far too broad to be of use.

Once the telco has calculated your latitude and longitude, this information needs to be provided to the PSAP. The methodology for this seems like a Rube Goldberg machine: the telco creates a fake phone number for you, called a Pseudo-ANI. It provides this Pseudo-ANI to the PSAP as your phone number. Then, the telco performs an emergency update of the ALI database with a Pseudo-ALI that contains your latitude, longitude, and as much civic address information as can be determined. The PSAP uses the Pseudo-ANI to lookup the Pseduo-ALI, and now they have a location for you. All of this takes time to determine, so there will a period where the PSAP doesn’t have an address for you. Typically the Telco has 45 seconds to populate the Psuedo-ALI, and must update it every 30 seconds.

Mobile 911 service is really a kludge to be “backward compatible” with exisitng 911 infrastructure designed for analog phones.

Here’s what the PSAP will see once the telco will see when the information is provided (my labels are in italics, and refer to the nearest coloured box):

PSAP terminalNext up, we’ll have a look at how VoIP Systems can use 3rd party “next generation” services and ugly workarounds to also provide location information to PSAPs.

And do make sure you watch that John Oliver segment, it’s great stuff.

 

911 Background and Basics

Emergency calling is a part of every SfB deployment I’m part of, and yet it seems to be an area that most people don’t have a good background on. This makes understanding capabilities and limitations a bit of a challenge. Over the next handful of posts, I’ll cover the background behind 911, some basics on how 911 works with boring analog home phones, traditional TDM business phones, mobiles, and then we’ll through SfB into the mix.

Let’s start by talking about an analog home phone, or a single business analog line like for a fax machine. When you sign up with your telco for this line, you provide them with an address for service, and that address is where the line terminates. There is no option for you to move the line, so the telco can build a large list of names, addresses, and phone numbers, and be very confident in the static nature of that list, and thus its accuracy.

I won’t address MLTS, or Multi-Line Telephone Systems, aka a traditional PBX connected to the telco via PRIs. For the purposes of how 911 works, they’re largely treated as a collection of virtual analog lines, and there’s not much different about how that’s handled – you assign locations to numbers, and the telco enters that information into their databases.

In a 911 (vs e911) scenario, when you call 911, no address or telephone number information is provided. The caller has to provide this information. Once your location is established, that operator has to transfer your call to the correct PSAP – Public Safety Answering Point – local to you. That’s never fun when you’re in distress, so in most regions of North America we’ve progressed to e911 – e for enhanced – where your information can be automatically passed on.

With e911, the address that you provided to the telco is put into a database along with your phone number. Now when you call 911, the telco can use this information to route you to the correct PSAP, and the operator at the PSAP can see your information on their screen. As a safety measure, they’ll almost always verify your address with you. When your number is automatically display at the PSAP, it’s referred to as the ANI, or Automatic Number Identification. . The PSAP then takes your ANI and performs a database lookup to retrieve the civic address. This is called the ALI, or Automatic Location Identification.

Though very similar, ANI and Caller ID aren’t the same thing. Caller ID is for customers and can be blocked. ANI is for telco use. You can block your Caller ID, you cannot block ANI.

This system works very well with static analog systems, where a pair of wires physically terminates at the address you provided. It falls apart completely when the phone becomes mobile, either VoIP or cellular. In my next post, we’ll consider how cellular/mobile e911 can work (and fail!) and why the pizza company knows more about your location than your PSAP.