Technology Integrated Services core service level expectations

Service level expectations (SLE) – Technology Integrated Services (TIS), Information Systems & Technology (IST)

Purpose of this document

The purpose of this document is to define the services applicable, and provide other information, either directly, or as references to public web pages or other documents, as are required for the effective interpretation and implementation of these service level expectations.

Services included

All services listed in the IST Service Catalog which are provided by the Technology Integrated Services group in IST.

Core services

Core services, for the purposes of this SLE, are a subset of TIS provided production services, as follows:

  • ADFS
  • DHCP
  • DNS
  • Exchange
  • External Network
  • Skype for Business
  • ​Mailservices
  • MS SQL Cluster
  • NEXUS
  • Office 365
  • Storage
  • Physical Security systems
  • SharePoint
  • Telephones
  • VMware cluster
  • WCMS
  • Wired Network
  • Wireless
  • VPN

Service description

The services are as described in the Service Catalog.

Client responsibilities

For the purposes of this service level agreement, clients are expected:

Technology integrated services responsibilities

  • As described herein

Governance

The Director of Technology Integrated Services has direct responsibility for ownership and maintenance of this document and providing guidance and advice on its interpretation and implementation.

This document may be changed by the Director Technology Integrated Services with approval of the Chief Information Officer (CIO).

Availability and uptime

IT services provided by Technology Integrated Services are generally offered on a 7x24 basis with 99.9%[1] or better availability (measured monthly), with exceptions as noted:

  • Downtime for scheduled maintenance excluded (see Maintenance, below)
  • Outages due to power failures excluded
  • Partial degradation of service excluded
  • Non service affecting failures excluded
  • Failures of individual access level (edge) switches, individual Wi-Fi Access points, and individual telephones, excluded.
  • Services which involve dealing with or interacting with a human being (as opposed to interacting with computer systems or technology) are provided during regular business hours only.  E.g. personal switchboard service, service desk activities, manual activation of services
  • Other exceptions as noted in the individual service items in the Service Catalog.

[1] – See metrics and reporting, below.  Availability is tracked and reported for specific core services only.  99.9% is provided as a guide for client expectations, and as a guide to Technology Integrated Services staff on the selection of technical infrastructure components, and on the design, implementation, and operation of services and related technical infrastructure.  99.9% availability, measured monthly, is 43.8 or fewer minutes of downtime per month (with exceptions as noted above). 

Notification of incidents and maintenance

Definitions, for the purposes of this agreement:

  • Incident – Unexpected degradation or loss of service
  • Scheduled Maintenance – Activity that is required for the effective ongoing security and operation of service, that does not fundamentally alter the service from the perspective of the user.
  • Emergency maintenance – Activity required as a result of an Incident, to restore service, but which may result in additional service deterioration, temporarily.

Physical security systems

Notifications of service interruptions, scheduled maintenance, and emergency maintenance on Physical Security systems, is at the discretion of the applicable Manager in Technology Integrated Services.

Core services (with the exception of Physical Security systems)

Service interruptions, scheduled maintenance, and emergency maintenance on core services (with the exception of Physical Security systems) are announced using the IST NETWORK/SERVICE ALERT tool.

The IST NETWORK/SERVICE ALERT tool is pre-configured to send email to approximately 30 individual email addresses and lists, and send a tweet @UWNetworkAlert.  The distribution list can be reduced at the discretion of the TIS staff member, depending on the scope and severity of the incident, or impact of the maintenance.  Alerts that are not flagged as ‘Private’  are available at University of Waterloo Network and Service Alerts.

TIS staff members using the alert tool should use the default full distribution for all Network incidents and maintenance.  All other incidents and maintenance may, at the discretion of the staff member, use the reduced distribution list (labeled Main) which includes twitter, IST Management, IST help desk list, IST TIS list, and CTSC.  Twitter may be excluded, at the discretion of the staff member, if it is believed the notification is sensitive and not suitable for public distribution.

All other TIS services

Notifications of service interruptions, scheduled maintenance, and emergency maintenance on non core systems, is at the discretion of the applicable Manager in Technology Integrated Services.

Note

Exceptions to the above will be noted in the individual service items in the Service Catalog.

Maintenance

(Note Business Critical Services and Service Maintenance Strategy may need review)

  • When required, Technology Integrated Services performs service affecting scheduled maintenance:
    • Between 22:00 – 08:00 weekdays, and anytime on weekends, statutory and University holidays, with:
      • At least 24 hours notice for maintenance window
      • At least 48 hours notice for maintenance window 5 minutes to 1 hour
      • At least 7 days notice for maintenance window 1 to 2 hours
      • At least 14 days notice for maintenance window 2 to 6 hours
      • At least 30 days notice for maintenance window 6 to 12 hours
    • At other mutually agreed times if maintenance activities are requested by a department, and impact is limited to that department.
    • All other times with approval of CIO
  • Technology Integrated Services will perform unscheduled emergency maintenance at its discretion, depending on the severity of the emergency.
  • Non service affecting maintenance may be performed during or outside regular business hours, with or without notice, based on the level of risk of unexpected service interruption caused by the maintenance, at the discretion of the applicable Manager in Technology Integrated Services, using the following table as a guide:
RISK OF NON SERVICE AFFECTING MAINTENANCE = LIKELIHOOD OF UNINTENDED IMPACT x IMPACT IMPACT
LOW (1) MODERATE(2) HIGH (3)

Example

Service to individual users is degraded

Examples

  • Service to a department degraded
  • Service to individual users down

Examples

  • Service to campus is down or degraded
  • Service to a department is down
LIKELIHOOD OF UNINTENDED IMPACT POSSIBLE (3)

Examples:

  • Activation of new functionality
  • Addition of new software modules
  • Restarting of services or servers providing core services
MODERATE (3) HIGH (6) HIGH (9)
UNLIKELY (2)

Example:

Hot swapping of network modules or disk arrays
LOW (2) MODERATE (4) HIGH (6)
RARE (1)

Example:

Routine maintenance which rarely failed, historically
LOW (1) LOW (2) MODERATE (3)
RISK OF NON SERVICE AFFECTING MAINTENANCE  
LOW 1-2 No announcement or maintenance window required
MODERATE 3-4

Announcement required. Can be performed during regular business hours with manager approval.

HIGH 6-9

Treated as service affecting maintenance

Exceptions to the above will be noted in the individual service items in the Service Catalog.

Response times

Service outages are given the highest priority by Technology Integrated Services staff, and response to outages takes precedence over other activities.

The expected response times for service outages vary depending on the impact, and whether it occurs during regular business hours[1].  The response time is from the time the outage was detected or reported.

Impact

Business Hours[1]

Outside of business hours

Response Time

Target Resolution Time[3]

Response Time

Target Resolution Time[3]

Critical Business Impact
 

Immediate

Within (4) hours

1 hour

Within (12) hours

Moderate Business Impact
 

Immediate

Within (4) hours

Within next business day[2]

Within next business day

Low Business Impact
 

Within current business day

Within next business day

Within next business day

Within next business day

Impact definitions:

Critical business impact

  • Work in the entire University or campus is stopped or interrupted.
  • A core service is unavailable or significantly degraded.
  • A critical business process (e.g. convocation, registration, payroll, etc.) is stopped or interrupted for entire University.

Moderate business impact

  • Work in a department is stopped or interrupted.
  • A core service is partially unavailable.

Low business impact

  • Work for an individual is stopped or interrupted

Exceptions to the above will be noted in the individual service items in the Service Catalog.

[1] – Business Hours - Monday through Friday, 8:30 a.m. to 4:30 p.m., excluding statutory and University holidays

[2] – Next business day – The next Monday through Friday, excluding statutory and University holidays

[3] – Target Resolution Time – Resolution time shown is expected resolution time for repair or replacement of physical components,  and resolution or workaround of software bugs in most cases.

Requesting support

Support is available as follows:

Support method

TIS services supported

IST Service Desk

All

Faculty Service Desk

Varies[1]

resnet desk

Wi-Fi and residence network

after hours help desk

Wi-Fi

Online request

All

Exceptions to the above will be noted in the individual service items in the Service Catalog.

[1] - Faculty service desks support some TIS provided services, upon mutual agreement of the Faculty IT group, and IST, and subject to:

  • Use of IST Request Tracker
  • Use of IST established processes for incident reporting and escalation
  • Faculty IT staff have taken IST provided, or IST approved, training to support the service, where needed

Escalation

Escalation of support requests, in order:

  • Customer representative
  • Manager, Service Desk, IST
  • Director, Client Services, IST

Escalation of service outage response:

  • Director, Technology Integrated Services, IST

Disaster recovery planning

Information on IST’s Disaster Recovery Planning (DRP) including Recovery Time Objectives (RTOs), Recovery Point Objectives (RPOs) is included as part of IST’s Disaster Recovery Framework and Disaster Recovery Plan.

Metrics and reporting

TIS tracks the start and end time of maintenance and incidents on core services only (with the exception of Physical Security systems), through use of the IST NETWORK/SERVICE ALERT, which records details in a database.

Additional reporting and interpretation of incident/maintenance data may be released at the discretion of the Director, Technology Integrated Services.