IEEE Communications Society (ComSoc)
Technical Committee on Communications Quality & Reliability (CQR)

Emerging Technology Reliability Roundtable 2016
(ETR-RT16)

Monday, May 9, 2016

In Conjunction with IEEE CQR 2016 International Workshop
Skamania Lodge
1131 SW Skamania Lodge Way, Stevenson, WA 98648
http://www.skamania.com/

Scope of the Roundtable

 

Introduction - Program Agenda

  1. David Lu – Reliability Challenges in the Software Defined Everything Network

  2. Martin Taylor – Telco-grade Availability from an IT-grade Cloud

  3. Xuewen (Sean) Gong – Leveraging the New change to meet with the Carrier Grade Requirements of NFV

  4. Raj Savor – Performance and Resiliency Considerations while scaling Virtualized Network Functions

  5. Steven Wright – Reliability Considerations in a Virtualized World

  6. Ying (Bob) C. Yeh – Airbus Fly-By-Wire Computers: A320-A350

  7. David Kinsey – ECOMP Enhancements on NFV-MANO

  8. Eileen Healy – IEEE SDN Initiative

  9. Spilios Makris – Reliability Challenges of Cloud-based Networks Used for World-Class Events

  10. Mike Tortorella – New Reliability and Performance Challenges in Rearrangeable Networks


David Lu, Vice President, Business Solutions Development, Technology Development, AT&T Services, Inc. USA

David, Vice President – Business Solutions Development, is responsible for Global Service Assurance, Network Fault and Rule Based Automation Platforms, Network Performance/Traffic/Capacity/Testing Management Platforms, Managed Services Platforms, and Field Operation Dispatching at AT&T. He leads an organization with more than 3,000 people across the globe.

David is a well-respected leader in software architecture and engineering, network performance and traffic management, business solutions, large DB and big data implementation/mining/analytics, software reliability and quality, and network operations process engineering.

Since joining AT&T Bell Labs in 1987, he has served in various leadership positions at AT&T. He holds 35 patents and has frequently appeared as a guest speaker at technical and leadership seminars and conferences throughout the world. He received numerous industry awards including the 2015 Chairman’s Award from IEEE Communication Society for Network and Systems Quality and Reliability.

He was admitted to the world renowned Shanghai Conservatory of Music but came to US to complete his college. He has an undergraduate degree in music, majoring in cello performance and graduate degree in Computer Science.

Topic: Reliability Challenges in the Software Defined Everything Network


Martin Taylor, Chief Technical Officer, Metaswitch Networks, USA

Martin Taylor is chief technical officer of Metaswitch Networks. He joined the company in 2004, and headed up product management prior to becoming CTO. Previous roles have included founding CTO at CopperCom, a pioneer in Voice over DSL, where he led the ATM Forum standards initiative in Loop Emulation; VP of Network Architecture at Madge Networks, where he led the company’s successful strategy in Token Ring switching; and business general manager at GEC-Marconi, where he introduced key innovations in Passive Optical Networking. Martin has a degree in Engineering from the University of Cambridge. In January 2014, Martin was recognized by Light Reading as one of the top five industry “movers and shakers” in Network Functions Virtualization.

Topic: Telco-grade Availability from an IT-grade Cloud

Abstract

No current cloud solutions promise better than 99.95% availability, but network operators deploying virtualized network functions are, quite reasonably, unwilling to relax their traditional requirement that services they deliver from their virtualized network should achieve five-nines availability.  In this session, Martin Taylor explores the different dimensions of network service availability requirements and explains how those requirements can be met using IT-grade cloud technology with the right approach to VNF design, cloud design, deployment practice and orchestration.


Gengui Xie, Vice President of R&D Competence Center, Huawei Technologies, China

Gengui Xie joined Huawei in 1996 and has more than 20 years experience in telecom area. He is a leading expert on network management system and design for RAS (Reliability, Availability and serviceability). Gengui is currently the VP of Huawei R&D  Competence Center responsible for designing product architecture, reliability, serviceability, energy saving and emission reduction, technical planning and solution, etc.

Gengui graduated from South-East University in China.

Topic: Leveraging the New change to meet with the Carrier Grade Requirements of NFV

Abstract

NFV introduced new technologies into telecom industry, such as COTS, Virtualization, decoupling, etc. these new changes will bring big challenges for network and service reliability while bring  dramatic benefit for industry. In this presentation some of our understanding and practice on leveraging the new changes for high reliability and availability will be discussed instead of just addressing the reliability challenges.


Raj Savoor, Assistant Vice President, AT&T Labs, USA

Raj Savoor is Assistant Vice President at AT&T Labs with over 25 years industry experience. Raj leads a department responsible for Domain 2.0 systems engineering modeling and analysis of network performance including design of instrumented network software solutions across AT&T’s Wireline and Wireless networks. Raj has over 40 awarded patents and is recipient of the AT&T Labs Science and Technology Medal for Innovation in Performance Management for Access Technologies. Raj has a M.S in Computer Science from the University of California at Davis.

Topic: Performance and Resiliency Considerations while scaling Virtualized Network Functions

Abstract:

Performance and resiliency have been cornerstone requirements for carrier grade network functions. SDN controlled cloud platform architectural principles of open source use, homogeneity, reusability and optimal asset utilization complement the traditional carrier grade network principles. To comply with all these objectives, VNF software design needs to take into consideration several new criteria. This criteria includes deterministic workload categorization, disaggregation or microservice support in VNFs, VNF software efficiency estimation and measurability. This talk will cover some of the foundational steps in scaling VNFs for performance and resiliency.


Steven Wright, MBA, PhD, JD., Chair, ETSI NFV Industry Specification Group (ISG)

Dr Wright has negotiated technology agreements  at organizations including ITU-T (a UN agency), NTIA, IEEE, IETF/ISOC, ATIS, ETSI, P4PWG, CEA, OIF. He has also conducted research and developed IP network architectures considering technologies including MPLS, QoS, IPTV and access infrastructure.

Dr. Wright is a named inventor in more than 40 US patents.  He has taught as adjunct university faculty; presented his research at international conferences in Europe, Asia, Australia and the USA; and continues to serve on committees for a several conferences.  His peer-reviewed publications range across topics including Quality of Service APIs, Open Source,  Online Games, Network Architecture, Marketing Professional Services, Investment Risk Analysis,  Internet Domain Name Trademarks, and Patent Licensor’s Obligations during Bankruptcy.  He is a member of the State Bar of Georgia. He also received an Award of Merit from BSA for service as a Scoutmaster.

Topic: Reliability Considerations in a Virtualized World

Abstract

This presentations provides an introduction to some of the key reliability concepts in the context of deployments of virtualized network functions.


Dr. Ying Chin (Bob) Yeh, IEEE Fellow; Technical Fellow, Boeing Commercial Airplanes

Ying Chin (Bob) Yeh joined Boeing Commercial Airplanes  in 1981, and has been on Boeing Fly-By-Wire (FBW) computers development programs (FFM, 7J7, 777, 7E7, 787, 777X) since 1984, more specifically computer architectures and redundancy management scheme to achieve/certify safety critical electronics systems with E-10 per hour functional integrity and availability.

He received his Ph.D. from University of Ottawa; M.S. from National Taiwan University, Taiwan; B.S. from National Cheng Kung University, Taiwan, all in Electrical Engineering.

Bob is an IEEE Fellow for contributions to ultra-reliable real-time embedded system, and a member of IFIP (international federation for information processing) Working Group 10.4: Dependability and Fault Tolerance.

Topic: Airbus Fly-By-Wire Computers: A320-A350

Abstract

Airbus develops the first commercial airplane fly-by-wire (FBW) computers in the 1980th with certification of A320 airplane in 1988. The Airbus FBW computers design philosophy can be stated below:

The A320 FBW Computers are evolved to A330/A340, A380, and A350, with similar design philosophy, while with changes of microprocessors, compilers, programming languages, design tools, and data networks.

Readers can appreciate that with same certification requirements, there are much different between two competing FBW (Computers) Systems – Airbus vs Boeing.


Steven Wright, MBA, PhD, JD., Chair, ETSI NFV Industry Specification Group (ISG)

Dr Wright has negotiated technology agreements  at organizations including ITU-T (a UN agency), NTIA, IEEE, IETF/ISOC, ATIS, ETSI, P4PWG, CEA, OIF. He has also conducted research and developed IP network architectures considering technologies including MPLS, QoS, IPTV and access infrastructure.

Dr. Wright is a named inventor in more than 40 US patents.  He has taught as adjunct university faculty; presented his research at international conferences in Europe, Asia, Australia and the USA; and continues to serve on committees for a several conferences.  His peer-reviewed publications range across topics including Quality of Service APIs, Open Source,  Online Games, Network Architecture, Marketing Professional Services, Investment Risk Analysis,  Internet Domain Name Trademarks, and Patent Licensor’s Obligations during Bankruptcy.  He is a member of the State Bar of Georgia. He also received an Award of Merit from BSA for service as a Scoutmaster.

Topic: Reliability Considerations in a Virtualized World

Abstract

This presentations provides an introduction to some of the key reliability concepts in the context of deployments of virtualized network functions.


Eileen Healy, Co-Chair, IEEE SDN Initiative

Eileen Healy is Co-Chair of the IEEE SDN Initiative, a worldwide program addressing the main softwarization aspects of SDN, NFV, and 5G. She is currently the CEO of Healy & Co, a leading-edge engineering services company. Working with U.S. telecom network operators her company supports the implementation of capacity delivery, network migration and business planning services. She also founded and sold Telecompetition Inc., a market research and data analytics company that passed data specific strategic insights into market adoption of a host of mobile services products.  Eileen has held senior positions in several telecommunications companies including Pacific Bell Mobile Services and AT&T. Eileen has honed a collaborative style cultivating trust management while facilitating key group contributions to strategic technology development. She obtained a B.S. degree in Electrical Engineering from University of California at Berkeley.

Topics: IEEE SDN Initiative


Michael Tortorella, PhD, Managing Director, Assured Networks

Dr. Tortorella is a leading communications industry expert in reliability management, engineering, modeling, and life data analysis.  Over a 26-year career at Bell Laboratories he was responsible for research and implementations in fundamental system, network, and service reliability engineering methodologies as well as for management of reliability in such critical projects as the SL-280 undersea cable system, the world's first application of fiber-optic technology in an intercontinental, undersea system.  He played a major role in many AT&T and Lucent product and service reliability studies, culminating in the creation of CADRE, a reliability modeling system for circuit packs that encompasses circuit simulation, thermal analysis, and uncertainty modeling in a single package fully integrated with computer-aided design systems used for circuit pack creation.

Formerly technical manager and a Distinguished Member of Technical Staff in the Design for Reliability Processes and Technologies Group in Bell Laboratories, Dr. Tortorella is now a research professor of industrial and systems engineering at Rutgers University.  In addition to teaching courses in operations research and statistics, he maintains a robust research program that has direct impact on the concerns of the CQR.  This program includes investigations into how the stochastic flows in an IP network determine the performance and reliability of services carried on those networks, design for network resiliency, developing modeling frameworks for control of IP networks under stressed conditions, and foundational issues in queueing theory.  Additional current research interests include stochastic flows, network performance, management, and control, stochastic processes and their applications to reliability, life data analysis, and next-generation networks, as well as design for reliability methods and technologies.  Dr. Tortorella has published extensively in these areas.  He received the Ph. D. degree in mathematics from Purdue University in 1973.  He is Advisory Editor for Quality Technology and Quantitative Management, where he has worked to increase the number of publications pertaining to the communications industry.  His recently written book, Reliability, Maintainability, and Supportability: Best Practices for Systems Engineers has just been published by John Wiley and Sons.

Topics: New Reliability and Performance Challenges in Rearrangeable Networks

Abstract:

Telecom networks are no longer fixed structures.  Software-defined networking (SDN) causes deliberate rearrangements in response to changing traffic or other network management demands.  Failures and repairs cause changes in capacity of network elements (routers and transport links).  Recognizing that packet delay and packet loss are the primary drivers of service reliability, it is necessary to understand how network rearrangements affect the flow of packets.  This talk discusses new reliability and performance challenges posed by rearrangeability and introduces some ideas for quantitative modeling of flows in rearrangeable networks.  We review flows in transportation networks as a simple introduction and move on to discuss briefly queueing network models and the Gale-Hoffman theorem in this context.


Panel Moderator: David Kinsey, Lead Architect, Domain 2.0 Architecture & Design, AT&T Labs

David has been in software development for over 31 years. Over 25 years in communications, 22 of those in telecommunications. David’s career encompasses the SDLC from planning through integration test. This was done on applications in the following domains: large scale databases, military communications systems, cryptographic and key management systems, submarine warfare systems, and telecommunications. In telecommunications he has worked on applications for call center software, fraud management systems, CDR collection and analysis, performance statistics collection and analysis, marketing campaign management, and mediation solutions which included revenue assurance. Between 2004 and 2014 David supported AT&T in architecture by assessing the technology and network realization process of the 2G/3G and 4G networks in order to determine OSS/BSS impacts and providing roadmap planning for tools and business processes. Since 2014 David has been involved in developing the framework for automating the management of AT&T activities for SDN/NFV.

Topic: ECOMP Enhancements on NFV-MANO

Abstract:

There are striking differences between the ETSI NFV-MANO model and the Enhanced Control, Orchestration, Management, & Policy (ECOMP) model published by AT&T. In this session David Kinsey compares these differences and discusses why they were made in order to improve network quality and resiliency.


Roundtable Chair: Spilios Makris, PhD, Director, Palindrome Technologies, USA and Chairman of the IEEE SRPSDVE Study Group

Spilios Makris is currently the Director of Network Resilience and Business Continuity Management (BCM) in Palindrome Technologies. Spilios has extensive experience in BCM and network resilience serving as Director and Senior Consultant at Telcordia Technologies (formerly Bellcore) for over 28 years, conducting studies and developing methodologies along with industry Best Practices for over 50 Tier 1&2 telecom companies, telecom vendors, and Telecom Regulatory Authorities (TRAs) worldwide. Spilios has served as Chair, Vice-Chair, Lead Contributor of the Standards T1A1.2 WG on "Network Survivability Performance” (was renamed PRQC Reliability Task Force) for 20 years. He successfully managed the development and regular update of Telcordia Generic Reliability Requirements documents establishing them as the “de facto” industry standards (e.g., SR-332 on Reliability Prediction Procedure for Electronic Equipment).

Spilios recently served as the Chair of the IEEE Study Group for Security, Reliability, and Performance for Software Defined and Virtualized Ecosystems (e.g., SDN, NFV, etc.). (http://grouper.ieee.org/groups/srpsdv/meeting_information.html).

Spilios received his PhD in Industrial Engineering & Operations Research from the University of Massachusetts at Amherst, Mass., MS in Engineering Management from Northeastern University, Boston, Mass., and Diploma (equiv. to MS) in Electrical & Mechanical Engineering from the National Technical University of Athens, Greece.

He is a Certified Business Continuity Professional (CBCP) by the Disaster Recovery Institute International (DRII) and a Senior Member of IEEE.

Topic: Reliability Challenges of Cloud-based Networks Used for World-Class Events
            (
Study Case: The first-ever, cloud-based network used for the 2015 European Games in Baku, Azerbaijan)

Abstract

There is a growing concern within the telecom community and especially the Organizing Committees of world-class events (e.g., Olympic Games, European/Asian/Pan-American Games, etc.) regarding the reliability of cloud-based telecommunications networks, including the services provided under failure conditions.  This presentation will give a high-level description and the reliability challenges of the first-ever, cloud-based network used for the 2015 European Games in Baku, Azerbaijan.  After all, it served as a "network experiment" to see if it is a viable network design for future world-class events.


Michael Tortorella, PhD, Managing Director, Assured Networks

Dr. Tortorella is a leading communications industry expert in reliability management, engineering, modeling, and life data analysis.  Over a 26-year career at Bell Laboratories he was responsible for research and implementations in fundamental system, network, and service reliability engineering methodologies as well as for management of reliability in such critical projects as the SL-280 undersea cable system, the world's first application of fiber-optic technology in an intercontinental, undersea system.  He played a major role in many AT&T and Lucent product and service reliability studies, culminating in the creation of CADRE, a reliability modeling system for circuit packs that encompasses circuit simulation, thermal analysis, and uncertainty modeling in a single package fully integrated with computer-aided design systems used for circuit pack creation.

Formerly technical manager and a Distinguished Member of Technical Staff in the Design for Reliability Processes and Technologies Group in Bell Laboratories, Dr. Tortorella is now a research professor of industrial and systems engineering at Rutgers University.  In addition to teaching courses in operations research and statistics, he maintains a robust research program that has direct impact on the concerns of the CQR.  This program includes investigations into how the stochastic flows in an IP network determine the performance and reliability of services carried on those networks, design for network resiliency, developing modeling frameworks for control of IP networks under stressed conditions, and foundational issues in queueing theory.  Additional current research interests include stochastic flows, network performance, management, and control, stochastic processes and their applications to reliability, life data analysis, and next-generation networks, as well as design for reliability methods and technologies.  Dr. Tortorella has published extensively in these areas.  He received the Ph. D. degree in mathematics from Purdue University in 1973.  He is Advisory Editor for Quality Technology and Quantitative Management, where he has worked to increase the number of publications pertaining to the communications industry.  His recently written book, Reliability, Maintainability, and Supportability: Best Practices for Systems Engineers has just been published by John Wiley and Sons.

Topics: New Reliability and Performance Challenges in Rearrangeable Networks

Abstract:

Telecom networks are no longer fixed structures.  Software-defined networking (SDN) causes deliberate rearrangements in response to changing traffic or other network management demands.  Failures and repairs cause changes in capacity of network elements (routers and transport links).  Recognizing that packet delay and packet loss are the primary drivers of service reliability, it is necessary to understand how network rearrangements affect the flow of packets.  This talk discusses new reliability and performance challenges posed by rearrangeability and introduces some ideas for quantitative modeling of flows in rearrangeable networks.  We review flows in transportation networks as a simple introduction and move on to discuss briefly queueing network models and the Gale-Hoffman theorem in this context.


 Roundtable Advisor: Chi-Ming Chen, PhD, AT&T Labs, USA

Chi-Ming Chen joined AT&T in 1995. His current responsibility is the operations support system (OSS) architecture. Prior to joining AT&T, Chi-Ming was with Bell Communications Research (Bellcore) from 1985 to 1995. He was a faculty member at Tsing Hua University, Hsinchu, Taiwan from 1975 to 1979.

He received his Ph.D. in Computer and Information Science from the University of Pennsylvania in 1985; M.S. in Computer Science from the Pennsylvania State University in 1981; M.S. and B.S. in Physics from Tsing Hua University, Taiwan, in 1973 and 1971 respectively.

Chi-Ming Chen is a Life Senior Member of IEEE and Senior Member of ACM. He is an Advisory Board Member of IEEE Communications Society (ComSoc) Technical Committee on Communications Quality & Reliability (CQR), a member of the IEEE GLOBECOM & ICC Management & Strategy (GIMS) Standing Committee, and  a member of the Industry Content and Exhibits Committee (ICEC). He has chaired several GLOBECOM and ICC Industry Forums and served as an IF&E (Industry Forum & Exhibits) Advisor for ICC 2015 and GLOBECOM 2015.


Resource Links:


Last updated on Tuesday, May 17, 2016