IBM Cloud Site Reliability Engineer in TORONTO, Ontario

Job Description

Ready to grow your career in the cloud? Like the feeling that you are making a difference?

This is your chance to be an integral part of a dynamic team of talented professionals developing and deploying innovative, industry-leading, cloud-based software.

While at IBM, you’ll be expected to draw upon your technical skill set and make important contributions to our popular Business Analytics cloud offerings, collaborating daily in an agile development environment with an extended team of experienced cloud engineers.

The Cloud Site Reliability Engineer is a key role in the growing and dynamic IBM Business Analytics organization. This technical role is focused on developing and deploying cloud applications, automating wide ranges of operational tasks, problem-solving, and interfacing with product management, development teams’ company-wide, and end users, to solve complex problems.

Work Environment:

You will be part of a strong, modern team culture driven to create world-class development and deployment environments, delivering an industry leading user experience for our customers. You will be valued for your contributions in a rapidly growing organization with dynamic opportunities. Each day, you will attend daily team scrums and project meetings to make important contributions in the development and architecture of automated solutions to continue building and optimizing our cloud and deployment infrastructure.

Your passion for problem solving and simplifying complex tasks will have an immediate impact on our IBM Cloud offerings and you will have a true (and rewarding) taste of what it takes to deliver an industry-leading Software as a Solution offering.

Primary Responsibilities include but are not limited to:

  • Developing best-in-class deployment technology to balance velocity and reliability in the delivery of our Software as a Service (SaaS) offerings

  • Driving technical and architectural excellence across all our Business Analytics offerings

  • Supporting service development through the creation of software frameworks, capacity planning, design consulting, and deployment process improvements

  • Measuring service KPIs, providing live data on availability, performance and system health

  • Identifying novel solutions to challenging operational problems and developing interfaces between separate SaaS offerings

  • Documenting and sharing your experience, mentoring others

Skills and Attributes

  • Experience developing product pipelines to optimize the process of deploying Software as a Service (SaaS) within large-scale, cloud-based infrastructure

  • Understanding 12-factor app development and ability to recommend architectural changes to convert existing apps

  • Natural drive and proven ability automating various complex parallel tasks

  • Ability to propose, design, and develop solutions that scale

  • Relentless drive to eliminate toil through automation

  • Strong background in Unix/Linux administration

  • Passionate about learning new technologies

  • Keen troubleshooting skills and practiced agile development methodology

  • Familiarity with load balancing, geo routing, and proxying

  • Strong verbal and written communication skills, both internally and customer-facing.

This position will be based out of our downtown Toronto office.


Required Technical and Professional Expertise

  • Mastery of at least one programming language (Java, C/C++, C#, Go, Javascript)

  • Strong background in Windows and Unix/Linux administration

  • Solid understanding of either Powershell or Python

  • Network Appliances – Firewalls and Load Balancing

  • Keen troubleshooting skills and practiced agile development methodology

Preferred Tech and Prof Experience

  • Scripting languages (Ruby, Python, PERL, Shell)

  • Configuration management (Ansible, Chef, Rundeck)

  • Data Warehousing and Analysis (Cognos Analytics, Planning Analytics, Controller)

  • Virtualization and Container orchestration (Xen, Docker, Kubernetes)

  • Monitoring and logging tools (Nagios, QRadar, New Relic, Prometheus)

  • Continuous Integration platforms (Jenkins, TeamCity, Travis CI)

  • NoSQL databases, key-stores and other data-structure solutions (MongoDB, Redis)

  • Single sign-on solutions and the Security Assertion Markup Language (SAML) 2.0 standard

  • Source and project control (GitHub Enterprise, ZenHub)

  • Virtual application and web servers (Apache, NGINX, WebSphere, IIS)

EO Statement

IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.