Redwood City, CA, USA
17 days ago
Sr Principal SRE

About Oracle SaaS Cloud SRE

Oracle SaaS Cloud SRE plays a critical role in delivering and supporting best-of-breed cloud solutions to Oracle customers.

Oracle Cloud is the industry's broadest and most integrated public cloud. It offers best-in-class services across software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS), and even lets you put Oracle Cloud in your own data center. Oracle Cloud helps organizations drive innovation and business transformation by increasing business agility, lowering costs, and reducing IT complexity.

Our team delivers cross-team visibility and execution on the most challenging reliability issues impacting Oracle's SaaS customers. We engage deeply with service owners and stakeholders to deeply understand and improve critical issues that impair service experience, and those that create waste.

We devise and execute programs that measure and improve the health of the service. This involves identifying key metrics and thresholds that indicate health or problems, collecting those metrics and threshold breaches and the supporting data, analyzing and classifying those events, assigning them to cases, identifying the product development and operational work necessary to resolve them, and tracking that resolution.

About the Job

A unique opportunity to join a rapidly growing world-class team to improve Oracle Fusion applications and technologies that make up the Oracle Cloud solutions. As part of the SRE team, you will be continually challenged and have an opportunity to contribute to the Oracle Cloud success every day, working closely with our development partners.

As a Site Reliability Engineer, you will discover and analyze patterns of performance and other reliability problems, and propose improvement opportunities and approaches. You will apply your deep experience building and delivering enterprise business applications with Oracle Fusion and RDBMS technologies.

You will work with others to devise and implement programs to develop and improve Service Level Indicators and Objectives (SLIs and SLOs), to measure performance against them, and identify and track product improvement and service improvement opportunities, and their benefits to customers and to Oracle.

What You'll Do

Explore and analyze patterns of performance and other reliability issues in the Fusion Applications SaaS product and service Engage with Development and Service partners to identify improvement opportunities for SaaS and propose solution approaches in Fusion Applications, Fusion Middleware and Oracle RDBMS Contribute to definition of SLIs and SLOs focused on the Fusion Applications product, stack, and service Telemetry data acquisition and ingestion Analyze metrics and SLO breaches Present data and analysis to various audiences including executives Service Accountability –You will be part of the SRE team, whose mission is the shared full stack reliability of a collection of services and technology areas, with our Development partners. Broad Interests - The SRE domain is an unusually broad one in terms of skills needed and exercised. SREs are a rare mix of sysadmins and Development Engineers, and as such, have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems. They are driven by professional curiosity, and a desire to develop deep understanding of their services and their dependencies. Cross-team collaboration – You will engage with and present to a wide variety of audiences, ranging from individual contributors and teams to executive leadership. You will empathize and interact with these personas on their terms is a useful capability.

What You Need to Have

A BS or MS in Computer Science, or equivalent

Knowledge of:

Oracle RDBMS, SQL and PLSQL development Systems thinking Large scale applications and deployments Excellent written and verbal communication skills Service Reliability Engineering Methodical approach to troubleshooting complex problems Most importantly, the aptitude to be a good team player and the willingness to learn and implement new Cloud technologies as needed

The ideal candidate will understand:

Fusion Applications development, implementation, customization, and their tools and processes Oracle Fusion Middleware Oracle SOA, BPEL, ESS Complex SQL query development and tuning Oracle Database performance metrics and fluency to understand reports FMW Administration, including WebLogic and SOA General web application performance and reliability Metric data acquisition tools and techniques Defining and documenting technical architecture of complex and highly scalable products

Career Level - IC5

Confirm your E-mail: Send Email