Senior Site Reliability Engineer Manager - Bucharest

Senior Site Reliability Engineer Manager
Bucureşti
Bucharest, Bucharest, Romania

Overview

Come build and maintain the world’s computer as a member of the Microsoft Capacity Infrastructure Services team in Azure Core. The team ensures new servers are brought online (capacity buildout/provisioning) to enable Azure customers to leverage the latest offerings, see the illusion of infinite capacity, and grow the Azure business efficiently at hyperscale. You’ll also complete the cycle by safely taking old capacity offline (decommissioning/deprovisioning) and provisioning new capacity again in its place thus ensuring the cloud remains healthy and current.

As a Senior Site Reliability Engineering Manager, you’ll grow your team of site reliability engineers and service engineers to work with a breadth of partners across Microsoft including developers in service teams, hardware engineers, network engineers, datacenter technicians, supply chain managers, and business leaders to rapidly debug and resolve issues delaying the carefully orchestrated buildout and decommissioning sequences. You’ll drive continuous improvements with these teams to prevent repeats and address common classes of issues across the Azure software stack through design reviews and problem management.

This opportunity will enable you to learn unparalleled
- wide knowledge of how the Azure cloud is built and maintained while growing your people management skillset. The contacts you make with experts will enable you to deep dive on services and new technologies and partner for improvements. You’ll be stretched to automate mitigations tactically to cloud scale and strategically analyze data to identify problem areas for driving improvements to meet business needs.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

Technical experience in software engineering, network engineering, or systems administration
OR Bachelor's Degree in Computer Science, Information Technology, or related field AND technical experience in software engineering, network engineering, or systems administration
OR Master's Degree in Computer Science, Information Technology, or related field AND technical experience in software engineering, network engineering, or systems administration

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: 

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

Technical experience in software engineering, network engineering, or systems administration
OR Doctorate Degree in Computer Science, Information Technology, or related field

Technical experience working with
- scale cloud or distributed systems

People management experience

#azurecorejobs

Responsibilities

Demonstrates
-
- end expertise in distributed systems design, interactions between cloud technology layers and components, functions of physical network devices, and dependencies at scale. Drives efforts within an organization to identify and recommend optimal configurations of cloud technology solutions and develops or modifies the code base that defines infrastructures to improve the reliability and operability of supported products.

Develops
-
- end technical expertise in the architecture, code, features, and operations of specific products as required to implement improvements in product availability, reliability, efficiency, observability, and/or performance. Drives code/design reviews with the engineering teams that develop and/or manage those products and shares learnings and recommendations across engineering teams working on related products within their organization.

Researches and maintains deep knowledge of industry trends and advances in
- scale distributed systems and cloud technologies; manages efforts to research, develop, implement, and optimally utilize new tools, technologies, and/or processes to solve ambiguous problems and improve the availability, reliability, efficiency, observability, and/or performance of their team's supported products. Monitors the implementation of new tools, technologies, and processes as well as their impact on reliability, efficiency, observability, and/or performance to make recommendations for broader adoption within an organization.

Manages partnerships between Site Reliability Engineering (SRE) and product engineering teams to identify and implement changes to the code base to improve availability, reliability, efficiency, observability, and performance of related sets of products within an organization. Reviews and provides feedback on recommendations provided by SREs and ensures they have the technical expertise and data to justify and gain
- in for their recommendations from product teams and owners.

Drives, and contributes to, the development of automation tools to reliably automate moderately complex but repetitive operations processes (e. g. , monitoring, alerting, deploying products and updates, debugging) at scale within an organization; reviews existing and newly developed automation tools to evaluate and provide feedback on reusability, extendibility, and scalability. Ensures automation tools and systems developed within an organization are tested and the impact of their deployments is monitored.

Oversees a team of Site Reliability Engineers (SREs) using existing tools and/or models to identify contributing factors and points of failure affecting availability, reliability, performance, and/or efficiency of systems, platform, and/or products; provides guidance, recommendations, and feedback to SREs to help them troubleshoot problem and to identify and test scalable solutions that can prevent the occurrence of similar issues in related products within their organization.

Participates in
- call rotations and manages teams of Site Reliability Engineers (SREs) responding to incidents during regular
- call rotations to identify the level of impact, troubleshoot issues, and deploy appropriate fixes to resolve root cause(s) and prevent recurrence across related products. Ensures that SREs within an organization have the technical knowledge and resources required to respond to incidents, that relevant engineering teams, stakeholders, leaders are alerted to customer impacting issues, major issues are escalated to other teams as needed, and that key details related to incidents and their resolution are shared through
- mortem reports and during regular review meetings.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work. Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Detalii privind locul de muncă:

Firma:	Microsoft
Localiția:	Bucureşti Bucharest, Bucharest, Romania
Adăugat:	2. 7. 2024 Postul de muncă activ

Răspunde la anunț
Fii primul, care se va înregistra la oferta de muncă respectivă!

De ce să cauți de muncă pe Lucrezi.ro?

	În fiecare zi oferte noi de muncă
	Puteți alege dintr-o gamă largă de locuri de muncă: Scopul nostru este de a oferi o gamă cât mai largă de opțiuni
	Lasă să-ți fie trimise noile oferte prin e-mail
	Fii primul care răspunde la noile oferte de muncă
	Toate ofertele de muncă într-un singur loc (de la angajatori, agenții și alte portaluri)
	Toate serviciile pentru persoanele aflate în căutarea unui loc de muncă sunt gratuite
	Vă vom ajuta să găsiți un nou loc de muncă

Alertă job

Senior Site Reliability Engineer Manager Bucharest

Senior Site Reliability Engineer Manager
Bucureşti
Bucharest, Bucharest, Romania

Detalii privind locul de muncă:

Locuri de muncă noi pe e-mailul tău

Locuri de muncă Bucureşti - locuri de muncă interesante în împrejurimi:

Site Manager - Bucharest, Bucharest, Romania - Antal International

RWE Associate Site Manager (French speaker) - Bucharest, Bucharest, Romania - IQVIA

Senior IT Site Administrator - Bucharest, Bucharest, Romania - HARMAN International

Site Commercial / Comercial de santier - Bucharest, Bucharest, Romania - STRABAG S.R.L.

Senior Site Reliability Engineer - Bucharest, Bucharest, Romania - Sirius XM

Site Integration Support - Bucharest, Bucharest, Romania - Experis IT

Site Design Engineer - Bucharest, Bucharest, Romania - Adecco

Site reliability engineer - Bucharest, Bucharest, Romania - SkillValue

Site Reliability Engineering Manager - Bucharest, Bucharest, Romania - Playnetic

Site Quality Head - Bucharest, Bucharest, Romania - Teva Pharmaceuticals

Senior Site Reliability Engineer Manager - Bucharest, Bucharest, Romania - Microsoft

Site Reliability Engineer - Bucharest, Bucharest, Romania - Trust In Soda

Site Reliability Engineer @ING Bank - Bucharest, Bucharest, Romania - ING

Site Integration Support - Bucharest, Bucharest, Romania - Eteam Workforce Limited

Site Integration Support - Bucharest, Bucharest, Romania - J & C Associates Ltd

Bucureşti: Alte locuri de muncă în orașul tău

locuri de muncă

Senior Site Reliability Engineer Manager Bucharest

Senior Site Reliability Engineer ManagerBucureştiBucharest, Bucharest, Romania

Detalii privind locul de muncă:

Locuri de muncă noi pe e-mailul tău

Locuri de muncă Bucureşti - locuri de muncă interesante în împrejurimi:

Bucureşti: Alte locuri de muncă în orașul tău

Senior Site Reliability Engineer Manager
Bucureşti
Bucharest, Bucharest, Romania