Lead Cloud & Release Engineer (Latam only)
Job description
- Cloud release engineer will have primary focus on cloud environment support, build automation and developer productivity.
- Develop, implement, optimize, and maintain cloud-based solutions.
- Adheres to architecture, design, implementation, and security standards and best practices.
- Implement and manage cloud network, servers, interconnection and interfacing such as IPSEC, Site-to-Site, and BGP.
- Implement, maintain and optimize cloud security capabilities (like WAF, GuardDuty, Security Groups, IAM, etc.).
- Reduce mean time to identify (MTTI) by helping teams create dependency.
- Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.
- Improve mean time between failures (MBTF) by helping teams define SLI/SLOs and prioritize proactive investment tasks.
- Diligently observe and interpret cloud monitoring dashboards and alerts.
- Resolve basic issues; escalate urgent and complex issues. Know the difference.
- Coordinate with internal departments and sometimes with customers.
- Be aware of customer SLA’s and escalate issues if cases are taking too long to resolve.
- Document all troubleshooting and issue management actions via the electronic case management system.
- Schedule resources and arrange work effort to cover weekends and/or on-call shifts to be available to be engaged in response to system-initiated alerts.
- Monitoring case backlog to ensure we meet agreed SLAs with customers and internal KPI targets; share regular status reports with stakeholders.
Minimum Education:
- Bachelor’s degree in Computer Science, Electronic Engineering, Computer Engineering, or related field
Key Skills and Competencies:
- Good practical Linux / Windows-based systems administration skills in a Cloud or Virtualized environment.
- Experience with webservices, databases, APIs, serverless and custom application deployments in cloud platforms (AWS, Azure, and/or GCP).
- Strong hands-on experience with network, storage and compute configuration and setup on cloud platforms (AWS and Azure).
- Hands on implementation in Terraform IaC deployments.
- Understanding of the following monitoring concepts: Infrastructure, systems, and Application health, system availability, latency, performance, and end-to-end monitoring.
- Entry level cloud, network or security certificate from Cisco, Microsoft, AWS, CompTIA, or other well-known vendors, preferable.
- Knowledge and experience of SIEM, ELK, PLG, and container orchestration platforms like Kubernetes are preferred
- Cyber Security related Certifications like Security , SSCP, CCSP, and CEH.
- Knowledge of query, and scripting languages, including Python, PromQL, SQL and familiarity with REST API calls, and PowerShell.
- Experience (1 years) with ITIL processes including Incident, Problem, Change, Knowledge and Event Management.
- Minimum of 6 years experience including experience working in a lead role.
Personal Attributes:
- Excellent communication skills (Written & Verbal) – should be able to communicate with various levels in the organization
- Excellent people and stakeholder management skills
- Analytical and problem-solving skills.
- Analytical and Logical Thinking.
- Open to travel between locations.