Job description

  • Cloud release engineer will have primary focus on cloud environment support, build automation and developer productivity.
  • Develop, implement, optimize, and maintain cloud-based solutions.
  • Adheres to architecture, design, implementation, and security standards and best practices.
  • Implement and manage cloud network, servers, interconnection and interfacing such as IPSEC, Site-to-Site, and BGP.
  • Implement, maintain and optimize cloud security capabilities (like WAF, GuardDuty, Security Groups, IAM, etc.).
  • Reduce mean time to identify (MTTI) by helping teams create dependency.
  • Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.
  • Improve mean time between failures (MBTF) by helping teams define SLI/SLOs and prioritize proactive investment tasks.
  • Diligently observe and interpret cloud monitoring dashboards and alerts.
  • Resolve basic issues; escalate urgent and complex issues. Know the difference.
  • Coordinate with internal departments and sometimes with customers.
  • Be aware of customer SLA’s and escalate issues if cases are taking too long to resolve.
  • Document all troubleshooting and issue management actions via the electronic case management system.
  • Schedule resources and arrange work effort to cover weekends and/or on-call shifts to be available to be engaged in response to system-initiated alerts.
  • Monitoring case backlog to ensure we meet agreed SLAs with customers and internal KPI targets; share regular status reports with stakeholders.

Minimum Education:

  • Bachelor’s degree in Computer Science, Electronic Engineering, Computer Engineering, or related field

Key Skills and Competencies:

  • Good practical Linux / Windows-based systems administration skills in a Cloud or Virtualized environment.
  • Experience with webservices, databases, APIs, serverless and custom application deployments in cloud platforms (AWS, Azure, and/or GCP).
  • Strong hands-on experience with network, storage and compute configuration and setup on cloud platforms (AWS and Azure).
  • Hands on implementation in Terraform IaC deployments.
  • Understanding of the following monitoring concepts: Infrastructure, systems, and Application health, system availability, latency, performance, and end-to-end monitoring.
  • Entry level cloud, network or security certificate from Cisco, Microsoft, AWS, CompTIA, or other well-known vendors, preferable.
  • Knowledge and experience of SIEM, ELK, PLG, and container orchestration platforms like Kubernetes are preferred
  • Cyber Security related Certifications like Security , SSCP, CCSP, and CEH.
  • Knowledge of query, and scripting languages, including Python, PromQL, SQL and familiarity with REST API calls, and PowerShell.
  • Experience (1 years) with ITIL processes including Incident, Problem, Change, Knowledge and Event Management.
  • Minimum of 6 years experience including experience working in a lead role.

Personal Attributes:

  • Excellent communication skills (Written & Verbal) – should be able to communicate with various levels in the organization
  • Excellent people and stakeholder management skills
  • Analytical and problem-solving skills.
  • Analytical and Logical Thinking.
  • Open to travel between locations.