Our client company offers a comprehensive suite of advanced solutions designed to empower marketers in the creation and execution of impactful campaigns while fostering customer loyalty.Our product offerings span across Messaging, Loyalty, and Experiences, catering specifically to the needs of enterprise-level clients. Headquartered in the United States (Nashville, Tennessee)
Our client operates globally, with offices across the US, Europe, Australia, New Zealand, Malaysia, India, South America, Central America, and Japan.
What You’ll Do
- Help build a Site Reliability Engineering culture by sharing your best practices, approaches, documentation, and code with other engineering teams
- Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually
- Troubleshoot complicated issues handling OS, Networking, Database in a cloud-based SaaS environment/on-premises environment and handle live production incidents,debug/troubleshoot application and infrastructure issues, follow and implement SRE best practices
- Monitor application performance, take steps to improve overall application performanceand stability and follow through with implementation
- Conduct system analysis, configuration management and develop improvements for system software performance, availability and reliability
- Work closely with software and QA engineers to ensure the system is responding properlyto non-functional requirements such as performance, security, and availability
- Document your system knowledge as you acquire it over time, create runbooks, andensure critical system information is readily available to those who need it
- Maintain and monitor deployments, orchestration, databases, and general backend infrastructure
- Keep up-to-date with security and proactively identify, diagnose, and solve complex security issues.
- Be part of an on-call rotation to support the global platform providing an excellent customer experience