Site Reliability Engineer – Kubernetes


Website Application Development

*please send resumes to*


Position: Site Reliability Engineer – Kubernetes

Contract to Hire

Location: Charlotte, NC 28202


Note: Looking for Site Reliability Engineer – Kubernetes, Splunk, Linux Scripting, CI/CD tools. Contract to hire if eligible.



Site Reliability Engineer – Kubernetes

The Site Reliability Engineer – Kubernetes is responsible for all application environments from development to production. The ideal candidate should have hands on experience learning, triaging (both proactive and reactive) and documenting application stacks, using monitoring tools (Splunk, AppDynamics, UI-session replay, Sentry, and/or others) and have expert-level proficiency in at least one area of the following: Kubernetes or AWS Serverless. They should understand web traffic movement through all layers of infrastructure including HTTP, CDNs, load balancers and firewalls.

The Site Reliability Engineer will partner with application development and API teams to gain understanding of the application stacks, triage environment issues, design monitoring methods, and provide reporting to executive leadership Will be critical part of an SRE team which will be the single point of contact for our Agile development and product teams regarding all application reliability, performance and environment issues.


• Administration and tuning of web applications on our private cloud platform [Red Hat OpenShift (Kubernetes)] • Partner with the Agile development teams to learn and assume responsibility for documentation, logging, and monitoring for various systems
• Partner with DevOps on CI/CD improvements using Bitbucket, Jenkins, OpenShift & AWS
• Implementation of monitoring on various online applications using solutions such as Splunk, UI-session replay, AppDynamics, etc. and ability to determine the right toolset to accomplish monitoring goals on net new application stacks
• Strong knowledge of custom alerts and ability to integrate data housed in disparate data sources to create workflow driven alerting
• Continuously tune and validate quality of current tools for network, system monitoring, UI-session replay, log file parsing, and implement a toolkit that works


!!!! MUST HAVE hands-on experience running web applications and/or APIs on Kubernetes OR AWS Serverless !!!!

• Must have expert level knowledge of:
o Kubernetes
• Must have some experience with:
o Leading Triages
o Monitoring tools (Splunk, AppDynamics, and/or others)
o SQL, Linux, Scripting, file manipulation, reporting and Visio
o Big data elements like server logs, user URL's, etc
o CI/CD tools such Bitbucket, Jenkins, OpenShift, & AWS CI/CD tools
o Supporting customer facing web applications
o Application Performance Monitoring (APM)
• Ability to work off-hours and/or weekends as needed

Additional Desired Knowledge & Skills:

• Experience with complex multi-system environments
• Working knowledge of Agile methodologies (Scrum, Kanban, Lean, XP)
• Experience supporting hybrid server environments (on-premise, AWS, Azure, etc.)
• Good understanding of financial industry operations metrics and reporting practices a plus
• Passion, positive attitude, engagement and desire to take over challenging assignments as part of a team to make things WORK

Upload your CV/resume or any other relevant file. Max. file size: 300 MB.

Recent Posts