Site Reliability Engineer

2024-04-04
Americas
Planet Argon
Planet Argon provides dependable support and maintenance of existing Ruby on Rails apps for a variety of clients in different industries. We take care of small feature updates, bug fixes, and performance improvements.
We are currently looking for an experienced Site Reliability Engineer that can provide part-time (15-20 hours/week) support for our clients and our team on a contract basis. The relationship would start as a three-month contract engagement; should both parties want to continue working together after those 90 days, we can agree to either a longer-term contract or a month-to-month arrangement.
Learn more about Planet Argon here
We're looking for contractors that embody our core values:

PROACTIVE - We actively seek opportunities to improve our client’s products, our processes, and our abilities.
CURIOUS - A natural curiosity for the undiscovered results in remarkable work for our clients – and stronger connections for our team. We ask questions, learn, and aren't afraid to fail.
DEPENDABLE - We are invested in our work. We manage expectations. We support our clients and teammates. We hold ourselves, our teammates, and our clients accountable.
VERSATILE - We readily adapt to change and encourage innovation because our team and work are transparent and flexible.
DELIGHTFUL - We choose to set a mindful, positive tone that allows everyone to flourish.

Requirements
As our Part-Time Contract Site Reliability Engineer, you will play a pivotal role in maintaining the resilience and optimal performance of our clients’ software systems. Your responsibilities will include responding to and resolving application outages and other incidents, identifying their root causes, and devising robust solutions to prevent their recurrence. You will assist team members in interpreting system monitoring alerts and messages, and actively participate in retrospectives to analyze incidents and develop strategies to avoid future issues. Additionally, you will integrate monitoring tools into client applications, apply patches and system upgrades across various client environments, and debug build failures in CI/CD pipelines.
To excel in this role, you should have experience in identifying and addressing scalability issues related to system architecture and in pinpointing security vulnerabilities in servers. Proficiency in scripting and automation with languages like Python or Bash is essential, as is experience with configuration management tools such as Terraform, Ansible or Chef. A strong understanding of Linux/Unix systems, networking fundamentals, and experience with cloud platforms like AWS or Google Cloud are crucial. You should also be adept at auditing third-party services and tools for efficiency and cost-effectiveness. While not mandatory, familiarity with Ruby on Rails will be considered an advantage.
Role responsibilities will include:

Responding to application outages and other incidents, identifying root causes, and implementing solutions to prevent recurrence
Helping other team members understand system monitoring alerts and messages
Taking part in retrospectives after incidents to document what went wrong and make plans how best to avoid those situations in the future
Integrating monitoring tools into client applications
Patching and applying system upgrades across different client systems
Debugging build failures in CI/CD pipelines
Managing database migrations
Managing deployments and CI/CD pipelines across multiple projects for various clients


The idea contractor has an understanding of and experience with:

identifying and solving scalability issues having to do with poor system architecture
identifying security vulnerabilities in servers
scripting and automation using languages such as Python or Bash
containerization tools like Docker
configuration management tools like Terraform, Ansible or Chef
Linux/Unix systems and networking fundamentals
cloud platforms such as AWS or Google Cloud
auditing third-party services and tools for efficiency and cost-effectiveness
Pingdom, Bugsnag, Rollbar, New Relic, Honeycomb, and other monitoring tools
Rspec, Test Unit, Cypress, and other testing tools
SQL databases
CI/CD tools such as CircleCI, Github Actions, GoCD, Jenkins, and Travis
Familiarity with Ruby on Rails is a plus!


Availability and Location
The ideal contractor is comfortable working remotely within 1-2 hours of EST timezone.
South and Latin American contractors are highly encouraged to apply.
This is a remote part-time contract position. The ideal candidate will be available for periodic meetings during EST business hours.
Benefits
Our ideal contractor has an hourly rate of $45.00 - $65.00 USD.
We address monthly or bi-weekly invoices within 10 business days.
If you're passionate about ensuring the reliability and availability of high-profile applications and working with a team of skilled developers, we'd love to hear from you!


About the company

Founded in 2002, Planet Argon helps companies with existing Ruby on Rails applications make them better and more maintainable.