49 day(s) ago

Site Reliability Engineer

Negotiable Salary


United States
English: Advanced, Native Speaker
Experience: 3+ years
Employment: Full-time

Scope AR

San Francisco
Scope AR is the pioneer of enterprise-class augmented reality solutions, delivering the industry’s only cross-platform AR tools for getting workers the knowledge they need, when they need it. The company is revolutionizing the way enterprises work and collaborate by offering AR tools that provide more effective and efficient knowledge-sharing to conduct complex remote tasks, employee training, product and equipment assembly, maintenance and repair, field and customer support, and more.

The company’s device-agnostic technology supports smartphones, tablets and wearables, making it easy for leading organizations like Boeing, Toyota, Lockheed Martin, Honeywell, Assa Abloy, GE and others to quickly scale their use of AR to any remote worker.

General overview of the project(s)

As a Site Reliability Engineer, you will be responsible for scaling and securing server and network infrastructure or our services, both for the Scope AR Cloud and for on-premise installations, often for large, security conscious customers. You will play a key role in delivering the services our software depends upon, and be essential to our success.

Responsibilities

Work with development teams to automate and streamline releases of our mission-critical distributed systems
Based on our customers’ needs, design infrastructure configurations, both for our hosted services and for on-premise installations
Anticipate changes to infrastructure to meet customers’ changing needs
Write scripts to automate installations, maintenance, migrations, etc.
Maintain, troubleshoot and administer VMs and networks for our hosted infrastructure
Improve our Docker Swarm configuration set up for high availability and scalability to support our products
Improve and execute on security policies, identify and resolve security issues
Run scans, and harden server images, configuration, networks and environments
Monitor services infrastructure performance; analyze issues, recommend and implement changes
Automate testing of configuration and scripts
Load/stress test configurations
Create and run tests to determine scaling characteristics
Document DevOps processes: develop standards to guide operations, support and maintenance
Work with our customer to establish suitable network configurations and security policies to run our software
Provide top tier support for on-premise installations
Respond to security and infrastructure questions
Plan and execute on meeting compliance requirements

Requirements

Very good Linux knowledge
Familiarity with CI/CD piplelines using Jenkins/Circle CI
Linux server scripting skills, especially bash and higher level languages, such as Python
Automating installations
Network configuration, setting firewall rules and other security policies
AWS management, especially configuring VPCs, and using IAM
Working with Docker, Docker compose and Docker swarm
Database scaling and clustering
Scaling services and network infrastructure in a cloud or data centre environment
Managed microservice-based service at scale
How to monitor service performance, and automate scaling up or down
How to monitor for security issues in an AWS environment

Skills considered as a good plus

Familiar with red/black deployments
Experience with other AWS services, such as Aurora
Experience with non-AWS cloud services, such as Microsoft Azure
Experience with Kubernetes
Comfortable scripting in Python and/or Ruby
Experience with Chef/Puppet
Experience with tools such as ELK, Graylog or similar
Experience working in a SOC 2, ISO27001, and other compliance environments

Similar Jobs