Sr. Full Stack Infrastructure Engineer - Applied AI Platform

Maisa AI

Maisa AI

Software Engineering, Other Engineering, Data Science
Remote
Posted on Oct 7, 2024

About Maisa

maisa AI is an AI Research Lab, as well as AI SaaS (b2b) startup providing a horizontal platform that enables companies and knowledge workers to create and manage their Digital Workforce of Autonomous AI Workers. These workers operate 24/7, handling end-to-end jobs and mission-critical processes with unparalleled traceability, explainability and consistent quality. No hallucinations!

We're revolutionizing productivity through cutting-edge AI technology.

We are seeking a Senior Full Stack Infrastructure Engineer to play a crucial role in our Site Reliability engineering (DevOps/SysOps/SecOps) and platform development efforts. This position offers a unique opportunity to work across our entire stack, ensuring the reliability, scalability, and security of our systems while also contributing directly to our product development.

## Key Responsibilities

- Actively contribute as a hands-on engineer, dedicating 60% of your time to DevOps and SysOps responsibilities and 40% to supporting the engineering team as a regular contributor.

- Implement and maintain robust SysOps, DevOps & SecOps practices, including CI/CD pipelines, infrastructure as code, and automated testing.

- Manage and optimize our cloud infrastructure and on-premises systems, ensuring high availability, performance, and cost-effectiveness.

- Collaborate closely with the development team, providing feedback and implementing best practices to improve the performance and reliability of applications.

- Design, implement, and maintain scalable, high-performance systems capable of handling high-concurrency and real-time processing.

- Assist in implementing and maintaining security best practices across our infrastructure and applications.

- Troubleshoot and resolve complex technical issues across the entire stack.

- Implement and manage monitoring, alerting, and logging systems to ensure proactive problem detection and resolution.

- Participate in on-call rotations to provide 24/7 support for critical systems.

- Continuously evaluate and integrate new technologies to improve our infrastructure and development processes.

## Required Skills and Experience

- Strong proficiency in Golang and Python, with familiarity in Rust for small-scale features.

- Extensive experience with DevOps practices and tools, including CI/CD, configuration management, and infrastructure as code.

- Solid SysOps skills, including system administration, network management, and performance tuning.

- Experience working with AWS services and the ability to handle cloud environments across various providers (GCP; Azure, etc.).

- Solid understanding of event-driven architectures and high-concurrency environments, with hands-on experience using Kafka, EventBridge, or similar event streaming technologies.

- Familiarity with Large Language Models (LLMs), retrieval-augmented generation (RAGs), embeddings, and other AI-related services is a major plus.

- Proficiency in containerization technologies (e.g., Docker) and orchestration platforms (e.g., Kubernetes).

- Strong knowledge of networking concepts, security best practices, and database management.

- Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).

## Soft Skills and Personal Qualities

- A true team player with excellent collaboration skills, willing to share knowledge generously and help colleagues grow.

- Honest, humble, and modest; a person who values teamwork and puts the needs of the team ahead of personal ambition.

- Willingness to take ownership and responsibility while maintaining open and transparent communication.

- Strong problem-solving skills and the ability to think critically under pressure.

- Excellent written and verbal communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.

- Self-motivated with a passion for continuous learning and staying up-to-date with emerging technologies.

- Fluency in English and Spanish is essential, with strong written and verbal communication skills.

## Additional Qualifications (Preferred but Not Required)

- Prior experience with high-traffic, event-based systems and knowledge of real-time processing.

- Experience working in fast-paced, high-concurrency environments, with the ability to scale systems to handle high loads.

- Knowledge of SecOps practices and experience in implementing security measures in cloud environments.

- Familiarity with InfoSec principles and experience in conducting security audits or implementing security controls.

- Bonus points for contributions to open-source projects or having built tooling that improves infrastructure or developer efficiency.

- Experience with data processing and analytics platforms (e.g., Apache Spark, Flink).

- Knowledge of machine learning operations (MLOps) and AI infrastructure management.

## What We Offer

- Opportunity to work on cutting-edge AI technologies in a fast-paced startup environment.

- Competitive salary

- Flexible work arrangements

- Chance to make a significant impact on the company's technical direction and growth.

If you're passionate about DevOps, SysOps, and building scalable, reliable, and secure systems, and want to be at the forefront of AI technology, we want to hear from you!