SHRUTHI NAIDU Site Reliability Engineer
Profile

A site reliability engineer, looking to make sense out of chaos while driving necessary change to help build a reliable platform.

Education

Master's Degree (MTECH)

09/2014 – 06/2016

Major : Digital communication and Networking

School : The Oxford college of engineering

Bachelor's Degree (BTECH)

08/2009 – 06/2013

Major : Electronics and Communication

School : New Horizon college of engineering

Professional Experience

SRE

FDJ
06/2025 – Present | Sydney, Australia
  • Managing and scale a high‑performance on‑premise observability platform built on the Grafana stack and OpenTelemetry architecture, deployed as Kubernetes workloads.
  • Partnering with service teams to define alerting strategies and build actionable dashboards that enhance operational visibility and response effectiveness.
  • Improving platform reliability, scalability, and efficiency through continuous optimization, performance tuning, capacity planning, and automation.
  • Applying SRE best practices — including SLIs/​SLOs definition, incident analysis, and proactive monitoring — to increase system resilience and reduce operational toil.
  • Leading a migration initiative to modernize and streamline cluster management, optimizing overall observability operations and scalability.
  • Projects : Service team SLO/​SLI engagement, ARGO Migration, Cluster Optimization

    Site Reliability Engineer

    Workday
    03/2022 – 06/2025 | Sydney, Australia
  • Building a reliable and highly availble product, though active engagement in incidents and postmortems and guiding the team towards SRE an transformation, by fostering a mindset focused on triage and root cause analysis.
  • Additionally, worked as an SRE Product Owner within the ANZ region with an aim to infuse agile methodology into daily operations via project work.
  • Skills
    Cloud: — AWS, GCP, OpenStackOS and Software — Linux, PythonContainer Management: — Docker, Kubernetes, HelmGitops: — Github, GHA, Argo Observability and Alerting: — ELK Stack, Grafana Stack, HoneyComb, Big Panda, Open TelemetryCollaboration Tools: — JIRA, Confluence, Pager Duty, Slack, ZendeskOther Tools: — Kafka
    Acheivements

    Recognised for : Customer Support and Employees

    Workday
    09/2024

    Received this recognition from my managers for putting customers first while always looking to improve how our team works.

    Recognised for : Innovation

    Workday
    04/2024

    Issued by the manager of the service health for my zest to learn by curiosity.

    Employee of the month

    Criterion Networks
    04/2020

    Received this award for working on building the SRE team while

    maintaining customer success with all clients from onboarding to exit.

  • Managed and prioritized the product backlog for collaborating with engineering teams to define and implement SLOs/​SLIs while also working to onboard alerts onto the existing observability platform, while working on daily operational tasks.
  • Projects : Service team SLO/​SLI engagement, Mean Time to Detect reduction via alerting, Team SRE Bot.

    Site reliability engineer (Remote)

    BlockFi
    04/2021 – 12/2021 | Bangalore, India

    Being an SRE evangelist, handling incidents as a first line responder and spearheading blameless incident postmortems. Assisting service teams in adopting best practices in terms of deployments, alerting and metrics while managing tooling and maintenance of the public cloud infrastucture.

    Projects : Service Improvements - Observability setup

    Site Reliability Engineer

    Criterion Networks
    07/2019 – 03/2021 | Bangalore, India

    Ensuring customer success through incident management and maintaining client usage metrics, while assiting the QA team in post/​pre deployment testing while advocating for adoption of SRE practices in metrics and observability. Maintaning the hybrid cloud environment while developing good processes well as maintaining product documentation.

    Projects : Client Usage Metrics and cloud costing

    Projects

    Learning Docker

  • Supe-story-project: A simple dockerized python app created to give you a random super hero story.
  • Bookstack: A wonderful note keeper spun up locally using docker compose
  • Warrior Spirit Award

    JCPenney

    Received this award for building the operations (NOC) team from the ground up including setting processes in place as well as training team members.

    Interests
    SingingSport enthusiastAmatuer GuitaristPart of the Toastmaster fraternityAttending tech conferences and webinars
    Courses

    KCNA

    CNCF
    08/2024 – 09/2024

    Terraform Associate

    ACG
    06/2023

    Google Cloud Professional Certificate: Cloud Engineer

    10/2021