MY SRE Task

chevron-rightAutomationhashtag

Deployment automation - Jenkins and GitHub actions

AWS automation - Python boto3

others API automate - Python request module (Loki, ClickHouse, Okta)

Ansible for user onboarding and offboarding

bash scripting for Linux-like backup and push to S3 / user creation

PowerShell - for Windows automation

Terraform for VPC , EKS

common

user onboarding, installing software, backup

chevron-rightMonitoringhashtag

prometheus, grafana, loki

system metrics - node expo

logs - loki

traces - Python library

blackbox for SSL, healthcheck, and other metrics

SLA Dashboard

Understand the ELK Stack flow and system

chevron-rightAlertinghashtag

CICD Build alert

Alertmanager alert (SSL, metrics threshold, logs error, health check)

chevron-rightContainerizationhashtag

Docker

Kubernetes

chevron-rightDatahashtag

SQL - Postgresql

data warehouse - clickhouse

data analytics - Apache Superset

data pipeline - airbyte

chevron-rightIncident handling - debugging, RCA, SOPhashtag

chevron-rightOthershashtag

AWS, DNS, SECURITY, Data

chevron-rightREAL TIME TASKShashtag

job failed - create ticket

automated sop

ai bot - run scripts and try to fix issue - if fixed no fixed assign ticket to sre will analyse, rca, debugg and fix issue

monitoring alert - alertmanager (ssl, health check) job failure alert (semaphoreui - teams and outlook and ticket) build failure alert (cicd - ticket)

issues/incident disk full memory high usage ssl expiry domain down server down pipeline failed job execution failed new intern onboarding jobs - install software, create user, grant access newapi/web app deployment migration exploring new tools improving system performance developer issues - access denied, cors error requesst blocked - check waf events setup waf rules db backups db connection issue - pooltimeout supp=iscious activity like rate limit or block bot setting new environment for client log rotate rotate secret - vault docker image building

learningnew systems like kubernetes SRE Pricinples

focusing on how to make system betetr like leveraging kubernetes concepts like for latency leverage pod affinity - to put db and api pod closer

monitoring systems - metrics, traces, logs, events, auditing

chevron-rightAutomation taskshashtag

onboarding user

installing software on Windows using PowerShell - Warp, WireGuard

Last updated