My Experience
Tell me about yourself
- code
- code
- I was involve in providing technical support for nearly 4 to 5 years. my day to day activities in this role:
1. I was placed in VMware technical support role. vcenter: server down, link broken, Linux skills was used.
2. site reliability work: migration, like lift and shift operation, I would like to work on core devops technologies like Jenkins, kubnernetes and application performance.
Major Challenges
- Any major challenges you have faced as a devops engineer?
- We get challenges everyday, some challenges are fixed as and then on regular basis but some take time due its complexity, some time to figure out the main cause takes time as issues are linked with integrated with other application or dependencies. I can recall one recent incident it was related to an RDS instance on AWS. space was almost full and alert was not checked on time by team member and we increase the size but again it gets full than we increase again but which was getting as much as we increase, so the main cause was the antivirus running on servers and during patching it was occupying the space and after successful pathching that anti virus was not release the occupied space, due to this bug many servers were impacted.
- code
- code
updates with new tools
- How do you keep yourself updated with the lates tool and technologies which comes in the market or any other tool with multiple versions or features is released?
- Through various sources I get an update like I have subscribed to various community/groups regarding latest tools and changes in the IT sector or upcoming tools in the market or with our team recommendations I learn during the free time or personal time to keep updated.
- code
- code
code
code
code
Day to Day Activities
- I got more than 20 years of experience working in IT sector, in the past 5 years I worked on cloud and devops technologies including Jenkins for CI/CD, GitHub for repositories, for automation in terraform and worked in containerisation in docker and its management with Kubernetes including writing docker files with conjunction with developers.
- Get the task tickets on Jira or email, and some task on WhatsApp group. some users get connectivity issues to servers as we have hybrid network where we have some servers on-prem and some on cloud. Some users have issues with o365 application access issues due to dual authentication failed.
task include setting up infra as per requirement, creates a pipeline for new or change request,
- Code
- code
Azure Devops Realtime
code
code
code
code
code
code
code
AWS Devops Realtime
Linux
CI/CD-Jenkins, Gitlab
IAC-Terraform
Container-Docker,Kubernetes
Database
code
code
Azure Devops Mocktest
Code
code
code
code
code
code
code
AWS Devops Mocktest
Linux
Jenkins
- What is CI and CD? What do you understand by continuous integration, continuous deployment and continuous delivery in devops?
- process from creating the code by developers to store on repository, create the pipeline so that any change or input after commit from develops which results in trigger and run the code.
after testing and vulnerabilities and code test when a package is build than it will be deployed on development, testing, pre- production or production environment application use.
- CI → Automatically build/test code whenever developers push changes
Continuous Delivery → Automatically prepare deployments, production release requires approval
Continuous Deployment → Fully automatic deployment to production after successful tests.
- I have to provision two infra, primary and secondary by using Jenkins pipeline. how will you ensure that secondary infra pipeline runs only after primary infra passes?
- Configure Secondary job as a post-build step of the Primary job.
- How do you take care of cleanup of temporary files in workspace in Jenkins?
- We manage cleanup using the Workspace Cleanup Plugin and automation in Jenkins pipelines through cleanWs() or deleteDir(), ensuring temporary files are removed either after every build or based on retention policies, preventing workspace clutter and disk usage issues.
- Code
- Code
- code
Terraform
- You have to create 10 identical EC2 instances using terraform. What approach will you follow?
- When you create a Terraform yaml file to configure/create the EC2 machine, define the provider and EC2 Instances details along with VPC, SC and other details, define number of instances = 10 using count term.
- VPC is created manually, You have to create EC2 instance using terraform, how will you pass VPC id in terraform?
- If you have access than you can login to AWS and copy VPC ID or if you do not have and you know VPC name or tag, than define in data block. In data block we can fetch the existing resources details.
- you have provisioned a VPC using terraform and someone deleted it manually. Now if you run Terraform apply, how will your pipeline behave?
- Terraform will match or check the existing resource with terraform file, if resource exist it will not do any action, if any changes made than it will perform those changes, if there is no resource than it will create resource.
- How will you ensure that in your terraform configuration that once vpc is created only after that its vpc id is fetched and passed on to EC2 module?
- In Terraform, dependencies are managed automatically when we reference one resource in another. By passing aws_vpc.main.id as an input variable to the EC2 module (or exposing it via output), Terraform ensures the VPC is created first. We can also enforce ordering using depends_on if needed.
- Code
- Code
- Code
- Code
- Code
Ansible
Docker
- I can coordinate with developers to write a docker file with their input.
- We have nginx pod which we want to expose to outer world, when I hit abc.com than my request goes to that pod?
- For deployment of application on pod we will create a dockerfile and will define image to be use and all and deploy with Kubernetes, define load balance and service which distributes the traffic.
In route 53 will setup custom domain and provide public IP of load balancer in custom domain dns records.
- Can you explain the steps that you take to create docker image for a nodejs application and store in ECR?
- Code
- Code
- Code
- code
Kubernetes
- Explain architecture of Kubernetes?
- When you create a Terraform yaml file to configure/create the EC2 machine, define the provider and EC2 Instances details along with VPC, SC and other details, define number of instances = 10 using count term.
- How to upgrade Kubernetes cluster?
- How to distribute 10 pods equally over 10 nodes in kubernetes?
- use daemon set to distributes pods equally on nodes. POD anti-affinity can be use
- when you hit the url you are getting 403 error?
- 403 is forbidden access, the request is not reaching to application due to permission, check the rules defined for load balancer and other services. check the dns name resolution and check where the traffic is dropping, check ssl certificates, check the logs of the pod. check the ports permission in security group.
$kubectl get pods -o wide (enter the namespace)
$kubectl get services
$kubectl describe service_name
$kubectl get logs
$kubectl get events
- What are all AWS services and Kubernetes concepts will you use for provisioning a dynamic website?
- AWS Services: EKS, Networking (VPC, Subnets, SG), LB, Storage for DB(Amazon RDS), Storage for app files(EBS/EFS/S3), container registry (ECR), route 53 (DNS), Autoscaling,
Kubernetes Services: Pods, deployments, Replicaset, services (lb), secrets(sensitive data store), PV and PVC, Autoscaling, RBAC
- Different between Daemonset and Deployment in Kubernetes?
- DaemonSet ensures a Pod runs on every node (or selected nodes)
- Deployment ensures desired number of Pods regardless of number of nodes
- Which component of K8 schedules pod over nodes?
- The Kubernetes Scheduler (kube-scheduler) is the component responsible for scheduling Pods onto the appropriate Nodes.
- Suppose I have an application for which POD is running on EKE cluster. Now whenever I hit abc.com in browser the request should get routed upto that pod. What all steps will you take?
- Kubernetes API gateway? tell us about?
- how do you scale Kubernetes nodes?
- Diff between config map and secret?
- secret is use to store sensitive data while config map use to store non sensitive data.
- Code
- Code
- Code
- Code
Database
Git
Code
Code
Code
Scenario-based
code
code
code
code
code
code
code
Code
code
code
code
code
code
code
code
Code
code
code
code
code
code
code
code
Code
code
code
code
code
code
code
code
Code
code
code
code
code
code
code
code
End of Interview
Interest in working your organization
- I am looking forward to getting opportunity to work on all those skills.
- code
- code
code
code
code
code
code
code
Azure Cloud
code
code
code
code
code
code
code
AWS Cloud
Personal experience in AWS Cloud
- I worked and got knowledge with many cloud services both in AWS and AZure like VPC, EC2, RDS, managing domains on cloud, cloud storge etc..
- code
- code
VPC
- How do you connect your on-premise data center to AWS?
- using site to site VPN connection service.
- You have 2 VPC and 2 EC2 instances running on each VPC, first one is running frontend and second is backend. These 2 apps needs to communicate but they are not? how to resolve?
- Establish connectivity between these two different VPCs. They are on two different networks. Enable VPC peering.
- Your app running on Ec2, product icons are stored in S3 bucket which needs to be shown on web home page. When application is trying to fetch its resulting into error, how to resolve? how to establish communication between app and S3 bucket?
- Create IAM role with required access to S3 bucket, attach this role to Ec2 Instance.
- diff between route table and security group?
- Code
- Code
- Code
EC2
- your application on EC2 experiences traffic spikes only during business hours. How would you optimize cost and performance?
- Configure autoscaling and load balancer. Define thereshold values in the configuration, when usage gets high a parallel defined number of EC2 will be created and load balancer will send the traffic to these machine using round robin method. When usage gets low than EC2 machines will be removed.
- Your EC2 instance crashed unexpedly, what steps would you take to identify the issue and restore service quickly, how you troubleshoot?
- There could be multiple scanrios, application consumption high hardware resources, Operating System get corrupt, updates required or recently updates causing problem,
plan of action: first in AWS/Azure Cloud health check the EC2 machine, if all three checks are ok than login to EC2 machine. If we unable to login, try to reboot the EC2 machine.
check the usage of CPU or memory , check which application is consuming lot of resources, restart that application.
check the logs of EC2 machine.
- You have deployed app on EC2 instance, the app has to fetch some images from internet but its not able to fetch. What might be the possible issues?
- check internet connectivity, that EC2 instance is on private subnet and check NAT configuration, check security group rules, check DNS name resolution.
- Suppose you have and EC2 Machine and you performed some task today. Now you have to perform the same task tomorrow and so on but on a new EC2 machine.. What will be the best way to do task here?
- The best approach is to automate the setup and configuration so each new EC2 instance is ready without manual work.
a} Create an AMI (Amazon Machine Image): After setting up the first Ec2 with all software, configs, and scripts than create an EMI, launch a new EC2 instance from that AMI so that you do not need to perform the repetitive task.
b) Use user data script: write a script that automatically installs packages and perform tasks on startup, add this script in EC2 -->advanced details --> user data, every new EC2 instance will run the script on launch.
c) User AWS Automation /Devops tools: cloudformation, terraform, ansible/chef/puppet, AWS System Manager.
- You auto scaling group is implemented. now suppose that your 80% cpu utilisations occurs, will it scale immediately or wait for some time then scale?
- When CPU utilization exceeds 80%, the Auto Scaling Group does not scale instantly. It waits for CloudWatch to evaluate metrics, triggers the scaling action, and then enters a cooldown period to allow the new instances to stabilize before considering additional scaling.
- Can we change private IP address of a running EC2 Instance? Can you change it after stopping the instance?
- you cannot change IP while instance is running, but it can be changed when instance in stop state.
- Code
RDS
- Which AWS services can be used to trigger notification whenever your database storage consumed more than 80%?
- Amazon CloudWatch: Monitors database storage metrics (e.g., RDS FreeStorageSpace) and creates alarms when threshold crosses 80%.
Amazon SNS (Simple Notification Service): Sends notifications (Email/SMS/HTTP) when CloudWatch alarm is triggered.
- Diff between RDS and Dynamo DB?
- code
Domain
- Deploying a new website, developers wrote code, devops engineer perform the following task?
- If the requirement to containerize the application than we can star developing infra with required instance and software, for managing the containers we can suggest to setup EKS or AKS or on-prem Kubernetes cluster.
We can setup CI/CD setup for automation and to hand change request, setup for monitoring the application using Grafana and Prometheus or any existing application if client already using, we can setup domain registration and custom domain setup.
- You have to create your own static website. Which all AWS services you can consider?
- a)Amazon S3 to store static files. b) Route 53 for Domain registration & DNS Routing, c) AWS Certificate Manager for SSL Certificates.
For static website hosting on AWS, the ideal combination is S3+Cloudfront+Rotue53+ACM
- How will you do mapping of abc.com with your running application on EC2 so that whenever you hit this abc.com in browser you will see your running website?
- Assign public IP to EC2 instance using Elastic IP, Create a zone in route 53 for abc.com and it gives NS records, update these records with domain registrar. Ensure port 80 and 443 is allowed in security group,
Short Answer: Use Route 53 to create an A-Record that points abc.com to the EC2’s Elastic IP or Load Balancer DNS, update nameservers on the domain registrar, and allow inbound traffic via Security Groups.
- My frontend application is hosted on EC2 in public subnet, there are spikes in traffic sometimes. This server is having slowness, how to handle this problem?
- User autoscaling, load balancer, consider horizontal scaling, monitoring and alerts.
To handle slowness and traffic spikes on a single EC2 frontend, I would implement an Auto Scaling Group to automatically add/remove instances based on load, put an Application Load Balancer in front for even traffic distribution, offload static assets to S3 + CloudFront, and monitor metrics via CloudWatch to ensure responsiveness under spikes.
- What are different scaling policies that we can implement with respect to auto scaling group?
- Auto Scaling Groups (ASG) can automatically adjust the number of EC2 instances based on traffic or performance. There are several scaling policies you can implement depending on the use case:
AWS Auto Scaling supports Target Tracking, Step Scaling, Simple Scaling, Predictive Scaling, and Scheduled Scaling.
Target tracking automatically maintains a desired metric, step scaling adjusts in multiple steps based on thresholds, simple scaling reacts to a single threshold, predictive scaling anticipates future traffic using historical trends, and scheduled scaling adds/removes instances at specific times.
- Code
- Code
- code
Storage
- You accidentally deleted data from S3 bucket. How do you recover it?
- If S3 versioning is enabled than it can be recovered it.
If S3 Replication was enabled
If backup solutions were configured
If object lock mode was enabled
If all the above 4 was not configured than it cannot be recovered.
Recommended best practice:
Enable S3 Versioning,
Enable MFA delete (prevent unauthorized deletion)
Use S3 Lifecycle rules + Object Lock for critical data
Set up Cross-Region Replication (CRR) for disaster recovery
Use AWS Backup for automated protection.
- In S3 bucket, I want to keep my bucket private but I want one of the object to be fetch by one user for specific 15 minutes, how to achieve?
- User pre-Singed URL for Temporary Access. A pre-singed URL allows secure, temporary access to a private S3 bucket without changing bucket permissions. $aws s3 presign s3://my-private-bucket/my-object.jpg --expires-in 900
- You have to restrict access to a specific object and specific user in S3 bucket. how will you do?
- I will keep the S3 bucket private and create a fine-grained IAM or Bucket Policy that grants s3:GetObject permission only to the specified IAM user for that particular object. Thus, only that user can access that single object and nothing else.
- You have an app running on EC2 in AWS account A. The product images are stored in S3 bucket in AWS account B. What kind of setup will you do to fetch the images?
Ans: you should use cross-account access with IAM roles and S3 bucket policies?
- I will use cross-account IAM Role + S3 Bucket Policy.
EC2 in Account A will assume a role in Account B using STS, and that role will have permission to access only the required S3 objects. This ensures secure, least-privilege cross-account access without exposing the bucket publicly.
- I have two AWS S3 bucket in 2 different regions. In one bucket static website is hosted and in second bucket images are stored. The static website has to fetch images. How do you achieve?
- To enable a static website hosted in an S3 bucket to fetch images stored in another S3 bucket in a different region, configure CORS on the image bucket and grant S3 GetObject permissions (via bucket policy or CloudFront OAC). Optionally use CloudFront as a CDN for optimized, secure delivery across regions.
- How will you establish the cross account communication between s3 bucket?
- If you want to connect EC2 to S3 bucket but don't want to use internet gateway and nat gateway, how will you establish communication between EC2 and S3 without internet?
- code
Other
- In what scenario you will use EC2 instance or AWS Lambda function?
- It depends on the requirement, if the application is running for short time we recommend to use lambda, lambda is a compute service that runs code without the need to manage servers. If the developers want to test the code and it does not need to keep servers for future use. EC2 instance is useful when you need to have some tools to be install and code required these tools,
- What is your understanding of Devops and SRE ?
- Devops is mainly to use automation the day to day task with less user interference, integration with different tools and technologies and perform the task automatically. it integrate and connects the different teams and keep record of their progress.
- Code
- Code
- Code
- Code
- Code
- Code
- code