Ananya Cheripally

Data Science Engineer • Python & SQL Developer • ML Analytics

Master's student at University of Sydney specializing in Data Science. Transforming complex datasets into actionable insights through predictive analytics and data-driven storytelling.

About Me

I'm a passionate Data Science Engineer with a strong foundation in Computer Science and growing expertise in SQL Server, Python, predictive analytics, and data-driven storytelling. Currently pursuing my Master's in Data Science at the University of Sydney, I thrive on transforming complex datasets into actionable insights that drive smarter decisions.

Data Engineering

Building scalable data pipelines, ETL processes, and data warehouses

PythonSQLSparkAirflowKafkaSnowflake

Automation & Workflows

Designing intelligent automation systems that optimize business processes

PythonAPIsCI/CDDockerKubernetesTerraform

AI/ML Engineering

Developing and deploying machine learning models at scale

TensorFlowPyTorchScikit-learnMLflowHugging Face

My Journey

25
2025

Master's in Data Science

University of Sydney

Pursuing advanced studies in data science, machine learning, and predictive analytics

23
2023

Data Engineer

SAAC IT Solutions

Built relational databases, designed robust triggers, and worked with real-world datasets

22
2022

Secretary, IEEE GRIET SB

GRIET

Led technical projects and organized events, honing leadership and organizational skills

Technologies & Tools

PythonSQLRC++MySQLSQL ServerMachine LearningPredictive AnalyticsData VisualizationTableauData ModelingRandom ForestSVMTensorFlowScikit-learnPandastidyverseQuery OptimizationER/RM DiagramsGitData Storytelling

Featured Projects

A selection of projects showcasing my expertise in data engineering, automation, and AI/ML

ML
Project Preview
AI/ML

Predictive Crime Analytics for NSW

Forecasting High-Priority Emerging Threats using 30 years of crime data

RRandom ForestSVMPredictive Analytics
ML
Project Preview
AI/ML

ML Model Deployment Platform

Created an end-to-end ML platform for model training, versioning, and deployment

PythonFastAPIDockerKubernetes
AT
Project Preview
Automation

Intelligent Workflow Automation

Automated complex business workflows saving 100+ hours per week

PythonAirflowAPIsPostgreSQL
DE
Project Preview
Data Engineering

Data Warehouse Modernization

Migrated legacy data warehouse to modern cloud architecture

SQLSnowflakedbtPython
ML
Project Preview
AI/ML

NLP-Powered Analytics

Built NLP models for automated text analysis and insights extraction

PythonTransformersPyTorchFastAPI
AT
Project Preview
Automation

Infrastructure as Code Platform

Created IaC framework for automated infrastructure provisioning

TerraformAWSPythonCI/CD

Dashboards & Visualizations

Interactive dashboards and data visualizations that drive decision-making

Sales Analytics Dashboard

Real-time sales performance tracking with predictive analytics

Power BI
Real-time
50+ KPIs
Predictive Models
Power BISQLPython

Dashboard Embed Placeholder

Replace this section with your actual Power BI, Tableau, Streamlit, or Grafana embed code

<iframe src="https://placeholder-embed.com" />

These dashboards are built using modern BI and visualization tools

Power BITableauStreamlitGrafanaPlotlyD3.js

AI/ML Experiments

Research projects, Kaggle competitions, and experimental ML models

Trained a custom YOLOv8 model on a dataset of 50,000+ images. Implemented data augmentation, transfer learning, and model optimization techniques. Deployed as a FastAPI service handling 100+ FPS on GPU.

PyTorchYOLOv8OpenCVFastAPI

Implemented a custom transformer architecture for time series forecasting. Compared performance against LSTM, GRU, and Prophet models. Achieved 20% improvement in RMSE over baseline models. Used for demand forecasting in production.

TensorFlowTransformersPandasPlotly

Fine-tuned BERT-base model on custom dataset of 100K+ reviews. Implemented efficient inference pipeline using ONNX Runtime. Achieved 92% F1-score across 5 sentiment classes. Processes 1000+ documents per second.

Hugging FaceBERTONNXPython

Built a hybrid recommendation system combining collaborative filtering and content-based approaches. Used neural networks to learn user and item embeddings. Improved click-through rate by 35% in A/B tests.

PyTorchEmbeddingsMLflowRedis

Implemented multiple anomaly detection algorithms including Isolation Forest, Autoencoder, and LSTM-based approaches. Compared performance and selected best model for production. Reduced false positives by 60%.

Scikit-learnTensorFlowKafkaDocker

Fine-tuned Stable Diffusion model on custom dataset for generating domain-specific images. Implemented LoRA for efficient fine-tuning. Created web interface for easy experimentation. Generated 10,000+ high-quality images.

Stable DiffusionLoRAGradioPyTorch

Competitions & Achievements

5%
Top 5%
Kaggle Competition
15+
15+
Published Notebooks
Expert
Kaggle Rank

Blog & Articles

Sharing insights, tutorials, and lessons learned from building data systems

DE
Article
Data Engineering

Building Scalable Data Pipelines with Apache Airflow

A comprehensive guide to designing and implementing production-ready data pipelines that scale to millions of records.

Jan 15, 2024
8 min read
AirflowPythonData Engineering
Read More
ML
Article
AI/ML

MLOps Best Practices: From Notebook to Production

Learn how to take your ML models from Jupyter notebooks to production-ready systems with proper monitoring and versioning.

Jan 10, 2024
12 min read
MLOpsPythonDocker
Read More
AT
Article
Automation

Automating Cloud Infrastructure with Terraform

How to implement Infrastructure as Code to automate and manage your cloud resources efficiently.

Jan 5, 2024
10 min read
TerraformAWSDevOps
Read More
DE
Article
Data Engineering

Real-Time Stream Processing with Apache Kafka

Deep dive into building real-time data streaming applications using Kafka and Spark Streaming.

Dec 28, 2023
15 min read
KafkaSparkStreaming
Read More
ML
Article
AI/ML

Fine-Tuning Large Language Models for Domain Tasks

A practical guide to fine-tuning LLMs like BERT and GPT for specific business use cases.

Dec 20, 2023
14 min read
LLMNLPTransformers
Read More
DE
Article
Data Engineering

Data Quality: The Foundation of Reliable Analytics

Implementing data quality checks and monitoring to ensure your analytics are built on solid foundations.

Dec 15, 2023
9 min read
Data QualityTestingdbt
Read More
AT
Article
Automation

CI/CD for Data Pipelines: A Modern Approach

How to implement continuous integration and deployment for your data engineering workflows.

Dec 10, 2023
11 min read
CI/CDGitHub ActionsTesting
Read More
ML
Article
AI/ML

Computer Vision at Scale: Lessons Learned

Practical insights from deploying computer vision models in production serving millions of requests.

Dec 5, 2023
13 min read
Computer VisionPyTorchProduction
Read More
IN
Article
Infrastructure

Cost Optimization Strategies for Cloud Data Warehouses

Proven techniques to reduce your cloud data warehouse costs without sacrificing performance.

Nov 28, 2023
10 min read
SnowflakeCost OptimizationCloud
Read More

Let's Work Together

Have a project in mind or want to discuss data engineering, automation, or AI/ML? I'd love to hear from you!

Get in Touch

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.

Connect on Social

5+
Years Experience
50+
Projects Completed
100%
Client Satisfaction

Ready to Build Something Amazing?

Let's turn your data challenges into opportunities for growth and innovation.