<KKY/>
Initializing data pipelines...
Available for opportunities
Hello, I'm

Krishna Kumar Yadav

Senior Data Engineer // _

Architecting data infrastructure that processes 10M+ transactions daily, delivering $2M+ annual savings, and building the backbone of fraud detection systems in banking.

krishna@data-engineer ~
0 Years Exp
0 Savings/yr
0 Query Boost
0 Txns/Day
Scroll to explore
01

About Me

I'm a Senior Data Engineer based in Bengaluru & Gurgaon, India, with a decade of experience architecting data solutions that power critical banking operations.

I specialize in building end-to-end data pipelines, real-time streaming architectures, and feature engineering systems serving ML teams at scale. My work spans fraud detection, risk management, and regulatory compliance in financial services.

Currently expanding into Generative AI, LLMs, MLOps, and cloud-native architectures to push the boundaries of what data engineering can achieve.

profile.py
class DataEngineer:
  def __init__(self):
    self.name = "Krishna Kumar Yadav"
    self.role = "Senior Data Engineer"
    self.experience = "10+ years"
    self.impact = {
      "savings": "$2M+ annual",
      "optimization": "50% faster",
      "scale": "10M+ txns/day"
    }
Location
Bengaluru & Gurgaon
Domain
Banking & FinServ
Education
M.Sc. CS (AI & ML)
Data Pipeline Architecture
02

Technical Arsenal

Big Data

Apache SparkPySparkKafka Spark StreamingHiveHDFS HadoopKinesisMapReduce ParquetAvroORC

Pipeline & Orchestration

Apache AirflowdbtETL / ELT Pipeline DevWorkflow Automation Batch ProcessingStream Processing

Feature Engineering & ML

Feature StoreFeature Pipelines ML Data PipelinesGreat Expectations Data QualityData Lineage

Warehouse & Lakehouse

SnowflakeDelta LakeApache Iceberg SnowpipeData ModelingStar Schema Data WarehouseLakehouse

Cloud Platforms

Azure DatabricksAzure Data Factory Azure SynapseADLSAWS S3 AWS GlueAWS EMRAWS Lambda AWS Athena

Languages & Tools

PythonSQLScalaBash TigerGraphPostgreSQLApache Impala HBaseGitDocker

Currently Learning & Exploring

Expanding Horizons
Generative AI Large Language Models LangChain RAG Pipelines MLOps Kubernetes Terraform Apache Flink Microsoft Fabric Vector Databases Data Mesh Prompt Engineering Generative AI Large Language Models LangChain RAG Pipelines MLOps Kubernetes Terraform Apache Flink Microsoft Fabric Vector Databases Data Mesh Prompt Engineering
03

Work Experience

CURRENT
2023 — Present

Senior Data Engineer

Bank of America
  • Architected data pipelines processing 10M+ transactions daily with 15% performance improvement
  • Designed real-time streaming with sub-second latency for fraud alerts
  • Deployed TigerGraph modeling 50M+ relationships for fraud ring detection
  • Built feature engineering pipelines and Feature Store for ML model training
  • Reduced query time by 40% on 5TB+ data with Delta Lake Z-ordering
PySparkKafkaTigerGraphdbtAirflowDelta Lake
2021 — 2023

Big Data Engineer

Standard Chartered Bank
  • Engineered big data architecture processing 5TB+ daily, reducing query time by 50%
  • Launched pipelines achieving 30% efficiency improvement
  • Created orchestration with 50+ DAGs and SLA monitoring
  • Enhanced Spark jobs reducing costs by 20% through optimization
HadoopSparkDatabricksSnowflakeAirflowdbt
2015 — 2021

System Operations Senior Analyst

Wells Fargo
  • Developed end-to-end data pipelines serving 500+ users
  • Improved workflow efficiency by 30% with Azure Databricks
  • Crafted Snowflake data models and star schema for regulatory reporting
  • Reduced data incidents by 45% with quality monitoring dashboards
PySparkSnowflakeAirflowKafkaDatabricks
04

Key Projects

02

Real-Time Transaction Monitoring

Standard Chartered Bank

Orchestrated Kafka streaming pipeline processing 1M+ events/hour with Spark Structured Streaming, Delta Lake, and Databricks.

1M+Events / Hour
KafkaSpark StreamingDelta LakeDatabricksAirflow
03

Data Warehouse Modernization

Wells Fargo

Spearheaded migration to Snowflake data warehouse using dbt, Airflow, PySpark. Transformed legacy systems into modern, scalable architecture.

60%Faster Queries
35%Cost Reduction
SnowflakedbtAirflowPySparkSQL
05

Education & Certifications

Education

Master of Science in Computer Science
AI & Machine Learning
Woolf University, Malta
2024
Bachelor of Engineering in Computer Science
VTU, Bengaluru
2014

Certifications

DP-700
Microsoft Fabric Data Engineer
2025
SA
AWS Solutions Architect
2018
CP
AWS Cloud Practitioner
2023
06

Get In Touch

Have a project? Need data engineering expertise? Let's connect.

Location
Bengaluru & Gurgaon, India
Open to consulting and collaboration opportunities