Sneha Susan Shaju

Data Engineer · Python · PostgreSQL · Elasticsearch · Machine Learning · NLP

Data Engineer focused on building reliable data pipelines, analytics workflows, and ML-ready systems using Python and SQL.

A bit about me

I am a Data Engineer with hands-on experience building Python-based data pipelines and analytical workflows for real-world systems.

In my current role, I work on automated data ingestion and processing pipelines using PostgreSQL and Elasticsearch, focusing on data reliability, validation, and downstream usability.

I’m particularly interested in working with imperfect data — missing values, delayed inputs, and evolving schemas — and designing pipelines that remain stable as requirements grow.

Experience

Data Engineer — Digital University Kerala

Oct 2023 – Present

Data Analyst Intern — Digital University Kerala

Feb 2023 – Jun 2023

Projects

AI-Powered Document QA System

Tech: Python, Mistral 7B, ChromaDB, LangChain, Streamlit

  • Built a document-based QA chatbot using LLMs and vector search
  • Implemented parsing, chunking, embeddings, and retrieval pipelines
  • Designed an interactive Streamlit interface

Revenue Analytics – Hospitality Domain

Tech: Power BI, SQL, DAX, ETL

  • Designed ETL pipelines and analytical data models
  • Built dashboards for revenue and occupancy insights
  • Implemented complex DAX calculations

Skills

Languages: Python, SQL

Databases: PostgreSQL, Elasticsearch

Data & ML: Pandas, NumPy, scikit-learn, TensorFlow, NLP, LLMs

Tools: FastAPI, LangChain, ChromaDB, Streamlit, Git, Jira, Power BI

Contact