sieduck

Data Science student @ Unimelb building toward industry-standard AI engineering — currently deep in RAG evaluation, with agentic systems next on the roadmap.

Projects

Selected work

Pinned projects from my GitHub.

RAG / LLMLive demo

Financial Report RAG

Interactive question-answering system that lets non-expert investors ask plain-English questions about corporate 10-K filings, powered by retrieval-augmented generation.

  • 69.5% accuracy across 20 questions on four 10-K filings
  • ChromaDB vector search + FlashRank re-ranking
  • Deployed live on Streamlit Cloud
ChromaDBGemini 2.5 FlashFlashRankStreamlitPyMuPDF
NLP

Sports Arbitrage with Embeddings

Detects guaranteed-profit betting arbitrage by scraping Sportsbet and PointsBet APIs, then matching semantically similar markets across bookmakers with NLP embeddings.

  • Sentence-BERT embeddings (stsb-distilroberta-base-v2)
  • Cosine similarity for cross-bookmaker market matching
  • Implied-probability arbitrage detection
Sentence TransformersPythonNLPWeb Scraping
Data Analysis

Vehicle Accident Analysis

Statistical analysis of Victorian road crash data investigating whether vehicle-related factors correlate with accident severity, using ML and hypothesis testing.

  • Decision tree regressors with cross-validation
  • Feature importance & R² driven findings
  • Surfaced brand-level safety patterns in the data
scikit-learnpandasSeabornJupyter
Contact

Available for Machine Learning / AI internships

Remote or on-site — happy to talk projects, research, or roles.