Daniel Guo clides

Daniel Guo 👋

Computer Science w/ AI Specialization @ University of Waterloo

Strong interest in AI/ML systems, with hands-on experience in Deep Learning, Natural Language Processing, Information Retrieval, Computer Vision, and MLOps
Passionate about developing and using open-source tools
Average nvim enjoyer

RankLLM (Python toolkit for reproducible information retrieval research using rerankers):

One of the top contributors to this toolkit
Designed and implemented customizable prompt template feature, replacing hardcoded prompts and response analysis with dynamic configurations to improve extensibility and maintainability while ensuring backward compatibility
Designed and implemented an optimized multi-tier caching system for first-stage retrieved results retrieval, combining local file caching, HuggingFace Hub fallback, and on-demand Pyserini retrieval to minimize redundant computations
Developed other features such as few-shot examples injection, VLLM integration for multi-GPU support, and more
Helped create unittests and perform regression tests to update regression scores
Updated RankLLM implementation and usage in other popular repos such as LangChain, rerankers, and LlamaIndex

Pyserini (Python toolkit for reproducible information retrieval research with sparse and dense representations):

Integrated the M-BEIR dataset and UniIR models into the pyserini pipeline for multimodal retrieval
Added feature to perform sparse vector encoding with SPLADE models to the pipeline
Created documentation for various regression tests, as well as computing their scores

[UniIR-for-Pyserini(https://github.com/clides/UniIR-for-Pyserini) (Fork of the original UniIR repo for easy Pyserini integrations):

Created and released PyPI package for uniir-for-pyserini, which is a fork of the original repo but modified for easy Pyserini integration

Anserini (Lucene toolkit for reproducible information retrieval research)

Created documentation for various regression tests, as well as computing their scores
Built indexes and uploaded them to HuggingFace datasets for easy retrieval

Email: daniel168.guo@gmail.com
Resume: here
LinkedIn: linkedin/daniel-guo
Open to: ML/SWE internship and coop opportunities, research collaborations, and open-source contributions