👋 Welcome to Banghao’s Blog

Hi! This is Banghao Chi, an M.S. CS student at the Siebel School of Computing and Data Science, part of the Grainger College of Engineering at University of Illinois at Urbana-Champaign, advised by Prof. Minjia Zhang.

· I’m documenting my learning notes in this blog 😄
· This is a space where I will mostly be sharing about Computer Vision & NLP 🙂
· I also work on Fullstack development with React, Golang and SpringBoot (^▽^)

My macOS Development Environment: A Comprehensive Dotfiles Guide

After countless hours of tweaking and optimizing, I’ve finally crafted a macOS development environment that perfectly suits my workflow. This blog post takes you through my complete dotfiles setup, explaining each component and how they work together to create a productive, efficient development experience. 🎯 Philosophy My dotfiles are built around three core principles: Efficiency: Everything should be accessible with minimal keystrokes Aesthetics: A beautiful terminal environment that inspires creativity Automation: Reduce repetitive tasks through smart automation 🛠️ Core Tools Overview Here’s the complete arsenal of tools that power my development environment:...

LLMarking: Adaptive Automatic Short-Answer Grading Using Large Language Models

This is the official repo for Automatic Short Answer Grading (ASAG) project, named LLMarking, from Xi’an Jiaotong Liverpool University (XJTLU). Using vLLM as the Large Language Model (LLM) inference framework and FastAPI as the HTTP service framework, this project can achieve high throughput of both LLM tokens delivered and request handling. Feature This project aims to achieve high concurrency automatic short answer grading (ASAG) system and implement the construction of service....

Let's build GPT from scratch with BPE!

1. Workshop Description Quick question: Have you ever thought about a string being transformed into a word vector so that it can be further fed into a machine learning algorithm? In this workshop, we are going to dive into the fascinating world of Natural Language Processing (NLP) with our focus on Byte Pair Encoding (BPE) algorithm. We will discover how this powerful technique segments text into subword units, enabling efficient representation of words as vectors....

IoT-Enabled Home Security Camera

Video GitHub 1. Motivation Facial recognition technology has become prevalent in all areas of life. Whether you work in security, law enforcement, or manufacture personal devices, the presence of facial recognition for various purposes is evident. Our project seeks to dive into this increasingly common technology and apply it to a place that needs upgrades, such as banks. Many banks are on old applications or using outdated technology, limiting the effectiveness of their work....

Quantization on CenterPoint

Take mmdetection as an example First find the Runner class: This is the place where the build of the model is completed: class Runner: def __init__(...): ... ... self.model = self.build_model(model) # wrap model self.model = self.wrap_model( self.cfg.get('model_wrapper_cfg'), self.model) # get model name from the model class if hasattr(self.model, 'module'): self._model_name = self.model.module.__class__.__name__ else: self._model_name = self.model.__class__.__name__ ... ... Learn about how pytorch-quantization works by diving into its source code: Code about the quantization function respect to a specific Pytorch model as input: quant_utils....

Daily Log

3.12 Managed to understand the whole code base of the CLIP repo from OpenAI. Planned to take a look at CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation, to understand how to implement Open-Vocabulary Segmentation (OVS) using CLIP. 3.13 1. DETR Got a basic understanding of DETR, which is an awesome end-to-end 2D object detection architecture, with its downside lies in: Long training period Difficulty of detecting small objects but has advantages in:...

CATseg: A complete walk through of the model architecture

1. Model Architecture setup and evaluation data flow(for ade150k) CATSeg setup: backbone: D2SwinTransformer -> Swintransformer -> BasicLayer(2) -> SwinTransformerBlock -> WindowAttention sem_seg_head: CATSegHead.from_config -> CATSegPredictor -> Load CLIP model -> Load text templates -> class_embeddings(self.class_texts, prompt_templates, clip_model) -> for each class: bpe encode classname in different templates and save results in variable texts (80(number of templates), 77(number of sentence length)). CLIP encode texts : texts go through token_embedding(nn.Embedding) (80,77,768(hidden_dim)) texts go through a 12 layers of ResidualAttentionBlock (80,77,768) take features of texts from the eot_token (80,768) do the above for all classes (150(number of test classes),80,768)...

Argparse: A User-friendly Tool to Write CLI Interface

1. Introduction Hello fellows! Today I’m excited to share insights about the argparse module, a robust and intuitive tool for creating command-line interfaces in Python. What makes argparse particularly fascinating to me is its ability to enable users to quickly leverage Python scripts with custom configurations and functionalities, without the need to dive into the underlying source code. This feature of argparse has captured my interest and again, showcasing its value in making Python files reusable and accessible for diverse applications....

Real-time Object Recognition in Chess: Personalized Tuning and Hardware Acceleration

1. Selected and customized the YOLOv5 model for Chinese chess annotation data. 2. Conducted testing and analysis of the model. The results indicated exceptional accuracy in recognition capabilities. However, a significant shortfall was identified in terms of efficiency, with the model taking approximately 6 seconds to process a single image. 3. Implemented model optimization. We substitute the YOLOv5 model with a more lightweight variant, YOLOv5-lite and convert the model into the ONNX format to leverage hardware acceleration, thereby enhancing computational efficiency....