• Dissecting Batching Effects in GPT Inference

    Machine learning models relying on batching to improve inference throughput, especially for smaller computer vision models such as ResNet and DenseNet. GPT, as well as other large language models (LLMs), is the hottest model these days. Does batching still apply to GPT and LLMs? Let’s find out.

    Read on →

  • How we discovered why C++ exceptions disappear in stack trace

    Every time my code crashes, I rely on core dump files to find out where the crash happened. (See my previous post on how to produce and use a core dump file.) My debugging life has been happy thanks to this approach, until this time. When I loaded GDB with the core dump, I’m so disappointed that all the stack trace is about some system library, none about mine.

    TLDR: Check this patch.

    Let’s start the journey.

    Read on →

  • Coq Tricks for Beginners with Too Many Examples

    The first time I heard Coq was in college. Since then I always wanted to learn Coq but didn’t know how to. Lots ofhypotheses the tutorials are about logic, which doesn’t seem interesting to me.

    I took UW CSE505 last quarter. Prof. Zach Tatlock and the TA Talia Ringer made some wonderful homework to help us learn Coq. While in this course, we proved more practical things, like a programming language interpreter and a regular expression matcher. I actually found myself feeling quite good about proving things in Coq (writing specifications is a completely different other thing), at least when doing homework.

    If you are teaching yourself Coq, I’d suggest the following resources:

    • Homework of UW CSE505
    • Formal Reasoning About Programs is the textbook we used in class. It’s a short textbook with big margins and lots of whitespaces. It comes with detailed Coq code for each chapter and a very powerful Coq library frap.
    • Software Foundations is the material for you if you are really into verification.

    I’m quite happy that I completed Coq in my wish list. But I’m not a huge believer of machine-checked proof or verification. Before I fully forgot Coq, let me write down some beginner-level tricks I learned when doing homework, just in case you might find it useful or I come back to Coq some decades later.

    Read on →

  • Damn Thread, Don't be such a dog in the manger

    Take a close look at the following code:

    Read on →

  • Sharding: A General Method to Improve Lock Granularity in a Brute-force Way

    Example: Concurrent Hashmap

    Suppose we need a concurrent hashmap. I know there are lots of libraries, but let’s assume we need to write it by ourselves. For simplicity, we only support three kinds of operations: get, put and remove.

    Trivial Solution

    We can use std::unordered_map, but it is not thread-safe. So, the most straightforward solution would be to have a big lock to protect it. It’s simple, just like this:

    Read on →