Skip to content

Commit 3b924de

Browse files
committed
[doc] Add the blog of wpoe
1 parent 1d16e47 commit 3b924de

File tree

2 files changed

+26
-1
lines changed

2 files changed

+26
-1
lines changed

_posts/2025-10-11-max-ent-rl.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
layout: distill
3-
title: Why the Exponential? From Max‑Entropy RL to the Boltzmann Distribution
3+
title: Why the Exponential? From Max‑Entropy RL to the Boltzmann Distribution
44
description: This blog post explores why the exponential function appears ubiquitously across modern RL, energy-based modeling, and statistical mechanics. We examine the connection between max-entropy reinforcement learning and the Boltzmann distribution, uncovering the fundamental principles that make the exponential form inevitable and explaining what "temperature" actually does in these frameworks.
55
tags: reinforcement-learning information-theory boltzmann-distribution
66
giscus_comments: true

_posts/2025-11-09-weighted-poe.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
layout: distill
3+
title: Test-Time Steering for Lossless Text Compression via Weighted Product of Experts
4+
description: >
5+
When I was a child, I always wondered: if I keep compressing the same file, will it eventually shrink to nothing? Of course, the answer is no—once a file is optimally compressed by a lossless compressor, compressing it again with the same method gives a file of exactly the same size. Today I know this comes from the fundamental limits of lossless compression in information theory. But what if we use multiple compressors instead of one? If we combine them, can each remove a different part of the data’s redundancy—and how should such a combination be designed? In this blog we discussed the above questions and proposed a method called Weighted Product of Experts.
6+
tags: large-language-models lossless-compression mixture-of-experts information-theory
7+
giscus_comments: true
8+
date: 2025-11-09
9+
featured: true
10+
redirect: https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html
11+
12+
authors:
13+
- name: Qihang Zhang
14+
url: "https://qihang-zhang.com/"
15+
affiliations:
16+
name: UBC
17+
18+
---
19+
20+
<script>
21+
window.location.replace("https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html");
22+
</script>
23+
24+
If you are not redirected automatically, you can read the full post here:
25+
[Test-Time Steering for Lossless Text Compression via Weighted Product of Experts](https://qihang-zhang.com/Learning-Sys-Blog/2025/10/15/weighted-product-of-experts.html).

0 commit comments

Comments
 (0)