I’m Alexis Huet. I got a PhD in mathematics in 2014 from the University of Lyon, France (after passing the agrégation de mathématiques in 2009). I work in data science, and have been at Huawei since 2018, after three years in Nanjing. I primarily code in Python
and R
. On this site, I write detailed posts about maths, machine learning, and some of my own projects.
Contact me
alexis.huet.phd@gmail.com
Profiles
Google Scholar • LinkedIn • GitHub • X
Reviewing
Conferences: KDD (2024–2026, Excellence Award), ICLR (2025-2026), ACML (2024-2025), CIKM (2025)
Journals: IEEE TPAMI, ACM TKDD, IEEE TNSM, Springer Machine Learning
Invited talks
- Keynote, XTempLLMs Workshop @ COLM 2025
Patents
- 15+ patents filed
语言
除了法语和英语,我也在学中文,通过了HSK五级。这是最近在INALCO课上的一个分享。
Selected publications
These are some papers I enjoyed writing. I like when other researchers use the algorithms, benchmarks, or metrics we built, and take them further. I care most about methodological aspects, metrics, and evaluation.
Episodic memory generation and evaluation benchmark for large language models
ICLR 2025 • [paper] • [poster] • [code]
A benchmark for evaluating episodic memory in long-context LLMs. Questions require retrieving multiple events across chapters, for example all entities seen at a given location. I enjoyed the modeling of events as (time, space, entity, content) tuples and the metric design. For the broader vision connecting episodic memory to computational neuroscience, see Zied’s home page.
Changepoint detection via subset chains
PAKDD 2025 • [paper] • [presentation] • [code]
A method for detecting change points in time series at multiple levels of granularity. Existing methods require setting a threshold or penalty that is hard to choose well. The idea here is to split the problem in two: first score every point by how important it is as a potential change, then threshold recursively to reveal changes level by level, from major to minor. This also matches how humans see it: different annotators label different levels of detail, and each finds a level that fits.
Local evaluation of time series anomaly detection algorithms
KDD 2022 • [paper] • [poster] • [code]
A parameter-free metric for evaluating time series anomaly detection algorithms. Each ground truth event gets its own precision and recall with interpretable distances, which makes it easy to visualize what the detector actually got right or wrong. The metric verifies theoretical properties that ensure discriminative ranking between algorithms (that can be tested e.g. with autorank).
Web QoE from encrypted traffic
IFIP Networking 2020, IEEE TNSM 2021 • [paper] • [poster] • [demo]
A way to estimate web browsing quality from encrypted traffic, building on the Byte Index metric introduced in Bocchi et al., 2016. An interesting feature is that the session-level Byte Index decomposes exactly into per-flow contributions (proof in the journal version), enabling routers to compute it online from simple operations.
Posts about deep learning
-
RNN with Keras: Predicting time series
Complete introduction of time series prediction with RNN. This tutorial has been written for answering a stackoverflow post, and has been used later in a real-world context. -
RNN with Keras: Understanding computations
Highlights structure of common RNN algorithms by following computations carried out by each model. It provides a clear summary of command lines, math equations and diagrams.
Posts about maths in data science
-
Optimizing GMM parameters using EM. Description of GMM; How to update parameters using EM; Illustration on a simple example. Unlike many other sources, I fully detail parameters’ update using gradient and Hessian.
-
Rediscover EM algorithm from scratch. Many introductions of EM exist on the web. This one starts from the likelihood computation problem and uses inductive reasoning to bring out EM.
-
Computation of the gradient for SNE. Deriving gradient of the SNE algorithm, fully detailed.
-
An illustration of Metropolis–Hastings algorithm. Toy example for understanding Metropolis–Hastings algorithm on a simple example.
-
Maximizing likelihood is equivalent to minimizing KL-divergence. Restating this classic equivalence in my “own” words.
-
Introduction to particle filters
Introduction to particle filters, with an homemade example of trajectory tracking. -
Introduction to hidden Markov models
Introduction to hidden Markov models on finite state spaces, following the tutorial of L. R. Rabiner.
Own projects
-
Langton’s ant extended to Voronoi tessellations
A program extending Langton’s ant to any Voronoi tessellation of the plane. Simulations show interesting walks for some partitions of the plane, including chaotic structures, highway patterns and even bounded evolutions. -
Anabasis webapp
Webapp where players can draw collaborative paintings. It was built with node.js combined with mongodb, and hosted on Heroku and MLab. Analysis of collected data is also available in this post. -
Trigger snake
A challenging snake game built in C++/Qt4 -
Nim function for take-a-prime game
Simulation of a recursive math sequence with interesting patterns, accelerated using C++ language. -
Coal: Composition of Linear Functions
A program for automating composition of linear functions. gmppackage has been used to keep exact results for big rational numbers. -
Triangle pursuit
A program computing recurrent sequences, offering generalization for different rules, different norms, larger number of initial points and higher dimensions. -
Gender of French nouns
Check out the distribution of the gender of French nouns across the letters. -
Description and modeling of FlapMMO score data
FlapMMO is an online game similar to Flappy Bird. This post explores a collected dataset of scores, using descriptive statistics and testing probabilistic models. -
Watering and draining planets
And what would be the Moon, Mars and Venus with as much water in proportion as on the Earth? -
An enigma Could you find the missing symbol?