Skip to content

Commit f7c92ed

Browse files
committed
Add virtual trees article and tests
1 parent 82a06ff commit f7c92ed

File tree

3 files changed

+324
-0
lines changed

3 files changed

+324
-0
lines changed

src/graph/virtual_trees.md

Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
---
2+
title: Virtual Trees
3+
tags:
4+
- Original
5+
6+
---
7+
8+
This article explores the concept of *virtual trees*, proves their key properties, describes an efficient algorithm for their construction, and then applies the method to solve a specific problem using dynamic programming (DP) on trees.
9+
10+
## Introduction
11+
12+
Many tree-related problems require working with a subset of vertices while preserving the tree structure induced by their pairwise lowest common ancestors (LCA). The concept of a *virtual tree* allows us to transition from the original tree $T$ with $N$ vertices to a substructure whose size linearly depends on the size of the selected set $X$. This significantly accelerates algorithms, particularly for DP computations.
13+
14+
## Definition of a Virtual Tree
15+
16+
Let $T$ be a given rooted tree, and $X$ be some subset of its vertices. A **virtual tree** $A(X)$ is defined as follows:
17+
18+
$$ A(X) = \{ \operatorname{lca}(x,y) \mid x,y \in X \},$$
19+
20+
where $\operatorname{lca}(x,y)$ denotes the lowest common ancestor of vertices $x$ and $y$ in tree $T$. In this tree, an edge is drawn between each pair of vertices if one of them is an ancestor of the other in $T$.
21+
22+
This definition ensures that the structure includes all vertices "important" for analyzing $X$ (i.e., all LCAs for pairs of vertices from $X$), and connectivity is inherited from the original tree.
23+
24+
## Key Properties and Their Proofs
25+
26+
### Property 1: Size Estimation
27+
28+
**Statement.**
29+
For any set $X$ of vertices of tree $T$, the following inequality holds:
30+
31+
$$
32+
\vert A(X) \vert \le 2 \vert X \vert - 1.
33+
$$
34+
35+
**Proof.**
36+
Let's choose a depth-first search (DFS) traversal order and consider the vertices of set $X$ arranged in traversal order:
37+
38+
$$ x_1, x_2, \dots, x_m. $$
39+
40+
Denote
41+
42+
$$ l = \operatorname{lca}(x_1, x_2). $$
43+
44+
We need to prove the following lemma.
45+
46+
!!! note "Lemma"
47+
For any integer $n \ge 3$, if $\operatorname{lca}(x_1,x_n) \neq l$, then $\operatorname{lca}(x_1,x_n) = \operatorname{lca}(x_2,x_n).$
48+
49+
??? hint "Proof"
50+
Suppose that for some $n \ge 3$, both $\operatorname{lca}(x_1,x_n) \neq l$ and $\operatorname{lca}(x_1,x_n) \neq \operatorname{lca}(x_2,x_n)$ are true. Let $k = \operatorname{lca}(x_1,x_n)$.
51+
If $k$ is an ancestor of $l$, then by definition both $\operatorname{lca}(x_1,x_n)$ and $\operatorname{lca}(x_2,x_n)$ equal $k$, which contradicts our assumption. Therefore, $k$ must be a descendant of $l$. But then in the DFS order, we would get the sequence $x_1, x_n, x_2$, which contradicts our chosen traversal order.
52+
This contradiction completes the proof of the lemma.
53+
54+
Returning to the proof of the property, we note that according to the lemma, each LCA arising from sequential consideration of the vertices in $X$ equals either:
55+
1. the vertex $x_1$ itself,
56+
2. $\operatorname{lca}(x_1,x_2)$, or
57+
3. $\operatorname{lca}(x_2,x_n)$ for some $n$.
58+
59+
Thus, we can recursively write:
60+
61+
$$
62+
A(\{x_1,\dots,x_m\}) = A(\{x_2,\dots,x_m\}) \cup \{ x_1, \operatorname{lca}(x_1,x_2) \}.
63+
$$
64+
65+
From this, it follows that
66+
67+
$$
68+
\vert A(X) \vert \le \vert A(\{x_2,\dots,x_m\}) \vert + 2.
69+
$$
70+
71+
Applying induction on the size of set $X$, we obtain the final estimate:
72+
73+
$$
74+
\vert A(X) \vert \le 2 \vert X \vert - 1.
75+
$$
76+
77+
Moreover, if tree $T$ is a perfect binary tree and $X$ consists of its leaves, the inequality is achieved exactly.
78+
79+
### Property 2: Representation through Sequential LCAs
80+
81+
**Statement.**
82+
Let the vertices of $X$ be ordered by DFS traversal order: $x_1, x_2, \dots, x_m$. Then
83+
84+
$$
85+
A(X) = X \cup \{ \operatorname{lca}(x_i, x_{i+1}) \mid 1 \le i < m \}.
86+
$$
87+
88+
**Proof.**
89+
We start with the recursive representation:
90+
91+
$$
92+
A(\{x_1,\dots,x_m\}) = A(\{x_2,\dots,x_m\}) \cup \{ x_1, \operatorname{lca}(x_1,x_2) \}.
93+
$$
94+
95+
Let's expand it recursively:
96+
97+
$$
98+
\begin{aligned}
99+
A(\{x_1,\dots,x_m\}) &= A(\{x_2,\dots,x_m\}) \cup \{ x_1, \operatorname{lca}(x_1,x_2) \} \\
100+
&= A(\{x_3,\dots,x_m\}) \cup \{ x_1, \operatorname{lca}(x_1,x_2), x_2, \operatorname{lca}(x_2,x_3) \} \\
101+
&\quad \vdots \\
102+
&= \{ x_1, \operatorname{lca}(x_1,x_2), x_2, \dots, x_{m-1}, \operatorname{lca}(x_{m-1},x_m), x_m \}.
103+
\end{aligned}
104+
$$
105+
106+
Thus, we obtain the required representation.
107+
108+
## Construction of a Virtual Tree
109+
110+
Given a set of vertices $X$, the virtual tree $A(X)$ can be constructed in time
111+
112+
$$
113+
O\left(|X| (\log |X| + \log N)\right),
114+
$$
115+
116+
with preprocessing of the original tree $T$ in $O(N \log N)$ to ensure fast LCA queries.
117+
118+
**Main construction steps:**
119+
120+
1. **Preprocessing.**
121+
Using depth-first search (DFS) or methods such as binary lifting or HLD, we compute entry times for vertices and LCAs for any two vertices.
122+
123+
2. **Vertex Sorting.**
124+
We sort the vertices of set $X$ in the order of their appearance during DFS traversal (using entry times).
125+
126+
3. **Adding LCAs.**
127+
For consecutive vertices $x_i$ and $x_{i+1}$, we compute $\operatorname{lca}(x_i,x_{i+1})$. The union of $X$ and these LCAs gives the set of vertices $A(X)$ (according to Property 2).
128+
129+
4. **Tree Construction.**
130+
Having obtained the set of vertices $A(X)$ sorted by DFS order, we can traverse the sequence using a stack to restore the tree structure. During traversal, we maintain a stack whose top element is the current vertex, and if a new vertex is not a descendant of the vertex at the top of the stack, elements are "popped" until the corresponding ancestor is found, after which connecting occurs.
131+
132+
This algorithm guarantees that the virtual tree contains $O(|X|)$ vertices, allowing efficient application of DP.
133+
134+
## Application: Counting Subtrees in a Colored Tree
135+
136+
### Example [Problem](https://atcoder.jp/contests/abc340/tasks/abc340_g) Statement
137+
138+
Given a tree $T$ with $N$ vertices numbered from $1$ to $N$. Edge $i$ connects vertices $u[i]$ and $v[i]$. Each vertex $i$ is colored with color $A[i]$.
139+
It is necessary to find (modulo 998244353) the number of (non-empty) subsets $S$ of vertices of tree $T$ satisfying the condition:
140+
141+
- The induced graph $G[S]$ is a tree.
142+
- All vertices of $G[S]$ with degree 1 have the same color.
143+
144+
### Main Solution Idea
145+
146+
The main idea is to break down the problem by colors. For each color $c$, we consider the set of vertices $X$ colored with $c$ and build a virtual tree for $X$. Then, on the resulting tree, we perform DP to count valid subtrees where all vertices with degree 1 have color $c$. The final answer is obtained by summing the results for all colors.
147+
148+
Thanks to the construction of the virtual tree, although the original tree contains $N$ vertices, each virtual tree for a specific color has size $O(|X|)$, which allows DP to be performed in an acceptable time.
149+
150+
We get that the total complexity of preprocessing and building virtual trees does not exceed $O(N \log N + \sum_{c \in C} (|X_c| (\log |X_c| + \log N)) = O(N \log N)$, where $C$ is the set of different vertex colors and $X_c$ is the set of vertices with color $c$.
151+
152+
### Solution Implementation
153+
154+
Below is the implemenation with comments describing the main components of the algorithm:
155+
156+
```{.cpp file=virtual_trees}
157+
const int MOD = 998244353;
158+
159+
vector<int> g[MAXN]; // Adjacency list of the given graph
160+
vector<int> vertex_sets[MAXN]; // For each color c, vertex_sets[c] stores vertices with color c
161+
int tmr, n; // Global time counter and number of vertices
162+
int up[LOGN][MAXN], dep[MAXN], tin[MAXN]; // For computing LCA and entry time
163+
int col[MAXN]; // Vertex colors array
164+
vector<int> virtual_g[MAXN]; // Adjacency list for the virtual tree
165+
166+
// DP arrays for dynamic programming on the virtual tree
167+
int dp[MAXN][2], sum[MAXN];
168+
169+
// Preprocessing function: DFS to compute tin, up array, and dep
170+
void dfs_precalc(int v, int p) {
171+
tin[v] = ++tmr;
172+
up[0][v] = p;
173+
dep[v] = dep[p] + 1;
174+
for (int i = 1; i < LOGN; ++i)
175+
up[i][v] = up[i - 1][up[i - 1][v]];
176+
for (auto to : g[v])
177+
if (to != p)
178+
dfs_precalc(to, v);
179+
}
180+
181+
// Function to compute LCA using binary lifting
182+
int getlca(int x, int y) {
183+
if (dep[x] < dep[y]) swap(x, y);
184+
for (int i = LOGN - 1; i >= 0; --i)
185+
if (dep[up[i][x]] >= dep[y])
186+
x = up[i][x];
187+
if (x == y) return x;
188+
for (int i = LOGN - 1; i >= 0; --i)
189+
if (up[i][x] != up[i][y]) {
190+
x = up[i][x];
191+
y = up[i][y];
192+
}
193+
return up[0][x];
194+
}
195+
196+
// DFS on the virtual tree to perform DP.
197+
// Parameter c — target color for which counting is performed.
198+
199+
void dfs_calc(int v, int p, int c, int &ans) {
200+
dp[v][0] = dp[v][1] = 0;
201+
sum[v] = 0;
202+
for(auto to : virtual_g[v]) {
203+
if(to == p) continue;
204+
dfs_calc(to, v, c, ans);
205+
// DP transitions: combining current state with result from subtree.
206+
int nxt0 = (dp[v][0] + sum[to]) % MOD;
207+
int nxt1 = ((dp[v][0] + dp[v][1]) * 1ll * sum[to] % MOD + dp[v][1]) % MOD;
208+
dp[v][0] = nxt0;
209+
dp[v][1] = nxt1;
210+
}
211+
sum[v] = (dp[v][0] + dp[v][1]) % MOD;
212+
if(col[v] == c) {
213+
// If the vertex has the target color, it can participate in a valid subtree.
214+
sum[v] = (sum[v] + 1) % MOD;
215+
ans = (ans + sum[v]) % MOD;
216+
} else {
217+
ans = (ans + dp[v][1]) % MOD;
218+
}
219+
}
220+
221+
// Function to build a virtual tree for color c and perform DP.
222+
void calc_virtual(int c, int &ans) {
223+
auto p = vertex_sets[c];
224+
if (p.empty()) return;
225+
// Sort vertices by entry time (tin) — inorder traversal order.
226+
sort(p.begin(), p.end(), [&](const int a, const int b) { return tin[a] < tin[b]; });
227+
vector<int> stack = {1}; // Initialize stack with the root of tree T (vertex 1).
228+
virtual_g[1].clear();
229+
auto add = [&](int u, int v) {
230+
virtual_g[u].push_back(v);
231+
virtual_g[v].push_back(u);
232+
};
233+
// Process each vertex from set p, maintaining a stack to build the virtual tree.
234+
for (auto u : p) {
235+
if (u == 1) continue;
236+
int lca = getlca(u, stack.back());
237+
if (lca != stack.back()) {
238+
while (stack.size() >= 2 && tin[lca] < tin[stack[stack.size() - 2]]) {
239+
add(stack.back(), stack[stack.size() - 2]);
240+
stack.pop_back();
241+
}
242+
if (stack.size() >= 2 && tin[lca] != tin[stack[stack.size() - 2]]) {
243+
virtual_g[lca].clear();
244+
add(stack.back(), lca);
245+
stack.back() = lca;
246+
} else {
247+
add(stack.back(), lca);
248+
stack.pop_back();
249+
}
250+
}
251+
virtual_g[u].clear();
252+
stack.push_back(u);
253+
}
254+
while (stack.size() > 1) {
255+
add(stack.back(), stack[stack.size() - 2]);
256+
stack.pop_back();
257+
}
258+
// Perform DP on the virtual tree, starting from root 1.
259+
return dfs_calc(1, 0, c, ans);
260+
}
261+
262+
// The main function where we read input data and calculate total answer
263+
264+
int solve(int N, const vector<pair<int, int>> &edges, const vector<int> &colors) {
265+
n = N;
266+
for(auto [x, y] : edges) g[x].push_back(y), g[y].push_back(x);
267+
copy(colors.begin(), colors.end(), col + 1);
268+
// Group vertices by color.
269+
for(int i = 1; i <= n; ++i) vertex_sets[col[i]].push_back(i);
270+
dfs_precalc(1, 0);
271+
int ans = 0;
272+
// Process the corresponding virtual tree for each possible color.
273+
for (int i = 1; i <= n; ++i) calc_virtual(i, ans);
274+
return ans;
275+
}
276+
277+
```
278+
279+
### Implementation Explanation
280+
281+
- **Preprocessing.**
282+
The `dfs_precalc` function performs a depth-first traversal of tree $T$, computing entry time (`tin`), depth, and filling the binary lifting table `up` for fast LCA queries.
283+
284+
- **Computing LCA.**
285+
The `getlca` function implements binary lifting for quickly finding the lowest common ancestor of two vertices.
286+
287+
- **Building the Virtual Tree.**
288+
The `calc_virtual` function takes a color $c$, extracts the set of vertices `vertex_sets[c]`, sorts it by `tin`, and uses a stack to build the virtual tree. For each pair of consecutive vertices, the LCA is computed, corresponding to Property 2.
289+
290+
- **Dynamic Programming.**
291+
The `dfs_calc` function traverses the virtual tree and combines the results from subtrees according to DP transitions. The DP states are calculated in such a way that the contribution of a vertex is accounted for if it has the target color, and only those subtrees where all leaves have the same color are counted.
292+
293+
- **Collecting the Result.**
294+
The `main` function reads the input data, performs preprocessing, and then sums up the results for each color, outputting the final answer modulo 998244353.
295+
296+
## Practice Problems
297+
1. [Leaf Color](https://atcoder.jp/contests/abc340/tasks/abc340_g) (problem from the article)
298+
2. [Unique Occurrences](https://codeforces.com/contest/1681/problem/F)
299+
3. [Yet Another Tree Problem](https://www.codechef.com/DEC21A/problems/YATP)

src/navigation.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@ search:
200200
- [Tree painting](graph/tree_painting.md)
201201
- [2-SAT](graph/2SAT.md)
202202
- [Heavy-light decomposition](graph/hld.md)
203+
- [Virtual Trees](graph/virtual_trees.md)
203204
- Miscellaneous
204205
- Sequences
205206
- [RMQ task (Range Minimum Query - the smallest element in an interval)](sequences/rmq.md)

test/test_virtual_trees.cpp

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#include <cassert>
2+
#include <algorithm>
3+
#include <vector>
4+
using namespace std;
5+
6+
namespace Implementation {
7+
const int MAXN = 30;
8+
const int LOGN = 10;
9+
#include "virtual_trees.h"
10+
}
11+
12+
int main() {
13+
int n = 20;
14+
vector<pair<int,int>> edges = {
15+
{1,2}, {1,4}, {1,5}, {1,6}, {1,10},
16+
{1,16}, {2,3}, {2,15}, {2,19},
17+
{3,11}, {4,7}, {4,9}, {6,12},
18+
{6,13}, {6,14}, {6,20}, {7,8},
19+
{8,17}, {13,18}
20+
};
21+
vector<int> colors = {2, 10, 6, 3, 16, 20, 6, 17, 8, 13, 9, 11, 7, 12, 5, 13, 7, 18, 3, 18};
22+
assert(Implementation::solve(n, edges, colors) == 25);
23+
return 0;
24+
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy