# Black-box group algorithm for finding the subgroup generated by a subset

## Summary

Item Value
Problem being solved a multitude of problems:
generating set to membership test for black-box groups: Given a generating set for a subgroup of a black-box group, find a membership test for the subgroup. In this particular case, the membership test is constructed by explicitly enumerating all the elements of the subgroup.
generating set-checking problem: testing whether a subset of a group is a generating set for the whole group.
given a generating set for a subgroup of a black-box group, finding a shortest word in terms of that generating set that can be used to describe a particular element of the group.
Input format an encoding of a group $G$ where $G$ is a finite group (i.e., $G$ is to be treated as a black-box group) and a subset $S$ of $G$.
Output format the most detailed output format is as follows: for every element of $G$, information on whether or not it is generated by $S$, and, if it is generated by $S$, a word of minimum length for it where the letters of the word are drawn from $S$. We allow inverses in our words.
Theoretical significance --
Mode of input processing requires dealing with the entire input
Running time If the order of the group is $N$, and the size of the generating set is $s$, then the worst case running time is $O(Ns)$ multiplied by the time taken for the black box group operations. With a polylog assumption, this is $O(Ns)$ times a polynomial in $\log N$.

## Idea and outline

Before beginning, we replace $S$ by $S \cup S^{-1}$. Note that this does not affect the subgroup generated, but does simplify some of our searches for cycles in the Cayley graph.

The idea is to construct the Cayley graph for the subgroup generated by $S$. Each iteration of our process goes one layer further out from the part of the graph that has been constructed so far.

At each iteration, we keep track of which elements of the group correspond to the boundary of the Cayley graph so far, i.e., which are the elements that have just been added in the final iteration. For the next iteration, it suffices to look only at the edges emanating from these. We also keep track of which elements have already been included in the graph, so that we can identify repetitions as we move through the graph.

### Outer loop of algorithm

Each step of the outer loop corresponds to expanding one step more in the whole Cayley graph.

Aspect of outer loop Details
Initialization step Set $H_0$ to be the one-element subset comprising the identity element.
Set $F_0$ to be the identity element.
Set $H_{-1}, F_{-1}$ to both be empty.
The $i^{th}$ step At the end of this step, $H_i$ is the collection of all elements of $G$ that can be expressed using words of length at most $i$ in terms of $S$. $F_i$ is the collection of all elements of $G$ that can be expressed using words of length exactly $i$ and cannot be expressed using words of smaller length.
Note that the $H_{i-1} \subseteq H_i$ and $F_i = H_i \setminus H_{i-1}$, so $F_i$s are pairwise disjoint.
Condition for quitting the outer loop We quit the outer loop after the $i^{th}$ step if $F_i$ is empty.
In this case, $H_i$ is the subgroup generated by $S$.

From a storage viewpoint, it is necessary only to store, for the $i^{th}$ stage, the values $H_{i-1},F_{i-1}, F_{i-2}$ and none of the smaller $F_j$s or $H_j$s.

### The process within each step

We now detail the $i^{th}$ step in detail, for $i \ge 1$.

Step no. Details
(before we begin) The incoming $H_{i-1}$ is the set of all elements expressible as a product of length at most $i - 1$ from $S$ and the incoming $F_{i-1}$ is the set of all elements expressible as a product of length precisely $i - 1$ and not expressible as a product of smaller length. Similarly for $F_{i-2}$.
1 Let $K$ be the set $\{ fs \in G \mid f \in F_{i-1}, s \in S \}$
2 Remove duplicates from within $K$, and also remove from $K$ all the elements that are already within $F_{i-1}$ or $F_{i-2}$ (see below for why it suffices to check duplicates with just $F_{i-1}$ and $F_{i-2}$ rather than against all of $H_{i-1}$).
3 Set $F_i$ to be the trimmed set $K$ and set $H_i = H_{i-1} \cup F_i$.

### Why this works

It remains to prove that the process within each step has the desired properties, i.e., if the input from the $(i-1)^{th}$ stage is correct, then the output at the $i^{th}$ stage is correct.

First, note that the original, untrimmed set $K$ contains all the possibilities elements that have minimum word length precisely $i$. We need to show that our trimming process brings us down to precisely the elements that have minimum word length precisely $i$. In other words, we need to show that if an element of $K$ has minimum word length less than $i$, that minimum word length must be either $i - 1$ or $i - 2$.

To see this, suppose the minimum word length is $j \le i - 3$. Then, we get an equality of the form: $\! fs = h$

where $f \in F_{i-1}, s\in S, h \in H_{i-3}$.

Rearranging, we get: $\! f = hs^{-1}$

Thus, we have rewritten $f \in F_{i-1}$ as an element of $H_{i-2}$ (this is crucially where we use that our generating set is a symmetric subset, hence closed under taking inverses), a contradiction to the definition.

### Analysis of running time

The running time of each step is $O(|(F_{i-1}||S|) + O(|F_{i-2}|)$ times the costs of the black box group operations (performing multiplication, inverting elements). Since the $F_j$s are all pairwise disjoint and their union is bounded in size by the size of the whole group, the running time is $O(|G||S|)$ times the cost of the group multiplication and equality checking. If we make the log-size assumption, then this is $O(Ns)$ times a polynomial in $\log N$.