前缀树进阶-经典案例详解

- 一、前缀树基础内容回顾
- 二、单词搜索建议系统
- - 2.1 问题描述
  - 2.2 解题思路
  - 2.3 Java代码实现
  - 2.4 复杂度分析
- 三、单词编码
- - 3.1 问题描述
  - 3.2 解题思路
  - 3.3 Java代码实现
  - 3.4 复杂度分析
- 四、最长单词
- - 4.1 问题描述
  - 4.2 解题思路
  - 4.3 Java代码实现
  - 4.4 复杂度分析

我在上篇博客介绍了前缀树（Trie树）这种高效的数据结构，以独特的树形结构和前缀匹配特性，在字符串处理、搜索建议、词频统计等众多场景中发挥着关键作用。本文我就将通过几个经典案例，深入剖析前缀树在实际问题中的应用，结合Java代码实现，帮你掌握前缀树的使用技巧和解题思路。

一、前缀树基础内容回顾

前缀树是一种多叉树结构，其每个节点代表一个字符，从根节点到某一节点的路径上的字符连接起来，形成的字符串即为该节点对应的字符串。前缀树的核心优势在于高效的前缀匹配和字符串查找，能够在 $O (m)$ 的时间复杂度内完成长度为 $m$ 的字符串的插入、查询操作，其中 $m$ 为字符串的长度。在Java中，前缀树节点可以用如下类表示：

class TrieNode {TrieNode[] children;boolean isEndOfWord;TrieNode() {children = new TrieNode[26];isEndOfWord = false;}
}

前缀树的基本操作包括插入、查询和删除：

class Trie {private TrieNode root;public Trie() {root = new TrieNode();}// 插入字符串public void insert(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {node.children[index] = new TrieNode();}node = node.children[index];}node.isEndOfWord = true;}// 查询字符串是否存在public boolean search(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {return false;}node = node.children[index];}return node.isEndOfWord;}// 查询是否存在以prefix为前缀的字符串public boolean startsWith(String prefix) {TrieNode node = root;for (char c : prefix.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {return false;}node = node.children[index];}return true;}
}

二、单词搜索建议系统

2.1 问题描述

给你一个产品列表 products 和一个搜索词 searchWord 。请你设计一个推荐系统，在依次输入单词 searchWord 的每一个字母后，推荐 products 数组中前缀与 searchWord 相同的最多三个产品。如果前缀相同的可推荐产品超过三个，请按字典序返回最小的三个。

2.2 解题思路

构建前缀树：将 products 数组中的每个单词插入前缀树中，在插入过程中，记录每个节点对应的所有单词（可使用列表存储），并按字典序排序。
遍历搜索词：依次遍历 searchWord 的每个字符，在前缀树中查找对应的节点。
获取推荐列表：若找到对应节点，则从该节点记录的单词列表中取出字典序最小的最多三个单词作为推荐；若未找到对应节点，则返回空列表。

2.3 Java代码实现

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;class TrieNode {TrieNode[] children;List<String> words;TrieNode() {children = new TrieNode[26];words = new ArrayList<>();}
}class Trie {private TrieNode root;public Trie() {root = new TrieNode();}public void insert(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {node.children[index] = new TrieNode();}node = node.children[index];node.words.add(word);}}public List<List<String>> search(String searchWord) {List<List<String>> result = new ArrayList<>();TrieNode node = root;for (char c : searchWord.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {while (result.size() < searchWord.length()) {result.add(new ArrayList<>());}return result;}node = node.children[index];Collections.sort(node.words);int count = Math.min(3, node.words.size());List<String> subList = node.words.subList(0, count);result.add(subList);}return result;}
}public class SearchSuggestionsSystem {public static List<List<String>> suggestedProducts(String[] products, String searchWord) {Trie trie = new Trie();for (String product : products) {trie.insert(product);}return trie.search(searchWord);}public static void main(String[] args) {String[] products = {"mobile", "mouse", "moneypot", "monitor", "mousepad"};String searchWord = "mouse";List<List<String>> result = suggestedProducts(products, searchWord);for (List<String> subList : result) {System.out.println(subList);}}
}

2.4 复杂度分析

时间复杂度：构建前缀树的时间复杂度为 $\times m)$ ，其中 $n$ 是 products 数组的长度， $m$ 是字符串的平均长度；每次搜索的时间复杂度为 $O (m)$ ，遍历 searchWord 的每个字符进行查找，所以总的时间复杂度为 $\times m + m \times k)$ ， $k$ 为 searchWord 的长度。
空间复杂度：前缀树占用的空间与 products 数组中的单词数量和长度有关，最坏情况下空间复杂度为 $\times m)$ ；存储结果列表也需要一定空间，总体空间复杂度为 $\times m)$ 。

三、单词编码

3.1 问题描述

请你设计一个数据结构，支持添加新单词和查找字符串是否与任何先前添加的字符串匹配。

addWord(word)：添加一个单词到数据结构中。
search(word)：如果数据结构中存在字符串与 word 匹配，则返回 true ；否则，返回 false 。word 中可能包含一些 '.' ，每个 '.' 都可以表示任何一个字母。

3.2 解题思路

构建前缀树：与普通前缀树构建方式相同，将单词插入前缀树中。
搜索匹配：在搜索时，若遇到字符 '.' ，则递归遍历当前节点的所有子节点；若遇到普通字符，则按正常前缀树查找方式进行匹配。当遍历到单词末尾且对应节点标记为单词结束时，返回 true ，否则返回 false 。

3.3 Java代码实现

class TrieNode {TrieNode[] children;boolean isEndOfWord;TrieNode() {children = new TrieNode[26];isEndOfWord = false;}
}class WordDictionary {private TrieNode root;public WordDictionary() {root = new TrieNode();}public void addWord(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {node.children[index] = new TrieNode();}node = node.children[index];}node.isEndOfWord = true;}public boolean search(String word) {return searchHelper(word, 0, root);}private boolean searchHelper(String word, int index, TrieNode node) {if (index == word.length()) {return node.isEndOfWord;}char c = word.charAt(index);if (c != '.') {int childIndex = c - 'a';if (node.children[childIndex] == null) {return false;}return searchHelper(word, index + 1, node.children[childIndex]);} else {for (TrieNode child : node.children) {if (child != null && searchHelper(word, index + 1, child)) {return true;}}return false;}}
}public class WordDictionaryExample {public static void main(String[] args) {WordDictionary wordDictionary = new WordDictionary();wordDictionary.addWord("bad");wordDictionary.addWord("dad");wordDictionary.addWord("mad");System.out.println(wordDictionary.search("pad"));  // 输出 falseSystem.out.println(wordDictionary.search("bad"));  // 输出 trueSystem.out.println(wordDictionary.search(".ad"));  // 输出 trueSystem.out.println(wordDictionary.search("b.."));  // 输出 true}
}

3.4 复杂度分析

时间复杂度：插入操作的时间复杂度为 $O (m)$ ， $m$ 为单词长度；搜索操作在最坏情况下，对于包含 '.' 的单词，需要遍历当前节点的所有子节点，时间复杂度为 $O(26^m)$ ，但平均情况下，若 '.' 出现较少，时间复杂度接近 $O (m)$ 。
空间复杂度：前缀树占用空间为 $\times m)$ ， $n$ 为单词数量， $m$ 为单词平均长度。

四、最长单词

4.1 问题描述

给出一个字符串数组 words ，找出数组中最长的单词，该单词是由 words 词典中其他单词逐步添加一个字母组成。若其中有多个可行的答案，则返回答案中字典序最小的单词。若无答案，则返回空字符串。

4.2 解题思路

构建前缀树：将 words 数组中的每个单词插入前缀树中，同时记录每个单词的长度。
遍历单词：遍历 words 数组，对于每个单词，检查其前缀是否都在前缀树中且前缀对应的节点标记为单词结束，若满足条件且单词长度更长或长度相同但字典序更小，则更新结果。

4.3 Java代码实现

class TrieNode {TrieNode[] children;boolean isEndOfWord;int length;TrieNode() {children = new TrieNode[26];isEndOfWord = false;length = 0;}
}class Trie {private TrieNode root;public Trie() {root = new TrieNode();}public void insert(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null) {node.children[index] = new TrieNode();}node = node.children[index];}node.isEndOfWord = true;node.length = word.length();}public boolean checkPrefixes(String word) {TrieNode node = root;for (char c : word.toCharArray()) {int index = c - 'a';if (node.children[index] == null ||!node.children[index].isEndOfWord) {return false;}node = node.children[index];}return true;}
}public class LongestWordInDictionary {public static String longestWord(String[] words) {Trie trie = new Trie();for (String word : words) {trie.insert(word);}String result = "";for (String word : words) {if (trie.checkPrefixes(word)) {if (word.length() > result.length() || (word.length() == result.length() && word.compareTo(result) < 0)) {result = word;}}}return result;}public static void main(String[] args) {String[] words = {"w", "wo", "wor", "worl", "world"};System.out.println(longestWord(words));  // 输出 world}
}

4.4 复杂度分析

时间复杂度：构建前缀树的时间复杂度为 $\times m)$ ， $n$ 为单词数量， $m$ 为单词平均长度；检查每个单词前缀的时间复杂度为 $O (m)$ ，遍历所有单词检查前缀，总的时间复杂度为 $\times m)$ 。
空间复杂度：前缀树占用空间为 $\times m)$ 。