redis 进行缓存实战-18

使用 Redis 进行缓存

Redis 通常被认为只是一个数据存储，但它的速度和内存中特性使其成为缓存的绝佳选择。缓存是一种技术，通过将经常访问的数据存储在快速的临时存储位置来提高应用程序性能。通过使用 Redis 作为缓存，您可以显著减少主数据库的负载并缩短用户的响应时间。本课将探讨如何有效地使用 Redis 进行缓存，涵盖关键概念、策略和最佳实践。

了解缓存概念

缓存是软件开发中的一种基本优化技术。它涉及将数据副本存储在缓存中，缓存是一个高速数据存储层，以便将来可以更快地处理对该数据的请求。发出请求时，首先检查缓存。如果在缓存中找到数据（“缓存命中”），则直接从缓存中提供数据。如果未找到数据（“缓存未命中”），则会从原始数据源（例如数据库）中检索数据，将其存储在缓存中，然后提供给用户。

缓存的好处

改进的性能： 缓存通过从更快的存储层提供数据来减少延迟并缩短响应时间。
减少数据库负载： 通过从缓存中提供经常访问的数据，您可以减少主数据库的负载，使其能够处理更复杂的查询和作。
提高可扩展性： 缓存通过减少后端系统的负载，使您的应用程序能够处理更多的并发用户和请求。
节省成本： 通过减少数据库负载和提高资源利用率，缓存可以节省基础设施和运营费用方面的成本。

缓存策略

Redis 可以使用多种缓存策略，每种策略都有自己的优点和缺点：

Cache-Aside （延迟加载）： 应用程序首先检查缓存中的数据。如果找到数据，则直接返回数据。如果没有，应用程序将从数据库中检索数据，将其存储在缓存中，然后返回它。此策略易于实施，并确保缓存仅包含已请求的数据。
直写： 应用程序同时将数据写入缓存和数据库。这可确保缓存始终是最新的，但会增加写入延迟。
回写（Write-Behind）： 应用程序将数据写入缓存，缓存将数据异步写入数据库。此策略提供最低的写入延迟，但如果缓存在将数据写入数据库之前失败，则可能导致数据丢失。
通读： 应用程序与缓存交互，而缓存又与数据库交互。请求数据时，缓存会检查它是否包含数据。否则，它将从数据库中检索数据，将其存储在缓存中，然后将其返回给应用程序。

对于大多数使用案例，Cache-Aside 策略是 Redis 最实用且最常用的策略，因为它简单高效。

使用 Redis 作为缓存

Redis 非常适合缓存，因为它的速度、内存数据存储和对各种数据结构的支持。以下是将 Redis 用作缓存的方法：

设置和检索数据

您可以使用 SET 和 GET 命令在 Redis 中存储和检索数据。例如：

SET user:123 '{"id": 123, "name": "John Doe", "email": "john.doe@example.com"}'
GET user:123

在此示例中，我们将一个 JSON 字符串存储在 Redis 中，该字符串表示用户对象，其键为 user：123。使用 GET user：123 检索数据时，Redis 返回 JSON 字符串。

设置过期时间（TTL）

为防止缓存无限增长，您可以使用 EXPIRE 命令或带有 SET 命令的 EX 选项为缓存数据设置过期时间（TTL - 生存时间）：

SET user:123 '{"id": 123, "name": "John Doe", "email": "john.doe@example.com"}' EX 3600  # Expires in 3600 seconds (1 hour)
EXPIRE user:123 3600 # Expires in 3600 seconds (1 hour)
TTL user:123 # Check the remaining time to live

这可确保在指定时间段后自动删除缓存的数据，从而防止提供过时的数据。

数据序列化

缓存复杂数据结构时，您需要在将数据存储在 Redis 中之前对其进行序列化，并在检索数据后对其进行反序列化。常见的序列化格式包括 JSON 和 Protocol Buffers。

以下是在 Python 中使用 JSON 的示例：

import redis
import json# Connect to Redis
redis_client = redis.Redis(host='localhost', port=6379, db=0)# Data to cache
user_data = {"id": 123, "name": "John Doe", "email": "john.doe@example.com"}# Serialize the data to JSON
user_data_json = json.dumps(user_data)# Store the JSON string in Redis with an expiration time
redis_client.set('user:123', user_data_json, ex=3600)# Retrieve the data from Redis
cached_user_data_json = redis_client.get('user:123')# Deserialize the JSON string back to a Python dictionary
if cached_user_data_json:cached_user_data = json.loads(cached_user_data_json)print(cached_user_data)
else:print("Data not found in cache")

缓存失效策略

缓存失效是在基础数据更改时删除或更新缓存数据的过程。缓存失效有几种策略：

基于 TTL 的失效： 在指定的 TTL 之后，数据会自动从缓存中删除。这是最简单的策略，但如果基础数据在 TTL 过期之前发生更改，则可能会导致数据过时。
基于事件的失效： 当发生特定事件（例如数据库更新）时，缓存将失效。此策略可确保缓存始终是最新的，但它需要与数据源进行更复杂的集成。
手动失效： 缓存由管理员或应用程序代码手动失效。此策略对缓存失效提供了最大的控制，但它需要仔细监控和管理。

示例：缓存数据库查询

假设您有一个从数据库中检索用户数据的函数：

import redis
import json
import time# Assume this function fetches data from a database
def get_user_from_db(user_id):# Simulate a database query with a delaytime.sleep(1)user_data = {"id": user_id, "name": f"User {user_id}", "email": f"user{user_id}@example.com"}return user_datadef get_user(user_id, redis_client):"""Retrieves user data from cache if available, otherwise fetches from the database,caches it, and returns it."""cache_key = f'user:{user_id}'cached_user_data = redis_client.get(cache_key)if cached_user_data:# Cache hitprint(f"Cache hit for user {user_id}")user_data = json.loads(cached_user_data)else:# Cache missprint(f"Cache miss for user {user_id}. Fetching from DB.")user_data = get_user_from_db(user_id)user_data_json = json.dumps(user_data)redis_client.set(cache_key, user_data_json, ex=3600)  # Cache for 1 hourreturn user_data# Example usage
redis_client = redis.Redis(host='localhost', port=6379, db=0)user1 = get_user(123, redis_client)
print(user1)user1_cached = get_user(123, redis_client) #This time it will be a cache hit
print(user1_cached)user2 = get_user(456, redis_client)
print(user2)

在此示例中，get_user 函数首先检查用户数据在 Redis 缓存中是否可用。如果是，则从缓存中检索数据并返回数据。否则，将从数据库中检索数据，将其存储在缓存中，过期时间为 1 小时，然后返回。

高级缓存技术

缓存防盗

当大量请求同时命中缓存，并且缓存已过期或为空时，就会发生缓存加速。这可能会使数据库过载，因为所有请求都尝试同时检索数据。

要防止缓存踩踏，可以使用以下技术：

Probabilistic Early Expiration（概率提前到期）： 您可以向过期时间添加一个小的随机延迟，而不是为所有缓存条目设置固定的过期时间。这有助于分配数据库上的负载。
锁定： 当发生缓存未命中时，您可以获取一个锁，以防止其他请求同时尝试从数据库中检索数据。只有第一个请求会检索数据，将其存储在缓存中，然后释放锁。
后台刷新： 您可以在缓存过期之前在后台刷新缓存。这可确保缓存始终是最新的，并降低缓存被踩踏的可能性。

使用 Redis 数据结构进行缓存

Redis 提供了各种可用于缓存不同类型数据的数据结构：

Strings： 用于缓存简单的键值对，例如用户 ID 和名称。
Hashes： 用于缓存具有多个字段的对象，例如用户配置文件。
Lists： 用于缓存有序数据，例如最近的活动源。
Sets： 用于缓存唯一数据，例如用户角色。
Sorted Sets： 用于缓存排名数据，例如排行榜。

选择正确的数据结构可以提高缓存的效率和性能。

示例：缓存博客文章列表

假设您要缓存最近的博客文章列表。您可以使用 Redis 列表来存储帖子 ID，然后在需要时从数据库中检索完整的帖子数据。

import redis
import json# Connect to Redis
redis_client = redis.Redis(host='localhost', port=6379, db=0)# Assume this function fetches blog posts from a database
def get_blog_posts_from_db():# Simulate a database queryblog_posts = [{"id": 1, "title": "Redis Caching", "content": "..."},{"id": 2, "title": "NoSQL Databases", "content": "..."},{"id": 3, "title": "Python Programming", "content": "..."}]return blog_postsdef get_recent_blog_posts(redis_client, limit=10):"""Retrieves recent blog posts from cache if available, otherwise fetches from the database,caches it, and returns it."""cache_key = 'recent_blog_posts'cached_post_ids = redis_client.lrange(cache_key, 0, limit - 1)if cached_post_ids:# Cache hitprint("Cache hit for recent blog posts")post_ids = [int(post_id) for post_id in cached_post_ids]# In a real application, you would fetch the full post data from the database# based on these IDs.  Here, we just return the IDs.return post_idselse:# Cache missprint("Cache miss for recent blog posts. Fetching from DB.")blog_posts = get_blog_posts_from_db()post_ids = [post['id'] for post in blog_posts]# Store the post IDs in Redisfor post_id in reversed(post_ids):  # Add in reverse order to maintain orderredis_client.lpush(cache_key, post_id)redis_client.expire(cache_key, 3600)  # Cache for 1 hourreturn post_ids[:limit]# Example usage
recent_posts = get_recent_blog_posts(redis_client)
print(recent_posts)recent_posts_cached = get_recent_blog_posts(redis_client) #This time it will be a cache hit
print(recent_posts_cached)

实践练习

实施 Cache-Aside 策略： 创建一个使用 Redis 缓存 API 调用结果的函数。该函数应首先检查数据在缓存中是否可用。如果是，则返回缓存的数据。如果没有，请进行 API 调用，将结果存储在缓存中并指定过期时间，然后返回结果。
实施缓存失效： 修改前面的函数，以便在底层数据更改时使缓存失效。您可以通过更新数据库中的值或调用其他 API 终端节点来模拟数据更改。
使用 Redis 哈希来缓存对象： 创建一个使用 Redis 哈希缓存用户配置文件的函数。该函数应将每个用户配置文件存储为单独的哈希值，其中包含 name、email 和其他相关信息的字段。
实施缓存踩踏防护： 修改 API 缓存功能，以防止使用概率提前过期或锁定的缓存踩踏。