在 MySQL 8.0 及以后版本中,窗口函数(Window Functions)为数据分析和处理提供了强大的工具。窗口函数允许在查询结果集上执行计算,而不必使用子查询或连接,这使得某些类型的计算更加高效和简洁。
语法结构
function_name() OVER ([PARTITION BY expression_list][ORDER BY expression_list][frame_clause]
)
- PARTITION BY:将查询结果集分成多个分区,类似于 GROUP BY,但在窗口函数中是用于定义数据的分组逻辑。例如,按照部门分区,然后在每个部门内进行操作。
- ORDER BY:对每个分区内的行进行排序,这是可选的。窗口函数的计算可能会依赖于行的顺序,如计算累计和等。
创建测试表
CREATE TABLE `employee` (`employee_id` int NOT NULL AUTO_INCREMENT,`department` varchar(45) DEFAULT NULL,`salary` varchar(45) DEFAULT NULL,PRIMARY KEY (`employee_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
写入测试数据
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (1, 'Sales', '6000');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (2, 'Sales', '6000');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (3, 'Sales', '5000');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (4, 'IT', '7000');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (5, 'IT', '9000');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (6, 'Marketing', '5500');
INSERT INTO `test`.`employee` (`employee_id`, `department`, `salary`) VALUES (7, 'Sales', '5000');
使用窗口函数
SELECT employee_id, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary desc) AS ran,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS d_ran,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employee;
RANK() 函数计算排名,如果有并列情况,排名会出现跳跃。
DENSE_RANK() 函数在并列情况下不会跳跃。
ROW_NUMBER() 函数为每一行分配一个唯一的行号,即使有并列情况也是如此。
实际使用时,可以通过结合cte表达式,指定row_num值,来实现数据去重的效果。
例如:
with distinct_tbl as (select *,ROW_NUMBER() OVER (PARTITION BY department,salary ORDER BY salary DESC) AS row_numfrom employee
)
select * from distinct_tbl
where row_num = 1
执行结果