* Why
Recently MariaDB 10.0 include LIMIT ROWS EXAMINED features.I think this features will prevent our services from abnormal behavior(A sudden change of QEP or Operation miss, ... something else..) of MariaDB.
I got new idea from LIMIT ROWS EXAMINED features.
We often use "SELECT COUNT(*) .." query for counting rows matched condition.
We can limit row count with LIMIT clause when we SELECT row itself, But we can't during counting rows.
So, I added small features on MariaDB 10.0.10.
* What : LIMIT ROWS MATCHED
This new features works before "ORDER BY" and "GROUP BY" and some aggregate function like "COUNT()" and "SUM()".So we can count rows matched WHERE condition with LIMIT ROWS MATCHED syntax.
Below example is simple usage of LIMIT ROWS MATCHED syntax. (in this example LIMIT ROWS MATCHED works same as just LIMIT clause)
MariaDB [test]> CREATE TABLE test (fd_pk INT NOT NULL auto_increment, fd1 INT, fd2 INT, PRIMARY KEY(fd_pk));
Query OK, 0 rows affected (0.04 sec)
MariaDB [test]> INSERT INTO test VALUES (1,1,1), (2,2,2), (3,3,3), (4,4,4), (5,5,5);
Query OK, 5 rows affected (0.02 sec)
Records: 5 Duplicates: 0 Warnings: 0
MariaDB [test]> SELECT * FROM test;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
+-------+------+------+
5 rows IN SET (0.00 sec)
MariaDB [test]> SELECT * FROM test LIMIT ROWS MATCHED 2;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
+-------+------+------+
2 rows IN SET (0.00 sec)
But if you use COUNT(*) aggregate function, the result looks different.
MariaDB [test]> SELECT COUNT(*) FROM test LIMIT 3;
+----------+
| COUNT(*) |
+----------+
| 5 |
+----------+
1 row IN SET (0.01 sec)
MariaDB [test]> SELECT COUNT(*) FROM test LIMIT ROWS MATCHED 3;
+----------+
| COUNT(*) |
+----------+
| 3 |
+----------+
1 row IN SET (0.00 sec)
And LIMIT ROWS MATCHED feature is processed before LIMIT. So sometimes, it looks like LIMIT is not working when you use both of LIMIT n and ROWS MATCHED n.
MariaDB [test]> SELECT * FROM test LIMIT 3 ROWS MATCHED 2;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
+-------+------+------+
2 rows IN SET (0.00 sec)
We can use LIMITED ROWS MATCHED syntax nested fashion.
MariaDB [test]> SELECT * FROM (
SELECT * FROM test WHERE fd2 BETWEEN 2 AND 4 LIMIT ROWS MATCHED 2) x
LIMIT ROWS MATCHED 1;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 2 | 2 | 2 |
+-------+------+------+
1 row IN SET (0.01 sec)
MariaDB [test]> SELECT * FROM (SELECT * FROM test WHERE fd2 BETWEEN 2 AND 4 LIMIT ROWS MATCHED 2) x LIMIT ROWS MATCHED 2;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 2 | 2 | 2 |
| 3 | 3 | 3 |
+-------+------+------+
2 rows IN SET (0.00 sec)
MariaDB [test]> SELECT * FROM (
SELECT * FROM test WHERE fd2 BETWEEN 2 AND 4 LIMIT ROWS MATCHED 2) x
LIMIT ROWS MATCHED 3;
+-------+------+------+
| fd_pk | fd1 | fd2 |
+-------+------+------+
| 2 | 2 | 2 |
| 3 | 3 | 3 |
+-------+------+------+
2 rows IN SET (0.01 sec)
In above example, derived table(subquery of FROM clause) will cut before third matched row, so outer LIMIT ROWS MATCHED 2 and LIMIT ROWS MATCHED 3 don't change the final result.
But first query of above example, outer LIMIT ROWS MATCHED is 1 (less than inner LIMIT ROWS MATCHED 2) so the result set has just 1 row.
And LIMIT ROWS MATCHED syntax could be used in join also.
MariaDB [test]> SELECT * FROM test t1, test t2, test t3 WHERE t1.fd_pk=t2.fd_pk AND t2.fd_pk=t3.fd_pk LIMIT ROWS MATCHED 3;
+-------+------+------+-------+------+------+-------+------+------+
| fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 |
+-------+------+------+-------+------+------+-------+------+------+
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
+-------+------+------+-------+------+------+-------+------+------+
3 rows IN SET (0.02 sec)
MariaDB [test]> SELECT * FROM test t1, test t2, test t3 WHERE t1.fd_pk=t2.fd_pk AND t2.fd_pk=t3.fd_pk AND t3.fd1<>1 LIMIT ROWS MATCHED 3;
+-------+------+------+-------+------+------+-------+------+------+
| fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 |
+-------+------+------+-------+------+------+-------+------+------+
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
+-------+------+------+-------+------+------+-------+------+------+
3 rows IN SET (0.01 sec)
Result set are little weird or different from what you expected.
-- // duplicate test table's row for ORDER BY .. LIMIT ROWS MATCHED n query
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 5 rows affected (0.02 sec)
Records: 5 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 10 rows affected (0.01 sec)
Records: 10 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 20 rows affected (0.01 sec)
Records: 20 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 40 rows affected (0.01 sec)
Records: 40 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 80 rows affected (0.03 sec)
Records: 80 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 160 rows affected (0.05 sec)
Records: 160 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 320 rows affected (0.12 sec)
Records: 320 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 640 rows affected (0.20 sec)
Records: 640 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;Query OK, 1280 rows affected (0.38 sec)
Records: 1280 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;Query OK, 2560 rows affected (0.74 sec)
Records: 2560 Duplicates: 0 Warnings: 0
MariaDB [test]> INSERT INTO test SELECT NULL, fd1, fd2 FROM test;
Query OK, 5120 rows affected (1.47 sec)
Records: 5120 Duplicates: 0 Warnings: 0
MariaDB [test]> SELECT *
FROM test t1, test t2, test t3
WHERE t1.fd_pk=t2.fd_pk AND t2.fd_pk=t3.fd_pk
ORDER BY t3.fd2 DESC, t2.fd2 DESC, t1.fd2 DESC
LIMIT 3;
+-------+------+------+-------+------+------+-------+------+------+
| fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 |
+-------+------+------+-------+------+------+-------+------+------+
| 578 | 5 | 5 | 578 | 5 | 5 | 578 | 5 | 5 |
| 2792 | 5 | 5 | 2792 | 5 | 5 | 2792 | 5 | 5 |
| 8694 | 5 | 5 | 8694 | 5 | 5 | 8694 | 5 | 5 |
+-------+------+------+-------+------+------+-------+------+------+
3 rows IN SET (2.27 sec)
MariaDB [test]> SELECT *
FROM test t1, test t2, test t3
WHERE t1.fd_pk=t2.fd_pk AND t2.fd_pk=t3.fd_pk
ORDER BY t3.fd2 DESC, t2.fd2 DESC, t1.fd2 DESC
LIMIT ROWS MATCHED 3;
+-------+------+------+-------+------+------+-------+------+------+
| fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 |
+-------+------+------+-------+------+------+-------+------+------+
| 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
+-------+------+------+-------+------+------+-------+------+------+
3 rows IN SET (0.01 sec)
Both query of above example have "ORDER BY t3.fd2 DESC, t2.fd2 DESC, t1.fd2" clause and same WHERE conditions.
First query use "LIMIT ROWS MATCHED 3" on the other hand second query use just "LIMIT 3".
"LIMIT 3" is processed after sort but "LIMIT ROWS MATCHED 3" is processed before sort operation. So first query need to sort all joined rows, but second query only sort 3 rows.
(Of course, the result set of second query is not what you want, so be cautious)
And LIMIT ROWS MATCHED syntax can be used on outer join query.
MariaDB [test]> SELECT *
FROM test t1
INNER JOIN test t2 ON t2.fd_pk=t1.fd_pk
LEFT JOIN test t3 ON t3.fd_pk=t1.fd_pk-1
ORDER BY t3.fd2, t2.fd2, t1.fd2
LIMIT ROWS MATCHED 3;
+-------+------+------+-------+------+------+-------+------+------+
| fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 | fd_pk | fd1 | fd2 |
+-------+------+------+-------+------+------+-------+------+------+
| 1 | 1 | 1 | 1 | 1 | 1 | NULL | NULL | NULL |
| 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 |
| 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 |
+-------+------+------+-------+------+------+-------+------+------+
3 rows IN SET (0.02 sec)
* Performance (Appendded)
I ran a simple row counting test.LIMIT ROWS MATCHED faster about 20~30% than SELECT COUNT(*) .. FROM (SELECT .. LIMIT).
-- // -----------------------------------------------------------------
-- // Counting 10000 rows
-- // -----------------------------------------------------------------
MariaDB [test]> SELECT COUNT(*) FROM test limit rows matched 10000;
+----------+
| COUNT(*) |
+----------+
| 10000 |
+----------+
1 row IN SET (0.05 sec)
MariaDB [test]> SELECT COUNT(*) FROM (SELECT 1 FROM test limit 10000) x;
+----------+
| COUNT(*) |
+----------+
| 10000 |
+----------+
1 row IN SET (0.06 sec)
-- // -----------------------------------------------------------------
-- // Counting 30000 rows
-- // -----------------------------------------------------------------
MariaDB [test]> SELECT COUNT(*) FROM test limit rows matched 30000;
+----------+
| COUNT(*) |
+----------+
| 30000 |
+----------+
1 row IN SET (0.14 sec)
MariaDB [test]> SELECT COUNT(*) FROM (SELECT 1 FROM test limit 30000) x;
+----------+
| COUNT(*) |
+----------+
| 30000 |
+----------+
1 row IN SET (0.19 sec)
-- // -----------------------------------------------------------------
-- // Counting 80000 rows
-- // -----------------------------------------------------------------
MariaDB [test]> SELECT COUNT(*) FROM test limit rows matched 80000;
+----------+
| COUNT(*) |
+----------+
| 80000 |
+----------+
1 row IN SET (0.36 sec)
MariaDB [test]> SELECT COUNT(*) FROM (SELECT 1 FROM test limit 80000) x;
+----------+
| COUNT(*) |
+----------+
| 80000 |
+----------+
1 row IN SET (0.47 sec)
* Finally
written on top of this blog, the priamry purpose of LIMIT ROW MATCHED syntax is limiting rows during count.In message service or some other services, usally we need to count rows matched where condition.
But almost case we don't need to count all matched rows. and counting all matched rows need more cpu cycle and disk reads.
Yes, we can limit counting rows with sub-query on FROM clause, but this solution need memory or disk internal temporary table.
Even I saw some people use "SELECT 1 FROM ... LIMIT n" and counting rows on client side. This solution also need additional network bandwidth.
But we can do limited counting job with minimal computing power. Because LIMIT ROWS MATCHED features never use additional network bandwidth and internal temporary table.
* Limitations
1. "LIMIT ROWS MATCHED" syntax can not be used with "LIMIT ROWS EXAMINED" together.2. "LIMIT ROWS MATCHED n" will push "LIMIT n" automatically. So MariaDB can not employ some semi-join optimizations when query contains "LIMIT ROWS MATCHED n" (This limitation is applied "LIMIT n" query too).
* Download
https://github.com/SunguckLee/MariaDB** This feature may have some bug, so please be cautious.