I have two tables, users and points. Currently users has 84,263 rows, while points has 1,636,119 rows. Each user can have 0 or multiple points and I need to extract which point was created last.
show create table users
CREATE TABLE `users` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`password` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`remember_token` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`role` varchar(15) COLLATE utf8_unicode_ci DEFAULT 'consument',
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`updated_at` timestamp NOT NULL DEFAULT current_timestamp(),
`deleted_at` timestamp NULL DEFAULT NULL,
`email_verified_at` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`email_verify_token` text COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `users_email_unique` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=84345 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
show create table points
CREATE TABLE `points` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`tablet_id` int(10) unsigned DEFAULT NULL,
`parent_company` int(10) unsigned NOT NULL,
`company_id` int(10) unsigned NOT NULL,
`points` int(10) unsigned NOT NULL,
`mutation_type` tinyint(3) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`updated_at` timestamp NOT NULL DEFAULT current_timestamp(),
PRIMARY KEY (`id`),
KEY `points_user_id_foreign` (`user_id`),
KEY `points_company_id_foreign` (`company_id`),
KEY `points_parent_company_index` (`parent_company`),
KEY `points_tablet_id_index` (`tablet_id`),
KEY `points_mutation_type_company_id_created_at_index` (`mutation_type`,`company_id`,`created_at`),
KEY `created_at_user_id` (`created_at`,`user_id`),
CONSTRAINT `points_company_id_foreign` FOREIGN KEY (`company_id`) REFERENCES `companies` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `points_parent_company_foreign` FOREIGN KEY (`parent_company`) REFERENCES `parent_company` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `points_tablet_id_foreign` FOREIGN KEY (`tablet_id`) REFERENCES `tablets` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
CONSTRAINT `points_user_id_foreign` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1798627 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Queries I tried, but are taking too long (we're talking in minutes, not seconds):
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
max(pt.created_at) as `last_transaction`
from `users`
left join points as pt on pt.user_id = users.id
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
group by users.id
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
pt.created_at as `last_transaction`
from `users`
left join (select points.user_id, points.created_at from points order by points.created_at desc) as pt on pt.user_id = users.id
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
group by users.id
Why am I not limiting the results and returning only 100 at a time? Because I am using Yajra DataTables for Laravel and when limiting results, it only returns limited results and it does not recognize that there are more. So instead of 84,263 rows, I only get 100 rows and that's it.
Basically your "users" table has a "role" column. It is not indexed. So your queries are doing full table scan on "users" table which has 84263 rows. One way to to optimize it would be to have an index on "role" column. But I can see "consument" is the default value & you are querying by that value. Now suppose 95% of users are having "consument" role. Then even adding index on "role" won't help much. You would have to add more condition to filter out the query & have an index for that condition.
Your first query is better as it would avoid unnecessary inner query of second one.
If you need to return 84263 rows, then that is a sperate issue. Somehow you would have to introduce pagination. You would have to break your queries to multiple queries. Suppose in each call you return 500 users data. You can sort it by id. And in subsequent call, you can ask for next 500 where id is greater than the last id returned in the previous query (for the very first call last id value would be 0). Then the queries can use "id" as index.
You can check the query plan using "explain" keyword & can have better understanding.