linked lists: or how to store a live qued-list properly

35 Views Asked by At

I'm working on a scheduler, for our internal production. the story of the problem is this: we have incoming 'jobs', that get sent to various stations 'machines'. we are storing the list of jobs in a mysql database. I want to make an interface for the production coordinator: the person responsible for getting jobs onto the correct machine - which would consist of a list of incoming jobs, and lists for each machine representing the job que for that machine.

some of the actions we need to be able to do: -reorder the list without a massive write to a sql DB - such as placing a job on a machine at the top of the que not at the end of a que, having the rest of the que update properly(a cascade of renumbering is what I'm trying to avoid).

things i'm worried about are scalability, if we get 100000's of jobs, or if we massively increase the number of our machines.

some of the things we have considered - and are still debating: creating a table with the following structure: jobkey, machineid, status, queposition. with a constraint of queposition being unique in respect to machineid. this would be good for any number of machines but quickly updating the que would require a search through all the db for each machine id and re-ordering would be less that clean. another thought was to have a prekey and postkey field to simulate a linked list, where the prekey is the table key to the last que entry, and post would point to the next. which would solve some problems, but kick the troubles can down to displaying the que.

neither of these seem to solve the problem completely.

this seems like a general enough problem I'm sure there's a good solution to que's - and modifying ordering of said list without conflict. working in javascript, php and mysql.

2

There are 2 best solutions below

0
On

The idea of having a "queue position" column makes good sense to me. (Go ahead and index on that column combined with machine-id.) Then, when querying the database, ORDER BY ... DESC this queue-position column and some other column that makes sense ... a timestamp, say.

Queue-position values don't have to be consecutive, nor unique, and a simple select.. MAX() query can tell you the highest value that's currently out there (for a particular machine). Just update the one row that you want to bump to the top.

Another simple trick, if you do anticipate a slight amount of re-ordering, is to borrow from the old BASIC-programming days and use queue-position values that are incremented, say, by 10. This gives you unused values that you can use if you'd like your positioning to be fairly accurate. (As I said, the values don't have to be consecutive and they don't have to be unique.)

0
On

So you have 100000 jobs, and you're worried about the work it will take to update jobs? 100000 isn't much really. A standard mysql database should be able to process these queries pretty fast. This let's the database resort all relevant items.

Creating the schema on mysql fiddle takes 187 ms.

Running it on my local dev machine, the insertion of the 100.000 dummy rows took 650ms. The actual updates of sort order? Not measurable.

17:39:40 CREATE TABLE test ( id INT(11) NOT NULL AUTO_INCREMENT, machine_id INT NULL, job_id INT NULL, sort_order INT NULL, PRIMARY KEY (id),INDEX machine (machine_id ASC, sort_order ASC)) 0 row(s) affected 0.016 sec

17:39:40 insert into test (machine_id, job_id, sort_order) values (1,1,1) 1 row(s) affected 0.000 sec

17:39:40 insert into test (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from test t 1 row(s) affected Records: 1 Duplicates: 0 Warnings: 0 0.000 sec

17:39:40 insert into test (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from test t, test t2, test t3, test t4 16 row(s) affected Records: 16 Duplicates: 0 Warnings: 0 0.000 sec

17:39:40 insert into test (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from test t, test t2, test t3, test t4 104976 row(s) affected Records: 104976 Duplicates: 0 Warnings: 0 0.657 sec

17:39:41 update test set machine_id = 50 where id > 49555 and id < 49999 443 row(s) affected Rows matched: 443 Changed: 443 Warnings: 0 0.000 sec

17:39:41 UPDATE test MAIN INNER JOIN ( SELECT id, machine_id, @rowNumber := @rowNumber + 10 AS rn FROM test , (SELECT @rowNumber := 0) var where machine_id=50 ORDER BY sort_order ASC ) AS t ON MAIN.id = t.id SET MAIN.sort_order = t.rn where MAIN.machine_id = 50 443 row(s) affected Rows matched: 443 Changed: 443 Warnings: 0 0.000 sec

My suggestion is: Make sure your indexes are okay, and let the database do the heavy lifting. Make a few tests in your code that utilize the database to max potential, put the database server under strain. Have it do a days work in 5 minutes. My bet is, it will hold up.

Make sure that your sort_order always has a difference of 10 steps between sorting. That way you can always insert "9" jobs before having to resort the relevant entries. Also have a ligthweight relational table, that links jobs to machines, with only the sort order as extra data, and possibly a primary key added to be able to easily directly update a job relation.

Take for example this demo database: http://sqlfiddle.com/#!9/3cf11e/1

CREATE TABLE `test` (
  `id` INT(11) NOT NULL AUTO_INCREMENT,
  `machine_id` INT NULL,
  `job_id` INT NULL,
  `sort_order` INT NULL,
  PRIMARY KEY (`id`),INDEX `machine` (`machine_id` ASC, `sort_order` ASC));
  
insert into `test` (machine_id, job_id, sort_order) values (1,1,1);
insert into `test` (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from `test` t;
insert into `test` (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from `test` t, `test` t2, `test` t3, `test` t4; 
insert into `test` (machine_id, job_id, sort_order) select t.machine_id, t.job_id, t.sort_order from `test` t, `test` t2, `test` t3, `test` t4; 
update test set machine_id = 50 where id > 49555 and id < 49999;


UPDATE 
test MAIN 
INNER JOIN 
(
    SELECT 
        id,
        machine_id,
        @rowNumber := @rowNumber + 10 AS rn
    FROM test , (SELECT @rowNumber := 0) var
  where machine_id=50
    ORDER BY sort_order ASC
) AS t
ON MAIN.id = t.id
SET MAIN.sort_order = t.rn
where MAIN.machine_id = 50;

select * from test where machine_id = 50;

Record Count: 443; Execution Time: 11ms