INSERTing into a tag_map in MySQL

107 Views Asked by At

I have a simple article and tag_map tables as

CREATE TABLE Articles
(
  ArticleID int(11) unsigned NOT NULL AUTO_INCREMENT,
  Title varchar(255),
  PRIMARY KEY(ArticleID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci

CREATE TABLE Tags
(
  TagID int(11) unsigned NOT NULL AUTO_INCREMENT,
  Tag varchar(255),
  UNIQUE INDEX(Tag),
  PRIMARY KEY(TagID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci

CREATE TABLE TagMap
(
  ArticleID int(11) unsigned NOT NULL,
  TagID int(11) unsigned NOT NULL,
  INDEX(TagID),
  PRIMARY KEY(ArticleID,TagID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci

I add tags via PHP

$result = $mysqli->query("SELECT TagID FROM Tags WHERE Tag='$tag'");

if($result->num_rows == 1) {
    $row = $result->fetch_assoc();
    $tag_id = $row['TagID'];
}
else {
    $mysqli->query("INSERT INTO Tags (Tag) VALUES ('$tag')");
    $tag_id = $mysqli->insert_id;
}

$mysqli->query("INSERT INTO TagMap (ArticleID,TagID) VALUES ($article_id,$tag_id)");

I wonder if there is a faster way to do this in one query within MySQL.

Here, I need 2 or 3 queries for adding each tag.

Additionally, I hope to find a way for batch INSERT (possibly via LOAD DATA LOCAL INFILE) when we have a list of tags as

ArticleID,Tag
1,tag2
2,tag11
4,tag3
2

There are 2 best solutions below

8
On BEST ANSWER

A pattern:

CREATE PROCEDURE load_to_TagMap ()
BEGIN
-- create table for loading data
CREATE TABLE tmp_TagMap ( ArticleID INT, Tag VARCHAR(255) ) ENGINE = Memory;
-- load data from file
LOAD DATA INFILE '/directory/filename.ext'
    INTO TABLE tmp_TagMap
    SKIP 1 LINES;
-- add absent tags into Tags table
INSERT INTO Tags (Tag)
    SELECT tmp_TagMap.Tag
    FROM tmp_TagMap 
    LEFT JOIN Tags USING (Tag)
    WHERE Tags.Tag IS NULL;
-- insert loaded data into TagMap table with lookup
INSERT INTO TagMap
    SELECT ArticleID, TagID
    FROM Tag
    JOIN tmp_TagMap USING (Tag);
-- remove loaded data table
DROP TABLE tmp_TagMap;
END

From PHP simply execute CALL load_to_TagMap;.

2
On

Overnormalization.

"Tags" tend to be short strings, correct? The overhead of creating an INT for each and doing a secondary lookup is not worth it. Replace Tags and TagMap with

CREATE TABLE Tags
(
  ArticleID int(11) unsigned NOT NULL,
  Tag VARCHAR(255) NOT NULL,
  PRIMARY KEY(ArticleID,Tag)
  INDEX(Tag, ArticleID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci

This may be of interest: http://mysql.rjweb.org/doc.php/lists

More

SELECT COUNT(*) FROM Tags WHERE Tag = '...';

is very efficient, even with Tag being VARCHAR. This also simplifies your code -- you don't need extra code to bump a counter; also it is easy to decrement the counters when an Article is removed:

DELETE FROM Tags WHERE ArticleID = ...;

If you expect to have 100K articles per tag, then there could be a performance problem. How many articles and tags are you expecting?

If the bigger picture is "Display the 'latest' 10 Articles for Tag='...', then the performance problem will be in the ORDER BY date DESC LIMIT 10. Currently that involves a join to the Article table, check for not 'deleted', sort, etc. But I have a solution for that: http://mysql.rjweb.org/doc.php/lists