Wednesday, 28 August 2013

Selecting for the absence of value in a one-to-many table design

Selecting for the absence of value in a one-to-many table design

I've been working all day on a problem that I expected to be simple, but
is proving incredibly elusive. I suspect I may not be asking the right
questions, so please bear with me.
I have a table with a bunch of newspaper articles for a research project.
The idea is that researchers can tag individual articles. These tags are
stored in a second table. An article can have any number of tags.
I can then select articles with a certain tag by using;
SELECT *,
GROUP_CONCAT(`TAGS`.`TAG`) AS tags
FROM ARTICLES
LEFT JOIN TAGS
ON TAGS.ID = ARTICLES.ID
WHERE TAGS.TAG = 'search term'
GROUP BY ARTICLES.ID;
My problems start when I want to select articles based on the absence of a
particular tag. If an article has only one tag, the result is as expected,
but if there is more then one tag associated with an article, the tag is
simply ommitted.
SELECT *,
GROUP_CONCAT(`TAGS`.`TAG`) AS tags
FROM ARTICLES
LEFT JOIN TAGS
ON TAGS.ID = ARTICLES.ID
WHERE TAGS.TAG != 'search term'
OR TAGS.TAG IS NULL
GROUP BY ARTICLES.ID;
if the original tables where as follows;
ID Name
1 Article #1
2 Article #2
and;
ArticleID Tag
1 New
1 Long
1 Boring
2 Old
2 Long
2 Interesting
Then if I use the above query to select articles where tag != Boring, the
results would be;
ArticleID Name Tags
1 Article #1 New, Long
2 Article #2 Old, Long, Interesting
How can I make it exclude the first article altogether, rather then just
excluding that tag? Keeping in mind that there are over a hundred thousand
articles in the database, what is the most efficient way to do this? I've
looked at dozens of other questions and google searches, but selecting for
the absence of a tag like this is something I could not find advice on.
On a sidenote, I am currently using a one-to-many table, as each tag
appears once for each article it is linked to. I noticed that a lot of
people in similar scenarios use a many-to-many design. Is this that much
faster then having just a foreign key in the tags table referencing the
article table?
Thank you for helping out an SQL noob :).

No comments:

Post a Comment