I've been struggling lately with a complex SQL query.
I have the following tables:
[dbo].[User] ~ {ID,nickname}
[dbo].[Property] ~ {ID,title}
[dbo].[Property_Values] ~ [ID,propertyID,title}
[dbo].[Property_Values_User_Linkage] ~ {UserID,PropertyID,valueID}
It's basically a project in which a user chooses values for each property. Each property can be single-value or multi-value. For instance, user might choose multiple values for Property {ID=1,title=Hobbies} but must choose a single value for Property {ID=2,title=HairColor}.
Using another table - [dbo].[Search_Property_Values_User_Linkage] - {UserID,PropertyID,valueID} I'm choosing what properties I want and I expect to find matching users. BUT, If I haven't chosen a value (or multi-value) for, let's say HairColor, I should get all users (since I don't want to filter by HairColor).
So far it's easy, but the problem I can't seems to solve is the case where there are multi-values ahead or no user defined values. For example, I want all users with HairColor=Brown and Hobbies IN(basketball,football).
I can retrieve all users who match one of the terms (and those who have other properties, since I haven't chosen to filter them) but I can't get only the users who absolutely match my criteria.
To put code into words, let's just say I need all users who:
- Match ALL property-value that I've chosen
- Might have other properties, such as EyesColor, but since I haven't chosen a filtering value, they might be retrieved as well.
- Might not have any property set at all, but since I haven't chosen a value for this property, they are VALID!
- Match all selected properties as group, not just one property that I've chosen (users who likes basketball but have "red" for their HairColor are INVALID!
I came across with a solution in which I create a virtual table which "completes" non-chosen values by bits. For instance (not actual code):
DECLARE @isMatches bit
SET @isMatches=0
if [propertyIsChosen]=1
{
if [userInProperty]=1 SET @isMatches=1
}
else SET isMatches=1
I basically CROSS-JOIN [Property] WITH [User]
and LEFT-OUTER-JOIN the rest of the tables to match selections.
I get all users and their match to a property. This is not good enough since I get users who have Brown hair, those who like basketball/football, but not those who matches to both (and of course any other undefined property by me).
This is heavy, but it's the what I got so far to analyze the problem.
I would appreciate any help. I think I'm missing something from my math classes, 10 years ago...
EDIT: db pic: http://i51.tinypic.com/2n1cfwg.png
Data Model
From you comment, I see that you have some other property-related tables that you have not mentioned in your question. I'm gonna assume we can ignore these for now and just concentrate on representing property values, which can be done by a simplified model below:
DDL SQL (MS SQL Server):
A multi-value is represented by several rows in USER_PROPERTY_VALUE, sharing the same PROPERTY_NAME but each having a distinct PROPERTY_VALUE_NO.
The following example data...
...is represented in the database like this:
USER:
USER_PROPERTY_VALUE:
Example Queries
Select users with brown hair color:
Result:
Select users whose hobbies include both basketball and football:
Result:
Select users whose hair color is brown and hobbies include both basketball and football.
Result:
Etc, etc...
Querying Based on the Contents of Another Table
Let's say you have a table that contains the filter criteria:
The following query will return only users that fulfill the criteria currently contained in that table:
In plain English: if there is a filter property that is not also users's property, ignore that user.
BTW, this will not work in concurrent multi-user environment - you'll need to introduce an additional field in the PROPERTY_FILTER table to identify the "session", or use temporary tables (if you don't require persistence).