The relevance model just estimates the relevance feedback based on feedback documents. In this case, the relevance model would have a higher probability of getting common words as its feedbacks. Thus I assumed the performance of the relevance model won't be so good comparing to the other two models. However, I learned that all those models perform pretty well. What would be the reason for that?
relevance models
156 Views Asked by user19283043 At
1
There are 1 best solutions below
Related Questions in INFORMATION-RETRIEVAL
- How does Elasticsearch do attribute filtering during knn (vector-based) retrieval?
- Issue with Passing Retrieved Documents to Large Language Model in RetrievalQA Chain
- text-to-SQL LLM that queries multiple data sources/databases,
- How to fetch a specific span tag on a webpage using Chrome console?
- Maximizing Document-Based Responses in OpenAI: Strategies for Comprehensive Information Retrieval
- How to add langchain docs to LCEL chain?
- Discount Function in NDCG
- Set filter in Langchain Self-Query Retriever
- Is Accuracy@k same as Success@k in Information Retrieval?
- langchain vectordb.similarity_search_with_relevance_scores() gives different top results with different value of k
- Extract PDF Content Including Images For RAG
- How do you build a Knowledge Graph Index using a .json file in Llama index?
- Reciprocal rank fusion using PyTorch
- Reciprocal rank fusion in PySpark
- Collecting data from a webform
Related Questions in FEEDBACK
- How to reduce noise in recordings with a known audio file
- usb2: How to declare interfaces and endpoints for implicit feedback
- I struggle with a basic feedback delay network
- Free form input in feedback form not displayed on facebook messager
- How to capture the streamlit-feedback output?
- feedback controller for percentual variables
- MATLAB: why is there an error in the place command?
- How do I add Firebase app Distribution in-app feedback to a flutter app?
- How to access the interaction with my Python chat bot on Heroku
- How much data is needed for a Feedback loop based semi-supervised model?
- Why can voltage error be converted to current using a PI controller in dq-frame?
- cannot use MATLAB function'lsim()' properly
- Using "flag"-technique of jira with custom field as SIL Power Action
- Flutter custom painter feedback: Blending new layer with old ones
- how can i sum my options like a calculator?
Related Questions in RELEVANCE
- Elasticsearch: Boosting score based on nested document matches
- How could autocomplete be implemented in Vespa search?
- Does Elasticsearch assign relevance to text fields based on their format? For example, heading, link, list item, bold, underline?
- Apache Solr: Why does the boost parameter in eDisMax result in squared score boosting?
- MarkLogic relevance scoring calculations using logtfidf method
- Laravel MySQL Full text search InnoDB Relevancy Calculation for multiple rows with same number of occurences in total
- Scoring issues while using BM25Similarity
- Asking about sorting or ordering results from a search engine by relevancy in php/Mysql
- Fetch other document value while framing DSL query
- Pooling Method in TREC competitions
- Solr - Boost result with exact match in the begginning of term
- Count relevance/weighted arithmetic mean SQL
- Why does Solr changes record position after updating a field
- How to calculate relevance score?
- How to get relevance score for each article classified, Python NLP
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
"In contrast, the relevance model just estimates the relevance feedback based on feedback documents. In this case, the relevance model would have a higher probability of getting common words as its feedbacks"That's a common perception which isn't necessarily true. To be more specific, recall that the estimation equation of relevance model looks like:
P(w|R) = \sum_{D \in Top-K} P(w|D) \prod_{t \in Q} P(q|D)which in simple English means that --
To compute the weight of a term
win the set of top-K docs - you iterate over each document in top-K and multiplyP(w|D)with the similarity score of Q with D (this is the value\prod_{t \in Q} P(q|D)). Now, theidffactor is hidden inside the expressionP(w|D).Following the standard language model paradigm (Jelinek-Mercer or Dirichlet), this isn't just a simple max-likelihood estimate but is rather a collection smoothed version, e.g., for Jelinek-Mercer, this is:
P(w|D) = log(1+ lambda/(1-lambda) * count(w,D)/length(D) * collection_size/cf(t))which is nothing but a linear combination based generalization of tf*idf - the second component
collection_size/cf(t)specifically denoting inverse collection frequency.So, this expression of
P(w|D)ensures that terms with higher idf values tend to get higher weights in the relevance model estimation. In addition to the high idf weights, they should also have a high level of co-occurrence with the query terms due to the product of P(w|D) with P(q|D).