100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Exam (elaborations)

ISBI - 1 Practice Question Guide With Accurate Answers.

Rating
-
Sold
-
Pages
4
Grade
A+
Uploaded on
13-06-2024
Written in
2023/2024

What is the main advantage of cosine similiarity - correct answer That the length of a document does not influence the similarity score What does retrieval precision measure? - correct answer Of the retrieved documents, how many are relevant to the query? Aka, how clean the sesrch result is. What does recall measure? - correct answer The relevant documents among all the relevant documents in the collection Why do search egines do link analysis, ex: Page Rank? - correct answer To determine the connections and relations between pages and sites on the web What is pooling in IR? Why is it needed? How does it work? - correct answer Pooling is used to estimate performance pf retrieval systems when a target document collection is very large. A subset of the collection is selected and used for evalutation of precision and recall. This gives a manageable amount to evaluate since it is done manually by humans and not computers. What is search engine spam? - correct answer Is different methods of trying to cheat the standard principl s of achieving good rating in sesrch egines by playing fair, utilizing white-had SEO. Example: link spam on a page make it appear more interesting and popular than it is. What is word2vec? What does it do? - correct answer It is invented by Google and uses distributional semantics to produce word space models and can distinguish unknown entities and thei relations. This is done by analyzing the context of targeted words. What is main data souce for Web Usage Mining? - correct answer It is obtained from web or application server log files. What is Named Entity Recognition? - correct answer It is a technique that can identify entities such as names, people, companies and products in text documents. Why is boolean retrieval model still being used? - correct answer Because it is easy to understand. And that the user has a lot of control on the search, which is something that the newer models are lacking. What is the principle of cosine similarity? - correct answer It is used to calculate the similarity between two documents. This is done by taking the two document vectors in an n-dimensional vector spce and calculate cosinus of the angle between the two vectors. What is relevance feedback? - correct answer The idea behind relevance feedback is to take the results that a returned from a given query, to gather user feedback and to see whether or not those results are relevant to perform a new query. What is XML-sidemap? - correct answer It is a document that the owner of a website fills out in order to be indexed by the search engine. It is used as a way for search engines to know which sites that should be crawled and listed in search results. Name 3 features Google uses to judge how trustworthy a website is - correct answer Deep linking Domain diversity Incoming links from other sites What is a No-Follow link? - correct answer It tells the crawler to not crawl or index the link. It was introduced to save time and capacity for the crawler. What does an entry in a server log correspond to and what type of information does it contain? - correct answer It corresponds to the users contact with the server and the requestd that it carries, ex: time and date of a request, the IP-adress, what browser the user is on. What are stop words? And why can it be good to remove them wjen opinion mining? - correct answer Stop words are common words that don't really give is ant information about a text. Removing them can increase the accuracy since it decreases the amount of data it has to go through, more relevant data aswell. What is competitive intelligence? - correct answer It is a part of buisness intelligence mostly focusing on what is happening outside of the company and not on the inside. What is query expansion? What is it purpose? - correct answer Adding relevant terms to the query to make the results more accurate. Increase the recall of the query. What potential problem can dynamic content pages cause to crawlers? - correct answer Spider traps. When dynamic content is created new links are created, and these links can link to more dynamic pages and this could put the crawlers in a loop. What is a link anchor text? Why is it a good place for page-relevant keywords? - correct answer It is the text you want your link to have. Ex: "press here to see info". This can create more links in the page via the text and the attributes. What is evergreen content? Why is it good for SEO? - correct answer It's fresh content created to keep the website seem up to date, ex: forum posts, FAQ ect. By keeping the pace fresh the crawlers reindex the page and it show the search engines that this website is still in use and active. What is the main source of data for analyzing user behavior on the web? - correct answer Web server logs Application server-logs Generating usage data What is Named Entity Recognition? What can it be used for? - correct answer It is used to identify names, companies, city ect for buisness intelligence. It can be used by compnies to overwatch its competitors for example. What is the difference between syntagmatic and paradigmatic relations in semantics? - correct answer Syntagmatic relations is when two words have the exact same context, while paragmatic means they have a similar context. What are long-tail keywords? Why are they good? - correct answer Keywords or keyphrases that are more specific and usually longer than more commonly used keywords. They are good because then higher the chances of more accurate search results as they make the query more specific.

Show more Read less
Institution
CSBI.
Course
CSBI.








Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
CSBI.
Course
CSBI.

Document information

Uploaded on
June 13, 2024
Number of pages
4
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
RealGrades Nursing
View profile
Follow You need to be logged in order to follow users or courses
Sold
172
Member since
2 year
Number of followers
52
Documents
11829
Last sold
3 weeks ago

4.0

26 reviews

5
12
4
5
3
7
2
1
1
1

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions