What Is one Excerpt?

an excerpt is made of one or more text segment from an indexing document. The segment generally incorporate occurrences of the searched terms. The objective of the excerpt is to provide vital pieces of the document, and therefore, assist you determine if the record contains the information you space looking for. In the excerpt, the searched terms room highlighted.

Tip: In .NET find interfaces featuring the choices link, you deserve to configure the excerpt to encompass one to four lines of message (see modifying .NET Search user interface Preferences).

Example: when you find for speed SSD, the two-line except for the following search results contains highlights the searched terms.


Note: No excerpt shows up in the search result for a copy protected file (such together a PDF) to avoid showing its contents in a context wherein users can make a copy. Once the excerpt is lacking or north for particular documents, your gimpppa.org administrator deserve to verify if the files are determined as copy protected in the table of contents (see Reviewing document Details indigenous the index Browser).

Example: In Adobe Acrobat, in the Password protection Settings dialog box, as soon as you clean the enable copying of text, images, and other content examine box, the file becomes copy protected, and no excerpt shows up in search outcomes for this document.


Why Keywords perform Not appear in some Excerpts?

By default, gimpppa.org find interfaces just return search outcomes containing all searched keywords, therefore you have the right to legitimately intend search outcomes excerpts to encompass your search keywords. However, in some cases like the ones noted below, one or much more of her keywords may not show up in the search result excerpts.

Your query contains many keywords

When a query has many keywords (such as a lengthy sentence), however the keyword cases are scattered in the document, it may be difficult to assemble a few segments (within 200 characters) that encompass all the keywords.

How are Excepts Generated?

At indexing time, the cleaned text of each item"s content is recorded in the index. In ~ query time, pertinent segments that incorporate the keywords space extracted native the taped cleaned text to build the excerpt.

Note: The compressed cleaned text recorded for each item is limited in dimension (about 32 KB) come optimize the table of contents size and query performances. For big documents (such together PDF documents with several hundred pages), just the contents at the beginning of the file can therefore show up in excerpts.

The algorithm provided to generate the excerpts is really complex. The score is to extract the most relevant segments approximately keywords and also fit the an outcome in 2 or 3 lines (typically 200 characters).

To aid you understand exactly how the excerpt is assembled, below are indications of some of the criteria on i m sorry the algorithm is based:

Create a contextual segment about each highlighted keyword:

Ideally a complete sentence

Segment centered on highlighted keywords

Grow tiny sentences through content from surrounding sentences

Evaluate each segment ranking score based on:

Keyword proximity and also completeness

Number that non avoid words keywords in ~ the segment

Grammatical top quality

Segment position in the paper (better at beginning)

Average native length

Sentence length

Keep the best segments:

Skip segments no bringing brand-new keywords come maximize completeness

Merge overlapping segments

If not enough characters:

Grow ideal segments first

Merge tiny nearby sentences

