You can search for any word or phrase on a Web site by typing the word or phrase into a query form and clicking the button to execute the query (for example, the Execute Query button on the sample query form). This section covers the following topics:
Searches produce a list of files that contain the word or phrase no matter where they appear in the text. This list gives the rules for formulating queries:
Boolean and proximity operators can create a more precise query.
To Search For | Example | Results |
---|---|---|
Both terms in the same page | access and basic —Or— access & basic |
Pages with both the words “access” and “basic” |
Either term in a page | cgi or isapi —Or— cgi | isapi |
Pages with the words “cgi” or “isapi” |
The first term without the second term | access and not basic —Or— access & ! basic |
Pages with the word “access” but not “basic” |
Pages not matching a property value | not @size = 100 —Or— ! @size = 100 |
Pages that are not 100 bytes |
Both terms in the same page, close together | excel near project —Or— excel ~ project |
Pages with the word “excel” near the word “project” |
Hints:
Note The symbols (&, |, !, ~) and the English keywords AND, OR, NOT, and NEAR work the same way in all languages supported by Index Server. Localized keywords are also available when the browser locale is set to one of the following six languages:
Language | Keywords |
---|---|
German | UND, ODER, NICHT, NAH |
French | ET, OU, SANS, PRES |
Spanish | Y, O, NO, CERCA |
Dutch | EN, OF, NIET, NABIJ |
Swedish | OCH, ELLER, INTE, NÄRA |
Italian | E, O, NO, VICINO |
Note The NEAR operator can be applied only to words or phrases.
Wildcard operators help you find pages containing words similar to a given word.
To Search For | Example | Results |
---|---|---|
Files that match free-text | $contents how do I print in
Microsoft Excel? |
Pages that mention printing and Microsoft Excel. |
The query engine supports vector space queries. Vector queries return pages that match a list of words and phrases. The rank of each page indicates how well the page matched the query.
To Search For | Example | Results |
---|---|---|
Pages that contain specific words | light, bulb |
Files with words that best match the words being searched for |
Pages that contain weighted prefixes, words, and phrases | invent*, light[50], bulb[10],
"light bulb"[400] |
Files that contain words prefixed by “invent,” the words “light,” “bulb,” and the phrase “light bulb” (the terms are weighted) |
With property value queries, you can find files that have property values that match a given criteria. The properties over which you can query include basic file information like file name and file size, and ActiveX properties including the document summary (information) that is stored in files created by ActiveX-aware applications.
There are two types of property queries:
This section covers the following topics:
Property names are preceded by either the “at” (@) or number sign (#) character. Use @ for relational queries, and # for regular expression queries.
If no property name is specified, @contents is assumed.
Properties available for all files include:
Property Name | Description |
---|---|
All | Matches words, phrases, and any property |
Contents | Words and phrases in the file |
Filename | Name of the file |
Size | File size |
Write | Last time the file was modified |
ActiveX property values can also be used in queries. Web sites with files created by most ActiveX-aware applications can be queried for these properties:
Property Name | Description |
---|---|
DocTitle | Title of the document |
DocSubject | Subject of the document |
DocAuthor | The document’s author |
DocKeywords | Keywords for the document |
DocComments | Comments about the document |
For a complete list of property names, see the List of Property Names later on this page.
Relational operators are used in relational property queries.
To Search For | Example | Results |
---|---|---|
Property values in relation to a fixed value | @size < 100 |
Files whose size matches the query |
Property values with all of a set of bits on | @attrib ^a 0x820 |
Compressed files with the archive bit on |
Property values with some of a set of bits on | @attrib ^s 0x20 |
Files with the archive bit on |
To Search For | Example | Results |
---|---|---|
A specific value | @DocAuthor = Bill Barnes |
Files authored by “Bill Barnes” |
Values beginning with a prefix | #DocAuthor George* |
Files whose author property begins with “George” |
Files with any of a set of extensions | #filename *.|(exe|,dll|,sys|) |
Files with .exe, .dll, or .sys extensions |
Files modified after a certain date | @write > 96/2/14 10:00:00 |
Files modified after February 14, 1996 at 10:00 GMT |
Files modified after a relative date | @write > -1d2h |
Files modified in the last 26 hours |
Vectors matching a vector | @vectorprop = { 10, 15, 20 } |
ActiveX documents with a vectorprop value of { 10, 15, 20 } |
Vectors where each value matches a criteria | @vectorprop >^a 15 |
ActiveX documents with a vectorprop value in which all values in the vector are greater than 15 |
Vectors where at least one value matches a criteria | @vectorprop =^s 15 |
ActiveX documents with a vectorprop value in which at least one value is 15 |
Regular expressions in property queries are defined as follows:
( opens a group. Must be followed by a matching ).
) closes a group. Must be preceded by a matching (.
[ opens a character class. Must be followed by a matching (un-escaped) ].
{ opens a counted match. Must be followed by a matching }.
} closes a counted match. Must be preceded by a matching {.
, separates OR clauses.
* matches zero or more occurrences of the preceding expression.
? matches zero or one occurrences of the preceding expression.
+ matches one or more occurrences of the preceding expression.
Anything else, including |, matches itself.
^ matches everything but following classes. Must be the first character.
] matches ]. May only be preceded by ^, otherwise it closes the class.
- range operator. Preceded and followed by normal characters.
Anything else matches itself (or begins or ends a range at itself).
|{m|} matches exactly m occurrences of the preceding expression. (0 < m < 256).
|{m,|} matches at least m occurrences of the preceding expression. (1 < m < 256).
|{m,n|} matches between m and n occurrences of the preceding expression, inclusive. (0 < m < 256, 0 < n <
256).
Example | Results |
---|---|
@size > 1000000 |
Pages larger than one million bytes |
@write > 95/12/23 |
Pages modified after the date |
Apple tree |
Pages with the phrase “apple tree” |
"apple tree" |
Same as above |
@contents apple tree |
Same as above |
Microsoft and @size > 1000000 |
Pages with the word “Microsoft” that are larger than one million bytes |
"microsoft and @size > 1000000" |
Pages with the phrase specified (not the same as above) |
#filename *.avi |
Video files (the # prefix is used because the query contains a regular expression) |
@attrib ^s 32 |
Pages with the archive attribute bit on |
@docauthor = John Smith |
Pages with the given author |
$contents why is the sky blue? |
Pages that match the query |
@size < 100 & #filename *.gif |
Graphics Interchange Format (GIF) files less than 100 bytes in size |
These properties are always available for queries. Additional properties may also be available depending on the configuration of the Web server.
Friendly Name | Datatype | Property |
---|---|---|
A_HRef | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML HREF. This property name was created for Microsoft® Site Server and corresponds with the Index Server property name HtmlHRef. Can be queried but not retrieved. |
Access | VT_FILETIME | Last time file was accessed. |
All | (not applicable) | Searches every property for a string. Can be queried but not retrieved. |
AllocSize | DBTYPE_I8 | Size of disk allocation for file. |
Attrib | DBTYPE_UI4 | File attributes. Documented in Win32 SDK. |
ClassId | DBTYPE_GUID | Class ID of object, for example, WordPerfect, Word, and so on. |
Characterization | DBTYPE_WSTR | DBTYPE_BYREF | Characterization, or abstract, of document. Computed by Index Server. |
Contents | (not applicable) | Main contents of file. Can be queried but not retrieved. |
Create | VT_FILETIME | Time file was created. |
Directory | DBTYPE_WSTR | DBTYPE_BYREF | Physical path to the file, not including the file name. |
DocAppName | DBTYPE_WSTR | DBTYPE_BYREF | Name of application that created the file. |
DocAuthor | DBTYPE_WSTR | DBTYPE_BYREF | Author of document. |
DocByteCount | DBTYPE_14 | Number of bytes in a document. |
DocCategory | DBTYPE_STR | DBTYPE_BYREF | Type of document such as a memo, schedule, or whitepaper. |
DocCharCount | DBTYPE_I4 | Number of characters in document. |
DocComments | DBTYPE_WSTR | DBTYPE_BYREF | Comments about document. |
DocCompany | DBTYPE_STR | DBTYPE_BYREF | Name of the company for which the document was written. |
DocCreatedTm | VT_FILETIME | Time document was created. |
DocEditTime | VT_FILETIME | Total time spent editing document. |
DocHiddenCount | DBTYPE_14 | Number of hidden slides in a Microsoft® PowerPoint document. |
DocKeywords | DBTYPE_WSTR | DBTYPE_BYREF | Document keywords. |
DocLastAuthor | DBTYPE_WSTR | DBTYPE_BYREF | Most recent user who edited document. |
DocLastPrinted | VT_FILETIME | Time document was last printed. |
DocLastSavedTm | VT_FILETIME | Time document was last saved. |
DocLineCount | DBTYPE_14 | Number of lines contained in a document. |
DocManager | DBTYPE_STR | DBTYPE_BYREF | Name of the manager of the document’s author. |
DocNoteCount | DBTYPE_14 | Number of pages with notes in a PowerPoint document. |
DocPageCount | DBTYPE_I4 | Number of pages in document. |
DocParaCount | DBTYPE_14 | Number of paragraphs in a document. |
DocPartTitles | DBTYPE_STR | DBTYPE_VECTOR | Names of document parts. For example, in Excel part titles are the names of spread sheets, in PowerPoint slide titles, and in Word for Windows the names of the documents in the master document. |
DocPresentationTarget | DBTYPE_STR|DBTYPE_BYREF | Target format (35mm, printer, video, and so on) for a presentation in PowerPoint. |
DocRevNumber | DBTYPE_WSTR | DBTYPE_BYREF | Current version number of document. |
DocSlideCount | DBTYPE_14 | Number of slides in a PowerPoint document. |
DocSubject | DBTYPE_WSTR | DBTYPE_BYREF | Subject of document. |
DocTemplate | DBTYPE_WSTR | DBTYPE_BYREF | Name of template for document. |
DocTitle | DBTYPE_WSTR | DBTYPE_BYREF | Title of document. |
DocWordCount | DBTYPE_I4 | Number of words in document. |
FileIndex | DBTYPE_I8 | Unique ID of file. |
FileName | DBTYPE_WSTR | DBTYPE_BYREF | Name of file. |
HitCount | DBTYPE_I4 | Number of hits (words matching query) in file. |
HtmlHRef | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML HREF. Can be queried but not retrieved. |
HtmlHeading1 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H1. Can be queried but not retrieved. |
HtmlHeading2 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H2. Can be queried but not retrieved. |
HtmlHeading3 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H3. Can be queried but not retrieved. |
HtmlHeading4 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H4. Can be queried but not retrieved. |
HtmlHeading5 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H5. Can be queried but not retrieved. |
HtmlHeading6 | DBTYPE_WSTR | DBTYPE_BYREF | Text of HTML document in style H6. Can be queried but not retrieved. |
Img_Alt | DBTYPE_WSTR | DBTYPE_BYREF | Alternate text for <IMG> tags. Can be queried but not retrieved. |
Path | DBTYPE_WSTR | DBTYPE_BYREF | Full physical path to file, including file name. |
Rank | DBTYPE_I4 | Rank of row. Ranges from 0 to 1000. Larger numbers indicate better matches. |
RankVector | DBTYPE_I4 | DBTYPE_VECTOR | Ranks of individual components of a vector query. |
ShortFileName | DBTYPE_WSTR | DBTYPE_BYREF | Short (8.3) file name. |
Size | DBTYPE_I8 | Size of file, in bytes. |
USN | DBTYPE_I8 | Update Sequence Number. NTFS drives only. |
VPath | DBTYPE_WSTR | DBTYPE_BYREF | Full virtual path to file, including file name. If more than one possible path, then the best match for the specific query is chosen. |
WorkId | DBTYPE_I4 | Internal ID for file. Used within Index Server. |
Write | VT_FILETIME | Last time file was written. |
To define properties that are not in the previous list, you must list them in a [Names] section in the .idq file. To use these properties in a restriction, sort specification, or as a retrieved column, you have define them in the .idq file, using the following format:
[Names]
#Properties that are not in the standard list
Propertyname ( Datatype ) = GUID ["Name" | propid]
In the syntax, "Name" is the property name ("Sales" in the following example), and propid is the property ID in hexadecimal. Note that you need to surround the friendly name with quotation marks, but the property ID does not take quotation marks.
For example, suppose you want to define an HTML meta tag as a property name that somebody can search for. The property you want to define is Sales.
To define the Sales property
MetaDescription(DBTYPE_WSTR) = d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1 "Sales"
The GUID number comes from the MetaTagClsid parameter in the registry, at the following location:
HKEY_LOCAL_MACHINE \SYSTEM \CurrentControlSet \Control \HtmlFilter \MetaTagClsid
For example, say you want to search for all files that give sales projections for the future:
In File1.htm:
<META NAME="Sales" CONTENT="Projections for 1998">
In File2.htm:
<META NAME="Sales" CONTENT="Projections for 1999">
In File3.htm:
<META NAME="Sales" CONTENT="Sales in 1997">
Note Be sure to add your META NAME tags between the <head> and </head> HTML tags at the beginning of the file.
You can now search for all files that show sales projections. Send the following query:
@metadescription projections
This query returns all the files with the word projections in the CONTENT field of the meta tag. In this example, File1.htm and File2.htm are returned.
But suppose you want to search for sales by year, for example a list of sales in 1997. Send the following query:
@metadescription 1997
File3.htm is returned.