+1 480-998-1843
info@commkal.com
Hosting Solutions
WINDOWS SERVER HOST

Help Options



E-Mail
FAQ
Links

FTP
FAQ
Links

FrontPage
FAQ
Links

HTML
FAQ
Links

CGI How-To
Site-Search Script

Download SiteSearch.zip (which contains a sample form and readme)

Description

This script allows users to search for key words in documents located in your Web site. It searches the basic text of the documents, as well as ALT text and (optionally) any information contained in META "keywords" and "description" tags. You can even specify your own descriptions for found pages by using the <searchrsltdesc> option, if you don't like the search results text that is shown by default; or you can turn off descriptions completely. The script scores the match URLs based upon the frequency with which the requested key terms appear in the documents, and also (optionally) lists the date on which each file was last modified. You can easily configure the number of matches that will be displayed on each search results page.
The script, of course, must be called from a search form on a Web page. The exact structure of the form is not too important, so you're welcome to modify the code provided to suit your own purposes, so long as the correct fields and options exist. If you leave out the "boolean" and "case" fields, the script will default to a case-insensitive boolean "OR" ("any terms") search. If you leave out the "hits" field, the script will default to showing 25 matches per output page. The "terms" field, of course, is essential, since it's in that field that your visitors will input their keywords!

Please read the caveat below before setting up this search for your web site.

Form Configuration

The action of your form needs to point to /scripts/sitesearch.pl, and the method must be POST in capital letters.
The script offers many ways to code your form, to tailor the resulting HTML page, and to change the way the script performs.
Below is a list of form fields you can use and how to implement them.

Necessary Form Fields

There are only two form fields that you must have in your form for the site search script to work correctly: the 'terms' and 'domain' fields.

Field: terms
Description: This form field allows your visitors to specify what to search for in your web site.
Syntax: <input type=text name="terms" size=60 value="">

Field: domain
Description: This form field allows you to specify your domain name.
Syntax: <input type=hidden name="domain" value="commkal.com">

Optional Form Fields

Note that for some options to work, some field variables must be specified in conjunction with other field variables.
Field: avoid
Description: By default, the script will search all directories in your web site. If there are particular files or entire directories that you *don't* want included in the search, define them here.
If you do not include this field, the script defaults to not searching:
*.backup, *.cgi, *.pl, and *.txt
If you include this field you must include each directory, file type, and/or specific file that you do not want searched.
Syntax: <input type=hidden name="avoid" value="(memberinfo.html|\.backup|\.cgi|\.pl|\.txt|dir1|dir2/subdir)">

The slashes ( \ ) above specify any filename ending in the suffix specified. Each entry is separated by a vertical line ( | ).
Be sure to enclose the whole mess with parentheses () as shown above.


Field: subdirs
Description: Alternatively, rather than defining what directories to avoid in the "avoid" variable above, you can specify that the search script only search a specific directory (and its sub-directories) in the site.
Note: This option can also be used in conjunction with the "otherurls" variables discussed below.
Syntax: <input type=hidden name="subdirs" value="public/livesite">

Field: allowdatesearch
Description: If this variable is set to Yes, visitors will have the ability to specify dates when searching your site, so that, for example, their results listing can include only files updated during the month of August.
Note: If you choose to not allow date search, you will need to remove the code for this option as specified in the search form sample.
Syntax: <input type=hidden name="allowdatesearch" value="Yes">

Field: displaybydate
Description: By default, the script will display search results in order by the number of keyword matches on each page. If, instead, you'd like the results displayed in order by date, with the most recently modified files listed first, then set this variable to Yes.
This variable can be used regardless of the 'allowdatesearch' variable mentioned above.
Syntax: <input type=hidden name="displaybydate" value="Yes">

Field: extrachars
Description: By default, the script will allow search terms to include any alphanumeric characters (A-Z and 0-9), as well as dashes, underscores, apostrophes and periods. If there are other characters you'd like to allow in search terms, such as "foreign" characters or particular symbols, define them in this variable as paired with any relevant HTML "encoding" that may be used to display them.
Syntax: <input type=hidden name="extrachars" value="'&copy;','©',">

Field: metasearch
Description: If you want to include META tag "description" and "keyword" information in the searches, set the metasearch variable to Yes.
If you don't want this done -- if for example, all your pages contain the *same* META information -- then set this variable to No, or do not specify it in the search form.
Syntax: <input type=hidden name="metasearch" value="Yes">

Field: descriptions
Description: If for some reason you don't want *any* text descriptions shown for the found pages in the results list, set this variable to No.
Syntax: <input type=hidden name="descriptions" value="No">

Optionally, if you want to have descriptions shown but don't like the results (for example, all of your META tag information shows the same), you can specify the following tag in each of your pages and it will be displayed as the search results description: <searchrsltdesc>Text to be displayed in search results for this page.</searchrsltdesc>
Note that this tag should be added to your web pages, not the search form. You must add it to each page for which you want to specify a search results description. In order for the search script to find your description, it must be added right after the closing TITLE tag and somewhere before the closing HEAD tag:
</title><searchrsltdesc>Your description for this page.</searchrsltdesc></head>
Note also that the "metasearch" option must be enabled if you wish to specify the search results descriptions. Otherwise, if you have "descriptions" enabled and "metasearch" disabled, the search results descriptions will be the first few lines from each found page.


Field: desclength
Description: If you have "descriptions" enabled and "metasearch" disabled, you can specify the number of characters of text from the found pages that will be shown as the search results descriptions. You can set this to any positive number.
If this option is not specified and the above conditions exist, the script defaults to showing the first 320 characters of text.
Syntax: <input type=hidden name="desclength" value="320">

Field: splitnames
Description: If your web pages use internal location tags (<A NAME="x"> tags), and you want the "pieces" of the pages to show separately in the search results listing, set this variable to Yes. If you set it to No, such internal tags will be ignored, and the script will view all documents as single entities.
Syntax: <input type=hidden name="splitnames" value="Yes">

Field: bodyspec
Description: This variable can be defined as illustrated with any attributes (BACKGROUND, BGCOLOR, TEXT, etc.) which you want to have assigned to the <BODY> tag on pages created by the script.
This option can be used in conjunction with the "metafile", "headerfile", "footerfile", and "ssirootdir" variables discussed below.
Syntax: <input type=hidden name="bodyspec" value="BGCOLOR=#FFFFFF TEXT=#000000">

Field: metafile
Description: The path to a text file containing any HTML code (META tags, etc.) to be inserted within the <HEAD> section of the pages produced by the script. (This file, of course, is optional.)
Syntax: <input type=hidden name="metafile" value="metacode.txt">

For this (and any) file path that is specified in the search form, you can specify directories if needed. For example, if the file is not located in the root directory of the site being search, simply specify "directory/metacode.txt"
This allows you to use multiple search forms within a site, and specify different options for the search results pages, dependant on which search form is used.


Field: headerfile
Description: This allows you to include certain "standard" information on all of the search results pages.
For example, if you want the search results pages to look like other pages in your site, you could use this file for the section of page that should appear above the search results.
Syntax: <input type=hidden name="headerfile" value="searchheader.txt">

Field: footerfile
Description: Same as the "headerfile" option, except the code in this file appears after the search listings in the search results page.
Syntax: <input type=hidden name="footerfile" value="searchfooter.txt">

Field: ssirootdir
Description: Your header and footer files can contain SSI "include" tags. If they do, you need to specify the path which precedes the "include" definition, in order to define the file's full path and thus allow the script to find it.
If your "include" file is not in the web site root directory, specify the directory location as in the example below.
Syntax: <input type=hidden name="ssirootdir" value="includes/searchfiles">

Field: printsearchform
Description: By default, the script will include a new search form at the bottom of all search results pages. If you don't want it to do so -- if, for example, you have a search form in one frame with the results appearing in another -- then set this variable to No.
Syntax: <input type=hidden name="printsearchform" value="No">

Field: usesearchhelp
Description: If you want to include any short "explanatory" help text in your search forms, to assist your visitors in using them, set this variable to Yes.
Note: If you do not use this option, you will want to remove the help text that appears on the sample search form. Otherwise the default help text will appear in the first search form, but not in the search forms that are included with the search results pages.
Syntax: <input type=hidden name="usesearchhelp" value="Yes">

Field: searchhelptext
Description: The help text to be included in the search forms.
Note: If you have "usesearchhelp" set to Yes but do not specify this variable, the script will include the default search help text, which is included in the sample search form.
Syntax: <input type=hidden name="searchhelptext" value="Your search form help text here.">

Field: linkstyle
Description: By default, the script will display the page TITLE as the link to the found page. You have 2 options to change the link style. You might want to do this if, for example, all the pages in your site have the same TITLE tag. Or you may wish to hide the URL to the found pages until the viewer clicks on the link.
Option 1: Display the URL to the found pages.
Syntax: <input type=hidden name="linkstyle" value="pageurl">
Option 2: Hide the URL to the found pages, and enter your own display text.
Syntax: <input type=hidden name="linkstyle" value="javahide">

Field: javahidedisplay
Description: If you use the "javahide" option above, you can use this variable to define what shows as the link text in the search results list.
Otherwise the script will show the default link text:
"Click here to view this search results page..."
Syntax: <input type=hidden name="javahidedisplay" value="Search Results">

Field: javahovertext
Description: If you use the "javahide" option above, you can use this variable to define the hover text (this shows in the browser's status bar where the URL normally shows when you hold your mouse over a link).
Otherwise the script will show the default hover text:
"Click to view page containing your search terms."
Example: Hold your mouse over this link and view the status bar

Search Results
Syntax: <input type=hidden name="javahovertext" value="Click the link to view the found page.">

The rest of the variables defined below allow you to set up search forms for non-standard web site configurations.


Field: otherurls
Description: You can use this variable for one of two things.
1. If you have a parked or vanity domain name, and want all search results directed to that domain instead of your main domain name, enter the domain in this field, and set the "overlay" option below to Yes.
2. If you have a domain name or vanity domain name that points to a sub directory in your main web site, you can use this variable in conjunction with the other variables defined below to set up a search form for that sub site.
Syntax: <input type=hidden name="otherurls" value="www.myotherdomain.com">
or
<input type=hidden name="otherurls" value="subsite.mydomain.com">

NOTE: You should only use these non-standard setup variables if the web site directory is inside of your main web site root directory. If instead, for example, you have a sub domain account that has its own directory outside of your main web site root directory, then simply create a new search form and enter the sub domain site's domain name (or vanity domain name) in the required "domain" field of that form.


Field: overlay
Description: If you are using option 1 in the "otherurls" variable above (I.E. you have a second domain name or a vanity domain name that is pointed to your main web site root directory), set this variable to Yes.
If you are using option 2 in the "otherurls" variable above (I.E. you have a second domain name or a vanity domain name that is pointed to a sub directory in your web site), set this variable to No.
Syntax: <input type=hidden name="overlay" value="Yes">
or
<input type=hidden name="overlay" value="No">

Field: otherdirs
Description: If you are using option 2 in the "otherurls" variable above, specify the directory name of that sub site's root directory.
Syntax: <input type=hidden name="otherdirs" value="subsite">

Field: subdirs
Description: If you want to limit the search in this sub site to just one directory (and its sub-directories), enter that directory name in this variable.
Syntax: <input type=hidden name="subdirs" value="public/livesite">

Using the examples shown in the last two variables, the search script will only search files in the following directory (and its sub-directories):
mainwebsiteroot/subsiteroot/public/livesite


Caveat

This search script works best for web sites which have less than 100 pages. If your web site has more pages, the searches may take more time than most viewers would find acceptable. In this situation, you may wish to request our Index Server add-on rather than using this search script. This add-on carries a monthly fee depending on your account plan, but greatly increases the speed of site searches.
Here is an estimated comparison of search times using this script versus the Index Server. The left column shows the number of pages, and the respective search times are shown in seconds in the table columns.

Pages in siteSearch ScriptIndex Server
5031
10081
250201
500602
1000+timeout3

For the sake of both your site visitors and your web site resource usage, please do not use this search script to search more than (approximately) 250 pages.
Thank you


Copyright 1998-2025 Communiqué Kaleidoscope, Inc.