Access restrictions, robots and black lists

From GeneWeb
Jump to: navigation, search
150px-Geographylogo svg.png Language: English • français

In addition to the wizard and friend password mechanism controlling modification rights and private data visibility, there are several other mechanisms restricting global access to the base or to the gwd service. Some features restrict access by robots, and black lists prevent total access from specified Internet nodes.

These additional features apply only in server mode, and not in CGI mode. In the latter case, you should use the access restrictions provided by your HTTP server.

Global access restriction to a base

Warning: this functionality is not available in CGI mode.

If you want to limit global access to one base to a limited set of persons, you should create a global-access.auth file, and specify this filename with the auth_file variable of the configuration file for the base. The global-access.auth file should be located in the bases folder.

This file has the same structure as that of friends and wizard password files, namely username:password:

dupont:ex23zuu
martin:2wxuz4

When accessing the site, visitors will be requested a username and a password which should match one of the pairs stored in bases/global-access.auth.

Global restrictions to gwd service

Warning: this functionality is not available in CGI mode.

The previous access restriction applies to a single base. You can implement a similar access restriction to all the bases managed by your site by providing gwd with a global authorisation file of the same structure as above. This filename is supplied to gwd at launch time through the option -auth filename.auth. In case of the existence of both a global access restriction to gwd service, and global access restriction to a base, the latter has priority over the former.

Robots management

Some robots visit regularly most web sites, exploring their content to build up their search capabilities. To do this, they download a page, and within this page, follow any valid HTTP link and repeat the process. With bases such as the ones managed by GeneWeb, this is an endless process with little value, and possibly bad consequences:

  • It slows down your server.
  • It impacts other visitors.
  • It destroys access statistics to your site.
  • It gives a bad taste of spying or stealing.

While GeneWeb’s HTTP server and the generated HTML pages adhere explicitly to the exclusion standard, followed by most “big names” search engines, some to ignore it and continue exploration of your site in spite of the “soft” interdiction. To alleviate this problem, GeneWeb watches the frequency at which a remote site comes visiting GeneWeb and puts it on a black list if this limit grows beyond a threshold. This limit is provided to gwd at launch time with the -robot_xcl xx yy option which means “put this remote site on black list if it performs more than xx visits in less than yy seconds”. Further requests will receive a warning message stating that they have been put on the site’s black list.

The black list is a file called robot stored in the bases/cnt distribution folder.

In order to restore access, one should remove the file (as explained in the gwd log file)!

Black list

Another black list allows to refuse access to a selected list of individuals or grouped Internet sites. A file named gw/gwd.xcl contains a list of excluded sites, one per line, with * replacing any sequence of characters.

grand-mechant@loup.bois
fournisseur-*@d.acces

prevents access for “@grand-mechant@loup.bois@”, “@fournisseur-22@d.acces@”, “@fournisseur-xx@d.acces@”, etc. Putting a single * on a line will block all access, including for your own address.

Managing the GeneWeb server

Geneweb can be managed through gwsetup. For security reasons, administration can be performed only from a specific IP address provided in the only.txt file (see distribution for its location). The default value is 127.0.0.1 and can be changed to any valid IP address (only.txt stores a single IP address).


GeneWeb Manual

Rembrandt Old Man Reading a Book.jpg

Use and manage genealogical databases

Technical annex