Difference between revisions of "CGI"

From GeneWeb
Jump to navigation Jump to search
(If /usr/bin/mail is not available!!)
Line 305: Line 305:
  
 
== Script to upload a copy of a base ==
 
== Script to upload a copy of a base ==
(Test and report for errors or improvements are welcome.)
+
The files below are supplied to experienced users knowledgeable with shell scripts and the subtelties of Linix. "Use at your own risk" is the traditionnal caveat! Report for errors or improvements are indeed welcome.
  
 
The script {{c|[[base-upld|base-upload.sh]]}} ([http://download.tuxfamily.org/geneweb/wiki/base-upload.sh download it]) takes a {{c|base_name}} as an argument, and optionally {{c|images}} or {{c|src}}. It uploads a fresh base on your server, or a fresh copy of the {{c|images/base}} or {{c|src/base}} folders. It performs the following steps:
 
The script {{c|[[base-upld|base-upload.sh]]}} ([http://download.tuxfamily.org/geneweb/wiki/base-upload.sh download it]) takes a {{c|base_name}} as an argument, and optionally {{c|images}} or {{c|src}}. It uploads a fresh base on your server, or a fresh copy of the {{c|images/base}} or {{c|src/base}} folders. It performs the following steps:

Revision as of 16:16, 24 November 2016

150px-Geographylogo svg.png Language:   English • français

When daemon mode of GeneWeb is forbidden or cannot be activated, you must use CGI mode. It's mostly the case on a shared hosting account.

When running in CGI mode, GeneWeb sits behind a general purpose HTTP server such as Apache, and is launched by the server as a CGI command. As such, the first step in installing GeneWeb consists in verifying the operation of the HTTP server and of its CGI calling function.

The second step is to install GeneWeb itself in a folder organisation adapted to your environment. You must indeed Download the GeneWeb version corresponding to your machine architecture (processor, 32/64 bits, OS).

Verifying Web and CGI service

Web service is verified by displaying from a web browser the content of index.html:

(server):~#ls -al index.html
-rwxr-xr-x  1 username usergroup    290 Nov 22 22:29 index.html
(server):~#cat index.html
<!DOCTYPE html">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
This is a minimal index.html page
</body>
</html>
(server):~#

Typing the following in your browser window:

http://my_server_address/index.html

should display

This is a minimal index.html page

CGI service is verified by executing the file test-cgi.sh

(server):~#cd cgi-bin
(server):~/cgi-bin#ls -al test-cgi.sh
-rwxr-xr-x  1 username usergroup    290 Nov 22 22:29 test-cgi.sh
(server):~/cgi-bin#cat test-cgi.sh
#!/bin/sh
echo 'Content-type: text/html'
echo  
echo '<!DOCTYPE html">'
echo '<html xmlns="http://www.w3.org/1999/xhtml">'
echo '<body>'
echo 'This is a test for cgi commands'
echo '</body>'
echo '</html>'
(server):~/cgi-bin#

Typing the following in your browser window:

 http://my_server_address/cgi-bin/test-cgi.sh

should display

This is a test for cgi commands

Note the ownership and read/write/execute characteristics if these files. See #Access rights and protection for a more detailed analysis.

Service specific comments:

Depending on your hosting service, the location of the system root (where you end up if you execute cd), web root and cgi-bin folders may vary!

  • OVH-Webm: (Web hosting) The server and web roots are at www and the cgi-bin folder is at www/cgi-bin.
  • OVH-VPS: (Cloud service) The server root is at /root, the web root at /var/www and the cgi-bin forlder is at /usr/lib/cgi-bin
  • 1&1: The server and web roots are at ~ and the cgi-bin folder is at ~/cgi-bin

(to be continued)

The cgi-bin folder

The cgi-bin folder contains the cgi commands launching the gwd server software for each request from a browser client.

(server):~ > cd cgi-bin/
(server):~/cgi-bin > ls -al
total 144
drwxr-xr-x  3 username usergroup  4096 Oct  7 06:58 .
drwx---r-t 23 username usergroup  4096 Oct  7 06:58 ..
-rwx---r-x  1 username usergroup   183 Jan 12  2015 gwd-7.00.sh
-rwx---r-x  1 username usergroup   183 Jan 15  2015 test-cgi.sh
(server):~/cgi-bin > cat gwd-7.00
#!/bin/sh
BIN_DIR=~/gw-7.00-alpha-linux/gw
BASE_DIR=~/bases
OPTIONS="-robot_xcl 19,60 -allowed_tags ./tags.txt -hd ./"
$BIN_DIR/gwd -cgi  $OPTIONS   -bd $BASE_DIR   > ./gwd.log 2>&1

The gwd-7.00.sh file is a shell script that launches gwd with the appropriate parameters:

It contains the following definitions:

  • DIR points to the gw folder holding the various executables files of GeneWeb (see below).
  • BASE_DIR points to the folder holding your genealogy bases (see below).
  • OPTIONS holds the set of start parameters for gwd, see gwd for details.

You may maintain in this folder several gwd-xx.sh files pointing to their respective DIR location holding different versions of GeneWeb, or starting with a different set of OPTIONS parameters.

As discussed in ???, the set of gwd start parameters can also be provided through a file named gwd.arg sitting in the gw forder. Note that in this file, the parameter and its value are on two separate lines.

(server):~ > cat ./gw-7.00-alpha-linux/gw/gwd.arg
-cgi
-bd
~/bases
-hd
./
-robot_xcl 
19,60
-allowed_tags
./tags.txt
(server):~ >

The gw folder

Download the appropriate version of GeneWeb to your root folder. In this folder, expand the file into a GeneWeb folder:

(server):~#cd
(server):~#ls -al gw-7.00-alpha-linux.tgz
-rw-r--r-- 1 username usergroup 5616960 Nov 24 14:21 gw-7.00-alpha-linux.tgz
gw-7.00-alpha-linux/
gw-7.00-alpha-linux/gw/
gw-7.00-alpha-linux/gw/gwd.arg
gw-7.00-alpha-linux/gw/convert_hist
....
gw-7.00-alpha-linux/gwsetup
(server):~#ls -al gw-7.00-alpha-linux
drwxr-xr-x  3 username usergroup   4096 Sep  4  2014 .
drwx---r-x 10 username usergroup   4096 Nov 24 14:21 ..
-rw-r--r--  1 username usergroup 160302 Sep  4  2014 CHANGES.txt
drwxr-xr-x  8 username usergroup   4096 Sep  4  2014 gw
-rwxr-xr-x  1 username usergroup     64 Sep  4  2014 gwd
-rwxr-xr-x  1 username usergroup     68 Sep  4  2014 gwsetup
-rw-r--r--  1 username usergroup  18007 Sep  4  2014 LICENSE.txt
-rw-r--r--  1 username usergroup  10345 Sep  4  2014 START.htm

You may rename the gw-7.00-alpha-linux folder to any name of your choice. You may also maintain several versions of GeneWeb in different folders.

Connecting style files

In a hosted CGI server environment, the style (.css) and javascript (.js) files are sent to your browser by the HTTP server (Apache) rather that gwd. You therefore have to install at the web_root a link to the repository holding the style files.

The example below is taken from a OVH-VPS hosted server.

root@vps265730:/var/www# ls -al
total 660
drwxr-xr-x  7 root root   4096 Sep 25 18:40 .
drwxr-xr-x 12 root root   4096 Apr 16  2016 ..
drwxr-xr-x  4 root root   4096 Nov 23 15:41 gw -> /root/GW/GeneWeb-7.00/gw
-rw-r--r--  1 root root   3514 Jun 23 13:19 index.html
drwxr-xr-x  2 root root   4096 Apr 16  2016 private
drwxr-xr-x  2 root root   4096 Apr 16  2016 public
-rw-r--r--  1 root root     27 Apr 28  2016 robot.txt
root@vps265730:/var/www#

In order to instruct gwd to use this path, you must add in your .gwf configuration file the following variable:

 static_path=../gw/etc

Note the leading .. due to the fact that the script launching gwd executes one level below the web_root of your server.

http://server_address/cgi-bin/gwd-7.00.sh

The bases folder

A typical installation will position the folder holding your base at the root of your hosting service.

Folder structure for GeneWeb bases (Genealogy data).

The bases folder should contain the same information structure as in the case of installation on a personal computer:

  • one or several base-i.gwb folders.
  • one or several base-i.gwf files.
  • cnt containing access count data.
  • etc containing template text files used in priority over the generic gw/etc files (see header).
  • images containing for each base photos associated with each person (first_name.occ.last_name.jpg).
  • lang containing some base specific language relates template text files (lexicon for instance).
  • src containing for each base text files and image files inserted in notes with the m=SRC command.
  • wiz.auth as many wizards and friends authorization files as specified for each base in their respective .gwf parameter file.

In each of the etc, images, lang and src folders, sub-folders with names base-i hold the specific data for each base.

(Note that the picture on the right has been taken off a personnal computer where the bases are stored in a folder names GeneWeb-Bases, as opposed to bases in the remote server structure proposed here).

1&1 hosting, GeneWeb 5 and 6

(Kept here for historical reasons)

The following applies to 1&1 hosting, and may differ on other hostings. On this web site, we can easily switch between two versions of GeneWeb: 5.00 and 6.00.

Directories and files

Files and directories on 1&1.
    • directory mybases: The bases directory, beside the CGI-root directory.
      • directory mybases.gwb
      • directory cnt
      • directory images
    • directory root: The CGI-root directory.
      • directory basesxg: An alternative bases directory.
      • directory css: A copy of the gw/css directory. This directory is used by the Apache server.
      • directory gw: The gw directory of the GeneWeb distribution.
      • directory gwenv: The gw directory of the Geneweb-5 distribution.
      • directory igw: The images directory for Geneweb-5.
      • directory images: A directory with copies of gw/images/gwback.jpg and gw/images/gwlogo_bas.png
      • directory pub: A directory where are readable copies of the CGI scripts (this website is a demo site).
      • file gw6.cgi: The CGI script which launches GenWeb-6.
      • file gw5.cgi: The CGI script which launches GenWeb-5.
      • file issue6.cgi: A test script which displays information about the environment of the server and checks size and md5sum for the gwd binary file.
      • file issue5.cgi: The same script, but for the GeneWeb-5 version

You need the “exec” permission on the files gw/gwd, gw/gwsetup, and files used by gwsetup (gw/gwc*, gw/gwu,gw/consang, gw/update_nldb). The gwd.arg file is empty.

The databases folder bases must be protected either by a .htaccess file, either by a location out of the HTTP server scope. It does not need to be accessible by the HTTP server.

Description of the CGI script

Main parameters:

PWD=$(pwd)
LNG="fr"
GENEWEBSHARE=$PWD/gw
GENEWEBDOC=$PWD/gw/doc
GENEWEBDB=$PWD/../bases
DAEMON=$GENEWEBSHARE/gwd
LOGFILE=$GENEWEBDB/gw.log
The CGI working directory.
The language for the user interface.
The programs folder.
The documentation folder (obsolete).
The databases folder. 
The program gwd itself.
Gwd log file, it helps to solve problems.

Note that although called DAEMON, gwd does not run in -daemon mode but in -cgi mode.

Be carefull when using a log file, its size can increase quickly, don’t forget to delete it from time to time.

OPTIONS="-blang -robot_xcl 40,70 -max_clients 15 -conn_tmout 120 -min_disp_req 30 -images_url http://myserver.net/gw/images"
# -allowed_tags $GENEWEBDB/tags.txt
cd $GENEWEBSHARE
$DAEMON -hd$GENEWEBSHARE -dd$GENEWEBDOC -bd$GENEWEBDB -lang$LNG -log$LOGFILE -cgi $OPTIONS  2>/dev/null

Misc. options:

  • robot_xcl: To protect your data from HTTrack or WebSite Extractor.
  • conn_tmout: For statistics on the bottom line.
  • images_url: Icons and images are not sent by GeneWeb, but by your HTTP server (not CGI).
  • allowed_tags: Usefull option if you use HTML tags not in default_good_tag_list.

Windows

Under Windows calling a CGI script using batch and cmd.exe can be tricky. An alternative is to directly call a copy of gwd.exe with its arguments file gwd.arg in Apache /cgi-bin/ directory file.

Gwd will work behind Apache calling http://localhost/cgi-bin/gwd.exe.

  • directory cgi-bin
    • file gwd.exe
    • file gwd.arg

Edit gwd.arg to point your local GeneWeb installation, for example if it is C:\Program Files (x86)\geneweb\:

-hd
C:\Program Files (x86)\geneweb\gw
-bd
C:\Program Files (x86)\geneweb\bases
-log
C:\Program Files (x86)\geneweb\geneweb.log
-images_dir
C:\Program Files (x86)\geneweb\gw\images
-cgi

Problems to solve: - Portrait (all images with m=IMH) are not shown or are corrupted (#103). - Minor problems in 7.00 with loading of new .css/.js files, specialy for templm (#356).

Images missing

-images_dir parameter create links to local image files like file:///c:\path\to\myimage.jpg that are not shown for security reason under some browser (like Chrome for ex.). An alternative to previous configuration is to switch to -images_url parameter so that gwd uses relatives paths for images and to create a virtual directory in your httpd server.

In gwd.arg, modify:

-images_url
/images

If you use Apache, edit httpd.conf to have those lines, then restart httpd.exe:

LoadModule access_compat_module modules/mod_access_compat.so
LoadModule alias_module modules/mod_alias.so

Alias /images " C:\Program Files (x86)\geneweb\gw\images"

<Directory " C:\Program Files (x86)\geneweb\gw\images">
  Options None
  AllowOverride All
  Order allow,deny
  Allow from all
</Directory>

Script to upload a copy of a base

The files below are supplied to experienced users knowledgeable with shell scripts and the subtelties of Linix. "Use at your own risk" is the traditionnal caveat! Report for errors or improvements are indeed welcome.

The script base-upload.sh (download it) takes a base_name as an argument, and optionally images or src. It uploads a fresh base on your server, or a fresh copy of the images/base or src/base folders. It performs the following steps:

  • creates a fresh base.gw file from the base on your personal computer.
  • extracts a listing of the content of bases/images/base bases/src/base and bases/src/base/images in three ls-xxx.txt files.
  • creates a remote.sh file (see below).
  • creates a tar file containing $BASE.gw, history remote.sh and the three ls-*.txt files above.
  • sends the tar file to the remote server.
  • triggers on the remote server unfolding of the tar file and execution of remote.sh. The result of this remote execution should be a mail containing the remote.log and three diff between images folders.

This procedure saves the previous folder of the base or its images and src folders with a "yyyy-mm-dd-hh:mm:ss" date tag. As time progresses, you may want to clean-up this accumulation of saves.

remote.sh for base update

The shell script below is executed on your remote server, and installs your base at the proper location (according to the overall set-up described here).

#!/bin/sh
# BASE BASES_SERVER BIN_DIR_SERVER and ADDRESS will be replaced by the appropriate values by sed
cd
DATE=$(date +"%Y-%m-%d-%T")

echo 'Error log of update for base: BASE' > ~/remote.log
echo $DATE >> ~/remote.log

# Save current version 
cd BASES_SERVER 2>> ~/remote.log
if [ -d ./BASE.gwb ]  # do it only if folder exists
then
  mv ./BASE.gwb  ./BASE-$DATE.gwb 2>> ~/remote.log
fi

# Extract new base from .tar file
tar xf ~/BASE.tar 2>> ~/remote.log
# rebuild BASE from .gw file
BIN_DIR_SERVER/gwc1 -f -o BASE BASE.gw 2>> ~/remote.log
BIN_DIR_SERVER/updnldb BASE 2>> ~/remote.log

if [ -f ./BASE-$DATE.gwb/history ] # do it only if file exists
then
  cp -f ./BASE-$DATE.gwb/history ./BASE.gwb 2>> ~/remote.log
fi
if [ -f ./BASE-$DATE.gwb/forum ]  # do it only if file exists
then
  cp -f ./BASE-$DATE.gwb/forum   ./BASE.gwb 2>> ~/remote.log 
fi
# done
cd >> ~/remote.log

# Create a link to css.txt in the bases/src folder
ln -s -f ~/GeneWeb/geneweb-$VERS/gw/etc/css.txt BASES_SERVER/src/BASE/css.txt 2>> ~/remote.log

# compute diffs between server and personal computer for image folders
echo '' >> ~/remote.log
if [ -d BASES_SERVER/images/BASE ]     # do it only if folder exists
then
  ls BASES_SERVER/images/BASE          > ./ls-personnes-serveur.txt
  echo 'Personnes diff serveur vs local' >> ~/remote.log
  diff ./ls-personnes-serveur.txt ./ls-personnes.txt >> ~/remote.log
  # If >> ~/remote.log does not work on your server, 
  # or if you do not have a /usr/bin/mail capability, global replace it by ####
  # If the diff above fails, replace it by the wc command below which 
  # provides a first level indication of discrepancies betwen the two folders 
  #wc HOME_S/ls-personnes*  ####
  # Do the same operation for the other diff occurences.

else
  echo "No folder BASES_SERVER/images/BASE" >> ~/remote.log
fi
echo '' >> ~/remote.log
if [ -d BASES_SERVER/src/BASE/ ]       # do it only if folder exists
then
  ls BASES_SERVER/src/BASE             > ./ls-src-files-serveur.txt
  echo 'Src files diff serveur vs local' >> ~/remote.log
  diff ./ls-src-files-serveur.txt ./ls-src-files.txt >> ~/remote.log
else
  echo "No folder BASES_SERVER/src/BASE" >> ~/remote.log
fi
echo '' >> ~/remote.log
if [ -d BASES_SERVER/src/BASE/images ] # do it only if folder exists
then
  ls BASES_SERVER/src/BASE/images      > ./ls-images-serveur.txt
  echo 'Images diff serveur vs local' >> ~/remote.log
  diff ./ls-images-serveur.txt ./ls-images.txt >> ~/remote.log
else
  echo "No folder BASES_SERVER/src/BASE/images" >> ~/remote.log
fi

rm ./ls-*.txt

/usr/bin/mail -s 'Update error log' 'ADDRESS' < ~/remote.log
rm -f ~/remote.*

remote.sh for images or src update

The shell script below is executed on your remote server, and installs the images or src/images folders for your base at the proper location (according to the overall set-up described here).

Depending on the size of your images folders, and on the number of new images requiring upload, you may prefer to upload images individually with your preferred ftp client rather than doing the bulk upload proposed here. Remember that the base upload script performs a comparison between the images folders on your personal computer and your server.

#!/bin/sh
# BASE BASES_SERVER and ADDRESS will be replaced by the appropriate values by sed

DATE=$(date +"%Y-%m-%d-%T")

if [ -e ./images.tmp ]
then
  FILES="images"
  rm -f ./images.tmp
fi
if [ -e ./src.tmp ]
then
  FILES="src"
  rm -f ./src.tmp
fi

echo "Error log of update for $FILES of base: BASE" > ~/remote.log
echo "`date`" >> ~/remote.log 

# Save current version then move new version
if [ -d BASES_SERVER/$FILES/BASE ]  # do it only if folder exists
then
  mv BASES_SERVER/$FILES/BASE  BASES_SERVER/$FILES/BASE-$DATE 2>> ~/remote.log 
fi
mv ./BASE BASES_SERVER/$FILES 2>> ~/remote.log 

/usr/bin/mail -s 'Update error log' 'ADDRESS' < ~/remote.log
rm -f ~/remote.*

Debugging your remote server

Debugging your remote server may be tricky, but some systematic approach and several tools will help.

  • Verify first that your HTTP server works properly. This is achieved by typing your_server_name in the URL window of your browser which should return the content of index.html.
  • Verify that the cgi mechanism works. This is achieved by typing your_server_name/cgi-bin/test-cgi.sh as seen above (see #Folders and files).
If your server returns a "Internal server error", several hypothesis should be examined:
  • Your script test-cgi.sh does not work properly. Run it directly on a terminal window by typing ./test-cgi.sh (see #Folders and files).
  • Your server accepts only files with extension .cgi. Rename test-cgi.sh into test-cgi.cgi and try again.
  • The first two lines returned by your test-cgi.sh script are not exactly as shown (the second one should be a blank line, Note also that "Content-type text/html\n\n" did not work in my tests).
  • You also may want to verify that your "End-of-Lines" are correct for your server environment. Remember that there are three different such EOL encoding for Windows, Mac, and Linux!!
  • Examine the HTTP server log file on your server by typing:
tail ~/logs/access.logs.current
(your environment may store access logs somewhere else, but this is the most likely place)
Examination of the log file may give you a hint as to the nature of your problem.
  • Another typical issue is that of ownership of the various files associated with GeneWeb. In the case described here, ownership is user and group ownership is ftpusers. Depending on the specific method you have used to install GeneWeb, this may vary. Some discussions on the Yahoo! group forum about GeneWeb mentions geneweb for group ownership!
  • You may also want to look at the gwd log file if your GeneWeb server works only partially (welcome page is ok, but other pages do not work properly). This log file is specified in the gwd launch command. In our example, it sits at $DIR/gwd.log where $DIR depends on the specific version of GeneWeb you are currently running (see details in #The cgi folder section above).
  • You may end up in a situation where GeneWeb works only partially, in particular it may miss some css style sheet or JavaScript files. One way of debugging this kind of problem consists in exploring the HTML source file produced by GeneWeb (your browser offers this capability addressed to "developers"). Is such source files, you will find reference to css and JavaScript files whose source should appear in your browser if you click on the link. In case of bad configuration, you will obtain an error messages such as "File not found" or "You do not have permission"…
Remember that some of the files needed for proper display are fetched by the HTTP server rather than by GeneWeb.

Access rights and protections

In CGI mode, gwd runs behind a standard HTTP server, typically Apache. As such, the owner of the gwd process is the owner of the HTTP server. For instance, with Apache, this owner is defined in the http.conf file and its default value is set to user: _www and group: —www. When running within a hosting environment, such as 1&1, you typically do not have access to this parameter. You must therefore organise the protection level of the various folders involved with GeneWeb appropriately:

  • read access is usually allowed by default
  • for wizards, gwd needs write access to bases/cnt/actlog, bases/cnt/robot (if you have activated the -robot_xcl parameter)
  • and when modifications are performed, gwd needs write access to bases/basename.lck, bases/basename.gwd/notes_link and to bases/basename.gwd/patches

Creating those files, and doing a sudo chown _www filename seems to be sufficient.

Access problems are reported in the HTTP error log file, unfortunately not always available in the case of shared hosting services!!

Access to various files through your hosted HTTP server is also controlled by your hosting service and through .htaccess files spread across your folders. Managing a coherent set of .htaccess files is not trivial and error prone!! The examples below apply to Apache version 2.4 only (earlier versions have different directives!!).

Options +FollowSymLinks should allow following symbolic links

Require all denied prevents access to this folder to all users. Require all granted allows access to this folder to all users.

AllowOverride AuthConfig
AuthType Basic
AuthName "Username/Password required"
AuthUserFile /Users/Name/SomeFolder/htfriends.auth
Require valid-user

should restrict access to this folder to users supplying a valid password as defined by the htfriends.auth file.

htpasswd /Users/Name/SomeFolder/htfriends.auth UserName will trigger the process to add a new user to the list.



GeneWeb Manual

Rembrandt Old Man Reading a Book.jpg

Use and manage genealogical databases

Technical annex