Voicemail On The Web

Why have an answering machine and CLID box when your computer can do it all? Here's how I set up a Unix box (Linux 2.4) to be an answering machine that's also a web interface to voicemail messages. No more arcane DTMF codes to get your messages when you're away! Unlimited message storage space! Backups!

I've tried to make everything simple and functional: easy to understand rather than shiny and multi-coloured. You might want a web page with Javascript (I mean Ajax™. Or do I mean Web2.0™?). Or a fancy voicemail system with lots of menus, multiple inboxes, extensions, whatever -- it's just a shell script.

Cave! You must have basic system administration skills to make any sense of this document. You should know how to install hardware and setup web servers (possibly with security). You should be able to patch, compile and install software. If you don't have these skills, find someone who does; then make her read this page and help you. Nothing here is Linux-specific: any system that recognises the modem should be fine.


Here's a sample of the voicemail web page:


DeleteTimeDuration NameNumberSubject
2005-02-14  14:50:25 0:29
2005-03-21  15:33:27 0:28

The system fills in the "Number" and "Name" fields from the caller-ID info (if present). The subject starts out blank. Note that all of name, number, and subject are editable. (The "Go!" button commits any changes.) The timestamp is a link to the WAV file that's the message. Access can be controlled by whatever means you use for HTTP. (I use good ol' "Basic" authentication over SSL.)


I use a US Robotics/3Com modem that includes voice and CLID features, the "56K FaxModem Model 5610". I had a hell of a time getting one; sellers don't seem to distinguish between the voice-capable and the fax/data only versions. (Anyone want a no-voice Model 5610 fax modem? I have one I'm not using.) It's a PCI card that shows up as ttyS4 on my system. The choice of modem is the most important part of this whole endeavour since the market is flooded with those crappy WinModems that will only work under windows, or do voice only on Windows, or not return CLID info, or.... The modem landscape is a vast and messy quagmire. Check the vgetty modem database before you buy.

The modem-handling software is mgetty+sendfax and vgetty. These are the version strings:

vgetty: experimental test release 0.9.32 / 24Dec01
mgetty: experimental test release 1.1.30-Dec16

Vgetty is started at system boot time by putting it in charge of ttyS4 in inittab:

# For modem
SX:345:respawn:/usr/local/sbin/vgetty ttyS4

I did have to patch vgetty to exec the "incoming call" callback (to deliver CLID) as soon as the information is available and not wait till the modem goes off-hook. This allows CLID-based call filtering, as you would expect from a CLID-equipped phone.

The conf. file /etc/mgetty+sendfax/voice.conf has these lines (among others) to configure vgetty:

answer_mode voice
call_program /usr/local/sbin/answering-machine.sh

By setting "call_program" we bypass vgetty's normal call handling and substitute the shell-script that's our answering machine.

The Answering Machine

This is the script that is the "answering machine" -- menus, file playback and recording etc. The shell script communicates with the voice library (vgetty) for DTMF tones, playback/record of files etc. via the file descriptors $VOICE_INPUT and $VOICE_OUTPUT. Here's /usr/local/sbin/answering-machine.sh:


exec 2> /var/tmp/answering-machine.out

# Answering machine script for vgetty
# Also see http://alpha.greenie.net/vgetty/readme.voice_shell.html



# This is the format for filenames:
FNAME=$(echo "$NAME" | tr ' ' '+')

export PATH="/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin"

# the function to receive an answer from the voice library
function receive {
     read -r INPUT <&$VOICE_INPUT;
     echo "$INPUT";

# the function to send a command to the voice library
function send {
     echo $1 >&$VOICE_OUTPUT;
     kill -PIPE $VOICE_PID

function expect {
    if [ "$1" != "$(receive)" ]; then
        logger -p user.err -t vmail "${PROGNAME}: $2"
        exit 0

function exch {
    send "$1"
    expect "$2" "$3"

function log {
    logger -p user.info -t vmail "$*"


# Perform handshake
expect "HELLO SHELL" "voice library not answering"
exch "HELLO VOICE PROGRAM" "READY" "initialization failed"

# output device
exch "DEVICE DIALUP_LINE" "READY" "could not set output device"

# This is the answering machine call flow

if [ -f "$OUT_MSG" ]; then
    exch "PLAY $OUT_MSG" "PLAYING" "could not start playing \"${OUT_MSG}\""
    expect "READY" "something went wrong!"
    log "Couldn't find $OUT_MSG"

exch "BEEP" "BEEPING" "could not send a beep"
expect "READY" "couldn't send a beep"

exch "RECORD ${FNAME}.rmd" "RECORDING" "couldn't record ${FNAME}.rmd"
expect "READY" "error while recording"

exch "GOODBYE" "GOODBYE SHELL" "couldn't say goodbye"

# Format it
if rmdtopvf "${FNAME}.rmd" | pvftowav > "${NAME}.wav"; then
    rm "${FNAME}.rmd"

# Change the owner of the voicemail file so a CGI can manipulate it
chown apache "${NAME}.wav" || exit 1

exit 0

Incoming messages are saved in /var/www/html/vmail as ${time}.${number}.${name}.${subject}" so they can be manipulated by the CGI script. (No database required to store call metadata.) Special characters (special to either the filesystem like '/', or others like space and '<') are URL-quoted. Also note that only ".rmd" files (basically the already encoded form ready for the DSPs on the modem card) are read and written by vgetty; subsidiary programs have to be used to convert to/from standard formats.

The outgoing message has to be recorded, converted, and placed into $OUT_MSG manually. A fancy interface, possibly via the web page, would be nice.

Modifications to vgetty for CLID callback

Vgetty normally waits till the modem picks up (usually 4 rings, set in the config file) before reporting the caller-ID it received. We want CLID displayed as soon as we have it: vgetty calls an external program when a call with CLID comes in.

The callback program is called with the tty the call came in on, the name, and number. Since that program is arbitrary, you can make it do whatever you want. I use a little script that checks to see if tvtime is running and if so, puts up an OSD display; if no tvtime then it puts up a little window. Awfully convenient to have CLID show up on the screen when you're watching a movie and trying to decide if you should answer the phone. (The new window and OSD go away on their own -- no need for an annoying "Press OK" to dismiss.)

*** mgetty-1.1.30/ring.c        2002-12-05 12:29:10.000000000 -0800
--- mgetty-1.1.30-phliar/ring.c        2006-05-01 12:21:42.000000000 -0700
*** 28,33 ****
--- 28,35 ----
  #include "tio.h"
  #include "fax_lib.h"

+ #define CND "/usr/local/sbin/log-cnd.sh"
  /* strdup variant that returns "" in case of out-of-memory */
  static char * safedup( char * in )
*** 221,226 ****
--- 223,231 ----
  int   rc = SUCCESS;
  boolean       got_dle;                /* for  events (voice mode) */

+     char caller_id[30], caller_name[100];
+     caller_id[0] = caller_name[0] = 0;
      lprintf( L_MESG, "wfr: waiting for ``RING''" );
      lprintf( L_NOISE, "got: ");

*** 316,321 ****
--- 321,339 ----
             strncmp( buf, "TO:", 3 ) == 0 )
            { *dist_ring_number = ring_handle_ZyXEL( buf, msn_list ); break; }

+       if ( strncmp( buf, "NMBR", 4 ) == 0)
+           strncpy(caller_id, buf+7, sizeof(caller_id));
+       else if ( strncmp( buf, "NAME", 4 ) == 0)
+           strncpy(caller_name, buf+7, sizeof(caller_name));
+       if (*caller_id && *caller_name) {
+           char buf[BUFSIZ];
+           snprintf(buf, sizeof(buf),
+                    "sh %s ttyS4 \"%s\" \"%s\"",
+                    CND, caller_name, caller_id);
+           lprintf(L_MESG, buf);
+           system(buf);
+       }
        /* Rockwell (et al) caller ID - handled by cndfind(), but
         * we count it as "RING" to be able to pick up immediately
         * instead of waiting for the next "real" RING

To do: Make the CLID callback command configured in voice.conf, not just hard-coded to /usr/local/sbin/log-cnd.sh! Also the tty parameter handed to the script.

The Web-Page

The web page is constructed by a CGI. Here's an example (in Unicon) that understands the filename encoding that answering-machine.sh uses for filenames.

# Controlling directories full of vmail
# Filename is
#     date.time.number.name.subject.wav
# but since name, number, and subject are user-editable,
# they're url-encoded. Hence a file's unique identifier is
#     date.time.*
# (makes the assumption there can never be two messages in the
# same second)
# $Id: web-vmail.html,v 1.10 2006/05/02 03:11:05 shamim Exp $

link cgi

$define file_encode urlencode
$define file_decode urldecode

# A few procedures that do the HTML

procedure cgiheaders()
    writes("Content-type: text/html\r\n\r\n")
    write("<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">")
    write("  <!-- Date: ", &dateline, "  -->")
    write("  <HEAD>")
    write("    <TITLE>Messages</TITLE>")
    write("    <STYLE type=\"text/css\">")
    write("    <!--")
    write("      TD { text-align: center }")
    write("    -->")
    write("    </STYLE>")
    write("  </HEAD>")

procedure cgititle()

procedure write_header()
    write("  <BODY>")
    write("    <CENTER>")
    write("      <H1>Messages</H1>")
    write("    </CENTER>")
procedure start_form()
    write("    <FORM method=\"POST\" action=\"/cgi-bin/vmail\">")
    write("      <TABLE>")
    write("        <TR>")
    write("          <TH>Delete</TH><TH>Time</TH><TH>Duration</TH>")
    write("          <TH>Name</TH><TH>Number</TH><TH>Subject</TH>")
    write("        </TR>")
procedure end_form()
    write("      </TABLE>")
    write("      <BR>&nbsp;<BR>")
    write("      <CENTER><INPUT type=\"submit\" value=\" Go! \"></CENTER>")
    write("    </FORM>")
procedure write_footer()
    write("  </BODY>")


# This is it: the main handler.

# The HTML table we use for the messages: in each row, the delete
# button is called DEL-$uid; the name textedit field is NAME-$uid,
# number is NUM-$uid, and subject is SUBJ-$uid. The DEL action takes
# precedence over any text edit fields.

procedure cgimain()
    dir := "/var/www/html/vmail/"

    # Look through cgi[] for keys DEL-*, NAME-*, or NUM-*

    names := table()
    numbers := table()
    subjects := table()
    deletes := list()

    # Collect all the attributes from the URL

    L := sort(cgi, 1)
    every l := !L do {
	k := urldecode(\l[1])
	v := l[2]

        \k ? {
	    if ="DEL-" then
		push(deletes, tab(0))

	    else if ="NAME-" then
		names[tab(0)] := v

	    else if ="NUM-" then
		numbers[tab(0)] := v

	    else if ="SUBJ-" then
		subjects[tab(0)] := v
    deletes := set(deletes)    

    # Read the vmail directory into a list called "files"

    files := list()
    f := open(dir)
    every fname := !f do {
	if fname[-4:0] == ".wav" then
	    push(files, fname)
    files := sort(files)

    # Generate HTML


    if *files = 0 then {
	write("No messages.")
    else {
	every handle(dir, names, numbers, subjects, deletes, !files)


# This procedure is called for each file in the directory
#     dir = "/var/www/html/vmail/"
#     fn = filename (no leading dir)
procedure handle(dir, names, numbers, subjects, deletes, fname)
    # newline
    nl := "\n        "

    # Get file metadata

    ffullpath := dir || fname
    r := stat(ffullpath)
    sec := (r.size / 8000)

    mins := sec/60
    sec -:= mins * 60
    if sec < 10 then sec := "0" || sec
    m_duration := mins || ":" || sec

    # Parse the filename to extract timestamp, subject, and sender

    m_date := m_time := m_name := m_number := m_subject := &null

    fname ? {
	m_date := tab(upto('.'))

	m_time := tab(upto('.'))

	m_number := cnum(file_decode(tab(upto('.'))))

	# User-entered fields are encoded to get the filename
	m_name := cname(file_decode(tab(upto('.'))))

	m_subject := file_decode(tab(upto('.')))

	m_suffix := tab(many(~'.'))
	pos(0) | (err() & continue)

    # Construct ID for this message
    msg_id := m_date || "." || m_time
    writes(nl, "<!-- file ", image(ffullpath),
	   " msg_id ", image(msg_id), " -->")

    # What action do we need to take?

    if member(deletes, msg_id) then {
	# delete the file
	write(nl, "<!-- Deleted ", msg_id, " -->")


    # Any changes to subject or sender?

    m_name := \names[msg_id]
    m_number := \numbers[msg_id]
    m_subject := \subjects[msg_id]
    writes(nl, "<!-- num/name/subj ", image(m_number), " ", image(m_name))
    writes(" ", image(m_subject), " -->")

    newfilename := msg_id || "." || file_encode(m_number)
    newfilename ||:= "." || file_encode(m_name)
    newfilename ||:= "." || file_encode(m_subject)
    newfilename ||:= ".wav"
    newfilepath := dir || newfilename

    if ffullpath ~== newfilepath then {
        # Yes, the user has changed something
	writes(nl, "<!-- rename ", image(ffullpath))
	write(nl, "          to ", image(newfilepath), " -->")
	rename(ffullpath, newfilepath) | write("<!-- Rename failed!!! -->")
    fname := newfilename

    emit(msg_id, fname, m_name, m_number, m_subject, m_duration, nl)

# Construct a table row for the given vmail
procedure emit(msg_id, fname, m_name, m_number, m_subject, m_duration, nl)
    writes(nl, "<TR>")
    nl ||:= "  "
    writes("<TD><INPUT type=\"checkbox\" name=")
    writes(image("DEL-" || msg_id), "></TD>")

    tstamp := (msg_id ? tab(upto('.'))||(move(1) & "&nbsp;&nbsp;")||tab(0))
    writes("<A href=\"/vmail/", urlencode(fname), "\">", tstamp, "</A></TD>")

    writes("<TD>", m_duration, "</TD>");

    writes("<TD><INPUT type=\"text\" size=\"15\" value=")
    writes(" name=", image("NAME-" || msg_id), "></TD>")

    writes("<TD><INPUT type=\"text\" size=\"15\" value=")
    writes(" name=", image("NUM-" || msg_id), "></TD>")

    writes("<TD><INPUT type=\"text\" size=\"20\" value=")
    writes(" name=", image("SUBJ-" || msg_id))

    nl := nl[1:-2]
    write(nl, "</TR>")


# A few utilities

# Canonicalise name
procedure cname(s)
    static ws
    initial ws := ' \t\r\n'
    # Don't re-canonicalise once the user has entered something
    if s ? upto(&lcase) then return s

    n := ""
    s ? repeat {
	#write("<!-- cname: ", prenv(), " -->")
        n ||:= capitalise(tab(many(~ws))) || " "
	#write("<!-- cname: ", prenv(), " ", image(n), " -->")
        if pos(0) then break
    n := n[1:-1]
    write("<!-- cname: ", image(s), " = ", image(n),  " -->")
    return n

# Canonicalize number
procedure cnum(s)
    # Don't re-canonicalise once the user has entered something. We
    # never use spaces, and users usually do.
    if s ? find(" ") then return s

    n := ""
    s ? {
        if ="011" then n := "+"
        n ||:= tab(0)
    n ? {
        if ="011" then num := "+"
        num := ""
        num ||:= move(-4)
        while s := move(-3) do
            num := s || " " || num

        pos(1) | (num := tab(1) || " " || num)
    return num

# Canonicalize a path. Remove multiple / and /../ and /./ etc. The
# returned value will be absolute, i.e. begin with a "/"

procedure canon(s)

    path := []
    s ? until pos(0) do {

	comp := tab(upto('/') | 0)
	if comp == ".." then
	    if *path = 0 then
	    if *comp > 0 & comp ~== "." then
		push(path, comp)

    r := "/"
    every r ||:= (!path || "/")
    return r[1:-1]

procedure basename(s)
    s? {
	while tab(upto('/')) do
	return tab(upto('.') | 0)

procedure rm(f)
    write(&errout, "Removing ", image(f))

procedure err()
    write("<!-- Parse unsuccesful: ", prenv(), "-->")

procedure urlencode(s)
    static special
    initial special := ~(&letters ++ &digits ++ '-,:!~')   # No / or .
    return  s? (tab(upto(special)) ||
		(c := move(1) & quote(c)) ||
		urlencode(tab(0))) |


procedure urldecode(s)
    static hexen
    initial hexen := &digits ++ 'ABCDEF'
    return s? (tab(find("%")) || 
	       (move(1) &
		(c1 := tab(any(hexen))) &
		(c2 := tab(any(hexen))) & 
		hexchar(c1,c2)) || urldecode(tab(0)))  | 

procedure toupper(s)
    return map(s, &lcase, &ucase)

procedure tolower(s)
    return map(s, &ucase, &lcase)

procedure hexval(c)
    if any(&digits, c) then return integer(c)
    if any('ABCDEF', c) then return ord(c) - ord("A") + 10

procedure hexchar(c1,c2)
     return char(hexval(c1) * 16 + hexval(c2))

procedure quote(c)
    static hexes
    initial hexes := &digits || "ABCDEF"
    i := ord(c)
    q := "%" || hexes[1 + i/16] || hexes[1 + i%16]
    # write("<!-- ", image(c), " = ", i, " 0x", q, " -->")
    return q

procedure capitalise(s)
    return toupper(s[1]) || tolower(s[2:0])

# For debugging
procedure prenv(sep)
    return &subject[1:&pos] || (\sep | "|") || &subject[&pos:0]



Copyright © 2006 Shamim Mohamed
This document is under the Creative Commons Attribution-ShareAlike License.
$Date: 2006/05/02 03:11:05 $ Last modified: Tue May 2 12:50:02 PDT 2006