Programmers’ Patch: 2015

Thursday, December 17, 2015

Tomcat 7 error

A few people seem to be bitten by an error in the Ubuntu distribution of Tomcat7. Basically when you shut down the service it gives the following messages:

INFO: Destroying ProtocolHandler ["http-bio-8080"]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/common/classes], exists: [false], isDirectory: [false], canRead: [false]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/common], exists: [false], isDirectory: [false], canRead: [false]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/server/classes], exists: [false], isDirectory: [false], canRead: [false]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/server], exists: [false], isDirectory: [false], canRead: [false]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/shared/classes], exists: [false], isDirectory: [false], canRead: [false]
Dec 18, 2015 1:46:15 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/usr/share/tomcat7/shared], exists: [false], isDirectory: [false], canRead: [false]

Opinions differ on what causes it, but it is pretty clear: Tomcat is assuming that the shared, server and common directories, which are in /var/lib/tomcat7 are in /usr/share/tomcat7. In other words, the configuration has mixed up catalina.home with catalina.base. catalina.base is supposed to be /var/lib/tomcat7 and catalina.home is /usr/share/tomcat7. All you have to do is edit the file /var/lib/tomcat7/conf/catalina.properties so that all references to those directories have the correct prefix and you're good. I ignored the "common.loader" line as changing this created some weird effects. But lower down the file is mixed up. Reboot and you should be good.

Wednesday, November 4, 2015

Allow local File access via jQuery.ajax in Chrome/Chromium

It is often useful to encapsulate a website onto a local file system without using a web-server. I wanted to create a web-archive of a site and then substitute the jQuery get calls with local file reads. That way I would not need to access the Internet for the web-archive to work, and I could give it to other people on a usb stick, and they wouldn't have to install a webserver to run it. So I thought I'll use jQuery.get or jQuery.ajax to read the local file. They would all be JSON files, since that is what my server returns, but you can tweak it for other formats. After a bit of fiddling I got it right:

<!doctype HTML>
<html><head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<link id="link" rel="external" href="data.json">
<script src="jquery-1.11.3.min.js"></script>
<script>
$(document).ready(function(){
  $.ajax({
    url: "data.json",
    beforeSend: function(xhr){xhr.overrideMimeType("application/json");},
    dataType: "json",
    success: function(data){$("#result").html(data.text)}
  });
});
</script>
</head>
<body>
<p id="result"></p>
</body>
</html>

This reads the local file data.json and sets the contents of the #result element to the value of the "text" property. Here's my data.json just to be clear:

{"text": "Oh my word!"}

You also need a local copy of jQuery. Now this works just dandy on Safari and Firefox and I'm told in IE, but not in Opera or Chromium. Chromium says that this is a cross-origin request. I don't see why. I opened my HTML file using the file:// protocol and that tried to open a file in the same directory using the file:// protocol. How is that cross-origin? Because it made me cross? But the workarounds in Chromium/Chrome are dire: they suggest installing a local webserver – ridiculous. The only reason I am doing this is to avoid that. Or they say use the --allow-file-access-from-files option when launching chromium. But it doesn't work, at least not on Linux. However, I discovered that if you add --allow-file-access as a second option it works, though neither works on its own. I saved the options in /usr/share/applications/chromium.desktop under the Exec= property. It's a bit of a pain to ask people to do that but it is far better than installing a webserver.

Here's my Exec line from chromium.desktop:

Exec=chromium-browser --allow-file-access-from-files --allow-file-access %U

To fix Opera is more sensible: You just set Allow File XMLHttpRequest in UserPrefs of opera:config.

Saturday, July 4, 2015

Swapping suffixes on file names in bash

A common problem when writing shell scripts is to swap suffixes for file names. For example, I wanted to translate a batch of markdown files to html but also to apply a sed script so I could get curly quotes and long dashes etc. To do that I needed to create a temporary file, so I had to go from file.md to file.tmp to file.html. Each time I needed to swap suffixes. Having looked around I couldn't find a neat way to do that, and most of them used expr, which starts a new process. I wanted to do it natively in bash or even dash (the default Ubuntu shell). So I wrote a trivial but neat function and a test, which can be stripped out. The function is all you need:

#!/bin/bash
string="banana.md"
function swap {
    echo "${1:0:(${#1}-(${#2}+1))}.$3"
}
swap "banana.md" "md" "tmp"

To use it in a real script just use backticks thus:

...
function swap {
    echo "${1:0:(${#1}-(${#2}+1))}.$3"
}
markdown myfile.md > `swap "myfile.md" "md" "html"`
...

Sunday, June 14, 2015

Synchro-scrolling three or more columns

I wanted to make a display that had three parallel windows. The left one would show a succession of page-images of some source document; the middle one an editable transcription of the document's entire text content in a MarkDown-like language; the third a rendition of that text as HTML. This gave the user the same information in three forms that were intrinsically out of sync with one another, with each column having a different height and layout of information. As you scroll down one column you would naturally like the other column to scroll in sync, so that at some point on the page – say the middle – would contain the same stuff and so the user would not lose his/her way. One attempt can be seen at the ecdosis Web site. Keeping track of how far down each page-number in the textarea is, and the corresponding positions in pixels down the columns that correspond in the other two views is an implementation detail I'll leave to the reader, although my code is available at that site. More than likely, however, you'd want to do that your own way.

The feedback problem

The key problem with all such displays is this: if I scroll column 2, and then set the scrollTop attribute of the other two columns, this will generate new secondary scroll events for columns 1 and 3 that are indistinguishable from the original event. In jQuery you can test the event.originalEvent field of the scroll event but it is mostly set to true even when it isn't an original event. The result is uncontrollable feedback. The display can freeze as each column talks to each other. One scrolls it down, the other slightly up, setting it vibrating. You can use the jQuery.scroll method but you have to surrender control of the event feedback again. The result is choppy and not at all smooth.

My solution is simple. All you do is set some global flag to the name of the currently scrolling column. Initially this is undefined, but on first scrolling say column 2, the "textarea", the global var scroller = "textarea". Now in the scroll handlers for the other two columns all you do is test if the current value of scroller is that of the relevant scroll event handler. (Of course your code will be different. This is just an example):

// scroll the textarea
    $("#"+opts.source).scroll(function(e) {
        // prevent feedback
        if ( self.scroller==undefined||self.scroller=="textarea" )
        {
            self.scroller = "textarea";
            var loc = self.getSourcePage($(this));
            // scroll HTML "target"
            self.scrollTo(loc,self.html_lines,$("#"+self.opts.target),1.0);
            // scroll "images"
            self.scrollTo(loc,self.image_lines,$("#"+self.opts.images),1.0);
            self.setScrollTimeout();
        }
    });
    // scroll the preview
    $("#"+opts.target).scroll(function(e) {
        if ( self.scroller==undefined||self.scroller=="target" )
        {
            self.scroller = "target";
            var lineHeight = $("#"+self.opts.source).prop("scrollHeight")
                /self.formatter.num_lines;
            var loc = self.getPixelPage($(this),self.html_lines);
            // scroll "textarea"
            self.scrollTo(loc,self.text_lines,
                $("#"+self.opts.source),lineHeight);
            //scroll "images"
            self.scrollTo(loc,self.image_lines,$("#"+self.opts.images),1.0);
            self.setScrollTimeout();
        }
    });
    // scroll the images
    $("#"+opts.images).scroll(function(e) {
        if ( self.scroller==undefined||self.scroller=="images" )
        {
            self.scroller = "images";
            var lineHeight = $("#"+self.opts.source).prop("scrollHeight")
                /self.formatter.num_lines;
            var loc = self.getPixelPage($(this),self.image_lines);
            // scroll "textarea"
            self.scrollTo(loc,self.text_lines,
                $("#"+self.opts.source),lineHeight);
            // scroll HTML "target"
            self.scrollTo(loc,self.html_lines,$("#"+self.opts.target),1.0);
            self.setScrollTimeout();
        }
    });

The view clicked on will always scroll by itself and prevent feedback by blocking the secondary scroll events (the calls to the specialised self.scrollTo method in the code above) when the scroll did not originate there. At the completion of scrolling the global (actually self.scroller, a variable in the containing object) is set back to undefined after a 200 millisecond delay. The reason for this is that Javascript is asynchronous. We cannot assume that when the current scroll handler has finished that the other scrolls have finished as well. So we set a timeout function to delay the reset to ensure that it happens after all of the current scroll is complete. Any more than 200 milliseconds and the user may have tried to click on another panel and found it blocked:

The timeout id resets itself when the timeout has completed. This is also used to prevent timeouts accumulating as the user scrolls continuously.

Monday, May 25, 2015

The dreaded ssh "Roaming not allowed by server" problem

Passwordless login via ssh is a great technique. It allows you to turn off user challenge authentication altogether, and so shut out those robotic hackers who try to guess your user name and then try every password until they break in. With passwordless login they have to forge a long cryptographic RSA key which, given the number of possibilities and the latency on the line, is impossible. So when I went about restoring a server that had been hacked I put back my old ssh key, and tried to login. No joy. ssh -v mysite.com produced a mysterious ssh error: "Roaming not allowed by server". What does this mean, and how do you fix it? Googling the answer didn't help. No one seemed to know the answer. They were all fixated with file permissions, which may be an issue, but does it cause this error? Without reading through the open-ssh code here's what I found: delete .ssh/known_hosts on the client connecting to the server and all will be well. If the server's domain-name is dynamic, or has been altered (as in my case) then that counts as "roaming". When your IP address changes ssh will complain that a key in known_hosts has offended it. But when the server's address changes it will give you this "Roaming not allowed by server" message.

Friday, May 22, 2015

Next TEXT node in a HTML DOM

The Rangy tool is great for making cross-browser HTML selections. It has this useful function "surroundContents", which pastes in a span around the text of the selection. But it refuses to work if the selection crosses an element boundary. It's so stupid, because there is a perfectly sound way to wrap all the text elements in one selection, which is what I wanted. I was trying to implement a commenting tool, which adds a comment to an arbitrary selection in HTML. So the range needs to be coloured and have something to activate it when the user clicks on it. To do that I can't accept rangy's restriction on surroundContents. So I wondered if I could write a simple function that would wrap ANY text elements in the current selection with <span class="someclass">...</span>. It turns out it was pretty easy, although I couldn't find anything on the Web by searching. Here's my test code. If you check the HTML you'll see that it works when going up, down or across the DOM tree. According to the specs this works on any browser. This is a HTML DOM thing, so not much point converting it to jQuery. It should ideally be part of Rangy or jQuery, or converted to a jQuery plugin.

<html>
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/>
<script type="text/javascript">
/**
 * Get the first text-node of any DOM element
 * @param elem the element to search for text
 * @return a text node or null
 */
function firstTextNode( elem ) {
    if ( elem.nodeType==3 && elem.nodeValue.length!=0 )
        return elem;
    else if ( elem.firstChild != null 
        && firstTextNode(elem.firstChild) != null )
        return firstTextNode(elem.firstChild);
    else if ( elem.nextSibling != null )
        return firstTextNode(elem.nextSibling);
    else
        return null;
}
/**
 * Find the next text node
 * @param elem the current text element
 * @return another text element or null
 */
function nextTextNode(elem) {
    if ( elem.nextSibling != null 
        && firstTextNode(elem.nextSibling) != null )
        return firstTextNode(elem.nextSibling);
    else
    {
        var parent = elem.parentNode;
        while ( parent != null )
        {
            var sibling = parent.nextSibling;
            while ( sibling != null )
            {
                if ( firstTextNode(sibling) != null )
                    return firstTextNode(sibling);
                else
                    sibling = sibling.nextSibling;
            }
            // no more text nodes on this level
            parent = parent.parentNode;
        }
    }
    return null;
}
function findTextNode(id)
{
    var elem = document.getElementById(id);
    var textNode = firstTextNode(elem);
    var next = nextTextNode(textNode);
    if ( next != null )
        alert( next.nodeValue );
}
</script>
</head>
<body>
<div>
<p id="para">Hello world. <i id="italics">This is italics </i><span id="not">and this is not. </span><table border="1"><tr><td id="firstTD">really inset</td><td id="nextTD">and another</td></tr></table>Oh la la!</p>
<button onclick="findTextNode('para')">para</button>
<button onclick="findTextNode('italics')">italics</button>
<button onclick="findTextNode('not')">this is not</button>
<button onclick="findTextNode('firstTD')">first TD</button>
<button onclick="findTextNode('nextTD')">next TD</button>
</body>
</html>

Sunday, March 22, 2015

LSB init scripts with a java daemon in Ubuntu

I have several java-based services that someone wanted to run as daemons. I would have preferred running them in Tomcat, but they didn't want to administer that. So I thought I would create an init script in /etc/init.d, and use that to stop/start the service and to check its status via service --status-all. If you follow the documentation on this the Debian people recommend using their init functions in /lib/lsb/init-functions. The ones I don't like are start-stop-daemon and the use of pid files in /var/run. Firstly, for a java service run as java -jar MyProg.jar ...options... & this doesn't fit into the mould of an executable daemon with arguments, since the program name is just "java" and the actual "daemon" is a long path name with even longer classpath variables etc. Also I don't see the point of the pid file. Its only purpose is to test if the daemon is running. But you can do that with:

PID=`ps aux | grep $DAEMON | grep -v grep | awk '{print $2}'`

That gives you the process ID so long as you have a long enough name for the daemon. So if $PID is empty the daemon is not running, and according to the standard I am supposed to call exit 3 or exit 0 if it was running. However service --status-all ignores this return code. Oh yes. After hours of hacking I found a script that did work because it called log_success_msg in each of the two cases (see below). And with that, it works. Presumably because this reads the script status code. Without those lines service --status-all reports that the service "isn't running" no matter what.

In the following sample code just replace my daemon, and all the paths etc with yours. The meat of the script is generic, although it doesn't support reload.

#!/bin/sh
#
# /etc/init.d/bhlpages -- startup script for the BHL pages service
#
#

### BEGIN INIT INFO
# Provides:          bhlpages
# Required-Start:    mongodb $remote_fs $syslog
# Required-Stop:     mongodb $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start BHL Pages.
# Description:       Start the BHL page description service for TILT
### END INIT INFO

# Exit if the package is not installed

[ -e "/usr/local/bin/bhlpages/BHLPages.jar" ] || exit 5
PATH=/sbin:/usr/sbin:/bin:/usr/bin
DESC="BHL page description service for TILT"
NAME=bhlpages
DAEMON=/usr/local/bin/bhlpages/BHLPages.jar
# LibPath is a simple java program that prints the java runtime libpath
# which is only needed so we can add our install location to it for JNI
LIBPATH=`java -cp /usr/local/bin/bhlpages/ LibPath`:/usr/local/lib
# Load the standard init functions
. /lib/lsb/init-functions

# make a classpath out of a directory of jars
getjarpaths()
{
  JARPATH=""
  for f in $1/*.jar
  do
    JARPATH="$JARPATH:$f"
  done
    echo $JARPATH
  return
}
do_start() {
  JARPATHS=`getjarpaths /usr/local/bin/bhlpages/lib`
  pkill -f $DAEMON
  java -Djava.library.path=$LIBPATH -cp .$JARPATHS:/usr/local/bin/bhlpages/BHLPages.jar bhl.pages.JettyServer -w 8084 >> /var/log/bhlpages.log 2>&1 &
}
do_stop() {
  pkill -f $DAEMON
}
### main logic ###
case "$1" in
  start)
        log_daemon_msg "Starting $NAME"
        do_start
        log_end_msg 0
        ;;

  stop)
        PID=`ps aux | grep $DAEMON | grep -v grep | awk '{print $2}'`
        if [ -n $PID ]; then
          log_daemon_msg "Stopping $NAME"
          do_stop
        else
          log_daemon_msg "$NAME not running"
        fi
        log_end_msg 0
        ;;

  restart)
        # Restart the daemon.
        $0 stop && sleep 2 && $0 start
        ;;

  status)
        PID=`ps aux | grep $DAEMON | grep -v grep | awk '{print $2}'`
        if [ -z $PID ]; then
          log_success_msg "$NAME is not running"
          exit 3
        else
          log_success_msg "$NAME is running"
          exit 0
        fi
        ;;

  *)
        log_action_msg "Usage: $0 {start|stop|restart}" 
        exit 1
        ;;
esac
exit 0

Wednesday, January 21, 2015

Reading an AJAX response gradually

I wanted to display a progress bar on a Webpage using Javascript/jQuery while a time-consuming process on the server was taking place. It wasn't uploading or downloading significant amounts of data, but it did take time. For this reason I couldn't use the progress events in Ajax, or their jQuery implementation. People said it couldn't be done, that it exposed "the limitations of the HTTP protocol itself". In fact it has nothing to do with HTTP, but with TCP. When I make an Ajax call the client first establishes a TCP connection using the SYN,SYN+ACK,ACK exchange. Then the server sends data back to the client until it is finished and then sends a FIN packet, which the client acknowledges, to signify "end of flow". So if we provide a callback that gets called on "success" it will wait for the end of flow and not report any data meanwhile. But that doesn't mean that data is not available. At the socket level in Java I can call something like "myStream.available()" to see if there is data to be read, and in Ajax we can test the ready state to see if it is 3 (not 4). If the server is writing data out gradually, in my case the percentage of process completion, and flushing at the ends of lines, then data will be available for the onreadystate function. Here's an example. I provided a button to make the Ajax call, whose id is "rebuild". My service is at "/search/build":

jQuery("#rebuild").click( function() {
    var readSoFar = 0;
    client = new XMLHttpRequest();
    client.open("GET", "http://"+window.location.hostname+"/search/build");
    client.send();
    // Track the state changes of the request
    client.onreadystatechange = function(){
        // Ready state 3 means that data is ready 
        if(client.readyState == 3){
            if(client.status ==200) {
                var len = client.responseText.length-readSoFar;
                console.log(client.responseText.substr(readSoFar,len));
                readSoFar = client.responseText.length;
            }
        }
    };
});

This prints out the text received from the server at the same rate that it was sent. It doesn't appear to be possible to do this in jQuery, because there is no "onreadystatechange" field in jQuery's jqXHR object, so I have used raw Javascript instead. This text can then be used to implement a progress bar.