<< Previous | Home

Simple Bash FTP Pull and Delete Script

Simple set of bash scripts to pull down a local directory with a remote FTP server and delete the files afterwards
Bookmark and Share

Recently we needed a temporary (weeks) solution to sync a local directory of XML files with a remote FTP server while we moved some functionality around, coordinated with external agencies, etc. Rather than pay for some ridiculous GUI-based FTP program, I whipped up this set of bash FTP scripts. It works great!

This bit here, which is named ./loop, infinitely loops calling the ./sync code father below. It is handy to keep ./sync separate for debugging purposes. It sleeps every 60s and prints the date as a primitive time stamp in the logs. If your goal is something more permanent, don't use it. Instead use cron or some other task scheduling daemon to invoke ./sync .


while [ 1 ]
do
    echo "Start loop..."
    date
    ./sync
    echo "Sleep..."
    sleep 60 
done
Here we have ./sync which is the core of the synchronization process:

#uncomment the below to debug
#set -x

#use case specific stuff
HOST=ftp.my.target.server.com
FTPUSER=myuser
FTPPASS=secret
HOMEDIR=/Users/stu/ftp-sync
TARGETDIR=/Users/stu/ftp-sync
SOURCEDIR=remoteDir

cd $TARGETDIR
pwd

#create ftp commands to get remote file list
echo "user $FTPUSER $FTPPASS" > $HOMEDIR/getFiles 
echo "asc" >> $HOMEDIR/getFiles 
echo "dir * $SOURCEDIR" >> $HOMEDIR/getFiles 
echo "bye" >> $HOMEDIR/getFiles 
echo "EOF" >> $HOMEDIR/getFiles 

#get list of files to pull down
ftp -inv $HOST < $HOMEDIR/getFiles 

#write volatile batch script to get files and remove from the local server
echo "user $FTPUSER $FTPPASS" > $HOMEDIR/batch
awk '{print $NF}' $HOMEDIR/files | while read file; do
    echo "get $file" >> $HOMEDIR/batch
    echo "delete $file" >> $HOMEDIR/batch
done
echo "bye" >> $HOMEDIR/batch

#run the volatile batch script
ftp -inv $HOST < $HOMEDIR/batch

#clean up temp files
rm $HOMEDIR/list
rm $HOMEDIR/batch
rm $HOMEDIR/getFiles
That's it, works like a charm. That said: If you use these scripts, don't forget to make them executable. If you are working with binary files you'll want to issue the binary command in the the "batch file" creation loop. There is no error handling. If the download of a file fails it will still try to delete it afterwards.

6-Port LACP Cacti Template

Or, 'How to monitor LACP device participation levels'
Bookmark and Share

With our HP 2910al, we have a few streaming machines wired with multiport LACP ethernet bonds. Early on with our LACP setup we realized that we needed to monitor the traffic distribution of the individual members in the trunk. Why?

  • Configuration of LACP, especially with older LInux distributions, can be difficult. Misconfigurations don't generate errors, etc.
  • Some hardware, bot NIC and switches, do not support the more advanced types LACP.
  • To ensure one member of the trunk was not being maxed. This could hide performance problems from us if our traffic was passing over a port with spare capacity.

To that end we created a template in Cacti. ("We" actually means my colleague Roland did the real work, figuring out how to create custom Cacti graphing templates!) It very clearly shows us the distribution of traffic across all members of the trunk.

Kinda groovy, eh?

Here are the graph template files for Cacti. They were saved from v0.8.8a:

Bon ap'!

TCP client hangs intermitantly in SYN_SENT for 10 seconds

Disabling tcp_tw_reuse and tcp_tw_recycle solve the problem
Bookmark and Share

We have some high volume media servers that have benefitted from some selective TCP parameter tuning. These Debian Linux machines have stageful firewalling on the box (netconn_track), a moderate number of actively communicating connections (hundreds to thousands of clients) and high bandwidth (hundreds of megabits per second to multiple gigabits per second.)

Some years ago when ran into a problem with the number of TCP connections consuming too much memory, both for the stack and netconn_track table. Some of the suggested solutions we implemented included...

  • Enabling tcp_tw_recycle: "It enables fast recycling of TIME_WAIT sockets. The default value is 0 (disabled). The sysctl documentation incorrectly states the default as enabled. It can be changed to 1 (enabled) in many cases. Known to cause some issues with hoststated (load balancing and fail over) if enabled, should be used with caution." *
  • Enabling tcp_tw_reuse: "This allows reusing sockets in TIME_WAIT state for new connections when it is safe from protocol viewpoint. Default value is 0 (disabled). It is generally a safer alternative to tcp_tw_recycle" *
(* from SpeedGuid.net's Linux Tweaking page.)

While we were loath to tune the TCP kernel parameters (modern kernels tune themselves very well, non-TCP-experts tend to screw it up), there was a notable improvement in the systems. Later on, when we had problems with the netconn_track table filling up, we tweaked two more parameters:

  • Setting ip_conntrack_max to 64k to 128k
  • Setting ip_conntrack_tcp_timeout_established to 5 minutes, from the default of 5 days

Things have been working very well for some time. Until this week, that is. On Monday I started getting reports of sluggishness in our application. As the app is under heavy development our first debugging was occurring there. Once I realized the app was fine I had a look at the TCP connections with a simple script checking the connection states on both client and server.

while [ 1 ] 
do
  date
  netstat -n | grep 195.49.105.149
  echo
  sleep 1 
done

Below is the output from the client. Immediately the problem jumped out: the client TCP connection was hanging in the SYN_SENT state for ~10s!

Thu May  3 15:58:14 CEST 2012

Thu May  3 15:58:15 CEST 2012
tcp4       0      0  192.168.1.69.63583     195.49.105.158.443     SYN_SENT
tcp4       0      0  192.168.1.69.63582     195.49.105.158.443     SYN_SENT

Thu May  3 15:58:16 CEST 2012
tcp4       0      0  192.168.1.69.63583     195.49.105.158.443     SYN_SENT
tcp4       0      0  192.168.1.69.63582     195.49.105.158.443     SYN_SENT

  -- snip -- 
Thu May  3 15:58:25 CEST 2012
tcp4       0      0  192.168.1.69.63583     195.49.105.158.443     SYN_SENT
tcp4       0      0  192.168.1.69.63582     195.49.105.158.443     SYN_SENT

Thu May  3 15:58:26 CEST 2012
tcp4       0      0  192.168.1.69.63583     195.49.105.158.443     ESTABLISHED
tcp4       0      0  192.168.1.69.63582     195.49.105.158.443     ESTABLISHED

  -- snip -- 

Thu May  3 15:58:36 CEST 2012
tcp4       0      0  192.168.1.69.63583     195.49.105.158.443     ESTABLISHED
tcp4       0      0  192.168.1.69.63582     195.49.105.158.443     ESTABLISHED

Thu May  3 15:53:53 CEST 2012
Note that this problem was occurring not only with HTTPS (as in the above output) but also HTTP and SSH/SCP. The problem was definitely in the TCP layer.

Some research around the Interwebs left me nowhere. The best suggestion was to reboot and hope. That's not the kind of engineered remedy that I was in search of. Instead we walked back the first to TCP stack tuning parameters, and turn off tcp_tw_recycle and tcp_tw_reuse.

That did the trick! I'm fairly certain that things will be OK now, and that the later tuning parameters will protect us from the problems our tcp_tw_recycle and tcp_tw_reuse tweaks were intended to help with. If problems do creep up again with the volume of unused sockets we'll try tweaking only tcp_tw_reuse, as that is deemed safer.

How to write a custom Tomcat logging valve

Or, the joy and pain of subclassing org.apache.catalina.valves.AccessLogValve
Bookmark and Share

In an attempt to aid a regular log analysis task of mine, I decided to write my own customer Tomcat valve. The idea was to pre-filter what I was after and log that to a file.

The result of my custom valve was an access log that rolled over monthly that contained:

  • Only PUT requests
  • A segment of the URI
  • The User-Agent of the client

Unfortunately, the task of coding a custom valve is not as simple as implementing a Filter or extending ServletRequestWrapper. Fortunately, the scope of what I was trying to accomplish was limited. Below is my code for extending AccessLogValve and a couple of tips for anyone venturing into the world of custom logging valves.

Notes on extending AccessLogValve:

  • AccessLogValve.invoke(request, response) always needs to be called. If you override it and do not call super.invoke() then the request stops there.
  • AccessLogValve is not scoped to the request. It is not thread safe. Do not attempt to keep state unless you have a specific reason to have one request impact another requests' valve behavior.
  • Make any the log/no-log decision in .log(Request request, Response response, long time).
  • To trim the URI, and filter out the method and protocol from the log,I overrode org.apache.catalina.connector.Request. This was a little trickier than expected as, like AccessLogValve, there is no corresponding wrapper. In the below extension of Request, the code overrides only the methods that are called by the logging patter. If the logging pattern contains anything that might calla property that is not set, then an NPE very well be thrown.

The code: LogAbreviatedPutValve.java


package ch.geekomatic.catalina;

import java.util.Enumeration;

import org.apache.catalina.connector.Request;
import org.apache.catalina.connector.Response;
import org.apache.catalina.valves.AccessLogValve;

/**
 * This log valve filters out non-PUT requests and trims the logged URI. This is only useful for providing an easy to parse access log 
 * to generate statistics on the clients that upload via PUT. The
 * correct way to use this value is to set the logger pattern to:
 * 
 * %r %{User-Agent}i
 * 
 * Other patterns either will not work or may generate an exception.
 * 
 * @author Stu Thompson
 * 
 */
public class LogAbreviatedPutValve extends AccessLogValve {

    @Override
    public void log(Request request, Response response, long time) {
        if (request.getMethod() != null && request.getMethod().equals("PUT")) 
            super.log(new AbreviatedRequest(request), response, time);
        
        
        super.invoke(arg0, arg1)
    }

}

/**
 * This class provides a trimmed down request URI which is only useful for providing an easy to parse access log
 *  to generate statistics on the clients that upload via PUT
 * 
 * @author Stu Thompson
 * @since v2.12.5
 */
class AbreviatedRequest extends Request {
    private final Request _request;

    AbreviatedRequest(Request request) {
        _request = request;
    }

    /**
     * Returns a trimemd RequestURI that does not contain any characters from the position of the last /, and hence the file name.
     * 
     * @see org.apache.catalina.connector.Request#getProtocol()
     */
    @Override
    public String getRequestURI() {
        String uri = _request.getRequestURI();
        if (uri != null && uri.contains("/")) {
            int pos;

            // remove file
            pos = uri.lastIndexOf("/");
            uri = uri.substring(0, pos);
            return uri;
            }
        } 
        
        return _request.getRequestURI();
    }

    /**
     * The logged URI includes the method. Since we are already filtering only for PUT, return an empty string.
     * 
     * @see org.apache.catalina.connector.Request#getMethod()
     * @returns empty string
     */
    @Override
    public String getMethod() {
        return "";
    }

    /**
     * The logged URI includes the HTTP protocol. We have no interest in this for our special log. Return an empty string.
     * 
     * @see org.apache.catalina.connector.Request#getProtocol()
     * @returns empty string
     */
    @Override
    public String getProtocol() {
        return "";
    }

    /**
     * The logged URI includes the query string. We have no interest in this for our special log. Return null no matter what.
     * 
     * @see org.apache.catalina.connector.Request#getQueryString()
     * @returns null
     */
    @Override
    public String getQueryString() {
        return null;
    }

    @Override
    public String getHeader(String name) {
        return _request.getHeader(name);
    }

    @Override
    public Enumeration getHeaderNames() {
        return _request.getHeaderNames();
    }

    @Override
    public Enumeration getHeaders(String name) {
        return _request.getHeaders(name);
    }
}

The code: server.xml snippet

<Valve className="ch.geekomatic.catalina.LogAbreviatedPutValve"
   directory="logs/access"
   prefix="put_log." suffix=".txt" pattern="%r %{User-Agent}i"
   fileDateFormat="yyyy-MM" resolveHosts="false"/>

Duplicate Session Errors on BlazeDS

Solution #2: Get off of Tomcat 7.0.0
Bookmark and Share

A while back we discovered a reason behind "duplicate session errors" in BlazeDS, and I posted what fixed it for us. But the pesky little problem remained on a single server. Until recently, the cause eluded us. What, prey tell, was the culprit?

The problem server had only one significant difference from the happily running servers: Apache Tomcat 7.0.0, from two years ago. Once we rolled the Tomcat version back to the most recent Tomcat 6 version, 6.0.35, everything worked perfectly.

That's not the first bizarre problem I've had with Tomcat 7 that was solved by staying with Tomcat 6. It might be a while longer before trying Tomcat 7 again.