Coding, Beer, and Storage: 2011

Thursday, July 21, 2011

SmartClient and Jersey/JAX-RS

Here's a good tutorial on using jax-rs to return smartclient compatible data sources.

Thursday, July 14, 2011

Fun with DC's gis data, part 1

It looks like DC has kindly released quite a bit of gis data for public consumption. One of the more interesting sets is the regularly updated Owner Polygon dataset available from data.dc.gov. This is a shapefile containing current property records for everything in the District. Unfortunately, it's not available kml for easy display in google's tools. However the 70MB esri shapefile is available. Using Open Layers, PostGIS, and and GeoServer, we can get start displaying everything, but what if we want to use google maps and do things the hard way? To solve that, there's a few simple steps to allow polygon querying, selection, and display on google maps.

Import data into PostGIS
Create GIS servlet
Draw the data on google maps
Query PostGIS for google's lat/long
Select Properties from the map

We're going to work on step one today, import your data into PostGIS.
Prepare PostGIS
I'm running Ubuntu 11.04, PostgreSQL 8.4 with postGIS 1.5.1 installed from the default software repo.

psql (8.4.8)
Type "help" for help.

postgres=# create database propertymap;
CREATE DATABASE
postgres=# \q
~$ createlang plpgsql propertymap;
~$ cd /usr/share/postgresql/8.4/contrib/postgis-1.5/
postgis-1.5$ psql -f postgis.sql propertymap
postgis-1.5$ psql -f ../postgis_comments.sql propertymap;
postgis-1.5$ psql -f spatial_ref_sys.sql propertymap;

Convert Shapefile
Create a ton of insert statements using shp2pgsql:

poly$ shp2pgsql -s 926985 OwnerPly.shp ownertable > inserts.sql
Shapefile type: Polygon
Postgis type: MULTIPOLYGON[2]

If we look at the .prj file included, we see that the projection for the data is NAD_1983_StatePlane_Maryland_FIPS_1900. We need to add the projection from spatialreference.org in to our database

propertymap=# INSERT into spatial_ref_sys (srid, auth_name, .......66666666],UNIT["Meter",1.0]]');
INSERT 0 1
propertymap=# \i inserts.sql

Run your first query

propertymap=# select ownername,square,lot,premiseadd from ownertable where premiseadd like '%1600 PENNSYLVANIA%';
        ownername         | square | lot  |       premiseadd        
--------------------------+--------+------+-------------------------
 UNITED STATES OF AMERICA | 0187   | 0800 | 1600 PENNSYLVANIA AV NW
 UNITED STATES OF AMERICA | 0187   | 0802 | 1600 PENNSYLVANIA AV NW
 UNITED STATES OF AMERICA | 0187   | 0801 | 1600 PENNSYLVANIA AV NW

First, a little background on what we asked for. DC property records are based on square, suffix, and lot. Square generally refers to a city block and goes all the way back to the original city planning in the old part of the city. Lot is a lot within a square/suffix. For the most part, you can ignore suffix as it's rarely used. Next time, create a simple servlet to expose all of this.

Monday, July 11, 2011

Isolating Big Blue Button Video

This is a quick how to on manually connecting to a BBB video stream. Before we begin, here's a very, very quick background.

Video streams are grouped under a conference-room specific url that has for format rtmp://host/video/roomID
Each streaming component under BBB is available as a separate stream (ie, video, desktop, sip/audio, etc)
BBB uses red5 under the hood to manage these streams
Grab flowplayer here and the flowplayer rtmp client here

Connect to your room and start your webcam.

Tail /usr/share/red5/log/bigbluebutton.log and uou should see the following log lines:

2011-07-11 18:14:54,871 [NioProcessor-1] DEBUG o.b.c.s.p.ParticipantsEventRecorder - A participant's status has changed 141 streamName 640x480141
2011-07-11 18:14:54,919 [NioProcessor-1] DEBUG o.b.c.s.p.ParticipantsService - Setting participant status ec0449a0-b5d1-4ca5-bfdf-d118d8bc2299 141 hasStream true

ec0449a0-b5d1-4ca5-bfdf-d118d8bc2299 or similar is the room id
640x480141 is the stream id you need

Download and place flowplayer-...swf, flowplayer.rtmp-...swf, and flowplayer-...min.js into a directory.
Create a web page as follows:

<meta equiv="content-type" content="text/html; charset=UTF-8">
       <script type="text/javascript" src="flowplayer-3.2.6.min.js"></script>
       <title>Minimal Flowplayer setup</title>


<div id="player"></div>

<script language="javascript">
$f("player", "flowplayer-3.2.7.swf",
{
clip: {
        url: '640x480141',
        live: true,
        autoBuffering: false,
        bufferLength: 1,
        provider: 'rtmp'
      },

       plugins: {

               // here is our rtpm plugin configuration
               rtmp: {
                       url: 'flowplayer.rtmp-3.2.3.swf',
                       netConnectionUrl: 'rtmp://your_server/video/ec0449a0-b5d1-4ca5-bfdf-d118d8bc2299'
               }
       }
});
</script>

Load up your web page and you should see the streaming video.

Friday, May 6, 2011

log4j and Pivot

Here's a simple way to consume log4j messages in pivot for use in a log console or similar. First create a custom appender which sents a log message to the pivot message bus.

public class MessageBusAppender extends AppenderSkeleton {
    
    @Override
    protected void append(LoggingEvent event) {
        MessageBus.sendMessage(new LogMessage(layout, event));
    }

    @Override
    public boolean requiresLayout() {
        return true;
    }

    @Override
    public void close() {
       //nop
    }

}

public class LogMessage {

    private Layout logLayout;
    private LoggingEvent event;

    public LoggingEvent getEvent() {
        return event;
    }

    public Layout getLogLayout() {
        return logLayout;
    }

    public LogMessage(Layout logLayout, LoggingEvent event) {
        this.logLayout = logLayout;
        this.event = event;
    }
    
}

In any component needs to display log messages, just listen for the messages and update as appropriately. Here's an example updating a textpane:

public class LogPane extends Border {

    @BXML
    private TextPane logTxt;
    @BXML
    private PushButton clearBtn;
...
...
        logTxt.setDocument(new Document());

        MessageBus.subscribe(LogMessage.class, new MessageBusListener() {

            public void messageSent(final LogMessage message) {
                ApplicationContext.queueCallback(new Runnable() {

                    @Override
                    public void run() {
                        String text = message.getLogLayout().format(message.getEvent());
                        logTxt.getDocument().add(new Paragraph(text));
                        if (message.getEvent().getThrowableInformation() != null)
                        {
                            StringBuilder sb = new StringBuilder();
                            for (String s : message.getEvent().getThrowableInformation().getThrowableStrRep())
                            {
                                sb.append("  ");
                                sb.append(s);
                                sb.append("\n");
                            }
                            logTxt.getDocument().add(new Paragraph(sb.toString()));
                        }
                    }
                });
            }
        });

        clearBtn.getButtonPressListeners().add(new ButtonPressListener() {

            public void buttonPressed(Button button) {
                logTxt.setDocument(new Document());

            }
        });
 ...
 ...
}

Now tie it together in your log4j config:

log4j.rootLogger=ERROR, Pivot

log4j.appender.Pivot=sample.MessageBusAppender
log4j.appender.Pivot.layout=org.apache.log4j.PatternLayout
log4j.appender.Pivot.layout.ConversionPattern=%-6r [%15.15t] %-5p %30.30c %x - %m%n

Monday, March 14, 2011

Counting bits

Here's a simple input stream which will tell you the current position. You should insert this after any buffered input streams if you want an accurate position after a read.

public class CountingInputStream extends PushbackInputStream {

    private long bytesRead;

    public CountingInputStream(InputStream is) {
        super(is);
    }
    public CountingInputStream(InputStream is,int len) {
        super(is,len);
    }

    public long getPosition() {
        return bytesRead;
    }

    @Override
    public void unread(byte[] b, int off, int len) throws IOException {
        bytesRead -= len;
        super.unread(b, off, len);
    }

    @Override
    public void unread(int b) throws IOException {
        bytesRead--;
        super.unread(b);
    }

    @Override
    public void unread(byte[] b) throws IOException {
        bytesRead -= b.length;
        super.unread(b);
    }

    @Override
    public boolean markSupported() {
        return false;
    }

    @Override
    public synchronized void reset() throws IOException {
        throw new IOException("Mark not supported");
    }

    @Override
    public int read() throws IOException {
        int rd = super.read();
        if (rd != -1) {
            bytesRead++;
        }
        return rd;
    }

    @Override
    public int read(byte[] b) throws IOException {
        int read = super.read(b);
        if (read != -1) {
            bytesRead += read;
        }
        return read;
    }

    @Override
    public int read(byte[] b, int off, int len) throws IOException {
        int read = super.read(b, off, len);
        if (read != -1) {
            bytesRead += read;
        }
        return read;
    }

    @Override
    public long skip(long n) throws IOException {
        long skipped = super.skip(n);
        bytesRead += skipped;
        return skipped;
    }
}

Wednesday, March 9, 2011

GZip pushback input stream

When dealing w/ Java's GZIPInputStream, it has an annoying habit of eating bits past the end of the gzip trailer. Normally when reading a file, this isn't a problem, but when reading concatenated GZIP'd files or gzip data embedded in other inputstreams, you would really like to have those overread bits back. Using this inputstream, you can.

public class GzipPushbackStream extends GZIPInputStream {

    private boolean pushed = false;
    private static final int TRAILER_LEN = 8; // 8 byte tailer

    public GzipPushbackStream(PushbackInputStream pis) throws IOException {
        super(pis);
    }

    @Override
    public int read(byte[] buf, int off, int len) throws IOException {
        int read = super.read(buf, off, len);
        if (eos && !pushed) {
            int n = inf.getRemaining();
            if (n > TRAILER_LEN) {
                int offset = super.len - (n -TRAILER_LEN);
                ((PushbackInputStream) in).unread(super.buf, offset, n - TRAILER_LEN);
            }
            pushed = true;
        }
        return read;
    }
}

Wednesday, February 9, 2011

Java digesting performance

For the ACE audit manager, reading and generating digests on files is slowest part of auditing. In previous (1.6 and lower) versions of ACE digesting was done in a simple, read/update digest loop using Java's DigestInputStream. This appeared to work good enough, however I wanted to see what effect large blocks have on this model. When reading from remote resources, we up the block size from a standard 4-32k to 1MG.

Using that model, here's how performance looked for a variety of block sized, both aligned and unaligned. The following table compares both only reading and reading then updating. The test data ~16G of large (27MG) files on a SATA disk, running on a machine w/ 8G ram and an Intel Q9550 @2.83Ghz. The digest algorithm was SHA-256, using Sun's provider.

Block size	read	digest only	read+digest
4096	96.5 MB/s	69.6 MB/s	53.1 MB/s
8192	95.9 MB/s	69.6 MB/s	54.8 MB/s
10000	94.6 MB/s	69.2 MB/s	54.0 MB/s
1048576	97.8 MB/s	72.1 MB/s	34.3 MB/s

The interesting take-away is how much worse large block preforms when you follow the read then process model.

The next step is to change the model and split the digesting and reading into separate threads. The new model uses two threads and two queues. The two threads exchange a pre-allocated set of byte arrays using an empty-array queue and a filled-array queue. These queues were Java LinkedBlockingQueues.

Read Thread:

Pop byte buffer from empty-array queue
Read into the byte array.
Push buffer into filled-array queue.

Digest Thread:

Pop byte buffer from filled-array queue
Update digest using byte array.
Push buffer into empty-array queue.

On the following test runs, I used 5 byte buffers of the above sizes to see how performance would vary.

Block size	read	digest only	read+digest
4096	92.2 MB/s	65.63 MB/s	54.7 MB/s
8192	92.4 MB/s	67.5 MB/s	57.0 MB/s
10000	87.8 MS/s	68.0 MB/s	57.2 MB/s
1048576	97.4 MB/s	72.2 MB/s	64.7 MB/s

Small block performance is pretty much unchanged, however on large block, large file there is a substantial speedup running at almost 90% of the possible digest speed vs 48% possible speed. The next version of ACE will switch to this method of data reading.

Friday, February 4, 2011

ACE 1.6 released

ACE 1.6 has been released. Among it's notable features are:

Support for exporting token stores.
New maven-based build system, for easier 3rd party collaboration
checkm manifest exporting
mproved database storage of tokens.
Open subversion repository, and nightly builds.

Download here: ACE Download

A complete list of all improvements and bug fixes is available at: Changelog

When upgrading to 1.6, ACE will attempt to migrate all existing tokens from the older table layout to the newer. This means, the first time you run 1.6, it may take several minutes to perform this migration. The web interface will be unavailable during this migration. You can follow the status of the migration by looking at the ace log file, located in /tmp/aceam.log.

Friday, January 14, 2011

Let netbeans edit bxml files

In NB 6.8, here's how to tell netbeans to treat bxml files as xml. Go to tools -> options -> Miscellaneous -> Files. Click 'New' Set bxml as the file extension. Under 'Associated File Type:' select 'XML Files (text/xml) Click OK

Thursday, January 13, 2011

Apache Pivot 2.0 released

Pivot 2.0 has been posted Pivot Site Release Notes

Monday, January 10, 2011

Accessing the ACE IMS via python

Using hashlib, and binascii we can use python to both generate a digest and grab an ace token for that digest. While this example only sends one request in a call, you should batch your requests prior to requesting a token. Just send a 'list' of tokenRequest objects to requestTokensImmediate. The IMS will support up to 10,000 requests per call.

ACE website

import hashlib
import binascii
from suds.client import Client

filename='test2.py'

digFile = open(filename,'rb')
hashAlg = hashlib.sha256()
hashAlg.update(digFile.read())
filedigest = binascii.b2a_hex(hashAlg.digest())

url='http://ims.umiacs.umd.edu:8080/ace-ims/IMSWebService?wsdl'
client = Client(url)

print  filename, ' ', filedigest

request = client.factory.create('tokenRequest')
request.hashValue = filedigest
request.name = filename

result = client.service.requestTokensImmediate('SHA-256-0',request)
print result

The result spits back a token that you can use to validate a file.

[python] [toaster@loach ace-cli]$ python test2.py
test2.py   164182eef9792e2e1c5005cd9240ff508aef042b8fa344597431eae39370c784
[(tokenResponse){
  digestService = "SHA-256"
  name = "test2.py"
  proofElements[] =
     (proofElement){
        hashes[] =
           "c5e82872eeee3dfa539202a9757f8a5364b6fded4dfcb40b66084158f2b5c627",
        index = 0
     },
     (proofElement){
        hashes[] =
           "6e16a71847403f4e586625463160993bfab189c0bba771d81354c03d9c3591fd",
        index = 0
     },
     (proofElement){
        hashes[] =
           "0879b385c366d07142446a18dfb6d19c468a733991e9685fc75ce6f4b929b659",
        index = 0
     },
     (proofElement){
        hashes[] =
           "e19dd18bd9eabf79a074d72231a7117bd2319a859d31a429575b4657e85d0c95",
        index = 1
     },
  roundId = 2893078
  statusCode = 100
  timestamp = 2011-01-07 13:08:27.000253
  tokenClassName = "SHA-256-0"
}]

Coding, Beer, and Storage