« October 2008
SunMonTueWedThuFriSat
   
4
10
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today
XML

Tom Haynes

loghyr.com
excfb.com
Serialized Science Fiction
Behind the Scenes

Blogs to Gander At

Navigation

Editing

AllMarks

Referers

Today's Page Hits: 3205

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

www.alesti.org

Add to Alesti RSS Reader

South Park as I was 10 years ago

South Park Fantasy

South Park today

South Park Reality

I have more hair and it isn't so grey. :->

10 years ago, really

Toon Tom

Today, literally

Tom Today

Site notes

This page validates as XHTML 1.0, and will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device. It was created using techniques detailed at glish.com/css/.

Main | Next page »
20081014 Tuesday October 14, 2008
I've been meaning to learn more about RBAC

If you also have been meaning to learn more about RBAC, a good start would be: Introducing pfexec, a Convenient Utility in the OpenSolaris OS By Joerg Moellenkamp, with contributions from Marina Sum, October 13, 2008.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20081013 Monday October 13, 2008
ds_addr_t is now da_addrlist_t

As a group, we decided to change ds_addr_t to ds_addrlist_t to avoid confusion with struct ds_addr. The OpenSolaris gate has those changes already.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20081012 Sunday October 12, 2008
Restarting with mds_gather_devs

Time to pick back up on that analysis, but remembering that ds_addr is different than ds_addr_t.

mds_gather_devs

Note, we are in usr/src/uts/common/fs/nfs/nfs41_state.c...

So mds_gather_devs does the work of stuffing the layout. It gets called for every entry found in the instp->ds_addr_tab:

    968 	ds_addr_t	*dp = (ds_addr_t *)entry;
...
    974 	if (gap->dex < gap->max_devs_needed) {
    975 		gap->lo_arg.lo_devs[gap->dex] = rfs4_dbe_getid(dp->dbe);
    976 		gap->dev_ptr[gap->dex] = dp;
    977 		gap->dex++;
    978 	}

So we keep on reading ds_addr_t data structures until we have enough.

Now, how is that table populated? We are looping over these entries in the NFSv4 state tables:

   1060 	rw_enter(&instp-;>ds_addr_lock, RW_READER);
   1061 	rfs4_dbe_walk(instp->ds_addr_tab, mds_gather_devs, &args;);
   1062 	rw_exit(&instp-;>ds_addr_lock);

So we need to look for instp->ds_addr_tab or instp->ds_addr_idx. And in usr/src/uts/common/fs/nfs/ds_srv.c, we find mds_ds_addr_update which does:

    616 ds_status
    617 mds_ds_addr_update(ds_owner_t *dop, struct ds_addr *dap)
    618 {
    619 	struct mds_adddev_args darg;
    620 	bool_t create = FALSE;
    621 	ds_addr_t *devp;
...
    626 	if ((devp = (ds_addr_t *)rfs4_dbsearch(mds_server->ds_addr_uaddr_idx,
    627 	    (void *)dap->addr.na_r_addr,
    628 	    &create;, NULL, RFS4_DBS_VALID)) != NULL) {
    629 		MDS_SET_DS_FLAGS(devp->dev_flags, dap->validuse);
    630 		rw_exit(&mds;_server->ds_addr_lock);
    631 		return (stat);
    632 	}

Note how we are calling the ds_addr_t a devp, perhaps a better structure name might be ds_dev_addr_t.

So, if we find one in mds_server->ds_addr_tab (via the mds_server->ds_addr_uaddr_idx which is a secondary index to ds_addr_idx), then we return. Else:

    636 	darg.dev_netid = kstrdup(dap->addr.na_r_netid);
    637 	darg.dev_addr  = kstrdup(dap->addr.na_r_addr);
    638 
    639 	/* make it */
    640 	devp = (ds_addr_t *)rfs4_dbcreate(mds_server->ds_addr_idx,
    641 	    (void *)&darg;);
    642 
    643 	if (devp) {
    644 		devp->ds_owner = dop;
    645 		MDS_SET_DS_FLAGS(devp->dev_flags, dap->validuse);
    646 		list_insert_tail(&dop-;>ds_addr_list, devp);
    647 	} else
    648 		stat = DSERR_INVAL;

we grab the info out of the ds_addr and create a new entry. Note that it is devp->ds_owner which is likely to have the addressing info I am interested in.

     98 typedef struct {
     99 	rfs4_dbe_t	*dbe;
    100 	time_t		last_access;
    101 	char		*identity;
    102 	ds_id		ds_id;
    103 	ds_verifier	verifier;
    104 	uint32_t	dsi_flags;
    105 	list_t		ds_addr_list;
    106 	listhttp://opensolaris.org/os/project/nfsv41/documentation/nfsv41_server/d13_layout_devices.jpg_t		ds_guid_list;
    107 } ds_owner_t;

So we have lists of ds_addr and ds_guid. But that ds_guid_list is currently only created and never populated.

Time to digress and attack this from a different angle.

Looking at the NFSv4.1 pNFS Devices and File Layout Structures

This may no longer be accurate, but Robert Gordon, before he passed on (to another company), left us with this image (from Server Design Document):

This says quite clearly that while it may be the spe's job to generate layouts, in order to do so you need to construct a device list. Up until now, I've been working on a month's old statement that I need to "just generate the stripe width, stripe unit size, and an array of guids". Implicit in that is that someone else would do the logic, because it was trivial, to morph that into a layout.

And you know, I keep on looking for an explicit mapping to occur between the selection of the layout and the device list - it is the title of this series of blog articles. It may not be occurring because of the maturity of the code. I.e., everything up to now is predicated on there being a fixed number of DSes and fixed number of data server storage. And relationships just work in that if you only have 1 entry in a list because there is only 1 data store, then all of the other associated lists will also only have 1 entry.

There is still a lot of work to do to make this implementation a product.

Anyway, the picture spells out a lot of what is in the spec. The other way to attack this would be to look at a snoop trace during a create.

But anyway you slice it, there is no magic happening to tie a guid to a device list.

I'm going to have to expand the scope of my project.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20081011 Saturday October 11, 2008
Understanding layout creation to understand what spe will have to do

In my task list for spe, a large item has been how to tie it into the current code base - you might have seen me reference it as translating data path to guid. To do that, I've had to understand what the current code is doing and the limitations in that code. I've also had to question exactly what it is we want done.

Quick overview of spe

The Simple Policy Engine (spe) tells the pNFS MetaData Server (MDS) how to layout the stripes on the data servers (DS) at file creation time. If you think of RAID, a file is striped across disks and we need to know how many disks it is striped across and what is the width of the stripe. Then to determine which disk a particular piece of data is on, we can divide the file offset by the stripe width to get the disk.

This is simplistic, but is also the basic concept behind layout creation in pNFS. A huge difference is that we need to tell the client not only the stripe count and width, but the machine addresses of the DSes. It is a little bit more complex than that as each DS might have several data stores associated with it, a data store might be moved to a different DS, etc. We capture that complexity in the globally unique id (guid) assigned to each data store. But conceptually, lets consider just the base case of each DS having only one data store and it is always on that DS.

Overview of Current Layout Generation

So the NFSv4.1 protocol defines an OPEN operation and a LAYOUTGET operation. It doesn't define how an implementation will determine which data sets are put into the layout.

In the current OpenSolaris implementation, these two operations result in the following call chains:

"OPEN" -> mds_op_open -> mds_do_opennull -> mds_create_file
"LAYOUTGET" -> mds_op_layout_get -> mds_fetch_layout -> mds_get_flo -> fake_spe

In my development gate, a call to spe_allocate currently occurs in mds_create_file.

The relevant files to look at are: usr/src/uts/common/fs/nfs/nfs41_srv.c and usr/src/uts/common/fs/nfs/nfs41_state.c.

Note: I will be quoting routines in the above two files. Over time, those files will change and will not match up what I quote.

mds_fetch_layout

The interesting stuff in layout creation occurs in mds_fetch_layout:

Note that we have starting with nfs41_srv.c.

   8320 	if (mds_get_flo(cs, &lp;) != NFS4_OK)
   8321 		return (NFS4ERR_LAYOUTUNAVAILABLE);

mds_get_flo

And in mds_get_flo:

   8269 	mutex_enter(&cs-;>vp->v_lock);
   8270 	fp = (rfs4_file_t *)vsd_get(cs->vp, cs->instp->vkey);
   8271 	mutex_exit(&cs-;>vp->v_lock);
   8272 
   8273 	/* Odd.. no rfs4_file_t for the vnode.. */
   8274 	if (fp == NULL)
   8275 		return (NFS4ERR_LAYOUTUNAVAILABLE);

Which basically states that the file must have been created and in memory. These is not a panic for at least the following reasons:

  1. Client may have sent the LAYOUTGET before the OPEN. A crappy thing to do, but not a reason for a panic.
  2. The server may have rebooted since the client sent the OPEN. Even if the file is on disk on the MDS, it is not incore. Clue the client in that they may need to reissue the OPEN.
   8277 	/* do we have a odl already ? */
   8278 	if (fp->flp == NULL) {
   8279 		/* Nope, read from disk */
   8280 		if (mds_get_odl(cs->vp, &fp-;>flp) != NFS4_OK) {
   8281 			/*
   8282 			 * XXXXX:
   8283 			 * XXXXX: No ODL, so lets go query PE
   8284 			 * XXXXX:
   8285 			 */
   8286 			fake_spe(cs->instp, &fp-;>flp);
   8287 
   8288 			if (fp->flp == NULL)
   8289 				return (NFS4ERR_LAYOUTUNAVAILABLE);
   8290 		}
   8291 	}

Note that an odl is a on-disk layout. And the statement on 8278 is how I will tie the spe in with this code. During an OPEN, I can simply set fp->flp and bypass this logic. If there is any error, then this field will be NULL and we can grab a simple default layout here. So I'll probably rename fake_spe to be mds_generate_default_flo.

fake_spe

So understanding what fake_spe does will help me understand what the real spe will have to do:

   8236 	int key = 1;
...
   8241 	*flp = NULL;
   8242 
   8243 	rw_enter(&instp-;>mds_layout_lock, RW_READER);
   8244 	lp = (mds_layout_t *)rfs4_dbsearch(instp->mds_layout_idx,
   8245 	    (void *)(uintptr_t)key, &create;, NULL, RFS4_DBS_VALID);
   8246 	rw_exit(&instp-;>mds_layout_lock);
   8247 
   8248 	if (lp == NULL)
   8249 		lp = mds_gen_default_layout(instp, mds_max_lo_devs);
   8250 
   8251 	if (lp != NULL)
   8252 		*flp = lp;

The current code only ever has 1 layout in memory. Hence, the key is 1. We'll need to see how that layout is generated. And that occurs in mds_gen_default_layout. Note how simplistic this code is - if for any reason the layout is deleted from the table, it is simply added back in here. Right now, the only reason the layout would be deleted is if a DS reboots (look at ds_exchange in ds_srv.c).

mds_gen_default_layout

This is the code builds up the layout and stuffs it in memory:

Note that we have switched into nfs41_state.c.

   1046 int mds_default_stripe = 32;
   1047 int mds_max_lo_devs = 20;
...
   1052 	struct mds_gather_args args;
   1053 	mds_layout_t *lop;
   1054 
   1055 	bzero(&args;, sizeof (args));
   1056 
   1057 	args.max_devs_needed = MIN(max_devs_needed,
   1058 	    MIN(mds_max_lo_devs, 99));
   1059 
   1060 	rw_enter(&instp-;>ds_addr_lock, RW_READER);
   1061 	rfs4_dbe_walk(instp->ds_addr_tab, mds_gather_devs, &args;);
   1062 	rw_exit(&instp-;>ds_addr_lock);
   1063 
   1064 	/*
   1065 	 * if we didn't find any devices then we do no service
   1066 	 */
   1067 	if (args.dex == 0)
   1068 		return (NULL);
   1069 
   1070 	args.lo_arg.loid = 1;
   1071 	args.lo_arg.lo_stripe_unit = mds_default_stripe * 1024;
   1072 
   1073 	rw_enter(&instp-;>mds_layout_lock, RW_WRITER);
   1074 	lop = (mds_layout_t *)rfs4_dbcreate(instp->mds_layout_idx,
   1075 	    (void *)&args;);
   1076 	rw_exit(&instp-;>mds_layout_lock);

We first walk across the instp->ds_addr_tab and look for effectively 20 entries. Note that max_devs_needed is always 20 for this code and so will be args.max_devs_needed.

I think the check on 1067 is incorrect and a result of the current implementation normally being on a community with 1 DS. It should be the case that args.dex is greater than or equal to max_devs_needed. Actually, we need to be passing in how many devices we will have D (the ones assigned to a policy) and how many we need to use S, with S <= D. The args.dex will have to be >= S.

Note that on 1070, we assign it the only layout id which will ever be generated. And if we play things right, we could store this layout id back in the policy and avoid regenerating the layout if at all possible.

Finally we stuff the newly created layout into the table.

mds_gather_devs

So mds_gather_devs does the work of stuffing the layout. It gets called for every entry found in the instp->ds_addr_tab:

    974 	if (gap->dex < gap->max_devs_needed) {
    975 		gap->lo_arg.lo_devs[gap->dex] = rfs4_dbe_getid(dp->dbe);
    976 		gap->dev_ptr[gap->dex] = dp;
    977 		gap->dex++;
    978 	}

So we keep on reading ds_addr_t data structures until we have enough.

Now, how is that table populated? You can look for ds_addr_idx over in usr/src/uts/common/fs/nfs/ds_srv.c, but basically, for each data store that a DS registers, one of these is created.

The upshot of all this is that if a pNFS community has N data stores, then the layout generated for the current implementation will have a stripe count of N.

Back to mds_fetch_layout

Note and nfs41_srv.c.

Okay, we've generated the layout and start to generate the otw (over the wire) layout:

   8332 
   8333 	mds_set_deviceid(lp->dev_id, &otw;_flo.nfl_deviceid);
   8334 

Crap, it is sending the device id across the wire! I'm going to have to rethink my approach. Instead of storing a policy as a device list and picking which devices I want out of that list (i.e., a Round Robin (RR) scheduler), I'm going to have to store each generated set as a new device list.

I don't understand the process like I thought I did.

Going back to mds_gather_devs, it is not stuffing data stores into a table as I thought. Instead, it is stuffing DS network addesses into a table.

Missing link

What I'm missing is how the ds_addr entries map back to data stores. Okay, this code in mds_gen_default_layout does it:

mds_layout_lock, RW_WRITER);
   1074 	lop = (mds_layout_t *)rfs4_dbcreate(instp->mds_layout_idx,
   1075 	    (void *)&args;);
   1076 	rw_exit(&instp-;>mds_layout_lock);

We have just gotten the device list via the walk over mds_gather_devs. And now we effectively call mds_layout_create on 1074.

nfs41_state.c:

   1104 	ds_addr_t *dp;
   1105 	struct mds_gather_args *gap = (struct mds_gather_args *)arg;
   1106 	struct mds_addlo_args *alop = &gap-;>lo_arg;
...
   1119 	lp->layout_type = LAYOUT4_NFSV4_1_FILES;
   1120 	lp->stripe_unit = alop->lo_stripe_unit;
   1121 
   1122 	for (i = 0; alop->lo_devs[i] && i < 100; i++) {
   1123 		lp->devs[i] = alop->lo_devs[i];
   1124 		dp = mds_find_ds_addr(instp, alop->lo_devs[i]);
   1125 		/* lets hope this doesn't occur */
   1126 		if (dp == NULL)
   1127 			return (FALSE);
   1128 		gap->dev_ptr[i] = dp;
   1129 	}

Okay, alop->lo_devs is the array we built in mds_gather_devs. Yes, yes, that is true.

I just figured out where all of my confusion is coming from - the code has struct ds_addr and ds_addr_t. In the xdr code, struct ds_addr is just an address (usr/src/head/rpcsvc/ds_prot.x):

    338 /*
    339  * ds_addr -
    340  *
    341  * A structure that is used to specify an address and
    342  * its usage.
    343  *
    344  *    addr:
    345  *
    346  *    The specific address on the DS.
    347  *
    348  *    validuse:
    349  *
    350  *    Bitmap associating the netaddr defined in "addr"
    351  *    to the protocols that are valid for that interface.
    352  */
    353 struct ds_addr {
    354 	struct netaddr4     addr;
    355 	ds_addruse          validuse;
    356 };

But in the code I've been looking at, ds_addr_t is a different structure (see usr/src/uts/common/nfs/mds_state.h):

    133 /*
    134  * ds_addr:
    135  *
    136  * This list is updated via the control-protocol
    137  * message DS_REPORTAVAIL.
    138  *
    139  * FOR NOW: We scan this list to automatically build the default
    140  * layout and the multipath device struct (mds_mpd)
    141  */
    142 typedef struct {
    143 	rfs4_dbe_t		*dbe;
    144 	netaddr4		dev_addr;
    145 	struct knetconfig	*dev_knc;
    146 	struct netbuf		*dev_nb;
    147 	uint_t			dev_flags;
    148 	ds_owner_t		*ds_owner;
    149 	list_node_t		ds_addr_next;
    150 } ds_addr_t;

This is pure evil because we typically equate foo_t as being typedef struct foo foo_t. As you can see, I've been fighting that in the above analysis.

I'm going to file an issue on this naming convention and leave the analysis here. I'll come back to it and rewrite it as if I knew all along that I was using a ds_addr_t and not a struct ds_addr.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20081009 Thursday October 09, 2008
nfs41-gate is branch merged with snv_100

I just merged the nfs41-gate with the snv_100 tagged onnv-gate. This caused me to bump the closedv tag to 2 in the nfs41-gate.

You can refresh your copies of our closed-bins at http://www.opensolaris.org/os/project/nfsv41/downloads/.

BTW: While the pushes are automatic, I'm still trying to get the notification to be automatic.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Pushed out the changes for 6751438 mirror mounted mountpoints panic when umounted

6751438 mirror mounted mountpoints panic when umounted, finally got through the code review process, the RTI process, etc. I wanted to get this into snv_101 because that is the candidate to become OpenSolaris 2008.11. I want a stable mirror mount experience out there for users.

The other bug fix I have in the works has passed stalled, which I'm really okay with at this point. This is the 6738223 Can not share a single IP address bug. Anyway, I've gotten two positive reviews and one reviewer asking questions. If I were tricky, I could try to submit an RTI, but the fact of the matter is that the third reviewer can still be satisfied.

Also, there is no way I want to integrate this into snv_101. While the fix is trivial, it can wait for snv_102. I.e., I don't see a business need to put it back right now.

Some of the key differences between the bugs and why one has to go in:

  1. Mirror mount bug is a panic, single addr one has a valid and documented work around
  2. Mirror mount bug is impacting regression testing, single addr one does not have a regression test case
  3. Mirror mounts are a hot request from some customers, single addr is being driven by my getting frustrated by unwieldy notation

So I feel one is a strong must and the other is a nice to have. And I don't think 'nice to have' meets the entry criteria for snv_101.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
A better fix for that non-responsive tool repository

I ran into another machine which got stuck on that 'hg style' issue I reported in Getting around a tool repository which is not updating and I got mad enough to start looking at it. We've still got path dependencies in BFU to stuff inside SWAN and I was hoping we could get away from it with the new tools.

I started trying to figure out where the Python script was that was being run. I wanted to find the hard-coded reference to /ws/onnv-tools/onbld/etc/hgstyle. Knowing that I had just seen a Flag Day on this, I looked in Flag day: new default output style for Mercurial. And the solution was staring right at me!

Mark had coded it correctly, no path hardcoding! I could simply change:

	[ui]
	username=Mark J. Nelson 
	style=/ws/onnv-tools/onbld/etc/hgstyle

to be:

	[ui]
	username=Mark J. Nelson 
	style=/opt/onbld/etc/hgstyle

Sweet, great to have my faith restored and an effective solution.

But now I need to really fix the automount maps in the lab, we keep on tripping over this issue with a stale repository and we have a working one we should be using.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Another reader suggestion for the Python script

So Justin suggested:

try

template =  """"%(started)s - %(ended)s: %(title)s for      
%(company)s
    %(jobdesc)s"""


print template % row

I'm trying to capture what he put in, the line wrap on the first 'template' line is probably the comment system.

So I entered this, making a reformat tweak, as:


template = """"%(started)s - %(ended)s: %(title)s for %(company)s %(description)s"""

if __name__ == "__main__":
        for row in main(sys.argv[1],'!'):
                print template % row

And I got the wrong output. The error is of course mine:

> ./justin.py r4.txt
"1/05 - present: Staff Engineer Software for Sun Microsystems NFS development
"6/01 - 12/05: File System Engineer for Network Appliance WAFL and NFS development
"4/01 - 6/01: Manager for Network Appliance Manager of Engineering Internal Test
"10/99 - 4/01: System Administrator for Network Appliance Perl hacker and filer administrator

I shouldn't have joined the lines in the 'template' format. But why the extra '"'?

Ahh, Justin had one in his entry.

Okay, so the new changes work, here is the full program:

#!/usr/bin/env python

import csv, sys

def main(dfile,format,delimiter=","):
        db=open(dfile,'U')
        start=0
        for line in db:
                if line.startswith(format):
                        db.seek(start+len(format))
                        return csv.DictReader(db,delimiter=delimiter)
                else:
                        start+=len(line)+(len(db.newlines)==2) #windows hackery
        raise "There is no %s header line in %s" % (format,dfile)

template = """%(started)s - %(ended)s: %(title)s for %(company)s
        %(description)s"""

if __name__ == "__main__":
        for row in main(sys.argv[1],'!'):
                print template % row

And now I need to go figure out what Justin did...

Okay, the '%(name)s' has to be a formatting option. Can I duplicate it?

>>> for row in csv.DictReader(file("r4.txt")):
...     print "%(title)s" % row
...
Staff Engineer Software
File System Engineer
Manager
System Administrator

And now I do understand it - 3.6.2 String Formatting Operations:

A conversion specifier contains two or more characters and has the following components, which must occur in this order:

   1. The "%" character, which marks the start of the specifier.
   2. Mapping key (optional), consisting of a parenthesised sequence of characters (for example, (somename)). 

So, this code would be good for printing, but not necessarily for doing more complex manipulations.

By the way, I'm having fun figuring this stuff out...


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20081008 Wednesday October 08, 2008
A reader suggestion on how to solve the Perl script

Neil doesn't like that our comment section wipes out whitespace. His concern is certainly valid where it comes to the way Python uses indentation.

He suggested the following implementation:

#!/usr/bin/env python

import csv

def main(dfile,format,delimiter=","):
        db=open(dfile,'U')
        start=0
        for line in db:
                if line.startswith(format):
                        db.seek(start+len(format))
                        return csv.DictReader(db,delimiter=delimiter)
                else:
                        start+=len(line)+(len(db.newlines)==2) #windows hackery
        raise "There is no %s header line in %s" % (format,dfile)


if __name__ == "__main__":
        for row in main('data.txt','!'):
                print "%s - %s: %s for %s\n\t%s\n\n" % \
                                tuple([row[column] for column in ['started','ended','title','company','jobdesc']])

And he provided the following note:

So what about something like this?

The csv module should take care of delimiters within columns
Simplification is possible if you don't need to deal with windows or
unix style line terminators
Changing delimiters is easy too.

I like that he caught on to making the separator an argument. It makes the code much more portable. I'm not sure it is as robust with respect to error handling, but in all fairness that could easily be handled and I did add those after posting the Perl script. Oh, and it does easily handle the addition of a new column in the data file.

I like the use of raise, I'm certainly not used to exception handlers any more.

I can see part of what is going on here:

>>> for row in neil.main("r4.txt",'!'):
...     print row
...
{'description': 'NFS development', 'title': 'Staff Engineer Software', 'started': '1/05', 'company': 'Sun Microsystems', 'ended': 'present', 'mad': 'money'}
{'description': 'WAFL and NFS development', 'title': 'File System Engineer', 'started': '6/01', 'company': 'Network Appliance', 'ended': '12/05', 'mad': 'honey'}
{'description': 'Manager of Engineering Internal Test', 'title': 'Manager', 'started': '4/01', 'company': 'Network Appliance', 'ended': '6/01', 'mad': 'scot'}
{'description': 'Perl hacker and filer administrator', 'title': 'System Administrator', 'started': '10/99', 'company': 'Network Appliance', 'ended': '4/01', 'mad': 'dam'}

And I thing the stuff with 'start' is what gets over the '!' in the first line???

>>> import csv
>>> help(csv.DictReader)

>>> for row in csv.DictReader(file("r4.txt")):
...     print row
...
{'!started': '1/05', 'description': 'NFS development', 'title': 'Staff Engineer Software', 'company': 'Sun Microsystems', 'ended': 'present', 'mad': 'money'}
{'!started': '6/01', 'description': 'WAFL and NFS development', 'title': 'File System Engineer', 'company': 'Network Appliance', 'ended': '12/05', 'mad': 'honey'}
{'!started': '4/01', 'description': 'Manager of Engineering Internal Test', 'title': 'Manager', 'company': 'Network Appliance', 'ended': '6/01', 'mad': 'scot'}
{'!started': '10/99', 'description': 'Perl hacker and filer administrator', 'title': 'System Administrator', 'company': 'Network Appliance', 'ended': '4/01', 'mad': 'dam'}

But I haven't figured out yet how the result is being built up. Okay, yes I have. I was fixated on the 'if' and 'else' thinking that was handling the header line versus the data line. But no, all it does it get you to the header line (i.e., there are comments in the file) and then the 'db.seek' gets you to the start of the header line and + 1 (via 'len(format)') for the format character. Then, just as in my interactive example, 'csv.DictReader' does the magic for you!

Sweet, Neil's code does what I had the Perl script doing!

It also shows I'm not used to all of the Python way of doing things. But it was fun to figure out what his script was doing!


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Branch merge to a specific tagged revision

I need to do a branch merge between nfs41-gate and onnv-clone. And specifically, I want to not get the 'tip', but rather the tag for release 100. I found a good reference - Chapter 8 Managing releases and branchy development.

So I'll follow along with it. I need the tag:

[th199096@jhereg onnv-play]> hg tags
tip                             7782:716c23b2ce2e
onnv_100                        7757:bf4a45ecb669
onnv_99                         7613:e49de7ec7617
onnv_98                         7473:fad192e9bc57

It turns out I don't need much more:

[th199096@jhereg nfs41-100]> hg reparent ssh://onnv.eng//export/onnv-clone
[th199096@jhereg nfs41-100]> hg tags | more
tip                             7744:763bfa203d1a
closedv1                        7742:9fab48a31a4a
onnv_99                         7652:e49de7ec7617

So I haven't merged yet:

[th199096@jhereg nfs41-100]> hg pull -u -r onnv_100
pulling from ssh://onnv.eng//export/onnv-clone
searching for changes
adding changesets
adding manifests
adding file changes
added 64 changesets with 475 changes to 462 files (+1 heads)
not updating, since new heads added
(run 'hg heads' to see heads, 'hg merge' to merge)

The tag can be used as a revision!


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Getting around a tool repository which is not updating

With the introduction of Mercurial, we have a need to keep our tools directory up to date. We could simply NFS mount the one in Menlo Park, but for WAN and build performance, that sucks. So, the Austin Labs have a local copy. And it is not being kept up to date. We've all been bitten by an old copy of the BFU script.

To get around this, we've built our own local repository and made sure that our paths all take this into account. Well, that just failed for me:

[th199096@jhereg mms]> hg outgoing -v
running ssh onnv.eng "hg -R /export/onnv-clone serve --stdio"
comparing with ssh://onnv.eng//export/onnv-clone
searching for changes
abort: style not found: /ws/onnv-tools/onbld/etc/hgstyle

I know the 'hgstyle' stuff is new, I saw Flag Day info on it. And sure enough: [th199096@jhereg mms]> df -k /ws/onnv-tools/onbld/etc Filesystem kbytes used avail capacity Mounted on mool-ha1-nfs.central:/export/ds01/d531/tools/01/elpaso.eng/opt/onbld 140454588 109105801 29944242 79% /ws/onnv-tools/onbld

I don't want to hack on the script, which I think shouldn't be using the full path. So I'll have to change where I'm getting my copy of the tools in /ws.

Okay, I don't have permissions on the NIS server, but I can get the map:

[th199096@jhereg ~]> ypcat -k auto.ws | grep onnv-tool
onnv-tools /SUNWspro   -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/slug-17.eng/export/$CPU/opt/SUNWspro    /teamware   -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/slug-17.eng/export/$CPU/opt/SUNWspro/SOS8    /onbld      -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/elpaso.eng/opt/onbld

And I can add it to my local /etc/auto_ws:

#
# Local copies of /ws workspaces
#
# For /ws/on10-clone use:
# /ws/on10-patch-clone-auspen or on10-feature-clone-auspen
#
on10-clone-aus          iquad:/pool/ws/on10-clone
on10-patch-clone-aus    iquad:/pool/ws/on10-patch-clone
onnv-clone-aus          iquad:/pool/ws/onnv-clone
on10-test-aus           iquad:/pool/ws/on10-test
onnv-test-aus           iquad:/pool/ws/onnv-test
onnv-stc2-aus           iquad:/pool/ws/onnv-stc2
on10-tools-aus  -ro     iquad:/pool/ws/on10-tools-$CPU
onnv-tools-aus  -ro     aus1500-home:/pool/ws/onnv-tools-$CPU
onnv-tools      /SUNWspro       -ro     /opt/SUNWspro /teamware       -ro     /opt/SUNWspro/SOS8    /on
bld     -ro     /opt/onbld

And no go:

[th199096@jhereg /etc]> sudo svcadm restart autofs
...
[th199096@jhereg th199096]> ls -la /ws/onnv-tools
/ws/onnv-tools: Permission denied
total 1
[th199096@jhereg th199096]> dmesg
...
Oct  8 16:54:23 jhereg automountd[883428]: [ID 406441 daemon.error] parse_entry: mapentry parse error: map=auto_ws key=onnv-tools
Oct  8 16:55:55 jhereg automountd[883477]: [ID 406441 daemon.error] parse_entry: mapentry parse error: map=auto_ws key=onnv-tools

I turn spaces into tabs, no luck. I check other machines and they do the hierarchy locally for other things. Well, I then convert the pathnames from /opt/SUNWspro to localhost:/opt/SUNWspro. And that turns the trick:

[th199096@jhereg th199096]> ls -la /ws/onnv-tools
total 5
dr-xr-xr-x   4 root     root           4 Oct  8 17:04 .
dr-xr-xr-x   2 root     root           2 Oct  8 17:04 ..
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 SUNWspro
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 onbld
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 teamware

I probably need to put a real fix into our jumpstart servers and make the path dependent on $CPU, but I think I was doing something when this happened.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
pdf to jpg via ImageMagick

I'm the volunteer webmaster for my son's soccer club: Blitz United Soccer Club. We occasionally get logos and such from sponsors. We want jpeg images for the website and they want high quality pdf for printing. Until now, I've simply asked them for the images in a format we can handle.

I got tired of doing that and googled 'pdf to jpg'. There were a lot of hits of sites that either wanted to install to my windows box or get an email address. I added 'linux' to my search parameter and found a nice hit: Batch converting PDF to JPG/JPEG using free software.

Having heard of ImageMagick vaguely in the past, and since they had many download sites, I installed it on my WinXP desktop. And it didn't convert for me:

C:\Documents and Settings\thud\Desktop\Downloads\97red>convert cooper.pdf cooper.jpg
convert: `%s': %s "gswin32c.exe" -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE -d NOPROMPT -dMaxBitmap=500000000 -dEPSCrop -dAlignToPixels=0 -dGridFitTT=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72"  "-sOutputFile=C:/DOCUME~1/thud/LOCALS~1/Temp/magick-UtqkGDcw" "-fC:/DOCUME~1/thud/LOCALS~1/Temp/magick-MpE4YxWI" "-fC:/DOCUME~1/thud/LOCALS~1/Temp/magick-z6ByBicB".
convert: Postscript delegate failed `cooper.pdf': No such file or directory.
convert: missing an image filename `cooper.jpg'.

Well, I solved that fairly quickly by:

[thud@adept ~/tmp]> sudo yum install ImageMagick
Setting up Install Process
Parsing package install arguments
Package ImageMagick-6.3.5.9-1.fc8.i386 already installed and latest version
Nothing to do
[thud@adept ~/tmp]> convert -density 600 cooper.pdf cooper.jpg

Which is probably what I should have tried in the first place.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Not able to mount from Fedora Core 9

Helen Chao, a colleague who had never really used Linux, asked me to help configure a kernel. I asked why and she said she needed to test RDMA over NFSv4. It turns out that the stock 2.6.25 kernel with Fedora Core 9 already had the support in it. We followed the directions at the nfs-rdma.txt and were not able to get it running.

Helen (a great test engineer) proceeded to investigate from there and couldn't get a simple loopback or NFS mount to succeed.

So I exported the root to all hosts and went to work debugging this issue. A 'rpcinfo -p' on the server showed the expected registered services. The same call from a client failed, but a ping worked:

[th199096@jhereg ~]> rpcinfo -p pnfs-9-30
^C
[th199096@jhereg ~]> rpcinfo -p pnfs-9-30
^C
[th199096@jhereg ~]> sudo mount -o vers=3 pnfs-9-30:/ /mnt
^C
[th199096@jhereg ~]> sudo mount -o vers=3 pnfs-9-30:/ /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: retrying: /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
^C
[th199096@jhereg ~]> ping pnfs-9-30
pnfs-9-30 is alive

I thought that perhaps it was a firewall issue and disabled IPTABLES.

No luck and I knew the mount should succeed - I tried it with my home Core 8 box and an OpenSolaris server. It worked, but then again, that Linux box has been configured for ages. Long story short, I asked Chuck Lever for help.

His only suggestion was to turn off selinux or as he puts it:

Also disable selinux, just so your systems behave like normal Unix.

So I followed the directions I found here: How to Disable SELinux and now the mount works:

# mount -o vers=3 pnfs-9-30:/ /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: retrying: /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: /mnt: mounted OK
# 

Most of the help I found with google on the RPC messages wasn't informative. Either the suggestion was to turn off IPTABLES or there was no reply.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Finally, the Python version of the old Perl script

I played about in the interactive Python shell trying to understand the data and how to tie it together. I learned about the difference between exec and eval for Python. I learned about capturing stdio and stdout for exec, but I couldn't figure out a way to automatically create variables in the proper scope in Python.

I even finally found a good quote on this at http://mail.python.org/pipermail/tutor/2005-January/035253.html:

> This is something I've been trying to figure out for some time.  Is
> there a way in Python to take a string [say something from a
> raw_input] and make that string a variable name?  I want to to this so
> that I can create class instances on-the-fly, using a user-entered
> string as the instance name.

This comes up regularly from beginners and is nearly always a bad
idea!

The easy solution is to use a dictionary to store the instances.

Nice to know I'm not the first to want to do this. But it did get me thinking, I have been calling this set of Perl scripts 'data dictionaries' for longer than I care to remember. And the code is not very legible at times. So, I decided to redo the script as:

#!/usr/bin/python

import sys

first_line = True

lang = []
iCounter = 0
for line in open(sys.argv[1]):
        line2 = line.lstrip()
        iCounter += 1

        if line2.startswith("!") or line2.startswith("#"):
                if first_line:
                        lang = line2[1:].split(",")
                        first_line = False
                continue
        splity = line2.split(",")
        dtemp = {}

        if len(splity) != len(lang):
                print "Error - args do not match header on line %d" % (iCounter)
                continue

        for i in range(len(splity)):
                dtemp[lang[i]] = splity[i]

        print "%s - %s: %s for %s\n\t%s\n" % (
                dtemp['started'],
                dtemp['ended'],
                dtemp['title'],
                dtemp['company'],
                dtemp['description'])

dtemp['started'] is more verbose than $started, but it is clearer how I am generating the data. And I have more error checking (which I have yet to sanity check :->).

Anyway, this fails and I knew why almost right off the bat:

> ./r3.py r2.txt
Traceback (most recent call last):
  File "./r3.py", line 33, in 
    dtemp['description'])
KeyError: 'description'

I was suspicious about that extra newline I mentioned way back in The simple version of the old perl script. I suspected that the entry line still had an extra one that I needed to remove. I.e., the data dictionary has a key for 'dictionary\n' and not 'dictionary'.

The following change proved that:

for line in open(sys.argv[1]):
        line1 = line.lstrip()
        line2 = line1.rstrip()
        iCounter += 1

And some quick sanity checking of removing a column in one row and adding one in another row shows that my error checking works:

> ./r3.py r3.txt
Error - args do not match header on line 2
Error - args do not match header on line 3
4/01 - 6/01: Manager for Network Appliance
        Manager of Engineering Internal Test

10/99 - 4/01: System Administrator for Network Appliance
        Perl hacker and filer administrator

So I learned what I set out to do. I may never use this script, but it helped me learn some things the hard way. I didn't show all of the little syntax errors I had to fix (forgetting the ':', not indenting in the interactive shell, etc). But hopefully, I'll remember them.

I'll also claim that the script does meet my needs as did the old one. If I add a new field to the flat file, I won't have to change the script to get the current output! And yes, I just tried that and I didn't have a problem.

I could do some more error checking (i.e., don't access an entry unless it is set), but I've already gone above the error checking in the Perl script.

Final Copy

#!/usr/bin/python

import sys

first_line = True

lang = []
iCounter = 0
for line in open(sys.argv[1]):
        line1 = line.lstrip()
        line2 = line1.rstrip()
        iCounter += 1

        if line2.startswith("!") or line2.startswith("#"):
                if first_line:
                        lang = line2[1:].split(",")
                        first_line = False
                continue
        splity = line2.split(",")
        dtemp = {}

        if len(splity) != len(lang):
                print "Error - args do not match header on line %d" % (iCounter)
                continue

        for i in range(len(splity)):
                dtemp[lang[i]] = splity[i]

        print "%s - %s: %s for %s\n\t%s\n" % (
                dtemp['started'],
                dtemp['ended'],
                dtemp['title'],
                dtemp['company'],
                dtemp['description'])

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Analyzing that old perl script

Guess I have to understand that script to rewrite it in Python. :->

First, gethead.pl reads through the file until it finds a line which starts with a '!'. In which case it creates a list of names of the form '$'column name:

        $format = '$' . join(', $', split(/,/, $first_line));
        print $format . "\n";

Yields:

> ./r.pl r.txt
$started, $ended, $title, $company, $description

The magic really occurs in the main processing loop:

do main'read_txtfile_format(*LNG_FILE, *languages);

lang: while (<LNG_FILE>) {
        next lang if (/^#/ || /^!/);
        eval "($languages) = split(/[,\n]/)";

        print "$started - $ended: $title for $company\n\t$description\n\n";
}

The first line gets '$languages' setup to the 'variable' names. Each time through the while loop, we call the eval to translate/associate the columns to variable names.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

Copyright (C) 2007, Kool Aid Served Daily