Tristan Horn KG6JTW Boulder Creek, CA 95006 tristan+r@ethereal.net +1 415 508-6270 http://tris.net/ Objective Position in network engineering or management, allowing me to apply and expand upon my prior technical experience -- possibly with some focus on monitoring or deployment automation. Skills * BGPv4, OSPFv4, HSRP, VPNs (IPsec, CET, PPTP, OpenVPN), ACLs, AAA (TACACS+, Kerberos 5, RADIUS), L2TPv3, NetFlow, rate-limiting, traffic shaping/QoS (CBWFQ, HTB), load balancing (L2-L7), anycast * DS1, E1, DS3, OC3, Gigabit Ethernet, SONET, ATM, Frame Relay, HDLC, PPP * Cisco 1751, 2501, 2514, 2811, 3640, 3825, 3845, 7206VXR, 7505, 7507 routers; Cisco 1900, 2900, 3750, 5500 and 6500 series switches; Cisco Aironet AP1231, Cisco VPN 3015, Cisco ASA 5510, Juniper SRX220, Foundry NetIron 400, 800, BigIron 4000, FastIronII+, ServerIron; Tiara/Tasman 1450, 4100; Alteon ACEswitch 180e; NetScreen-25, 100, 500; Livingston PortMaster 2, 3; Checkpoint FireWall-1; NetScaler MPX 17000 * ShoreTel; Cisco CallManager 4.1, Unity 4.0 (with Exchange & Active Directory integration), Attendant Console, IPCC Express, IP Communicator, 7920, 7936, 7940, 7960 IP Phones (SCCP), ATA186, CCME, SRST; Tandberg 770 MXP video phones * RANCID, rtrmon, NOCOL, tkined, flow-tools, cflowd, FlowScan, RRDtool gnuplot, SmokePing, Cricket, MRTG, Visio 2000-2003, graphviz, dia * tac_plus, gated, zebra/quagga * Solaris 2.5-9; FreeBSD 2.0-5.2; NetBSD 1.2.1, 1.5.x; System 6.0-7.6, Mac OS 8-X; Gentoo Linux 1.4-; SunOS 4.1.x; Red Hat Linux 6.2-; Slackware Linux 2.x; IRIX; ULTRIX; SCO UNIX; XENIX; ProDOS; MS-DOS; OS/2; Windows 3.1-XP, various NT through 2003 * Network Appliance filers & filer clusters (various) * Oracle, MySQL, PostgreSQL * BIND, NIS, OpenLDAP * JumpStart, Kickstart, FAI * NFSv2-4 * sendmail, qmail, Postfix * INN, Diablo, NNTPRelay * Bourne shell, perl, awk, sed, Tcl/Tk, C, C++, Ruby, Blankenship BASIC, Python * HTML, CSS, JavaScript, mod_perl, mod_ruby, SpeedyCGI, FastCGI, CGI Employment 2011 - Present Sr. Network & Systems Engineer Color Labs, Inc. Co-developed the automation and monitoring that orchestrated the bringup of our two large production environments and several development and staging clusters (more than 1600 instance launches). We use Amazon's Java APIs for instance management, Puppet for configuration management, and Icinga, Ganglia, SmokePing, Cacti, Graphite and Splunk for monitoring. Coordinated the installation of dark fiber between Color's Palo Alto HQ and Equinix SV8 to enable an eventual hybrid public/private cloud configuration. 2009 - 2011 Software Engineer Apple, Inc. Founding member of the iTunes Site Reliability Engineering team. Created a weekly on-call rotation for the team to handle and triage issues requiring the attention of iTunes Store Engineering. Developed software to identify performance issues in the iTS applications and network infrastructure. Findings from these new monitoring systems resulted in immediate widespread rehoming of servers, network capacity augmentation, and later the complete redesign of the network in the primary iTunes datacenter, altogether saving the Store from congestion collapse. Held and attended various weekly and monthly meetings to ensure network continuity for various other teams, including MobileMe (now iCloud), Video Island and others. The monitoring system now runs over 2400 SNMP queries/second and sends over 2000 ICMP echo requests/second to maintain fine-grained visibility into network health. Interface counters are monitored at the practical limits of the hardware -- typically 15-second intervals. 2009 Sr. Network Architect la la media, inc. Lala was the leading web-based social music player, and iTunes' strongest competitor. We enjoyed rapid growth owing to deals with Facebook (with the first non-virtual good purchaseable through the Facebook Gift Shop), Google (as the first & only full song player embedded in Google SERPs), and through our eventual acquisition by Apple. I executed an in-place redesign of the network, achieving these goals while maintaining full uptime: - Foundry SI4G to F5 BIG-IP LTM load balancer cluster upgrade - 100M to 1G upgrade for all 120 server downlinks - Full network redundancy down to each server (with Linux bonding driver, in active-backup mode) - Removal of dependence on NAT for all critical high-bandwidth flows - 1G to 8x1G upgrade for all backbone links - 3G to 61G transit capacity upgrade Actual network usage scaled over 100x up to a peak of 17.6Gbps and 100,000 concurrent audio streams. 2008 - 2009 Sr. Network Engineer AOL LLC 2006 - 2008 Sr. Network Engineer Bebo, Inc. Bebo was a leading social network and top 100 website with over 50 million users and 5Gbps of user-facing traffic. I joined as the 2nd member of Ops (making it a team), and helped grow it to an international team of 13, while focusing primarily on (and solely responsible for) design, capacity planning, evaluation, purchasing, deployment and ongoing management and monitoring of the network infrastructure (primarily routers, switches and load balancers) and services (including DNS, content delivery, transit, partial transit, paid peering & settlement-free peering). Notable projects include: - flattening out of two firewall layers, quadrupling effective throughput - renumbering of live interdependent systems out from behind NAT - migration from Cisco ACE to NetScaler MPX series - development of abuse (spam, DoS, phishing) detection and prevention systems - upgrading from 1G to 10G infrastructure - migration of static UGC delivery to CDN -- including creating cost-benefit analyses, vendor selection & negotiation, running A/B tests utilizing success criteria, internal education & documentation - migrating dynamic content delivery to Akamai DSA - migrating DNS hosting to Akamai eDNS (but staying as stealth master) - initial IPsec-tunnelled corporate network setup for three offices, and a few moves & upgrades; later integration with AOL - integration of production network with AOL (ATDN) 2005 - 2006 Principal Network Engineer CollabNet, Inc. 2002 - 2005 Sr. Network Engineer CollabNet, Inc. Led a worldwide group of three, responsible for all aspects of networking, information security and telephony at the ~230-employee ASP. Designed and implemented a Cisco-centric data and video telephony architecture to replace Foundry, NetScreen, Tiara, HP, Nortel, Bay and ShoreTel equipment. The 137-device network supports 3458 endpoints (of which 190 are phones) at 8 locations. Set up and created tools to simplify network management. The unified JumpStart and Kickstart system now reconfigures switchports pre- and post-build, and each machine has been receiving its own portable /30 and layer 4 ACL. (ACL maintenance also being automated to an extent.) Brought up 5 new locations and completed 5 moves. Acquired initial two /22 allocations from ARIN. Acted as an escalation point as needed for systems issues. 2000 - Present Playa Technical Operations Black Rock City LLC Engineering the network backbone at the annual Burning Man festival, used primarily for critical functions of the LLC (such as ticket validation and emergency services), but also by a portion of the 50,000 participants. Part of a three-member core team. Solely responsible for the automation, monitoring and auditing tools, which help us rapidly and safely execute the network bringup (and ensure it all stays up). Also maintaining the office network at the San Francisco HQ. 1999 - 2001 Sr. Network Engineer Critical Path, Inc. Responsible for the ongoing design and implementation of a rapidly evolving network during acquisitions and sudden growth periods. Expanded the network from 2 to 24 routers, 10 to 800Mbps in external traffic, and 1 to 9 datacenters. The network then supported 31 million hosted mailboxes on 1 million domains. Eliminated single points of failure by configuring HSRP, migrating from static routing to OSPF and iBGP, and implementing a hierarchical LAN fabric with a switch failure affecting, at worst, one half rack of front-end hosts. Managed ARIN address requests, later documenting the process and developing tools to automate it. Obtained two /14s and a /15. Implemented and maintained default-deny layer 4 ACLs for corporate headquarters and all datacenters. Conducted extensive security audits. Assisted large customers in defining connectivity requirements and implementing VPNs and leased lines. Interviewed, trained and supervised new staff. Installed BSCW to allow cooperative maintenance of Ops documentation. Created and maintained IP allocation maps, Visio diagrams, vendor contacts, peer contacts, circuit information, policies, processes & procedures, on-call schedules, etc. Measured flow statistics with cflowd and FlowScan to aid in capacity planning, peering development, and tracking of DoS attacks. 1998 - 1999 Systems Administrator Critical Path, Inc. Primarily responsible for developing, implementing, documenting and maintaining monitoring software and systems -- basically from scratch. Among the tools created is a TCP port health checker -- still in use today, testing at least 2500 ports on 1500 hosts every 10 seconds. I also extended snmpmon and reach (earlier projects of mine) for use at Critical Path. Ran (and acted upon) reports for capacity planning. Held classes for the NOC. Maintained production services on Solaris and FreeBSD servers and took part in weekly on-call rotation. 1997 - 1998 Network Technician Priori Networks, Inc. Worked with telcos and colocation vendors to provision frame relay, DS-1, DS-3 and ATM DS-3 circuits. Turned up peers, customers, transit; diagnosed and fixed routing problems on Cisco 2500 and 7500 series routers. Built and maintained production Solaris 2.5.1, FreeBSD 2.2.7, Windows 95 and Windows NT Server 4.0 systems (mail server, name servers, news server [both Diablo and NNTPRelay]; monitoring, NOC machines). Developed and maintained various, mostly home-grown real-time monitoring systems -- tracking packet loss, traffic (including network-, ASN- and port-specific reports & graphs from NetFlow statistics), BGP sessions, OSPF neighbor adjacencies, as well as router configuration diffs, CPU utilization and uptime. Created system to synchronize BIND configuration files; was responsible for further development and maintenance of DNS. 1996 - Present Proprietor Ethereal Networking Provide Internet access (shell accounts, FTP/WWW/e-mail virtual hosting), consulting services (C, perl & Tcl programming). Notable past projects include an IRC to web gateway for Real Networks and CGI/database work for the Special Needs Project (www.specialneeds.com). 1995 - 1996 Technical Support DiaCom Technologies Maintained Windows 95, Windows NT and Mac OS systems. Set up and tested presentations for prospective clients and investors. Managed Ethernet, LocalTalk network and file servers. Researched cost-effective solutions for part-time and dedicated Internet access. Set up remote dial-in access for telecommuting. Responsible for quality assurance, technical support and documentation (for employee/customer use) of custom database software. Administrated automated backup system, created and documented scripts and DOS batch files. 1994 - 1995 Webmaster Marathon Central Coordinated with several Marathon-related web page maintainers to form Marathon Central (www.marathon.org), which quickly became known as the leading source of information for the then-popular Macintosh game, serving over 2,500 software downloads per day. Education 2006: Completed UC Berkeley Extension course in Project Management 1997: Completed UC Berkeley Extension course in C++ at Menlo College 1995: Graduated from Beach High School, Santa Cruz, California