Posts Tagged ‘case studies’

Do It With Drupal: New York Senate

Wednesday, December 9th, 2009

Background

  • Transforming an anachronistic organization with Drupal
  • In control of Republican party for 44 years
  • Never had a CIO before January 2009 – focused on internal enterprise IT before
  • People were cutting out and pasting articles from papers, scanning them, printing them, and distributing these reams of paper to offices every day – 1.5 million/year
  • CRM (constituent relationship management) – command-line type system
  • Intranet 1.0 – publishing info, no collaboration
  • Desktop PCs
  • Email 1.0 – intranet only, can’t work from home
  • Managing our own data center – not a core competency, but we do a reasonable job

NYSenate CIO Mission

  • Transparency
  • Efficiency – more effective, less cost
  • Participation – give people a participatory role in government
  • Modeling ‘best tech practices’ for legislative bodies
  • Organize/share data internally/externally, improve internal/external communications

Site dissection

  • No staff with web development experience in January; started out w/ consulting firm
  • Built by April, launched in May
  • Had to train hundreds of staff people to use it as content creators
  • RSS feeds, Twitter, Facebook
  • Popular/e-mailed/commented content, events, press releases/blogs/news clips
  • Almost 100 sites in one: 62 mini-sites for senators, 40-ish mini-sites for committees, issues/initiatives, legislation, open senate, about, photos & videos, newsroom
  • Previously, used proprietary CMS and external vendor – one party got better sites than the other, even with tax payer dollars covering everything
  • Senator directory – shows RSS/Twitter/Facebook (when available – been actively promoting this)
  • Senator pages: they stand on their own, all the info about the senator, he can post news releases/blog, news clips related to him, videos, RSS/Twitter/Facebook
  • Senators can create stories with visuals for their pages
  • Committees – each has its own stand-alone mini-site, with chairs, sign-up for newsletters, updates, video archive of meetings (will be live streams in January)
  • Submitting testimony on-line available in January
  • Issues & initiatives – marriage equality (aggregated all content from site), PSA (information about the census)
  • OpenLegislation: information should be freely available, searchable, sortable, permalinks
  • Open Senate initiative: OpenData (administrative info, how much who gets paid, what gets spent on what, etc.)
  • Data available in different formats – PDF, CSV, TXT, XLS, DOC
  • Contact forms for senators individually and for the site in general (press inquiry, webmaster)
  • Photos and videos – recording and, soon, livestreaming everything
  • Also available on YouTube; audio available on iTunes
  • Working on adding automated transcription
  • Blogger who works in the “newsroom” to create web-friendly content/press releases for the site

Modules

  • 131 modules + core required: activism, petition, administration, gmap/location modules, content templates, interrelated date & calendar, imageAPI/imagecache, and more!
  • Views: home page image carousel, event calendars, video/photo galleries, press releases, petitions, senators’ pages
  • CCK: constituent stories, senate districts, events, expenditure reports, photos, polls, press releases, video, senator, committee)
  • 19 custom modules – custom views/blocks for the most part, permissioning system for Office and Web Editors
  • Upcoming: distributed authentication, ideas crowdsourcing, unified commenting
  • Working on implementing SOLR search – Acquia is now hosting our site as of today, we’ve so far been using native Drupal search
  • Embedded Media Field for video

Integration with other applications, social web

  • 15,000 viewers on livestream.com for marriage equality debate
  • Social bookmarking for all content on the site
  • Some senators are using Facebook well and having open discussions with their constituents
  • nysenate.gov was re-branding, now we use “nysenate” for everything
  • API so developers can take any of our open data and do things with it
  • Haven’t made a final call about whether to keep using Discuss (external product) for commenting, or use Drupal’s native commenting (there’s a lot of configuring to do to get the seamless experience we want)
  • Sign up for updates about anything on the website; integrating w/ Bronto for e-mail blasts
  • Voting content up and down – needs to be elegant and incredibly easy, using a 3rd party solution right now and themed it like the main site

Everything else

  • New hosting – don’t have the resources to host something like this; now moved to Acquia
  • New domain name – wanted .gov to force the issue of what you can/can’t say (previously, it’d been used to say partisan, sometimes nasty things)
  • New policies (content creation, copyright, privacy, TOS, release of data, permissions)
  • New processes (requirements gathering, quality assurance – people who had previously done phone service or legacy systems, content creation workflows)
  • New talent (previously didn’t have any web developers in-house, consulting contracts, staff)
  • New tools (videoconferencing, IRC Chat, Central Desktop- lightweight project management, Redmine- bug/feature tracking, ticketing tasks)
  • New training materials
  • New communications/PR

Guidelines & miscellanea

  • No political or campaign information – conveniently, with .gov we’re not allowed to
  • Copyright policy – states can assert copyright if they want, but we went for CC BY-NC-ND for most things
  • Privacy policy – mirrored White House
  • Terms of participation – also mirrored White House
  • Post all code to Github
  • Use Daylife.com for replacement to paper clipping system
  • Hope that other legislative bodies will be able to reuse code
  • Had an Unconference (CapitolCamp) to hear what people think – some people were excited to pitch in, do things with API

Questions & feedback

  • Node Bulk Operations could be helpful
  • Had to take screenshots for a while to allow very non-tech-savvy senior people to see private things without the risk of them doing anything wrong with it (finding a better way for this)
  • Feedback from senators has been all over the map – actually the inverse of expected, where more Republicans were early adopters even when they weren’t saying nice things about it in public
  • More Republicans were effective using Twitter and Facebook, more internally organized to identify opportunities and make the most of them collectively
  • Senators are learning that by making content easy for others to see and share, related content gets more views too
  • Google Analytics stats available for all senators available; special reports around particular events
  • 1.5 mil page views a month, on a big day, 50,000 unique views (marriage equality)
  • 40-50 comments on a hot bill
  • Not massive, shouldn’t cause major performance headaches, but we had to do this in such a rush that we have a lot of refactoring to do to make sure it holds up okay under stress
  • If there’s something broken, blogs publish screenshots – we have to be very vigilant
  • Want to make custom modules available; just haven’t had the bandwidth, just have a code drop on github for now
  • Building relationships with CIOs of various state agencies – some of them have a lot more developers
  • PDFs have been the traditional publication format, including scanned documents; we’ve maintained that format for most data to accommodate the “I want to download and print” crowd – only last week got wifi in capitol building
  • For born-digital content, making it available as feeds in ways that will make it easier for people to use
  • More and more federal work being done in Drupal (whitehouse.gov); a couple state entities have put up rudimentary sites (liquor authority for state of New York)
  • Contacted mostly about policy issues for other states – comment moderating, copyright
  • Big national open data initiatives – community of practice around government transparency
  • Haven’t sat down with whitehouse.gov Drupal developers to talk about roadmaps yet – we feel overwhelmedly busy right now
  • Third party to compare roadmaps, sort out implications for working together? It’s a major undertaking
  • Sunlight foundation – encourages getting data out in mashable form; they give us feedback
  • Some senators have gamed the system by getting people to e-mail things they post so it gets on the “most e-mailed” list – this upsets other senators

@ahoppin
@NYSenateCIO
NYSenate.gov/department/cio
Hoppin – at – Senate.State.NY.US

Do It With Drupal: Anatomy of a Distribution: Open Atrium

Wednesday, December 9th, 2009
  • Open Atrium is a “team portal in a box” (AKA Basecamp alternative)
  • Can be behind a firewall, is free, openatrium.com
  • Putting people in different groups
  • Comes with six features:
    • Blog: turned on/off on a group-by-group basis
    • Wiki
    • Calendar- iCal feeds too
    • Shoutbox – like private Twitter
    • Case Tracker – ticketing system
    • Group dashboard
  • 75,000 downloads since July 17
  • translate.openatrium.com – 31+ levels to various extents; get updates that don’t overwrite your custom updates

What are people doing with it

  • Basic project management tool set
  • Sprite-based theme (5.5 kb, 13.7 kb)
  • Tailoring the system to your own needs
  • Drupal Core, modules, plus Features module power Open Atrium
  • People can customize their own dashboard
  • Cross-posting to different groups disabled; also, Organic Group configuration much more simple (clear distinction between public and private)

Migrating into Open Atrium

  • It’s just a Drupal site, so in theory you can turn on the Open Atrium modules around your existing site (but this isn’t suggested) – use some other way (Feeds module?) to aggregate existing content and put it into the new framework
  • Migration is a solvable problem, but probably not in a generic way useful for the core project

Extended features

  • Project status – time tracking and approval flow for a web shop
  • World Bank did a highly customzied version; integration with Lotus Notes – their own internet behind a firewall; faceted search across their pre-existing staff directory; extended events system to help with scheduling
  • Some custom coding went into the World Bank site, but a lot of what goes into it comes from configuring existing modules

How we use it

  • Over 50% tickets
  • Use blog instead of e-mail for the most part

Atrium’s rules

  • Works out of the box
  • At least as simple as running straight from drupal.org
  • Once you install it, it’s clear what the next step is – unlike Drupal, where you install it and wonder “what now?”
  • Works with Aegir
  • Doesn’t hack core or contrib (except occasionally- there’s a hack to Views that makes it translatable)
  • Doesn’t do everything – does a few things that are widely useful for intranets, and you can extend it

Things we’ll never do

  • Add a WYSIWYG; BUT, you can do that
  • Add CVS integration (but see features.blackstormsstudios.com)
  • Add Alfressco integration – but someone else has tried this
  • Investing some time in Google Docs integration
  • Won’t ever clone Basecamp – but someone wrote a theme that looks a lot like it (drupal.org/project/atrium_simple)
  • Add Sharepoint integration to base package

Things we will do

  • Clearer branding- Drupalisms & Atriumisms beware!
  • Drag and drop dashboards (vimeo.com/7643255)
  • Better admin experience (drupal.org/project/admin)
  • Pluggable search
  • Improved l10n support- Drupal only supports one language at a time, we want to fix this
  • Rewriting core functionality – upgrading to Context and Spaces, when we say “beta”, we mean it
  • Rework the “user space”
  • A calendar with a user story
  • Rewrite Case Tracker – this powers the to-do system, people want to customize the states cases can be in, kinds of cases, etc. (github.com/miccolis/casetracker)
  • This is going to be painful, we’ll provide upgrade paths
  • Move to drush make (drupal.org/project/drush_make)
  • New on drupal.org: install profiles: lists of things that, all together, make a site

Do It With Drupal: The Economist

Wednesday, December 9th, 2009

Rob robpurdie@economist.com – Scrum Practice Leader
twitter.com/robpurdie
facebook.com/robpurdie

Overview

  • Moving incrementally and iteratively to Drupal- making improvements as you move bit by bit
  • User comments and recommendations served from Drupal, along with comment history pages, article comments pages
  • Syncing data to Drupal every 5 minutes– all content and comments
  • Soon, article pages served from Drupal– running into a few performance problems
  • Next: channel pages served from Drupal, third-party services, registration
  • We benefit from Drupal sooner by taking this approach; rather than building the whole site in the background and not benefitting until the end, this way we benefit from improved functionality sooner
  • “The Economist is so old that the guy who started it had to be painted rather than photographed”

The old way

  • 20-30 mil page views, 3-4 million unique visitors per month – lots of performance and scalability issues
  • Want to build the foremost destination online for analyzing and debating global agenda; want to bring visitors into that debate; current system isn’t enough to support this vision, that’s why they moved to Drupal particularly for comments
  • Increase publishing volume with user-generated content (more content w/o more costs)
  • The old way: custom CMS built on proprietary stack (MS, ColdFusion, Oracle)
  • Blogs were originally MovableType, now are all Drupal
  • Broken waterfall processes meant frequent fire-fighting
  • Needed to be more responsive to change, deliver business value sooner (projects take a long time to deliver value to organization), more sustainable, happier
  • Making these changes incrementally and iteratively; “perfect is the enemy of better”

Why Drupal?

  • Looked at OpenCMS, Alfresco, Joombla, met with other newspapers, considered building a custom system, buying a proprietary system, or going open source
  • Drupal as strategic fit: community and content publishing, robust development framework, development language, free software
  • Strength of Drupal community
  • Selling Drupal internally was a challenge: no suit-wearing Drupal sales force
  • Attended DrupalCon Boston 2008, networking within community, engaging w/ Lullabot for workshops and training
  • Proof-of-concept to reproduce article page in Drupal; how to use CCK fields to make a rich article content type

Using Scrum

  • 3 million registered users, articles – data migration is daunting
  • Manage the move using Scrum – selling it was easy with charts (developing business value sooner and throughout, management can see progress throughout, shining a spotlight on issues/dysfunction and attacking them along the way – risk decreases a lot faster)
  • Take requirements, prioritize based on business value: which are the most important to organization, do those first
  • Trained management team in Scrum, development team in Drupal, then started sprinting with help from consultants (2-week sprints, delivering something of value at the end)
  • “Maybe not the largest Drupal project, but the most expensive” – lots of consultants

Integrating CMS’s

  • Proxy approach: Drupal sends JSON over HTTP back and forth with Existing ColdFusion system
  • Using native Drupal comments; comments have to be attached for nodes – there has to be a node for every piece of content on the legacy system
  • Create nodes on the fly for every ColdFusion request that comes in
  • Notion of proxy nodes is a pattern that comes up during integration of Drupal with other systems
  • Voting API votes used for recommends; these are also attached to proxy nodes
  • Started with proxy approach only; then moved to doing some with subdomain approach – hope to be doing neither soon after moving entirely to Drupal

Migrating data

  • Migrating and syncing data every 5 minutes – don’t wait until the end to figure out that piece
  • Table Wizard and Migrate modules
  • Table Wizard writes Views integration for MySQL tables
  • Migrate lets you migrate certain views, push into Drupal as nodes/users/taxonomy terms/etc
  • Client is involved in how legacy data gets organized in Drupal
  • Sat down with client to browse through content and decide what data needs to be moved and what it means
  • Migrate keeps track of everything you’ve done, gives you a dashboard, tells you how far along you are – keeps a mapping table, legacy ID, you can check and see what came across and fix things; does your bookkeeping for you
  • Drupal expects to have all the info it needs in its database; something getting published in Oracle needs to be in Drupal promptly – synchronization

Questions

  • How did you decide what to put into Drupal first?
    • Business value: comments, user profiles, recommends
  • How many Drupal servers does it take to scale that big?
    • Not entirely sure how many servers we have; let’s say +/- 12
    • Master MySQL server, a few slave MySQL servers – more important aspects have to do with Pressflow
    • Pressflow = high performance variant of Drupal 6, completely API compatible with Drupal, but it takes some patches that are in Drupal 7 and moves them in to Drupal 6
    • Use Varnish’s full capability; Varnish = reverse proxy server, takes load off Drupal/PHP/MySQL
  • How do you stop people from trying to shove their emergencies into Scrum process?
    • Don’t want people going directly to the team like they traditionally do
    • Team, Scrum Master, product owner – customer, person who represents the client, has to have power to make decisions on behalf of organization, responsible for managing stakeholders
    • Product owner comes to team w/ prioritized list of features for next sprint
    • Had two teams in New York and one team in London all doing 3-week iterations in parallel
    • Split up site into component parts: profiles, article pages, channel pages, had three product owners who had to manage stakeholders
    • Works reasonably well; now we’re doing two teams, one system that shows what all teams will do; someone has to keep “product backlog” in order, stopping people from shoving in their “one little thing”

Features

  • Base theme is 960 px grid – laying out themes as a series of columns, all sections have to fit into the grid
  • Selenium for “user journey” testing; building environments to help manage configurations
  • Continuous integration using Hudson – needed a shared place where user tests could run
  • Set of servers running on Amazon; Hudson sets off user tests every time there’s a commit to the SVN repository
  • Apache SOLR search hosted by Acquia- 100,000k articles that have to be available through site search
  • People were unhappy with relevance of matches in old site search
  • Acquia’s hosted search service: really fast, good results
  • Apache SOLR: can start filtering results further and further – faceting
  • “How do I get SOLR running on my website?” – can self-host, but we went with Acquia

Questions

  • Other tools for managing people/process?
    • In Scrum, less about resource management – we just want dedicated co-located teams, don’t worry about availability because of multiple projects: single focus
    • Redundancy of function – generalizing specialists, specialists can create bottlenecks/risks
    • “How many people need to be hit by a bus before your project fails?”
    • agilemanifesto.org
    • Use Google Docs a lot – project backlogs are all spreadsheets, a big wiki, project dashboards that “radiate information to the rest of the organization”
    • Focus is on people, not tools
    • Test-driven development, writing tests first can sometime be hard with Drupal

Impediments to progress

  • Previous processes/structure/culture: command and control – hard habit to break
  • Project manager telling people what to do and when to do it by – this is bad management; it has an impact on people
  • We want self-organizing teams
  • Previously, black box development: low visibility during the project process
  • For Scrum, everything needs to be transparent, frequently inspect outcomes, adapt as we go – can’t have a postmortem after everything’s done, need to do that every day
  • Hero developers who go off and solve problems heroically aren’t compatible with Scrum
  • Previously, developmental silos – departments based on function, these have been removed, but people still want to exist within their old silos
  • People want to work on multiple projects like they used to, rather than working on a single project in a dedicated manner
  • Previously, traditional line management: where you stack up in the line doesn’t matter now, this was a big change
  • Engineering practices (specifically quality) – big issue; Scrum is a wrapper for your existing engineering practices, doesn’t say anything about testing
  • Scrum assumes your engineering practices are great, or you’ll make them great quickly
  • You can say “we’re going to do Scrum” but old habits die hard – focusing on what “done” means and providing a deliverable at the end of each sprint, have to deliver quality too– have to go live successfully
  • Want to deliver “potentially shippable code” at the end of each session – have to have a testing environment that’s representative of live environment; been bitten by differences in configuration
  • Everything has to be identical in the test environment (just with a scaled down number of servers) – same data center, same network issues, etc
  • Hard to bite the bullet on the costs involved in building a testing environment, but it’s important
  • Hard to simulate kinds of traffic you get in production – plus, have to keep track of session cookies
  • Form fields can hurt you – replaying post requests
  • Cron jobs that run all the time – cron jobs can stack up and site starts to decay

Questions

  • Migration of real-time data: code changes are easier to migrate than content changes, what’s the process for moving bits of content from development to production?
    • When there’s content you need to work on for a while before it goes live, work on the live servers but make sure end-users can’t see it
    • Can use the unpublished flag on a Drupal node to do that; use “views” to see everything unpublished in sports category
    • For a small team, that’s a reasonable solution
    • For bigger organizations with a lot of people working together, use “Workflow” module – nodes step through a series of states
    • If it’s a business requirement that content has to start off on staging servers and only then push to live, use module “Deploy” – push-button way to push nodes and their dependencies– users, taxonomy terms, etc– to another environment
  • Technical reason for using external searching – why use SOLR at all? What about Drupal search?
    • Drupal 6 is better than previous search mechanisms, but falls apart at a certain scale
    • Slow queries, sub-optimal results
    • A lot of non-Drupal people have worked on Apache SOLR, Drupal has integrated it well
    • Self-hosting, or with Acquia – if you have the talent to run Java apps in your data center and keep it running, self-hosting is a great idea; will reduce latency
    • Most of us are struggling to keep PHP/MySQL up as it is, this is where Acquia comes in
    • Acquia service is pretty much plug-and-play
    • Built-in search doesn’t come with facets; can add on facets with the “Faceted Search” module
    • SOLR is an enterprise search system; used by Netflix, Expedia, etc.
  • Could you use Views instead of facets?
    • There’s a lot of overlap there, and different possible approaches.
    • Full-text searches need SOLR rather than Views
  • Some of the wins you’ve had with Scrum/Drupal, and some weaknesses
    • Wins by development teams – prefer this way of working, where business people are only concerned with relative priority of requirements, have no say in how long it takes to implement
    • Product owners prioritize “stories”, developers size those stories relative to each other, rather than in hours of effort
    • Stops the cycle of cutting corners on quality in order to get it done in a shorter timeframe
    • Can’t get productivity gains w/o changing the way you work
    • Product owners need to be involved, can’t change requirements mid-sprint
    • Have “working agreements” – a kind of social contract
    • Scrum isn’t a prescription – you can pick and choose the parts that you want that meet your organization’s needs
    • Specific processes layered on top of simple framework of transparency, working together, and adapting to testing results, can vary
  • When will the Economist be fully on Drupal?
    • Description says “this month” – that was the plan
    • People paying the bills get to make decisions; is it most important for us to go all-Drupal ASAP, or extend functionality of site to be competitive?
    • Recent decision was for the latter
    • Don’t know when