Open Standards, Open Source, and the Public Good

Richard P. Gabriel & Ron Goldman

Executive Summary

The move to e-government presents a unique set of challenges. Governments and public-sector organizations have a responsibility to the public to do the following:

Although computers and digital media provide flexibility, speed, and accuracy, they don’t support these responsibilities well—at the moment. While people can read paper records dating back hundreds of years, information stored by one computer program often cannot be read by another. Software providing essential services can stop working when the computer hardware it was designed for is replaced, and the company or organization that originally wrote that software is no longer available to replace it. New security holes in major software packages are discovered regularly, forcing customers to install updated versions or be vulnerable to outside attacks.

Governments and public-sector organizations also need to be concerned with the following when procuring computer hardware and software:

To meet these concerns and best support the responsibilities mentioned above, governments and public-sector organizations should require software that is built upon open standards. When a software application stores information using a standard format, other programs will be able to access that information. Open standards make it possible to replace one piece of software with another when making procurement decisions.

The challenge in picking computer systems is far more complex than picking a consumable: A decision today about a type of software may well entail a commitment to that type for many decades to come. An apt comparison is the difference between choosing a railcar to purchase and determining the railroad gauge width for the tracks. The choice of railcar is a primarily a short-term, economic decision while the choice of railroad gauge has repercussions for decades or centuries.

Although open-source software seems to offer initial-cost benefits for TCO concerns—especially computing platforms like the Linux operating system and the Apache web server—open source does not necessarily address the primary responsibilities of governments and public-sector organizations; further, support, training, and maintenance costs can offset any initial-cost savings. Open-source software is software that is developed using a particular (open) process and licensed under specific (open) terms. How software is developed and licensed is of minor importance when choosing software. The major determining factors should be whether the software provides the required functionality, has a reasonable TCO, and supports open standards.

The Needs of Government

As governments make more and more of their services available online, they face a unique set of challenges. Governments and public-sector organizations have the following responsibilities to the public:

The use of computers both facilitates and complicates this charter, as discussed below. All organizations that procure software and computer services have similar concerns which include the following:

These concerns are further discussed below.

Government Responsibilities

Although individual and private organizations share with governments and public-sector organizations the same short-term concerns for cost, compatibility, and replacement, the responsibilities set them apart. And those responsibilities combined with the confusion of software licensing and development techniques make it hard to see what are the best decisions for governments and public-sector organizations.

Permanently Maintain Public Records

Longevity is a hallmark of public and societal infrastructure. Roads, bridges, and buildings are designed to be used and maintained for years, decades, and sometimes centuries. Public records are kept for at least this long. A local city hall holds records going back to the founding of the city. People can read newspapers from 150 years ago on paper and 200 years ago on microfilm. Family records go back generations. Details of ownership and taxation go back over 200 years.

Computing infrastructure is taking over many governmental and public sector record-keeping operations. As the mass of information and data grows, computer-based infrastructure is not merely an alternative form of record-keeping, but often the only form. Unfortunately, the first 60 years of computing history viewed with an eye toward the prospects for the permanence of computer-based public information is not promising. It is generally not possible to read magnetic tapes, disks, paper tapes, and punch cards from 20 or even 10 years ago, because the industry keeps moving forward, changing the physical media, changing formats, and constantly obsoleting technology. In fact, important data is being lost every day because data formats designed and used as recently as 25 years ago cannot be read any more, the software that created it is stored in those formats, and even were the software code available, it cannot run on any existing computer; and to add to the dilemma, in many cases the companies that created all the pieces are long gone. One solution would be to migrate data onto new hardware—from tapes to CDs and now to DVDs. However data migration requires knowing how the data was stored, and this may not be possible if a proprietary or obscure data format was used.

Continually Provide Essential Services

The services that governmental organizations currently provide, both to the public and to other organizations, will continue to need to be provided in the future. Whether it is a county keeping a roll of registered voters, an agency providing information to the public via web pages, government employees sending and receiving email, staff members writing reports and publications, or any of the myriad other essential day-to-day activities, such activities will be just as necessary in 5, 10, or 50 years. Computers play an important role in delivering these services. Unfortunately the lifetime of a piece of software is typically only 5 to 10 years because the hardware it runs on becomes obsolete or the requirements for the software changes. And this makes it is necessary to plan for periodically replacing or upgrading software to take advantage of new technology and to better meet the changing needs of government.

Guarantee the Security, Accuracy, and Auditability of Recorded Information

Like any large organization, governments and public-sector organizations have a vital interest in maintaining the security of their records. Governments have a special responsibility to do so. As we are learning from email spam, computer viruses, denial-of-service attacks, and other security lapses, security is not one of the towering strengths of computing infrastructure. All the major computer operating systems—Windows XP, Mac OS X, Solaris, HP-UX, IBM AIX, and Linux [2]—have released security patches in 2004. Moreover, security and other patches are routinely being made to database software, web server software, application server software, office and productivity software, and just about any software being actively developed. Security lapses routinely occur because the nature of the tools and practices with which software is created makes it inevitable that there will be errors including errors that jeopardize security. Governments and public-sector organizations have no choice but to plan to live with and accommodate error-filled software using the tools of acquisition policy, vigilance, conservative decision making, and by having fallback strategies for crucial information and systems.
One way to address the fragility of software systems for governments and public-sector organizations is to require transparency of data formats and source code. Further, software systems used by governments should, in the ideal case, be accountable—an audit trail or other log of important actions and transactions performed by the software should be kept.

Government Concerns

Governments and public-sector organizations share concerns with any other organization that procures software—cost being one of the most prominent today.

Total Cost of Ownership

Computers are complex, and expenses do not end when the hardware box or the software package is purchased. Total cost of ownership (TCO) is a model developed to analyze the direct and indirect costs of owning and using hardware and software. Total cost of ownership includes the following:

Many factors must be taken into account when purchasing a computer system, and basing a choice only on initial investment may prove more costly in the long run. Upgrades, maintenance, technical support, and training can have direct costs, and upgrades and maintenance can be disruptive, causing indirect costs.

Compatibility and Integration with Other Applications

Two applications or software tools are compatible if they are able to interoperate. For example, a simple text editor can be used to produce an HTML file which is then provided to the public on a web site. In this case, the text editor is interoperating with the web server. This is the simplest form of compatibility: compatibility based on using a common data format. HTML—HyperText Markup Language—is another example of compatibility. Any web browser can display a page whose format is described in HTML. But Adobe Photoshop is not compatible with Microsoft Word: Word cannot edit a Photoshop file.

Other, deeper forms of compatibility come from two software applications being designed to work together.

For example, Windows XP has a suite of software that enables applications to create and manipulate the windows that people use to interact with the application. In this case, the application and Windows XP interoperate by design.

Choosing software and platforms that are compatible with many applications and other platforms is a benefit to governments and public-sector organizations because doing so provides greater value to the public.

Ease of Replacement

As innovation occurs, new tools, applications, hardware, and capabilities come onto the scene. Whether an organization is able to take advantage of them depends on previous choices: Perhaps the new tool will not run on the selected operating system, or the application is unable to operate on the data formats used by existing tools, or possibly the existing software has not been ported to the new hardware platform. For governments and public-sector organizations, the ability to provide new capabilities and take advantage of new innovations is crucial. In some cases it is a matter of providing what the public expects, and in others it might mean achieving significant cost savings.

Understanding Software Development

With software playing such an important role in governmental operations it is important to understand how it is developed and made available. Three important dimensions are as follows:

These three dimensions are independent, so that graphically they can be presented as follows:

Prepackaged versus Custom Software

Software can be produced as a package to be used by many people and organizations without any (or much) change, or it can be designed and implemented to suit the needs and requirements of a particular person or organization. These are two ends of a spectrum: off-the-shelf, shrink-wrapped software at one end and custom software at the other. Common, prepackaged software applications such as Microsoft Office, Adobe Acrobat, and Mozilla Firefox (a free web browser) are purchased or downloaded, and can be installed easily, usually without requiring expertise. Custom software is written for a specific purpose either by an in-house group of software developers or a consulting firm. Examples include Amazon.com, Google.com, custom databases, custom billing systems, and custom design systems for things like aircraft engines, bridges, and computer hardware. Typically custom software requires some expertise to set up and sometimes to use. Approximately 90% of the money spent globally on software is for custom software.

Much software is a blend of prepackaged and custom software. For example, a typical database system is a combination of an off-the-shelf database customized to a particular organization and combined with a set of prepackaged and custom applications.

The advantage of customized software is that it can meet the specific needs of an organization, something that generic, prepackaged software cannot do. However customization comes with a price: the need to pay the upfront development costs to write the software. This can entail major risk because a large number of big software projects fail outright and many more are delivered late and over budget.

Open versus Closed Development

Software development involves programmers writing source code—instructions that the computer carries out to do what the user—a person using the software—requests. For most commercial software, this source code is available only to the employees of the company developing it; the customers purchasing the resulting program never see the source code. For open-source development, a group of developers working as individuals—or for different companies—collaborate to develop the source code, sharing both the resulting program and the source code with end users. [3] There are also intermediate approaches where the source code is made available to a limited number of individuals or organizations—for example, consortia.

Because open-source projects are typically volunteer efforts, the software is produced on loose schedules, and the only way for an organization to ensure a change will be made to the software is to do the change itself.

For most people and organizations, having access to the source code is not important; however, for some, being able to examine and modify the source code (or to hire programmers to do so for them) means the ability to customize the software for their specific needs.

Proprietary versus Public Domain Software

The final dimension concerns who owns the software and what rights organizations have that purchase or receive it. This is all about licensing. At one end is proprietary software that is owned and controlled by the person, company, or organization that created it. Users of proprietary software have very limited rights as to what they can do with it. At the other end is software put into the public domain where anyone can do anything they want with it. Open-source licenses tend toward the public domain end, but many of the most common ones, such as the GNU GPL, include important restrictions to make sure that the source code always stays available to all. Note that open-source software is owned by its authors.

Open Standards

For software to be at all useful, it must be interoperable. To print a document a text editor program must be able to talk to the printer, a web browser must be able to communicate with the web server, etc. The rail gauge metaphor comes in handy when understanding the idea of open standards. Rail gauge is the distance between the rails in a railway. If every town or state had it own rail gauge or their rail gauges were secret, it would be tough for a train to pass from town to town—the undercarriages would have be changed or there would need to be a complicated multi-carriage system to enable a switch from one width to another. Or moving goods from place to place would require unloading rail cars at borders and re-loading the contents onto another train.

An open standard for software refers to a publicly available specification for achieving a specific task. It is a set of agreements about some aspect of a software system that concerns interoperability or compatibility. Some open standards are enforced by legal means through a standards body. For instance, the American National Standards Institute (ANSI) maintains a set of specifications for nuts and bolts. By doing this, ANSI ensures that a nut from one manufacturer fits a bolt from another.

The most important open standards are data formats. The Web depends on ISO standards for character sets, and without such a standard the Web might be sensible only within small regions that had agreed on a character set. Servers and browsers would be usable only in particular locales.

The Value of Open Source and Open Standards

Open-source software offers some important advantages to the public good. It is often of high quality, it is fairly secure, and it can be customized by software developers. Because the source code is available, open source has other not so obvious advantages as well, such as providing a literature that can be used to teach students learning to develop software.

Similarly, because the source code is available, there is the possibility that some of the software developers in the open-source community will experiment with new features, new designs, and new implementation techniques. This improves the overall quality of software, the state of the art, and the public good through experimentation and discovery, much as the free market fuels innovation by encouraging diversity and choice. Because of the open-source licenses, the fruits of these experimental and derived systems are generally available under the same liberal terms, and thereby aid the public good.

Open and free licenses benefit the developer, but even more, they benefit the user, by creating choice.

The initial cost of ownership for open-source software is often lower because the software is generally free, as are upgrades. However, the total cost of ownership (TCO) can be similar to other kinds of software because support, training, and maintenance are often the bulk of TCO.

Open standards provide an essential public good. Open standards enable anyone to write code to fit into and operate with existing code. Open standards are at the heart of interoperability. With open standards, a person or organization can mix and match components within a system. Without open standards—even when the system is based on open source—a person or organization may be unable to transform data stored in old documents or databases into the formats used by other or new applications. Open standards are a way of providing full choice to the user, and thereby efficiently and effectively lowering TCO.

Finally, supporting a healthy free-market system is a public good. Open standards, by providing a venue for small entrepreneurs, are naturally a potential source for new jobs. By leveling the playing field, open standards enable innovation and encourage competition. Open standards work to the user’s benefit, and are a win/win for governments, both as users and purchasers, and as promoters of economic growth.

Software Requirements for Governments and Public-Sector Organizations

With this background, we can now provide some recommendations for how governments and public-sector organizations might approach defining their software requirements.

Prepackaged Software

Prepackaged software should adhere to open standards. Neither how the software is developed nor the licensing terms are important. Companies that compete on open standards create value without locking in a decision to use proprietary software.

Data formats should be open and standard. To preserve the longevity of public information and data, a government or public-sector organization should not be locked into a standard that no one else can use (a proprietary standard) or one that no one actually uses (open but nonstandard). Data formats for numbers, text files, and especially for applications with common functionality are crucial for governments and public-sector organizations. When data formats are open standards, documents, databases, and files can be manipulated by a variety of applications and tools from a variety of sources, both from commercial sources and other places, making it easier to switch if that becomes necessary or desired. In other words, the procuring organization would have a choice. And 5 or 10 years down the road, the procuring organization would continue to have choices over new hardware and software by not being locked in to a particular manufacturer or system.

Note that staying with a single vendor as a means to achieve the benefits of open standards can be problematic if that vendor does not actually use open standards—if such a vendor releases a new version of an application that uses a data format that is incompatible with older versions of the application, important data can become inaccessible.

For security, transparency, and accountability, the source code should be available for inspection. This can be achieved through licensing regardless of the means of production, and does not require an open-source license nor does it preclude a commercial supplier. There are several common existing licenses for making source code available to organizations like governments or public sector groups, and special licensing terms can be created.

Custom Software

Following open standards and using standard data formats will gain the same benefits for custom software as for prepackaged software. There is more potential choice, interchange is possible, and longevity of public data is improved.

The tools and infrastructure to support custom software should conform to open standards so that components of the infrastructure or tools can be replaced without affecting the other components or tools. Code will be more portable over different systems and over time, and the code will be able to interoperate with other code through standard interfaces.

Because many different government and public-sector organizations provide similar services—for example, each state keeps records of motor vehicles and each city keeps records on property ownership—there are intriguing possibilities for agencies collaborating to create custom software. If enough commonality of function can be agreed upon, then different organizations can jointly develop new software, thereby sharing the costs and risks.

An example of a set of government projects operating under an open-source development model is the Government Open Code Collaborative. [4]

What The Development of the Internet Tells Us about Open Standards

The Internet began in the late 1960s as an ARPA (Advanced Research Projects Agency, a agency within the Department of Defense) as a way to share computer resources over telephone circuits and switching nodes. Over a 10-year period, the work, subsidized by the Department of Defense, went on, culminating in a set of protocols (open standards) for computers to communicate with each other reliably. Eventually the ARPAnet became the Internet, and now it is indispensable.

The Internet has enabled open source, and many of the tools and infrastructure behind the Internet are open source. There is every likelihood that open-source software will form a major part of the public infrastructure going forward, but the goals of the amorphous open-source community are not necessarily the goals of the long-term public good.

The key to the Internet was open standards, and so should it be for governments and public-sector organizations.

References

  1. These 3 points are basically the 3 points mentioned in a Bill passed by the Peruvian government in 2002. For more details see: http://www.opensource.org/docs/peru_and_ms.php Dan Bricklin also touches on these in his “Software That Lasts 200 Years” essay: http://www.bricklin.com/200yearsoftware.htm
  2. http://www-03.ibm.com/servers/eserver/support/unixservers/aixfixes.html, http://www.openwall.com/linux/, http://patches.sun.com/clusters/9_SunAlert_Patch_Cluster.README, http://www.securitytracker.com/alerts/2004/Jul/1010759.html, http://docs.info.apple.com/article.html?artnum=61798, http://www.microsoft.com/athome/security/update/default.mspx
  3. http://opensource.org/docs/definition.php
  4. http://www.gocc.gov/