|
© Patricia Seybold Group's Strategic Research Service : July 3, 2003
Go Back for the Future for Content Management
Content Management Will Evolve Like Data Management
By Mitchell I. Kramer, Sr. VP and Sr. Consultant, Patricia Seybold Group
Netting It Out
We have reached a crossroads in the evolution of content management. Customers and users are feeling serious pain in trying to cope with ever-increasing volumes of content. At the same time, content management software is proliferating with separate systems for each type of content and no system with the capabilities to manage all of an organization's content. Increasingly, we're beginning to see content silos just the way that we had seen application silos and data silos in the past.
In fact, it's exactly the past that provides the prescription for content management pain. The evolution of content management is following the evolution of data management exactly, just offset by 25 years or so. We can apply the lessons that we learned in data management to content management. We can predict the direction of content management based on those lessons. In data management, growth in data volumes and proliferation of data management systems were addressed with relational technology and its standards for data storage, structure, and access. In content management, we anticipate similar standards, although their details have not yet emerged. Our bet is that they'll leverage the capabilities of relational database management systems for storage, access, and infrastructure as well as XML for structure.
The Lessons of History
You know the clichés:
- If we don't remember history, we're doomed to repeat it.
- Whatever goes around comes around.
- Hold onto it long enough and it's bound to be back in style.
- There's nothing new under the sun.
These old saws apply to information technology as well as they apply to other facets of our personal and business lives. Information technology has always moved in a predictable sequence of phases. Every type of IT, whether it's hardware or software, moves in a sequence of phases from innovative, raw, and unstructured technology through commoditized, standardized, packaged products.
Content management is in an early phase. Our customers have been telling us for the past six months or so that content management has become a major problem for them. Part of the problem is the ever-increasing volume of content that they have to manage. Part of it is that today's content management systems can't manage all their content. Part of it is that they're beginning to have content management silos.
We know where content management is going because we remember our history. Today's content management issues are remarkably similar to the data management issues of the early and mid 1980s. And today's content management systems are remarkably similar to the data management systems of the 1980s. What happened next in data management will happen next in content management–standardization and then commoditization. Trust us and get ready for the changes. We know our history. In fact, we lived it.
The Evolution of Data Management Data: 1964 to 2003
Data management has evolved significantly over the past forty or so years. We make this statement from experience. Your writer has witnessed and experienced this evolution. Its key phases are listed below and shown in the timeline in Illustration 1.
- Data integrated in applications
- Data in application-dependent files
- Data in application-independent, proprietary databases
- Data in application-independent, standard databases
- Data in application-independent, commoditized databases
The Evolution of Data Management

Illustration 1. This illustration shows the evolution of data management from 1964 to the present.
DATA INTEGRATED IN APPLICATIONS.
The earliest data management combined data and applications. Every application included an internal data definition and performed its own management functions. The data for one application was independent of the data for all other applications. Data wasn't shared between applications.
Between executions of an application, data was first stored on paper tape and punched cards then on magnetic tape and disk. Early usage of disk storage required that applications allocate and manage their own disk space and access their data using the commands defined within the disk hardware.
DATA IN APPLICATION-DEPENDENT FILES.
As disk technology and operating systems matured, access to external storage became much simpler. The advent of file systems was a huge advance. Applications and their data remained tightly intertwined, but files began a separation of applications and data. By sharing file definitions, applications could begin to share data. This sharing of data reduced data duplication and simplified application development. However; synchronizing data updates became a new issue.
DATA IN APPLICATION-INDEPENDENT, PROPRIETARY DATABASES.
Sometime around the late 1960s and early 1970s, the first database management systems emerged. Database management systems completed the separation between applications and data, making data application-independent. They also eliminated data redundancy. One database provided the data requirements for multiple applications. Databases also supported the concept of transactions and implemented the transactional data properties of atomicity, consistency, integrity, and durability (ACID).
Two database structures predominated: hierarchical and network. Structure described the types of relationships that the DBMS supported between its data elements. Hierarchical DBMS implemented a tree structure of parents, children, and siblings. Network structures were more flexible. They could support hierarchical relationships, but they also supported more general relationships.
These early DBMSs were proprietary because each implemented its own data structure, approach to database design, and language/interface for data access. There were no standards for data structure or data access. Arguably, there were not even any best practices. Each vendor took its best shot at addressing its customers' requirements using experience in file systems and the data structures theory of the computer science of the day. In those days, much information technology used commercially was driven by the research in universities rather than by vendors, as is increasingly the case today.
Examples of these early DBMS were IBM IMS, Software AG Adabas, Cincom Supra, and Cullinet's IDMS. IMS offered a hierarchical data structure, Adabas and IDMS were network structures. Many of these DBMSs are still in use today, supporting operational applications in finance and telecommunications. For example, many of the operational systems of the old Bell System were built on IMS, and many of the current regional telecommunications companies still use them.
DATA IN APPLICATION-INDEPENDENT, STANDARD DATABASES.
In the late 1970s, the first database management systems built on relational technology began to appear. Relational technology held the potential for a standard approach to designing data structures and for accessing the data stored within those structures. 1970s-vintage relational technology was not commercial grade. DBMSs built on early relational technology were proofs of concept and prototypes, but they really did prove the concept. By the middle 1980s, there were somewhere around a dozen viable RDBMSs from then new companies like Digital Equipment Corporation, Informix, Ingres, Oracle, and Sybase. IBM moved into the relational era with DB2, but most of the other network and hierarchical database suppliers did not make this move and began a slow fade into oblivion or, at least, near oblivion.
The most significant aspects of RDBMSs were the complete separation of applications and data begun with the proprietary DBMSs and their use of common interfaces and languages for both their design (data definition languages or DDL) and their access (data manipulation language or DML). SQL is the DML for RDBMSs. In case you forgot, SQL stands for structured query language.
RDBMSs were standardized beginning in the late 1980s. SQL89 was the specification for standardizing RDBMSs DML. SQL92 was an update. Most products adhere to all of SQL89 and parts of SQL 92. However, all of the RDBMSs have proprietary extensions and their own SQL "dialects."
DATA IN APPLICATION-INDEPENDENT, COMMODITIZED DATABASES.
Microsoft's RDBMS offering was SQL Server, OEMed from Sybase and rebranded with little change to the Sybase offering. In 1999, Microsoft introduced SQL Server 7, a version that was built by Microsoft R&D. In addition to being built in-house, SQL Server 7 was offered with Microsoft pricing — less than $10,000 a copy. Microsoft's entry commoditized the market. Competing vendors found it increasingly difficult to continue to get prices that had been an order of magnitude higher for anything but their top-end offerings. In addition, as another consequence of Microsoft's presence, the RDBMS market began a consolidation that continued through last year. Today, there are three major database vendors — IBM, Microsoft, and Oracle. In addition, Teradata plays a significant role in the data warehousing market. And Sybase has survived as a secondary database supplier.
THE EVOLUTION OF CONTENT MANAGEMENT
Content: 1975 to 2003
Content management has followed the same path as data management, but content management technology got off to a later start and has not yet reached the level of maturity of data management. Tracing the evolution of content management, the phases and events are very similar to the evolution of data. We can identify the phases listed below and shown in Illustration 2:
- Content integrated in applications
- Content in application-dependent files
- Content in application-independent, proprietary databases
The Evolution of Content Management

Illustration 2. This illustration shows the evolution of content management from the 1970s to the present.
CONTENT INTEGRATED IN APPLICATIONS.
The first content management systems were mainframe-based document management systems. Our first experience with them was in the mid 1970s (Do any of you remember Script/370?). These multi-user systems were designed to format relatively large documents and to control their versions and life cycle phases. They contained their own authoring tools, their own markup languages (Runoff and TeX were early standard efforts, which evolved into SGML), and their own management services. You couldn't share documents between systems.
CONTENT IN APPLICATION-DEPENDENT FILES.
Through the early 1990s, content management systems managed only documents. They were better known as document management systems. They supported document authoring with the popular word processing systems of their day, but they continued to have their own formatting languages and document management services. Their output was stored in files with formats accessible only by the systems, themselves.
With the rise of the Internet in the mid 1990s, content management expanded to Web content, and its management took a step backward. Every Web application managed its own content, some internally integrated with application logic and some in external files. For example, in our coverage of e-commerce systems in the mid and late 1990s, we found that every e-commerce product combined customer, product, and order information that was stored in an RDBMS with Web page templates stored in local files to produce HTML pages. The templates were unique in format and structure to the e-commerce product. This rather crude approach to content management was replaced by a more systems-oriented approach that is consistent with the next phase of content management.
CONTENT IN APPLICATION-INDEPENDENT, PROPRIETARY DATABASES.
The next phase of content management begins in the mid to late 1990s and continues today as the state of the market. Similar to the hierarchical and network DBMSs of the 1980s, the content management systems of today make the clean separation between content and applications, store and manage content in hierarchical and network structures (the similarities are incredibly striking), and provide content management system-specific language/interfaces for defining and accessing the structures. There are no standards for the structures or their management and manipulation interfaces.
Another striking similarity between the 1980s DBMSs and today's CMSs is their proliferation. There are around a dozen content management systems. Geoffrey Bock has just completed a framework-based report series on seven of them. In addition, the current versions of many e-commerce systems now also package content management functionality. Most recently, campaign management products have taken the same approach. For example, both Aprimo Marketing and Unica Affinium have add-on features that provide content management functionality. Many of the portal platforms also now offer content management services, too.
BACK TO DATA FOR THE FUTURE OF CONTENT
Three Key Drivers
If we go back to the evolution of data management, we see that its next phase was the standardization of structure and of design and access interfaces. Will standardization be the next phase of content management? Yes, for three key reasons:
- First, and most significantly, customers of content management systems — both users and the IT staff that supports these systems — are feeling pain. They're faced with managing rapidly growing volumes of content with content management systems that are content-type specific, that are application-specific, that are different from each other, and that don't integrate. Each stores content in its own way. Each provides a distinctive set of content management services. Each provides a distinctive interface to those services.
- Second, information technology moves in cycles. The next phase in the cycle of content management is standardization. It's inevitable.
- Third, there are early signs of standardization. XML offers the potential for standardizing the structure of content. RDBMS vendors like IBM and Oracle have begun to integrate content management services into their databases. They've long supported the storage of content through text management capabilities or through object/ relational technology.
The confluence of these three factors will drive toward standardized content management. It is still early in this evolution. Only today's best content management systems — Documentum 5 and FileNet P8, according to Geoffrey Bock — support multiple content types, rich content models, and a broad range of content management services. Also, XML is only a start toward content standards.
Content management within RDBMSs will be a key driver. As simply a database feature, RDBMS-based content management will lower the price for content management, especially so considering that one of the RDBMSs is Microsoft SQL Server. It will also add a layer of services lacking in current content management systems. These services — security, recovery, replication, versioning, load balancing, and failover, to name but a few — have long been integral to RDBMSs. RDBMSs will also provide the interfaces for content DDL and DML. These won't be standard interfaces, but they'll most likely be extensions to the administrative toolsets of the RDBMSs.
RDBMS-based content management has additional significant advantages. These systems enable the consolidated storage and management of all information — structured data and unstructured content. They support content object models naturally within their standard features. RDBMSs are already used by many of today's content management systems for storing content and/or content metadata. In addition, you already know how to implement them and how to use their services.
GETTING READY FOR THE NEXT PHASE
Customers
Organizations that have implemented content management systems will ultimately have their current pain salved, but we don't think that it will be a smooth path from today's content management systems to those of the future. There's no easy migration or conversion from the current hodgepodge of content management infrastructures to a coherent RDBMS-based infrastructure. Fortunately, you won't have to mess with your content. Your content, itself, can naturally be stored in RDBMSs. RDBMSs can store anything. The management part is a lot harder. The structure and the services that you currently use to manage that content will not convert easily.
We urge the RDBMS suppliers to provide conversion and migration tools. We further urge them to develop automated tools and not force you to rely on expensive consulting services packages.
Suppliers
For today's content management suppliers, the message is simple and straightforward: get out of the way. Abandon your proprietary content management infrastructures. Forget about low-level content services like recovery, replication, versioning, globalization, application integration, and publishing. You'll never compete effectively in these areas. RDBMSs have been doing this stuff for 20 years. Focus instead on higher-level services and applications such as categorization, taxonomies, and process management. Consider moving into content analytics. Plan how you are going to evolve your current platforms to support these higher-level capabilities. Plan your exit strategy.
When?
It will be three to five years before content management is standardized, one-and-a-half to two years before RDBMS-based content management systems begin to offer the complete range of the services you need for the types of content that you manage.
Think twice before investing in a new content management system, especially one that purportedly addresses enterprise-wide requirements. Tactical systems with smaller scopes and lower costs are better choices for now. IBM's and Oracle's offerings should become more attractive to you, especially for the long term, even if they don't have all the higher-level services of traditional content management suppliers. And keep your eye on Microsoft. Ask your suppliers for their long-term product plans. Ask your suppliers what RDBMS facilities they currently use beyond content and metadata storage.
© 2003 by Patricia Seybold Group, 210 Commercial Street, Boston Massachusetts 02109-3504. Telephone 617.742.5200, Fax 617.742.1028, Internet: http://www.psgroup.com
Reproduction in whole or part is prohibited. For reprint information, call 617.742.5200.
Customer Scenario and Customers.com are registered trademarks of the Patricia Seybold Group, Inc. Customer Flight Deck and Quality of Customer Experience (QCE) are service marks of the Patricia Seybold Group, Inc.
All material distributed through our Web site, by e-mail, and in print belongs to the Patricia Seybold Group. If available, you may retrieve and display this material on a computer screen, print one copy on paper (but not photocopy), and store in electronic form (but not on any server or other storage device connected to a network) for your personal, non-commercial use. You may not — without written permission — reproduce, retransmit, redistribute, disseminate, sell, publish, broadcast, circulate, or in any way commercially exploit any of this material. In addition, you cannot remove the copyright or trademark notice.
Permission requests should be addressed to feedback@psgroup.comor 617.742.5200.
|