This article is from the July 2000 issue of SoftwareDevelopment magazine.
Applications used to run on centralized mainframes that also hosted data. Things arent so simple anymore. by Scott Ambler
You have just won the lottery. Now youre out on the street, entrusting your big check to the local bank machine. You press the deposit button, select your savings account, input the amount that you have won, stick the check in an envelope as requested, feed the envelope into the machine, and it suddenly shuts down. The bank machine has taken your check but has not updated the balance in your account. Will you get your money or not? If the system is built to be transactional, youll be able to buy that big house and car; if not, you still have to report for work on Monday. A transaction is a unit of work that either completely succeeds or completely fails (the terminology of transactions is summarized in the sidebar). In the lottery check example, the transaction is to accept the deposit, update the balance in the appropriate bank account and post a financial record documenting the deposit. Either all three of these steps should occur or none of themthere is no middle ground. This example actually describes a distributed transaction, one whose steps occur in several different software nodes. The action of taking the deposit occurs at the bank machine itself, the updating of the account would occur on an application server somewhere within the banks system, and the recording of the transaction would occur within the banks database. The update of the account balance is an interesting step because it occurs at two nodes: within a bank account object running on an application and within the corporate database used to persist the bank account object. The need for transaction control has existed since at least the 1960s, when information technology was first deployed for business applications. At that time, the system architecture was very simple: An organization ran software applications on a centralized mainframe computer that also hosted the data source(s) for the applications. The application itself typically acted as the transaction coordinator and focused on ensuring that data was consistently read from and written to a single data source. As complexity increased, it was common for an application to access several data sources simultaneously, heralding the introduction of sophisticated transaction coordinators called Transaction-Processing (TP) monitors (for example, IBMs CICS in the mid-to-late 1970s). Over time, the types of transactions that applications required also became more complex, requiring distributed architectures as well as object-oriented and complex database technologies as implementation platforms. Figure 1 depicts the evolution from a centralized mainframe application to the distributed, n-tier architecture employed by leading-edge applications today.
Complex Architecture Transaction control is very straightforward when you have a single program manipulating the data of a single data source; it can often be managed by your data source itself. In fact, sophisticated transaction control is a common feature in most database technology (Oracle, Sybase, DB2 and Versant). For simple applications, the approach often proves sufficient. Unfortunately, few things are simple these days. Modern applications are written using distributed object technology such as Enterprise JavaBeans, Common Object Request Broker Architecture (CORBA)-based technology, and, ostensibly, Microsoft COM+-based technology. These technologies typically have clients, such as personal computers or Internet browsers, interacting with application servers using object-oriented or object-based technology that in turn interact with databases on the back end. While the implementation architecture may have changed, the need to support ACID (atomic, consistent, isolated and durable) transactions has not. Your distributed object applications therefore need distributed object transaction control. Consider an example: Figure 2 depicts a simple business schema, using the Unified Modeling Language class diagram notation, that is to be implemented within a three-tier client server environment. An object-oriented version of the business schema implemented in Java will be deployed to the application server, which in turn will be mapped one-to-one to an identical relational schema in the database. I have kept the example simple, so I can focus on the issues involved with distributed object transactions; I could have used another language such as C++ or Visual Basic, implemented this on an n-tier architecture involving many application servers and/or many data sources. The environment may become more complex, but the fundamental issues remain the same.
Figure 3 depicts a UML sequence diagram for successfully depositing $10 into
a bank account. Sequence diagrams are used to show the dynamic interactions
of objects fulfilling a usage scenario, a topic that I cover in detail in
the newly released The Object Primer, 2nd Edition (Cambridge University
Press, 2000). The user interacts with the banking system via a bank machine
(the client of Figure 3) and requests to deposit $10. The bank machine interacts
with an instance of the Deposit class that controls the overall process.
It interacts with the
The Good Old Database Days There are several interesting points to be made here. First, transactions must be managed throughout your entire systemthey are neither simply an application server issue nor a database issue. I have run across several organizations that are struggling in adopting new technology because they are still mired in "old-school" thinking from their centralized-database days about the transaction management features of their relational database. Second, distributed object transaction control is very hard. My example was simple; in actuality, you would need to implement this in an environment where there are many application servers accessing several databases and thousands of concurrent users.
Third, locking is a key enabler of transaction control. As each transaction
resource attempts an operation, it must obtain a lock on itselfeither
for read or for writeto ensure the ACID characteristics of the transaction.
When the deposit is being made to the account object, a write lock would
be obtained, whereas when the deposit record object is being posted, a write
lock would be obtained on it and a read lock on the appropriate instance
of Fourth, referential integrity, like transactions, is a system-wide issue and not simply a database issue. Referential integritythe act of ensuring that associations between objects are always validis a business logic issue. It is a fundamental mistake to assume that your database is sufficient to maintain referential integrity within a distributed object application, a mistake commonly made by "old-school" thinkers. It is just as important that the association between a customer object and an account object on your application server is valid as is the association between the customer row and the account row in your relational database. Finally, transaction control is a common requirement for most of todays business applications and is therefore a topic with which all developers should be familiar. When do you need to apply your newly gained knowledge about distributed object transactions? Whenever you are building systems using common technologies such as the Java Transaction Service (JTS), the Common Object Request Broker Architecture (CORBA), Object Transaction Service (OTS) or Microsoft Transaction Server (MTS). More information can be found about each of these technologies at java.sun.com/products/jts/, http://www.omg.org/ and http://www.microsoft.com/, respectively. Distributed object transaction management is hard, but it is critical to your success as a software developer. For further reading about transaction management I recommend the books An Introduction to Database Systems (Chris J. Date, Addison-Wesley, 1995), The Essential Distributed Objects Survival Guide (Robert Orfali, Dan Harkey and Jeri Edwards, John Wiley & Sons, 1996) and Distributed Object Architectures with CORBA (Henry Balen, Cambridge University Press, 2000). I also recommend the Web site http://www.opengroup.org/ for information about Extended Architecture (XA), an industry standard for distributed transaction processing. The secret to success is to recognize that your transaction must work end-to-end, from your client through your various application servers all the way down to your data source(s).
The Terminology of Distributed Object Transactions Transaction. A transaction is a unit of work, perhaps a collection of updates on several objects, which either completely succeeds or completely fails. A nested transaction includes subtransactions that may succeed even though the parent transaction does not. A flat transaction has no subtransactions.
|
|||||
Copyright © 2000 Software Development magazine. All rights
reserved. |