Technology Works: 2008

Wednesday, November 26, 2008

EA is doomed to fail!

Enterprise Architecture(EA) in any organization is bound to fail because it doesn't do a good job of capturing the true state of an enterprise. When people talk about EA then they talk about systems, applications, and assets within the enterprise however organization structure, an understanding of the business itself and its processes are not captured properly.

Anyway that is a quick rant!!!

Sunday, October 26, 2008

It's cool like that!

For the last few days, I have been playing with Ruby and Groovy. I have to say it very exciting to play with these technologies. Like typical Java applications once you figure out how to configure the application, the development of these applications is very straightforward. I don't have have worry about compiling the files, generating a Web Application Archive (WAR), and deploying the WAR file on the servlet container. Ruby and Groovy for most part take care of this issue. From a software architecture perspective, these two technologies promote software best practices by having built in capabilities to design pattern.

The core philosophy is that it saves time by simply customizing a baselined web application rather than build each class, jsp and configuration file. I was amazed when I built a simple prototype in Ruby and Groovy. It is also takes less time to learn the technology because folks learn alot more when they plan with the code rather than reading a O'reilly book or a Recipe book. It's exciting!

The downsize is that it takes alot more time to understand how the code actually works since the inner working have been abstracted out. I also notice that I would use a standardized approach in naming methods, pages, and classes.

To increase my learning, I have been doing Groovy on JetBrains IntelliJ. It's great and it's better than eclipse. Enjoy!

Sunday, October 5, 2008

XML Driven Software Architecture

As technologies are evolving from a procedural to object oriented and architectural principals as loose coupled and flexible software is ideal, developers are constantly thinking or taught to think that abstraction is good and brittle code is bad. This idea has also evolved in Information Technology (IT) enterprises where the notion of a service has caught. Since IT enterprises involve more than hard-core java drinking code junkies, services have evolved into business services where services abstract business processes from their supporting IT systems. As Business Processes are changing with respec... WAIT!!! This is a Technie blog entry so lets stay in the realm of software, hardware and whatever makes a java drinking, red-bull drinking, dry cereal snacking technie happy.

As software architects and developers are praising the likes of Gavin King, who developed the Object Relational Mapping (ORM) technology Hibernate, since they developed technology which abstracts developers from knowing how the data is stored in a relational database. ORM technologies allow programmers think in objects rather than normalized relational tables with primary and foreign keys. The wide argument is that programmers won't get bogged down by writing sql queries but they can simply think of their code as objects. The problem with this notion is that sometimes ORM queries can be poorly written.

As I was pondering this issue today, I realized that ORM still couples the software code like Java to the structure of the database. From my experience, when new tables or fields were created in the database, the ORM code has to be modified and this in turn may affect the actual business logic code. This could get painful when new requirements or the Database Administrator (DBA) has provided a new design to optimize the database.

One way to get around this issue to design the business datamodel as a XML Schemas (XSD). After working with XSDs, I believe they do a great job of capturing the object model. After creating the XSD, generate XMLBeans or JAXB classes and code to these classes. The JDBC code or ORM code can be then communicate to the business logic tier via the XML binding classes. This way the XML binding classes provide a layer of abstraction between the data tier and the business logic tier. The problem with this approach is that XSD can be hairy however if you understand XML and XML technologies than this may be the way to go. The other issue is that there could be issues in the generate classes or how you created the XSD. Once again if you are comfortable with XML then may be a way to go since I think it could be better than ORM.

Please let me know if you disagree with me.

Wednesday, October 1, 2008

Practical Benefits of Architectural Views and Models

I have been assigned to lead a Services Oriented Architecture (SOA) prototype effort for the organization I work for. Since this project involves building a SOA solution with canned data sources (the easy part) and then formulating best practices in SOA governance (the hard part), I have begun to appreciate the skillset of an architect. IT Architecture helps folks, who are in the architect role, to create artifacts which can communicate the core IT system architecture, process architecture, data architecture, etc., etc, to various audience groups. Creating conceptual models and logical models provides a great mechanism to communicate complex or abstract ideas in a straight forward manner. IT Architecture doesn't simply involve drawing pretty pictures in Visio and show-off your Visio skills but it is more than that. It is the ability to take a system design, complex design and simplifying it to a diagram which conveys the necessary details to its appropriate audience. Being an architecture enables you to improve your communication skills, able to adapt and communicate to various audience in appropriate terms in appropriate detail. It is important to know that even when you are simplifying a concept for a technically novice audience, you cannot compromise the underlying physical architecture of a system. The architect has to know the whole design but then he is astute enough to decipher what is needed and what is not needed when the design is communicated to the appropriate audience. Anyway today I can say that I managed to convince key stakeholders in the project on how the system should be designed. This was a big accomplishment in the project. For any aspiring architects, you will miss developing code, you will miss updating your spend plan but in the end you will be responsible for marketing the system, improving the system and most importanly, communicating the system to the appropriate audience.

Saturday, September 27, 2008

Oracle Open World 2008

Last week, I was in San Francisco, California attending Oracle Open World 2008. It was an event organized by Oracle to show case their latest suite of Information Technology tools. The Oracle tool set ranged from existing open source technologies with new functionality to the umber fast Relational Database Management System (RDBMS) called Exadata. The EXADATA is a Hewlett Packard (HP) and Oracle product which couples Oracle's RDBMS to some heavy duty hardware from HP. The sessions I attended and which I enjoyed were around the topic of Services Oriented Architecture (SOA) governance. Here are some cool things I saw:

Using SQL Developer, you can migrate any major database to Oracle.

RDFS/OWL Semantic Database - Cool stuff and we are getting closer to WEb 3.0

Oracle Times-Ten - An in-memory database. It is used for caching and it can be accessed via SQL/PL SQL

Oracle Coherence - Used in caching in a grid architectured system. It can interact with various Java Virtual Machines (JVM). This is cool stuff.

No question Oracle has acquired or created some new tools however it is to be seen how they maintain these products. Overall I give Kudos to Oracle for setting up this event. It was a successful event!

Monday, August 18, 2008

That was Suite! Not really

This morning I was asked to go to an all morning presentation by Oracle. The presentation explained Oracle's vision with its products and its newly acquired company's, BEA, products. The core presentation was given by Thomas Kurian Senior Vice President for Oracle's Fusion Middleware. The presentation was insightful and it showed that Oracle has a defined and definite vision. I, however, being a skeptic will accept Oracle's vision when I see results. Other than enjoying Oracle's flagship product which is their relational database product, I have not been impressed with Oracle products. That being said, I do believe that Oracle is going to keep the Weblogic Application Server as well as other quality products. It is going to be interesting how other vendors will compete with Oracle in the Application and Middleware space.

Like Oracle, IBM has been buying strategic companies like Oracle but they NOT are buying monster size companies like BEA or PeopleSoft. IBM products work well however they are very expensive and their support is expensive as well. Oracle is following the Sun and RedHat approach which is to follow the Open Source paradigm. Thomas Kurian stated that Oracle is going to streamline the Weblogic IDE,which is based on Eclipse framework, make it open source. It will be interesting where Oracle Fusion Middleware will be in the next two to five years. During the presentation, Thomas Kurian talked about the Weblogic Suite, SOA Suite, BPM Suite and every other Suite possible. So in closing I say, "That's suite!" Actually I say this Larry Ellison, show me good software and I will say, "Suite software". If the software is not good then I will call the Oracle Fusion Middleware, law-suit software and not suite software.

Monday, August 4, 2008

Zapped

Last week, I was in a ZapThink Service Oriented Architecture (SOA) Bootcamp. It was a descent class since they derive their material from actual SOA implementations in large corporations. They presented high level principals and "gotchas" but I felt it lacked something but I cannot tell what. I thought the ITIL v3 Foundations course was more intense and it actually presented some excellent concepts. I highly recommend taking ITIL v3 Foundations course and the ZapThink bootcamp does a good job of re-enforcing overall same concepts which are:

How does IT solve a business problem? This is key since vendors and developers tend to buy the tool or solution and then retrofit it to a business problem. I have been involved in couple of projects which spelled D-O-O-M-E-D after they were cancelled.
Capture Business Processes - This is an other key since requirements are derived by analysts which inturn are passed down to the architects and engineers. Architects and Engineers design solutions to the requirements. If they had an insight into the business processes, the solutions could be better and cheaper.
Business Agility - Flexible IT systems provide businesses more flexibility to be agile. Services support this paradigm however defined Configuration Management processes are key as well.
Architecture - Enteprise, Data, Process, and Technical Architectures are key since they provide a defined approach in designing a system or an enterprise. No more Ad-hoc development projects. As a developer and an as an architect, this is key since developers love to know their constraints. We, as developers, have been burnt by poorly written code or no documentation.

Overall good IT architecture with SOA or no SOA, involves alot of common sense, a business understanding and knowing your constraints. Ideas are great but we all have to be realistic.

Friday, July 18, 2008

Technology Injection is bad!

You must of heard of Anti-patterns or practices which cause more problems in any Information Technology. Well here is one that I see in my current employment and I call it Technology Injection is bad. At my current employer, an average software development project has a duration of two to five years. The common problem with this approach is that new technologies are introduced or injected in the middle of the project cycle to address certain requirements. Is this a good approach? No! Since the new technologies are not researched well and they usually cause more headache since it also introduces new infrastructure, new skill sets associated with the technology, and it has a lasting impact on the project and eventually on the system.

Then the question comes up, "How do we address the problem of Technology Injection?"

The answer is that the system has be well architected which promotes loose coupling, well defined functional requirements, and not deviating from non-functional capabilities. Each architecture should be flexible enough for change but change would be done in a phased approach. If the architecture is not flexible then be sure to have those big yellow stickers which say in menacing black print, "LEGACY SYSTEM", since you might need to use them in five to six years. Like any IT approach, research and development (R&D) and thorough technology analysis need to be done to identify which technologies can be safely injected into the system.

Thursday, July 17, 2008

IT Architecture = Communications

For the last six months or so, I have been working for the US Government as a Services Oriented Architecture (SOA) Technical Lead. Eventhough that is my functional title, I act more as a Information Technology (IT) architect. It is my first job where I am an IT architect. At my previous job I was a part-time XML data architect where I did alot of data modeling, data analysis and then defend why I modeled certain data structures a certain way. When I was not a XML data architect, I was a JEE developer. Eventhough developers aspire to be an architect because they can get to design the systems and leave the grunt work to other developers, this may not be true when the developer becomes an architect.

Couple of summers ago, I took a class to become a Java Enterprise Architect and it is there I learnt that an architect doesn't have a set role. As a developer or a project manager, your roles are pretty much defined but an architect's role is ambigous. In the class, I was taught that one of primary responsibilities of an architect is to see his system or project succeed. In some projects, he can be coding and developing reference architectures and in other projects, he is basically working with the project manager in refining requirements. After being an IT architect for the last six months, I have to say that the instructor was absolutely correct. An architect's role is ambigious to a project but every project must have an architect. An architect is a rare breed because:

he understands what is being built and why it is being built
he can look at code and have code level discussions about the software code.
he understands the business problem and why the system is a value to the customer
he can work with business analysts in refining requirements or make them more abstract which might give the development team more flexibility
he can sit with the project manager and hash out a realistic Work Breakdown Structure (WBS)
he can recommend modifications to the project scope if the project is running behind schedule
he doesn't need to be the best programmer in the team
he doesn't need to design the system but rather review them
he doesn't need to write a test plan but he can evaluate it
he doesn't need to be a guy. Women can be architects too.

And the biggest trait an architect should have is that he or she should be able to

COMMUNICATE

with various stakeholders and he or she should be able to discuss the system in technical, non-technical, and business terms. Architects should be willing to exchange ideas with other architects and they should humble enough to accepts mistakes if they made them.

The role of an IT architecture will expand as more businesses go into collaborative modes of information exchange since they need to be aware of standards and they should understand the newer technologies. So for all aspiring programmers in the world, it's time to shelf the pocket protector, clean the stacks of burnt or blank CDs, get a haircut, stop living on Red Bull and be more social. Try to interact with your teams and peers since a great idea is bad idea if you cannot communicate it and a better idea is to have your good idea validated with others.

Saturday, July 12, 2008

A breath of fresh air. I think Spring is in the air

For the last month or so, I have been working with the Spring framework in developing Java web applications and let me tell you, "It's been a pleasant and a refreshing experiencing." Rod Johnson and gang have finally presented the Java development community with a framework that few people will complain about. My background is Java is that I have been working with it for about five years.

I remember the days when I was looking at Java classes which had HTML code buried in it. Talk about brittle and hard to read code. For example I have also looked at existing production code which consists one JSP which has eight thousand lines of heterogeneous code. When I say heterogeneous code, I mean that the JSP has HTML with scriptlets and JavaScript with scriptlets. The JSP had multiple views, JDBC calls and security. This JSP is still in production and it is used by our beloved government. When I first looked at that messy JSP, I asked the programmers and management who conceived the hideous JSP, why was the JSP created this way and how come no one refactored it. The answer was, "a few years ago, the code was based on a Sybase application server which had to be restarted whenever new code was added." Since JSP pages don't require the application server to be restarted, they went that route. After being on that project for a few months, I realized that the underlining problem which influenced the generation of the hideous JSP was poor project management which included poor schedules, not able to lower customer expectation, etc. etc. Nevertheless the project is slowing moving in the right direction. They have started using Spring, introduced a build process, and they actually sit down do design their application where they are introducing design pattern and promoting the ideas of code reuse, loose coupling, flexible code base, etc., etc.

If used properly, Spring promotes sound architectural practices which make sense. The concept of Inversion of Control (IoC) is very powerful and very simple. The ability to have properties editable from a XML file and then injecting them into classes rather than tracking them down to specific classes. I have seen where folks have created a Properties interface which had public static final variables with properties attached to them. This Properties interface would then get implemented with all the classes that needed it. I thought it was neat concept however it "irked" some of developers. A text properties file is the ideal way but sometimes it can be hard to find the appropriate property for a specific class without going through the class.

After buried many days, months and years in trying to decipher a COTS API which had very little documentation and it is not easily implementable. It is a truly a breath of fresh air to use an API like Spring which is built on solid principles and has incorporated ideas and work from previous frameworks like Struts. Kudos to Rod Johnson and gang!

Friday, July 11, 2008

"We won't mention their name..."

Three weeks ago, I went to Microsoft Technology Center in Reston Virginia. I was part of the Federal Aviation Authority (FAA) team which listened to Microsoft's overall strategy for the next three years, how Microsoft licenses were distributed across the FAA, and listened to their product experts regarding their new products and got a tour of their facility. It was a long session and some parts were interesting. I was amused when their product experts tried to toute the web 2.0 capabilities like creating wikis and blogs on the new Sharepoint product however I asked them if they did any research why they decided to include web 2.0 capabilities in their products. They had good reasons for the wikis since Wikis have proven to be an excellent source of knowledge via Collaborative means. However they didn't give me a good reason why they included blogging software, I realized that big companies like Microsoft provide the "new" capabilities eventhough they don't know if it will ever be used by their customers. I found that very interesting.

The other thing that I thought was hilarious was that their whole staff was taking stabs at Google. Eventhough Google and Microsoft product and services slightly overlap, I didn't think Microsoft and their employee would truly hate Google. For instance, one presentator talked about email services and used Hotmail as an example. Then he stated their products work well Yahoo and other email providers including "the other email provider who is still beta". Then he interjected, "how long are they going to have beta anyway" If you don't know what I am talking about then check out Google Mail or Gmail which is still beta.

Microsoft also stated they are rewriting their whole stack of products to introduce cloud computing since "the other company is planning to expand into this realm." Amazon.com currently provides this service and I know one person who uses their backup service and he is happy with them.

Anyway it looks like Bill Gates new tarket is "the company" that neither he or Microsoft will mention its name. "Hey Bill, 'Google' is not a buzzword but it is simply a company you are jealous of"

I til, you til, we all til for ITIL

For the last three days, I was taking the Information Technology Infrastructure Library (ITIL) version 3 Foundation for Service Management course. Eventhough the course was three days long with one hour lunch break, two working lunchs, nine breaks which lasted from anywhere ten to fifteen minutes, I enjoyed the course. I think ITIL has figured out how to address the Services Oriented Architecture (SOA) lifecycle. ITIL is not proscriptive but rather descriptive. This means that ITIL doesn't promote a methodology like Rational Unified Process (RUP) or Program Management Body of Knowledge (PMBOK) but rather it provides very high level principles which can be used in your organization's methodology. I was familiar with ITIL version 2 and I thought version 2 did a good job of addressing some of the high level processes but it did not tackle the concept of services well. Version 2 stated that there is a notion of service but it didn't talk about how we should address the concept of services. In my previous projects, we extracted the notion and we built our custom methodologies on top of the "concept of a service" however our methodologies were not aligned with other organizations since they had different methodologies on how they should address services. Version 3 addresses this issue since they go into specifics on what a service is, they did a good defining it and the whole ITIL lifecyle is built on Service Management. They also address concepts of Knowledge Management and how it fits in an enterprise. I have been dealing with SOA for the last 3 years and for SOA to be successful, a knowledge repository is essential.

I highly recommend this course for IT architects like system architects and data architects, project managers, program managers and folks who deal with licensing. In a world of high level concepts, buzz words and theortical exercises, I think ITIL offers the appropriate ingredients in implementing a sound SOA in any enterprise. One of the reasons why ITIL makes sense since they went to several successful enterprises, asked them how they managed their IT component, they then processed the information, analyzed the information and finally presented their work to the world as version 3. I believe ITIL followed their 7-step Improvement Process.

I also give kudos to my instructor David Moskowitz who did a great job in covering this material.

Thursday, June 12, 2008

Google Driven Architecture (GDA)

A few days I decided to try something different and it worked wonderfully. For the last couple of years or so, I have been spending some time building a web site for the New Market Antique Dealers Association (NMADA). I charged them a very low amount since I was helping them out. The user community for this web site is not at all computer savvy. Here were their requirements:

NMADA shall have an updated website
NMADA members shall be able to write entries and upload pictures about their products (antiques)
NMADA would like to a calendar to update their entries

Traditionally I would create a typical website with a MySQL database in the back-end and a PHP site in the front. Unfortunately the NMADA had an existing website which was a static site with lots of outdated pictures. The website was just an old website. I took the brave task of updating the website unfortunately after a couple months, I realized that I didn't have any time. I asked a dear friend of mine to help with me the with site. He finished the site and now the website can be found at http://www.newmarkettoday.com. Recently the NMADA members came to me and asked for my help to fulfill the second requirement of having a content management system. Fortunately before the members could spring the second requirement as an actionable task, I moved their hosting to a site which provides database and PHP support. Even though I felt fortunate, time was still against me. I therefore decided to implement the second requirement using Google Technologies. Rather than create a database schema, research and implement a content management system, I decided to use Google's Blogger website as the the content management system. Now NMADA users can login into the Blogger website with their Gmail accounts which I created for them, write their content, upload pictures and publish the content to the newmarkettoday website. I know. I know. What's so special about that? Developers have been using this practice for ages. Well let me tell you. I built the functionality in fifteen minutes and another five minutes to customize the site. Google's Blogger website has well defined access controls which can be customized for various NMADA members.

Okay let me get to the point (I know I am dragging but it's good to prolong sometimes...okay).

It is possible to create a professional level website with just Google technologies. Hence the title, I use Google Drive Architecture for building websites.

Blogger.com - You can use it as a WYSIWYG editor and to upload pictures and videos into your website.

Google Page Creator - You can publish static pages to the website. I currently use it to store attachments which are linked to my blog entries.

Google Calendar - You can publish events to Google Calendar and invite guests

Google Docs - You can write and publish documents on your website

Google Checkout - You can use this when you want to enable e-commerce functionality to your website.

Google Mashup Editor - You can use this when you want to create custom views with your data.

Google Sites - You can use this as your integrated Google tool to build and manage your website.

As you can see, it may not be the most straightforward but it is possible. The bottom line is that your don't need spend tons of money on folks like me who can build you a php or a Java enabled websites which have all of the snazzy bells and whistles. If you really to build a good website then you should hire someone however you have a mom and pop store then it is time to use GDA - the next paradigm in the land of many architectural paradigms.

A Google Sites YouTube video

Sunday, June 8, 2008

...You can negotiate with a terrorist...

Yesterday I saw this line on Wikipedia's entry for Data Architect. The line is:
"Q: What is the difference between a Data Architect and a terrorist? A: You can negotiate with a terrorist." I chuckled after reading this line since it is so true. I consider myself a "data" guy who enjoys talking about data and its significance in the Information Technology domain. Being a "data" guy, I have to agree with the line from Wikipedia because data architecture is the origin of any system architecture. If the data architecture is incorrect or partially incorrect, the downstream impact on the system is significant. That is one of the reasons why data folks are such sticklers to details and they don't easily deviate from their data policies or decisions.

One of things why I really like working with data is because it is smallest unit of information in an Information system. Even in college, I enjoyed physical sciences where we discussed quantum mechanics and the smallest units in science. Fortunately in Information Systems, data architects don't deal with probabilities to determine if the data went into the system or not. In quantum mechanics, there is a mathematical probability (a very small one) that part of the tennis ball will go through the wall while there is very high mathematical probability that the tennis ball will bounce of the wall. Even though data is so binary, people don't understand how to work with data. Hopefully there will be a day when everyone appreciates data in information systems and there are no terrorists in the world. Will that day every come true?

EnochScript - do I really need it?

Last Friday, I spent reviewing slides from this year's JavaOne Conference. Unfortunately I wasn't lucky enough to attend this conference but my colleague was kind enough to bring copies of various presentation slide decks. They were all quite interesting. They ranged how Java and SOA are meant to partner to writing your own scripting language which can be read by Java. Chaur Wu wrote an article on JavaWorld called "Build your own scripting language for Java: An Introduction to JSR 223" which talks about how developers can write their scripting language. This is a neat technology and as a technologist I want to create one but the bigger question to ask is, "Do I really need it?". Now we can create scripting languages where we can write our business logic and functions into the scripting language for a certain domain. My question is where do I need it. I can use a Business Rules Engine (BRE), write the logic in a Java application, in a database with stored procedures, triggers or in a BPEL process. Therefore before I start writing EnochScript, I need to ask the question why and what are benefits and drawbacks of EnochScript. I would also ask myself the question how does it benefit my project, my company or my business needs if I write a script rather than putting it in a Java application, in a database (if the the logic is in the data tier) or in a service orchestration (if the data is in the services and logic involves extracting and aggregating data in the service tier).

Saturday, June 7, 2008

Buzz words 2.0

In the last year or two, I have noticed that buzz words have a version number next them. I am sure you have heard words like: Web 2.0, Enterprise 2.0, eGov 2.0, Web 3.0, etc., etc. Here is my theory on this recent phenomenon. Companies, which have been trying to sell new functionalities added to old technologies, have been tagging tech words with version or sequel numbers. By adding a version or a sequel number, the marketing folks are trying to imply that the technology has been reborn and it is cooler and faster. To which I say, "eh...okay?" Is web 2.0 or web 3.0 really that different from Web 1.0? Maybe back-end processing or user interaction is different, however like any technology there are benefits and drawbacks.

Okay, I am going to write a blurb about the latest and neatest technologies. Here it goes,

"Last week, I went to a conference in Boulder, Colorado where major vendors came to promote their new products. Here are couple of products, I really liked:

Collaborative Databases - Oracle has recently released their new collaborative database which they are promoting as RDBMS 2.0. This technologies lets users use Web 2.0 and SQL 3.5 to generate reports in format Report 4.0 and the reports follow the Business Intelligence 2.0 terminology. The front-end is in Flex and Oracle is promoting the Collaborative databases as part of their SOA 2.0 paradigm.
Java 9 - Sun just released Java 9 which follows the new trend in programming. It is called predictive programming. The new JVM, which is based on Linux 3.0, can predict how a Java programmer will program the application before s/he can start programming. Sun also said they are currently working on JUnit 3.41 which will generate the unit tests before the program is created using the predictive programming.

"

Do you see how ridiculous my blurb sounds unfortunately it could trick a non-technical decision maker in believing in that the blurb is infact true. I am waiting for Blog 2.0 or Blog 3.0. Here is an IBM ad which makes me chuckle every time I see.

Monday, May 19, 2008

Thanks CSS

Dear CSS,

I have been working with you for many years now and using you improves the quality of the web pages. You also make the web page code less verbose. Back in the dat (early 21st century), web pages were composed of Hypertext Markup Language (HTML) and JavaScript. To customize the page, HTML coders would use <font/>, color attribute in the <img>r tags. Do have roll-over images, or any blinking text, developers would use JavaScript. The problem with JavaScript is that each browser renders it differently.

Thanks to you, developers like myself can add more functionality to web pages. We can now easily use Asynchronous Javascript And XML (AJAX) methodologies and make cooler pages with the style attribute in virtually any HTML tag.

I also think you have various editors which let developers to interface with you. My personal favorite is free tool called TopStyle Lite which is created by Bradbury Software.

Anyway I am excited about and hope you can continue to provide the web community with new innovations. Good luck!

Respectfully,

Enoch Moses

Saturday, May 17, 2008

Is Open Source bad?

Don't get me wrong. I love working with open source software. I love working Java, Php, Spring framework, Struts framework, Eclipse Integrated Development Environment (IDE), Spring WS, Junit, Hibernate, FileZilla, Mozilla's Firefox, log4j, Axis, JBoss middleware, etc., and etc. I am currently looking at Grails, JRuby, Seam, and on and on. The journey never seems to stop. As soon I am familiar with some software, it is either outdated or obliterated with a new software which runs on new principles but it is faster and addresses the issues with the software I just became familiar with. From a developer or an application architect's perspective, open source software is the way to go; however as a decision maker for a company, I would be leery to use open source source. Okay, Okay, let me explain. In my current position, I see resistance from my managers to adopt to open source software. Their reasons, which I think are quite valid, are that there is no support for the software, there is no evidence that it is widely used in the IT community, they haven't been tested well, and our company doesn't have enough people, who know how to use certain pieces of software.

As I said earlier, these are valid reasons however I believe companies should allocate some resources who track and evaluate open source projects. The committed resources should be composed of software engineers, software architects, business architects and analysts. Each open source product should be scored on:

Maturity of the software
Adoption in the software in the IT community
Is support being offered by companies and what is the level of support
Stability of the open source software
and determine the software's competition in the IT community and how the software's competition is doing in the IT community

Once again let me reiterate that I enjoy working with open source software since I have time to read the source code and make modifications if the software doesn't meet my requirement. This is a lot better than spending thousands and millions of dollars to buy COTS products and then realize the software doesn't work or the company needs to spend another chunk of money in hiring high priced vendors for the product. I have seen this happen with products from Microsoft, Oracle, Convera, Autonomy, IBM, etc., and etc.

In closing, I conclude that open source software isn't the panacea but it can be close to panacea if the company has the right set of resources, a culture for innovation and a willingness to take a risk. I look at companies like Google which thrive on individual and group innovation. I however don't recommend open source software for companies where policies quite frequently, they cannot allocate resources in researching open source products and as a company at large, they are financially loaded. If your company is in between the two examples I gave then I say please do your research and hire the appropriate personnel and then PICK YOUR POISON!

Tuesday, May 6, 2008

Be the hunted

"Be the hunted" is the slogan for a new job website JobFox.com on the Internet. I decided to surf to this site after seeing JobFox ads plastered all over the Metro trains. After putting my profile and seeing its functionality, I was impressed. It is a marriage of a typical job site like Monster.com or Yahoo's Hotjobs with web 2.0 capabilities.
Here are its features:

It offers a web page for every job seeker registered on their site.

Offers an "Experience Map" whic is a tree graph with your skills. This is created from a questionnaire which is filled out by the job seeker.

Maps your skills and experience with jobs in its database and the site offers a score

JobFox offers alerting capability which notifies you when your resume is viewed by your potential employer

One of the biggest features it doesn't offer is a search capabilities. This means that users cannot search for jobs but rather trust the JobFox system to show the appropriate jobs which it considers appropriate for the job seeker.
All mapped positions show a pie graph on how the position is mapped to your, the job seeker's, requirements like location, salary and stability. Anyway try it out and it might you insight into your market value. Have fun job hunting.

Saturday, May 3, 2008

Is Microsoft really the dark side?

Unlike some of colleagues and friends, I have worked mostly with open source products and I like the open source world. Most of the technology is well built and there is a community of developers, testers and business sponsors who spend alot of time and money to get the product right. There seems to be a common understanding that open source technology is flexible enough where it can be customized to a certain requirement and they cost is minimal when working with Microsoft.

On the other hand, Microsoft tends to dictate when its new products and technologies are going to released. They also dictate how their products should be licensed and flexibility is minimal. As a technologist, they seem to black box all of their technology.

Here are some of typical complaints regarding Microsoft:
Not Best of Breed
I have heard people complain left and right that their product is not the best of breed. For example their SQL Server product line is not the best relational database and it is ideal for small to medium implementations.

Tie Their products to the Windows Operating System
A few years ago, the technology community was complaining that Microsoft tied their internet browser, the Internet Explorer, to the operating system and that is how they beat their competition in the internet browser space

They eye-gouge their customers with licenses and other products
More and more end users are complaining that Microsoft is clamping down on how licenses are tracked in their software. It's now ever harder to steal their software.

I admit that these are valid complaints and lately I have realized their products aren't that bad.

Things I like about Microsoft products:
Their products integrate well together. I used to take this fact for granted until I worked with Oracle suite of products. Oracle's products don't talk to other products but they don't even talk to each other.

Their products are easy to use.
Beginning with their Windows product line, Microsoft products are easy to use and they are visually appealing to their end users. Until three or four years ago, Java didn't have robust debugging in their various IDEs. I remember working with Visual Studio many years ago and I still remember that their debugging was the best. SQL Server has had a great Graphical User Interface (GUI) for many years. During this Oracle only provided SQL*Net.

Even though we complain Microsoft is being selfish and opportunistic as business. I have to say that they have never sacrificed their core capabilities which is:
their products are easy to use and their products integrate well with each other. It is true that their products aren't the best in the breed but at least theywork.

In the last year or two, I have been extremely disappointed in how Oracle products are built and how they work. All of Oracle's products are tied to the database product line. You look at their JMS implementation, their PL/SQL based web services, their ESB which is tied to their database, etc., and etc. Oracle is doing the same thing as Microsoft did. As Microsoft tied their products to their operating system, we see that Oracle is currently tying their products to the database. If I were a CIO or the CTO, I wouldn't buy anything from Oracle other than their relational database products.

Unlike Microsoft, IBM sells their products for an extremely high price. Their products work well, unfortunately I wouldn't trust IBM since their business practices are questionable.

The biggest competition for Microsoft is Google. I am still wondering what is Google. Google sells online advertising, they have a search appliance and bunch of software products which are good but they don't address any core requirement. They seem to create a buzz on how they market their products. For instance, remember how they created a buzz for their email capability. They gave email accounts via email invites which created a buzz. They are geniuses in viral marketing but they really don't have a product. What I mean by a product, if I were to liquidate Google then what would I sell. Their search technology, their online software and their advertising business and now their mobile software. According my sources in US Department of Justice, Google's Android software is currently used by alot hackers to hack into mobile phone. Things like this make wonder...Is Google really out there to improve our lives or just market themselves a higher profit margin.

Yahoo! - I like Yahoo!. I have been a loyal customer of Yahoo! from 1996. They categorize and sell information, information products and more. I really like their stuff and their product might suffer a little when Microsoft buys them but I would rather see Microsoft buy them than Google beat them.

In conclusion, I still prefer open source technologies but I now also appreciate the value of Microsoft. They are not that bad after all.

Thursday, April 24, 2008

I am going coo coo for Popfly!!!

Tonight I went to my Popfly account and I was very impressed. Microsoft has added features which make it the best online Web 2.0 editor. They have added ways to add data sources. A pseudo database where you add a table using comma delimited syntax. Popfly has also added the capabilities of creating "blocks". Blocks are like components of software logic which can easily intergrate with other blocks or services. They have added features to the "block" where you can integrate it to web services by simply pointing to the Web Service Description Language (WSDL). You can write code for the block in C# or in a XML structured format. You can create mashups by simple drag and drop or write .net centric code. Like Google web pages, Popfly also lets you create web pages by simple drag and drop. I have to say that Microsoft has put alot of effort into this product compared to Google Mashup Editor or Yahoo! Pipes. Microsoft is probably eyeing this a possible collaborative programming IDE for the Visual Studio family. It is a not a bad idea.

FYI - Couple of weeks ago, Microsoft contacted me to see if I was interested in working for their Visual Studio line. With products like this and Google's snobby attitude towards folks with non IT degrees, I might have to reconsider. I enjoy programming, architecting systems and just being a technology geek. Is something wrong with me? Anyway in the next few entries, I might explore Popfly's capabilities in more in depth.

Wednesday, April 23, 2008

The Ultimate Data Mining Sport

A few days ago, I came across a show on ESPN on the concept of Sabermetrics. According to Wikipedia, "Sabermetrics is the analysis of baseball through objective evidence, especially baseball statistics." The sport of baseball is a unique sport since it is dominated by statistics. As many of you, hitting a ball, which is the size of a tennis ball, with a stick is an not easy task. Now imagine if the ball is coming at you around eighty to ninety miles per hour. What makes it even more hard is that the ball velocity, location and the trajectory of the ball may be different from one pitch to another pitch. The hitter has to decide to hit the ball if it goes through or touches the strike zone. The hitter has to guess what the pitcher's tendencies are and where and what type of pitch he, the pitcher, will throw. As you can see, statistics are the core of any baseball game. Lineups and pitching match-ups maybe altered depending on the pitcher, hitter, weather, ballpark and the time of the game. As an avid fantasy baseball player, I am aware of some of the stats and I alter my fantasy line up everyday.

What does this all mean? Since baseball is game of stats, Major League Baseball (MLB) general managers are relying more on analytical systems like Sabermetrics which give quantitative data on whether a player is valuable to the team or not. Sabermetrics captures statistics which are usually overlooked by regular stats. For example, Sabermetrics does not put alot of value in a player who hits for a .320 average. It does, however, give the player a higher rank if the player hits .320 average compared to a player who is hitting .098 average when the bases are loaded in a tie game at the bottom of the ninth. Yes the Runs Batted In (RBI) stat would say this however the player's RBI stat might be inflated if the player bats in a lot of runs when the games are not close. This might indicate that the player does not do well in pressure.

Even though folks in baseball have figured out on how to quantitatively score their assets to put a better product on their field, IT folks have been struggling with this for number of years. IT folks have a grand vision called Enterprise Architecture where they can map every asset, process, system and other IT variables and start seeing the gaps in their IT enterprises. AS IT business managers look at EA as their panacea for monitoring their enterprises, computers scientists are looking to fields like ontologies and referential systems to capture "inferred relationships" in the IT domain. I believe the IT people, including myself and all of the architects, developers, testers, program managers, project managers, CIO, etc., etc, should take a step back and learn from the baseball folks before we all strike out and head back to the dugout shaking our heads and pondering, "How did I miss that?"

Tuesday, April 8, 2008

AIXM is promising...

For the last few months, I have been indirectly involved in reviewing and doing analysis on the Aeronautical Information Exchange Model (AIXM) XML standard. After doing some initial review and attending the AIXM User Conference 2008, I believe this XML standard has a lot of promise. Here are couple of reasons:

Built on a mature and stable standard - AIXM is built on the Geography Markup Language (GML) which is a well accepted Geospatial standard. I believe this is a smart move from AIXM's perspective. AIXM can be used by all GML compliant applications with minor tweaking. When I asked the AIXM folks on why certain elements are modeled a certain way, their answer was that they tried to follow the GML standard. I believe this is a valid answer since they are merely extending the GML standard.

Domain specific - Unlike the National Information Exchange Model (NIEM) which has multiple domains, AIXM only addresses the aeronautical domain. Once again this is smart from a governance perspective since their Communities of Interest (COI) are limited. Even though NIEM allows various groups take ownership of each domain, NIEM is facing a problem with harmonization of data entities within the NIEM domain.

Things I would like to see from the next AIXM version:

Polish, Polish and Polish - Since AIXM is a international standard, I would like to see the standard XML schemas to be polished which includes:
- adding annotations to each datatype and element
- remove embedded elements within datatypes and elements and expose each datatype and element. This will make the standard more flexible and loosely coupled.
- each element is uniquely named, this is quite common in the AIXM standard now since there are embedded elements in the datatypes

Publish XSL files - If AIXM published XSL transformation files, then it will be possible for applications to easily migrate to new versions of AIXM. I know this will take alot of work but I believe it is worth it.

Extract Geospatial data from Aeronautical data - I know that AIXM is an extension of GML however geospatial data is very bulky and can cause bottlenecks in a system. I would like to see pilots where non-geospatial data in AIXM format are serviced through a XML based service. Each element in the non-geospatial message can be mapped to geo-spatial AIXM element or message.

I think AIXM can be used as cutting edge standard. Since AIXM has just introduced the concept of time or temporality in its standard, I believe this offers the biggest promise. With temporality, I envision AIXM capturing some of the aeronautical business processes and eventually aviation systems will be using AIXM instead of BPEL and BPM. The problem with BPEL and BPM is that it is too generic and sometimes it cannot be modeled well for a specific domain like aviation. I might be dreaming but entities like AIXM should consider this option after all a BPEL file is nothing more than an XML file.

Thursday, March 27, 2008

Visual Navigation

Yesterday I met an ex-colleague of mine at BullFeathers, which is in Capitol Hill. I didn't know BullFeathers is a burger joint where politicians met and discussed politics. My burger was good but the ambiance was mundane. However my conversation with Ren Pope, my ex-colleague, was very interesting. Whenever Ren Pope and I have casual discussions, the main focus of our conversation revolves around the data domain. I don't know alot of people like myself or Ren who really enjoy talking about data and its implications in IT. For example when we met yesterday, our discussion revolved around concepts like data density, data weights, "What is metadata?", misuse of data in the public sector and technologies which basically deal with data.

During our conversation, Ren stated that he built a website[http://rensdomain.net] using TheBrain. After visiting the website, I was initially disappointed since I didn't see any fancy graphics, embedded YouTube videos, or even much AJAX functionality. Isn't every website which has been developed in the last year or so, use or abuse AJAX? However after spending a few minutes on the website, I really enjoyed it. The beauty of the website is that it visually renders the information flow from one page to another. Like what Ren did, when you embed TheBrain technology into your website, you don't have to use the "BACK" button or the "FORWARD" button on your browser. Mr. Edward Tuft would be happy with the Brain since it does a good job of visualizing the information flow channels.

I have personally used TheBrain for a long time. It is a good tool for visually organizing information.

Monday, March 24, 2008

Can EA ever work?

A few months ago, I was sitting with my supervisor, who is an Enterprise Architect (eArchitect), and Gartner consultants discussing on how to sell the concept of Enterprise Architecture (EA) to our organization's senior management and how to create a successful EA in our enterprise.

Before we talk about whether EA can work, lets define the term Enterprise Architecture.

the description of the current and/or future structure and behavior of an organization's processes, information systems, personnel and organizational sub-units, aligned with the organization's core goals and strategic direction

- Wikipedia

To capture the true essence of EA, Microsoft.com states the following about EA: "For the first time there is a methodology to encompass all of the various IT aspects and processes into a single practice.

Processes
As you can see, capturing every process and nuance in an enterprise is a daunting task. An enterprise can have mature documented processes which are known and recognized by the enterprise's resources. These processes are easy to document. There are other processes which are well known but no one has documented. There are also processes which exist but they are known by the select few. And lastly there are processes that no one is aware of but they do exist.

Resources
So who do you work for? What do you do? What projects are you working? What is your skillset? These are the typical questions an eArchitect has when he is tackling the resource view in the overall EA. The questions might be simple ones to answer however they are quite hard to ask. eArchitects don't want to invade in someone's space however they need to collect the information to complete their model and render different views of the EA to their management folks. It would be alot easier if the eArchitects went to the Enterprise's Human Resources (HR) Manager. The HR Manager might even be cooperative to share his or her information however their information might not tell the actual picture. For example, my official title in the enterprise I work for is "Computer Specialist" but my functional title is "SOA Technical Lead". Our Enterprise's HR Manager does not capture our functional titles. Therefore just going to the HR Manager and getting information from them does not give the real picture. There needs to be actual discussions between the EA team and low level managers.

Assets
What operating system is running on your operating system? Do we have licenses for this software? How come we have four versions of Microsoft Word? As you can see, assets are a big deal in an EA. Capturing assets in an EA is important since it shows an enterprise view on what, where and how did the enterprise obtain these assets. This information may be great for asset inventories however eArchitects need to make data calls with various groups to capture this information. Without understanding an enterprise's business processes, eArchitects will not be able to infer why certain assets are higher than others.

Application
eArchitects have to spend time with enterprise applications to determine if the applications are meeting some critical requirements in the enterprise. eArchitects have to work with application Subject Matter Experts (SME)s to obtain this information. (Once again eArchitects are doing a data call)

As you can see for a successful EA, eArchitects have to capture lots of information and then synthesize the information. This becomes extremely hard when other groups don't see immediate rewards in sharing their group's information. If there is no information, there is no EA and when there is no EA then the enterprise cannot be improved. For organizations to get a better view of their enterprise, they need to mandate their internal groups to share their information. If the internal groups don't share their information then the consequences are, frankly speaking, dire.

In summary, EA can work however EA will not work with extremely bright and talented eArchitects, buying the state of the art EA software nor simply creating numerous EA models. EA will work if the upper management take time and creates policy which will give power to the EA teams to capture their organizations critical information. The upper management should also investigate ways on how to automatically capture some of the EA information in their processes.

Saturday, March 22, 2008

SOA from a human perspective

A few months ago I switched jobs. For my new job, I take the Washington Metropolitan Transit Authority Train (Metro) to go to my work location. Everyday it is a different experience. I get on a different train, a different car, sit on a different seat, sit next different people. Sometimes I get an announcement on the Metro that states the the train is delayed for some sort of reason. When my train is delayed then I usually send an email to my boss or call my wife that I will be late for dinner. I also see Metro law enforcement agents, who look for suspicious activities, people or items, get into the Metro and get off the Metro. Why are my observations about Metro interesting?

Well, let's say that I am a XML message and I am sent as a response message from the service providing application. All I, as the XML message, do is sit still in a ESB process (the train) and ride the ESB (Metro equivalent in a SOA environment). I might ride the ESB process with other XML entities who join me during the process. An system owner or an enterprise owner would be comfortable if some sort security mechanisms would be implemented in the ESB where my (as an XML message), fidelity is not altered and copies of myself are covertly created and sent to shady destinations. To address this issue, the system owners should enforce polices of having security agents who act as listeners in the ESB and then call other agents who take care of the corrupt XML individual. As an innocent XML message, I don't want to be tampered with and I would be happy if security controls like agents were implemented on the ESB. I would be upset if my train (the ESB process) is significantly delayed if every security agent is "frisking" every XML message. My owner, the system, would then have to send notices to his customers that I arrived late to the appropriate destination because i was frisked way too many times. Security in the ESB and in the enterprise have to appropriate implemented without significantly slowing down the system.

In airports, we are asked to take off our jackets, shoes, pocket change, cellphones, etc, and etc. This is very inconvenient however we do this because we want to be sure that there are no bad guys on the plane. Even if they are in the plane then they are stripped off of their potential weapons at the airport. This way we know we can reach our destination without much trouble. Using this example, I would like to argue how binary XML is a bad technologies for ESB centric solutions. Since SOA systems are based on distributed archtiectures where trust plays an important role between stakeholders and their systems. If binary XML or even compressed XML is sent across an ESB then rogue agents might sent encrypted viruses and other malicious malware through the ESB. If the ESB decides to decrypt the message then it could possibly infect the whole middleware infrastructure. If it doesn't then the service consumer may be "hosed". This model could bring an enterprise if the malware message sent to several destinations if the message is sent in publish/subscribe model.

Therefore binary XML is a dangerous technology and it should be used in a minimal basis. If you don't think so then please comment to my post.

Monday, January 28, 2008

XML Specification and Its Implementations

Since XML is a specification and not a technology, technologies which work with XML are implemented different. This is most evident in the new technologies like Semantic Web and Services Oriented Architecture (SOA) which extensively use XML. For example:

Semantic Web
In November 2007, I attended an ontology conference in Columbia, MD. During the conference, Steven Robertshaw stated that there is no standard ontology editor since each editor renders, validates a OWL document differently. OWL is a W3C standard for ontologies. I agree with Robertshaw's statement because the OWL document which I created from Protege did not validate on Altova's SemanticWorks OWL editor. When I fixed the OWL document for the SemanticWorks tool then it wouldn't work on Protege.

Web Services
Altova's XMLSpy generated WSDL does not validate well on Cape Clear's WSDL editor called SOAEditor. WSDL is a W3C approved XML specification for web service contracts. Sometimes certain editors look for attributes in a different place. I have used Mindreef's SoapScope to test a WSDL which is approved by all WSDL editors.

XML Schemas
While I was doing in depth analysis on Global Justice XML Data Model (GJXDM) for Department of Homeland Security, I created XML Schemas using XMLSpy . This caused a major issue in XMLSpy editor. The graphical interface stated the the xsd was valid however the xsd was invalid in the text view. After some research, I determined that XMLSpy used multiple xsd validators and that each XMLSpy view used a different xsd validator. This is still a problem. Look at this article.

Going back to my last blog entry, XML based modeling technologies have too many quirks and this is because how the technologies have implemented various XML specifications. This type of confusion is NOT needed by a data modeler. Before I start any modeling exercise, I would say, "Where is my Protege? "

Thursday, January 17, 2008

Can't live without it

In the world of pure data modeling, where data modelers develop data models specific to a domain and not to a technology, the Protege editor is a must have. A pure data model is a object model. This is different from a data model in a database. The data model in a database is normalized where certain object attributes are grouped together in a table. For example, the Person object look like this:

Person

Name

Height

Weight

Nationality(ies)

Now Instances of this object might look like:

Person

Enoch Moses

6 ft

180lbs

Person

Osama Bin Laden

6ft 2inches

160lbs

Saudi
Pakistani

Person

Barry Bonds

6ft 2inches

228lbs

However the Person object might be "shredded" across multiple tables in a database. This is done for performance reasons. In a transactional world, systems might just query certain attributes from various object instances since that is what the system is interested in. This approach is quick and logically however data modelers should not model their domain objects to a database since they might miss some implicit data relationships.

XML editors like Altova's XMLSpy, Stylus Studio and Oxygen provide graphical representations of XML schemas (xsd) which are great to create object models however it is a bit cumbersome to validate the data model with real data. To validate the data model with real data, an XML instance needs to be created with actual data and then the XML instance has to be validated with its xsd. This can be a painstaking process if the xsd's are quite complex.

For example lets look at the Person Object in a XSD

For the person called Enoch Moses, the xml instance should look like this:

For the person called Barry Bonds, the xml instance should look like this:

And lastly the Osama Bin Ladin, the xml instance should look like this:

The xml instance might make sense via its structure and data; however the xml instances are different documents and this could be a painstaking process to validate numerous instances. It can be done however it is not an enterprise data modeling solution.

UML is also used to model data however UML doesn't provide any way of validating the data model with the real data. UML can also be ambiguous.

The image does not have any methods since we are not discussing how to access various attributes in a class.

After creating database schemas, XML schemas and UML diagrams, I found an open source tool called Protege which was developed at Stanford University. Protege is an ontology editor which lets modelers create OWL documents (W3C approved XML language for ontologies) or ontologies in Protege frames. Since I think XML is not the best medium to develop ontologies or data models, I use Protege frames to develop data models. Protege lets the modeler create an ontology and its views where the modeler can input real data and see if the data model makes sense. This is how I created the Person class and the three instances on Protege.

Here is the overall Person class (please click on the image to see the larger version of it):

Here is the Name attribute(please click on the image to see the larger version of it):

Here is the Weight attribute(please click on the image to see the larger version of it):

Here is the Height attribute(please click on the image to see the larger version of it):

Here is the Nationality attribute(please click on the image to see the larger version of it):

Protege allows the modeler to validate the data model with actual instance data. It also allows the modeler to create views of the data model. Here are the three instances we have been working with:

Instance Enoch Moses (please click on the image to see the larger version of it):

Instance Barry Bonds (please click on the image to see the larger version of it):

Instance Osama Bin Laden (please click on the image to see the larger version of it):

Protege is a great tool since it lets you follow the MVV (not MVC) pattern which is:

Model - How the data entities are structured and are related to each other. The modeler models the data according to requirements or how he perceives the data (which is an ontology).
View - Since data models can be complex, it allows the modeler to create various views of the model.
Validator - This validates the model with real data. Most tools don't allow you do to this but it is a critical component of any modeling process. I see this akin to unit testing in programming.

In summary,

modeling data via a database can lead to incomplete understanding of the data. It only provides one view (the database view), an incomplete model but it does allow the modeler to validate the model in the database view.
modeling data via xml schemas is a great way however it can be extremely process heavy because xml is verbose and there is a variation between xsd validators. Some validators validate a complex instance while another validator may flag the same instance as a invalid instance.
modeling data via uml does not allow the modeler to validate the data or create various views on the data. UML is however quite useful when developers want to get the UML class diagram and generate skeleton programming classes.
modeling data via protege is quick, easy and it is free. It has a great modeling UI. It allows modelers to great various views and validate the model and views with data which is provided by the modeler.

As a modeler, I find it hard to model without Protege since I believe it is a pure data modeling tool.

Saturday, January 12, 2008

If Enterprise Level Federated Query is a hoax then what?

Couple days ago I received an email with the comment to my previous blog entry Enterprise Federated Query is a hoax!. The comment stated that Enterprise Federate Query is not impossible. It is not impossible but rather improbable. According to my previous blog entry, what I meant by an enterprise federated query is a user querying "n" data sources and the responses from the "n" data sources are combined and a unique set of results is passed to UI which then renders it for the end user. This is not possible since they are performance issues, governance issues, security issues and other functional and technical barriers. I believe with current technology and with smart human resources. It is possible to create a functional federated query with twenty to thirty data sources. Twenty or thirty data sources is no where close to 500 or 10,000 data sources.

The question to be asked then "what can be achieved with a SOA enterprise?". I believe rather than a request/response model, a publish/subscribe model is more robust and extensible. If data sources can publish new data or updated data to a topic then the topic clients can subscribe to the data sources' data. This way clients can subscribe to the data they want. Service Level Agreements (SLA) should be written and agreed upon on how the data can be manipulated or stored. Metadata repositories and registries are essential for any SOA enterprise to be successful with the publish/subscribe model.

Wednesday, January 9, 2008

Enterprise Federated Query is a hoax!

Everyone uses the federated query example as a great application in the Service Oriented Architecture (SOA) paradigm. Via SOA paradigm, the application developers can integrate their client application to numerous services and this will allow the client application users to search and browse various data sources. This sounds great in theory however an enterprise federated query application is not possible. Before we look into why an enterprise federate query application will be a reality, we need to understand what is a federated query application (fqa).

A fqa will allow its users to query multiple data sources at the same time and then the fqa will process the multiple responses into one standardized result set. To the fqa user, it would appear like he or she is hitting one data source. Fqa would be a killer app if it:

was fast - performance is not an issue
was very secure - security is not an issue
was reliable
always returned great results

Unfortunately the reality is not nice. Here are the issues:

High number or infinity number of data sources - The middleware like an Enterprise Service Bus (ESB) takes the fqa user's request and replicate a request for each data source. After the middleware does that, it waits till it gets most if not all of the responses. It then aggregates the responses by removing duplicate results or corrupted results and then forwards the response to the fqa. Imagine the time it is going to take to replicate the requests, wait for the responses, aggregate the responses into one response and then forward it to the fqa. Now imagine if there are five thousand users sending requests at the same time or in a short amount time. The fqa result screen might take a few minutes to render the results. Caching can used to alleviate this problem to a certain degree however it is not possible with realtime or near realtime data.
Security - Now let us imagine that every data source which is accessed by the fqa requires user credential. This could add time since every data source has to authenticate the user credentials against a data store. This could add increase the response time. The response time is the amount time it takes the fqa user to get the response. Trust based security model would alleviate this issue. The data sources trust that the fqa user has been authenticated by the fqa.
Data Source variation - Each data source could be different. It could be a Relational Database like Oracle, SQL Server, or DB2; or it something else like:

Flat file
Custom Off The Self (COTS) product which has service interfaces
Object Database
etc., and etc

Result set Aggregation - After the responses are collected, the responses need to be process to remove redundant data set. Computer algorithms need to be developed to aggregate the results and display them.
Governance - Each of the data sources needs to have appropriate agreements before they can be integrated with the fqa. The agreements can include a Memorandum of Understanding (MOU), Service Level Agreement (SLA), Office Level Agreement (OLA), etc, and etc.

As we can see an enterprise federate query is a large undertaking and it in reality it is not possible because various dependency and it may not be feasible with respect to performance. I built a simple federated query using Yahoo! Pipes but this only works with three data sources.