Web 4.0.1
In a recent post, I discussed how a few major themes have shaped the evolution of the Internet over the past 35 years. Communication was a major focus of the early web, allowing the free-flow of information between people and organizations. Then, more recently, with the proliferation of Web usage for more commercial and entertainment purposes, there was a big push to improve presentation mechanisms and technologies.
Now, I want to mention some trends I see emerging that will likely shape the evolution of the Web over the next several years. This post will address the first major trend I see emerging – that of data-driven web applications and services.
When I discussed the themes of Communication and Presentation, I’m sure everyone knew exactly what I was talking about. But, I’m not going to assume that everyone understands the idea of data-driven web applications and services. So, let me start with some explanations and examples.
Historically, web applications have been singly focused on a specific task or service, with that task or service defined by the data that the website controlled. For example, you go to Amazon.com and you can buy products. But, that’s pretty much all you can do at Amazon. You go to weather.com, and you can get the weather. You go to cnn.com and you get the news. I think you get the point. Basically, with today’s Web, each website/application owns a specific set of data (book inventory, weather information, news articles), and that set of data defines what the website/application will allow you to do.
To represent it with a picture, the model looks something like this:

At this point, you might be asking yourself, “But, why would I ever want to go to one place to buy a book, get the weather and read the news anyway?” Good point…you probably wouldn’t. But, let me give you an example of something you might want to do:
I fly airplanes. And when I decide to take a trip somewhere, I have to visit multiple different websites to plan my trip. First, I visit the website of my flight school so that I can book an airplane. Next, I go to the National Weather Service website to get weather information to plan my trip. Then I go to the websites of each of the airports I’m planning to stop at along the way to get airport diagrams and service information. Then, if I’m planning on staying over, I need to rent a car and get a hotel room. All-in-all, I spend an hour going through upwards of a half-dozen websites just to plan my trip. In a perfect world (on a perfect Web), I would be able to go to a single website where I could do each of those actions concurrently. And because this single application would already know certain details of my trip (my arriving location, for example), it could also provide integrated, value-added services as well (recommendations for hotels near the airport, suggestions for dinner reservations, etc).
This is an example of a data-driven web application. The application itself is hosted by a single website, but the data is coming from multiple different sources (multiple airport websites, National Weather Service, car rental sites, etc) and is being aggregated in real-time to allow the user to perform multiple related tasks. While we aren’t seeing many of these types of applications popping up on the Web just yet, it’s not too far off.
So, what’s the fundamental change that will allow these types of complex, data-driven applications to be deployed? The short answer is that the Web is evolving from a set of disparate applications to an integrated application platform; this application platform will allow more complex and more interactive applications, services, and features to be built on top of it. Before I go any further, perhaps I should define “platform.” While everyone has their own definition of what “platform” means, I’ll stick with something simple – a platform is a set of technology components that allow applications to be run on top of it.
So, now I’ve stated that I believe data-driven web applications are one piece of the next evolutionary phase of the Internet; I’ve stated that these types of applications will be enabled by the Web being transformed into an application platform; and I’ve defined platform as a set of components on which applications run. For those of you who like details, perhaps now would be a good time to define what I believe this platform – this set of technology components – will specifically consist of.
In the spirit of pictures, here’s how I would represent the Web as an application platform in the near future (this is actually highly simplified for this discussion…I’ll have more to say in a future post):

As you can see, the model of mapping a single application to a single set of data has been broken. Applications rely on data that is stored in multiple locations around the Internet, and not only will a single application rely on multiple data stores, but a single data store will serve many applications.
This type of data distribution (and the data-driven applications on top) is made possible by two relatively recent advances in technology:
Better Structured Data (Schema)
The Internet has so much information that sometimes it’s difficult to make sense of it all. Companies like Google and eBay have spent years figuring out how to take all the data that’s provided by web users and organize it (structure it) into a format (a “schema”) that is easy for both a human being and a computer to understand. For example, if I go to eBay and do a search for “New 40GB iPod”, eBay’s search engine needs to be smart enough to figure out that I am looking for an Apple iPod with certain characteristics – namely that its condition is new and that it has 40GB of storage. But, how does the search engine know that “iPod” is the product, and “New” and “40GB” are attributes? Humans can parse this type of data and make sense of it; computers have a difficult time doing so.
With the proliferation of search technology, new standards for data representation (XML, specifically), and the ever decreasing cost of storage, the ability for companies like eBay and Google to import massive amounts of data, analyze it (whether by hand or by machine) and then begin to organize it is finally becoming feasible. Google Base is one example of an initiative designed to collect and organize massive amounts of data into a well-defined structure that can be used to disseminate human knowledge in a way that a computer can (in some ways) understand.
In our example above (the flight creation website), the application will only work well if the airport data for my departing location is formatted and structured in the same way as the airport data for my arriving location. In other words, the “schema” for defining airport information must be consistent across all the databases that store airport data.
Web Services and Data Syndication
Once a computer can understand the logical relationships between disparate pieces of data, we’re well on our way to building applications that can be smarter and more flexible. But, there’s one other piece of the puzzle that’s essential to creating data-driven web apps – moving that data from one place to another in a standard fashion.
Web Services is a technology designed to allow computers to communicate with each other and send data between each other in a standard fashion. Using Web Services, one computer can “talk” to another computer, and find out what kind of data that other computer has, and then request that data. In our flight creation website example above, our website could use Web Services to talk to each of the other websites it needed data from, and request that data in real-time. This means that our website didn’t need to store large amounts of data itself, and also didn’t need to worry about that data changing or expiring. Every time it needed data from another computer it could “talk” to that other computer in a mutually accepted fashion – using Web Services.
In addition to Web Services, data syndication standards such as RSS have started to become extremely popular among content creators on the Web. RSS and other syndication standards provide a mechanism by which content authors and owners send data and content to other computers and applications that request it. Along with Web Services, data syndication standards such as RSS will make sending and receiving data as easy as passing a note.
Well, I’ve now touched on the first major trend I see emerging on the Web – data-driven web applications and services. In an upcoming post, I’ll touch on some other trends I foresee.
In the meantime, I’d love any feedback you might have…
December 9th, 2005 at 10:18 am
Re: Web 4.0.1
My cow-orker Jason Steinhorn has lept ahead to Web 4.0.1. Here are my initial thoughts on his ideas.
I think a key take-away is that big web companies with large data stores shouldn’t focus on creating one giant web site application.
Instead, th…
December 13th, 2005 at 5:42 pm
[…] overlaying technology’s three titans with steinhorn’s primer on the three core building blocks of web 4.0.1, i see yahoo excelling at apps, microsoft keen on the communication layer (”web feeds”, SSE, open document standard) and google as the mother of all data. too simple, really, but it’s late in the day and I am taxed as it is. […]
December 13th, 2005 at 9:54 pm
[…] This was nice to see after my post about how distributed data providers and data driven applications will form a major part of the next “version” of the Web. I think this is a great example of how large data stores of information will be made available to drive both niche and large-scale applications that form the backbone of future web experiences. […]
December 17th, 2005 at 2:29 pm
[…] In my Web 4.0.1 post (please take these titles for what they’re worth…my disdain for trying to “version” the Web), I discussed “distributed data driven applications and services” as an emerging trend on the Internet. In retrospect, it appears I got tied up in that awful techno-speak that this industry has been propagating for the past twenty years, trying to make something really simple sound really fancy and complicated; I guess our industry is overly concerned that people won’t pay good money for our products if they were to realize that they’re being sold basic tools with lots of fluffy marketing. Anyway, to my point, what I should have said about the first emerging trend is that “it’s all about data.” Nice and simple. It’s about moving around, integrating and then surfacing all the data out on the web in ways that are easily consumable by Web surfers. […]
December 18th, 2005 at 4:03 pm
[…] In my Web 4.0.1 post (please take these titles for what they’re worth…my disdain for trying to “version” the Web), I discussed “distributed data driven applications and services” as an emerging trend on the Internet. In retrospect, it appears I got tied up in that awful techno-speak that this industry has been propagating for the past twenty years, trying to make something really simple sound really fancy and complicated; I guess our industry is overly concerned that people won’t pay good money for our products if they were to realize that they’re being sold basic tools with lots of fluffy marketing. Anyway, to my point, what I should have said about the first emerging trend is that “it’s all about data.” Nice and simple. It’s about moving around, integrating and then surfacing all the data out on the web in ways that are easily consumable by Web surfers. […]
August 8th, 2007 at 11:04 am
[…] This isn’t a new theme…in fact, I blogged about some very similar ideas a year and a half ago. […]