Tim Berners-Lee's current proposal for enhancing the WWW, the Semantic Web initiative defines a set of standards that augment the current WWW standards. The aim of making is to make the Web more useful for humans by making it more understandable for machines.

The W3C says:

The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. In order to make this vision a reality for the Web, supporting standards, technologies and policies must be designed to enable machines to make more sense of the Web, with the result of making the Web more useful for humans. Facilities and technologies to put machine-understandable data on the Web are rapidly becoming a high priority for many communities. For the Web to scale, programs must be able to share and process data even when these programs have been designed totally independently. The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people.
The emphasis, as you can see is on shared open data models.

The Semantic Web Activity (W3C calls "activity" large coordination efforts) uses standards and concepts developed in the W3C Metadata Activity (PICS, DSig, the P3P privacy standard and the CC/PP device capability description standard).
Additionally, extra effort is dedicated to RDF related standards (notice that RDF is expressed in XML).

What will it do for me?

If the Semantic Web becomes a reality, we will be able to add real-world, machine-readable information to Web pages in a portable format.
For example, a student Web page might "say", in a standardized way, that he studies at Purdue; advanced search engines could use the information to answer questions such as "Find all the people named Mary Brown that study but do not work at Purdue".
Moreover, the RDF related standards will allow intelligent search engines to draw inferences like "an acceptable mailing address for Mary Brown is the mailing address of the Department she studies at"; at the same time, the much talked about software "agents" will at least have usable information to grind.

A good place to start reading about the Semantic Web is http://www.w3.org/2001/sw/Activity.

W3C allows quoting of its documents provided the following is included: Copyright © 2001 W3C® (MIT, INRIA, Keio) All Rights Reserved. http://www.w3.org/Consortium/Legal/

I think the Semantic Web is a grand idea, but I’m not quite so sure it’ll work, for three very important reasons, all, unfortunately, societal:

First, it assumes that all the people providing information in these XML documents understand the information they are posting in the same way. In the previous writeup, the example was given of a student who posted information that he studied at Purdue, and in the example, an agent program could then take this information and deduce that the student studied, not worked, there. But what if the person who wrote the XML document was the student, and the tag he used implied worked? (After all, he does do a lot of academic work there.) And what if he is only a part time student, or is on the faculty (who sometimes study themselves), etc.? I think people will need help in making their personal Semantic Web documents conform to the assumptions made by the people who write the RDFs and the agents.

Second, I’m afraid that privacy concerns will prevent people from posting important personal information in Semantic Web documents. Or rather, I’m afraid that posting personal information in Semantic Web documents will be a threat to privacy. By posting information about oneself to the web in a machine readable form, it will make it easier for companies to harvest this data in massive quantities. Spambots already do this to ordinary web pages and USENET groups to gain lists of e-mail addresses to send to, which often requires the spoofing of e-mail addresses, thus defeating the original purposes of these e-mail reply fields.

And this brings me to my third point: that the Semantic Web depends on getting people who post these machine readable documents not to lie. Not only can it be perhaps abused by spambots, but also it might become victim to corporate interests publishing reams of purposely false data to the Internet, or maybe even more reams of true data in such quantity, or maybe in locations more accessed by agents, that it is given a false weight, in order to further their own interests. C’mon folks, you just know they’d do it.

I really hope the Semantic Web manages to overcome (or has already found ways around) all these concerns, especially the last two. I really hate it when petty commercial interests stomp all over such noble concepts.

(Written by a person whose concerns, he realizes, may be completely invalid, but decided to post them anyway. Please, please, correct me if I’m wrong.)

