Archive for the ‘Uncategorized’ Category

Monosyllabic Roots in Hebrew

October 17, 2010

New article “Monosyllabic Roots in Hebrew” at Lingbuzz

Increased Interest in the Semantic Web

August 9, 2010

The Semantic Web is receiving more and more attention recently which can be seen from the fact that Google bought Metaweb, the company that developed the social semantic database. It is less known that Microsoft has recently licensed the advanced linguistic technology from Cognition. As a result, Microsoft now has access to the technology of both Powerset and Cognition.

When Microsoft bought Powerset they did not integrate it into their main search product Bing, so one might have concluded that Microsoft was not really that interested in utilizing Natural Language technology, but simply wanted to make use of the brain power of Powerset’s team of computer scientists. But now that Microsoft has also licensed the scientific linguistic parsing from Cognition, it seems clear that they do in fact have plans to utilize Natural Language technology.

At this point, out of the three companies which concentrate on the computerization of the most recent theoretical linguistics, two (Powerset and Cognition) have been grabbed up by Microsoft. This leaves Linguistic Agents’ technology as the only available alternative on the market.

Linguistic Technology

January 10, 2010

Humans have a unique ability (known in linguistics as Language Faculty)  to process complex syntactic structures. Language Faculty is a subject of study by theoretical linguists. 

After many years of intensive research in the field of theoretical linguistics, there has been significant progress in the deciphering of the general properties of human language.  The recent expansion of research beyond English and other familiar European languages has enabled the refinement and verification of the central discoveries of theoretical linguistics. 

People tried the idea of precprocessing with a lingutic parser that generates “parse tree”, but dealing with plain text directly leads to much better results.  As the result, the progress achieved by theoretical linguistics in the study of the human language is not presently utilized in deep learning. 

The Scientific Infrastructure for the Linguistic Web

January 5, 2010

There are two major pre-requisites for the emergence of the Linguistic Web:

1) A solid Linguistic Parser

2) An extensive Lexical Semantic Database

Not only do these two elements need to exist – they must also be generally available to developers worldwide. We will now examine what is the current status of each one of these two crucial components of the Linguistic Web.

Linguistic Parser

Just about everybody has heard about the big buzz generated by Powerset’s acquisition by Microsoft. This is just an example of the magnitude of effort needed for the development of realistic, industrial strength linguistic software.  

Scientific linguistic technology necessitates a very long period of development and significant financial investment.  Powerset’s collection of natural language technologies incorporates over 25 years of intense scientific research, which originated at the PARC (Palo Alto Research Center). 

After all this invested effort in research and development, it is not clear what the policy of Microsoft will be in regards to making their sophisticated linguistic platform generally available.   

And yet, there are other players on the block with advanced linguistic parsers, who just may go ahead and make available the scientific technology which can be used in the Linguistic Web for the massive production of language oriented applications.

Lexical Semantic Database

To meet the requirements of the Linguistic Web, any Semantic Ontology must be constructed in terms of natural semantic concepts used by Language Faculty (FL), the basis for the inborn human ability to process language.

What is needed is something such as the “Semantic Map” developed by Cognition Technologies.  It took more than 20 years to build, and is probably the largest scientific linguistic database for English in the industry.

Is there a comparable Lexical Semantic Database available for everyone? Not at the moment. Perhaps what is needed is a Wikipedia-type collective effort in order to build a global Lexical Semantic Database. Obviously, this effort must be in sync with the accumulated insights of the last 60 years of intense research in theoretical linguistics. 


Any way you look at it, an infrastructure which contains both a Linguistic Parser and a Lexical Semantic Database will be needed in order to jumpstart the Linguistic Web.

Imagine the economical impact of all these various natural language solutions, in all the major languages, being developed worldwide, all using the same underlying linguistic platform. Of course, a new standard format for the representation of Natural Language Objects will also be necessary, but this is a subject for another posting.

Israel: Leader of Business Innovation –

November 8, 2009



Israel: Leader of Business Innovation –

Dan Senor, co-author of ‘Start-up Nation: The Story of Israel’s Economic Miracle,’ discusses with CNBC how Israel has managed to become a leader in business innovation.

21 Semantic Roles for Linguistic Web

October 28, 2009

Following is the preliminary list of Semantic Roles (known in linguistics as “thematic roles”) for use  in Linguistic Web.  The Roles are part of  the intermediate protocol making it possible for linguistic processing services to present the results in simple and unified form.

(1)  AGENT


(3)  CAUSE

(4)  THEME





(9)   GOAL

(10)  SOURCE


(12)  PATH

(13)  MANNER




(17)  RESULT


(19)  TIME






Linguistic Technology Going 2.0.

October 27, 2009

Linguistic Web envisions high resolution linguistic intelligence tools freely accessible to the developers of natural language applications, opening possibility for  easy collaboration between providers of linguistic analysis software and developers of natural language applications.

Companies like Powerset and Cognition Technologies already have solid, scientifically-based software for linguistic analysis, a fruit of long years of research.  Of course, application developers do not presently have free access to anything like these proprietary solutions.  Nevertheless, the very existence such advanced systems proves that it could be done.