Algebraic Structure of IEML Semantic Space
The mathematical foundations of IEML
We propose in this article a formal description of IEML, a novel
language designed to be used within computational intelligence and
collective intelligence domains. Its target applications are:
collaborative semantic tagging (`balisage’) of any idea or concept
available on the Web (blogs, images, software, documents, data in
general);
collaborative semantic search, including comparison, merging and
navigation;
interoperable modeling and simulation in social sciences, management, design, gaming and digital storytelling.
The emphasis is put on the operations that can be performed on
IEML expressions. The combination of these operations leads to automatic generation and transformation of IEML expressions and to automatic generation and transformation of semantic networks of IEML
expressions.
IEML has been designed with two goals in mind: provide a practically unlimited method of semantic expression while remaining well
within the limits of modern computation.
IEML : PURPOSES AND STRUCTURE
By Pierre Lévy
Professor at the University of Ottawa (Department of Communications)
Canada Research Chair in Collective Intelligence
Member of the Academy of Sciences of Canada (RSC)
Translation from french: Michele Healy, PhD
I Introduction
Ieml (Information Economy Meta Language) is an artificial language designed to be simultaneously: a) optimally manipulable by computers; and b) capable of expressing the semantic and pragmatic nuances of natural languages. The design of ieml responds to three interdependent problems: the semantic addressing of cyberspace data; the coordination of research in the humanities and social sciences; and the distributed governance of collective intelligence in the service of human development. Indeed, the semantic addressing of oceans of digital documents and the coordination of the social sciences find their full meaning only as a function of the ultimate goal of ieml, which is to contribute to the well-informed governance of human development.
Because ieml is a metalanguage for describing the information economy, I will begin here by explaining what is understood by “information economy”, followed by the meaning of the term “metalanguage” in this context.
The information economy
The information economy is an inclusive concept that goes considerably beyond the dimensions of a monetary economy. It designates the metastable and evolutionary ecosystem of flows of meaningful data that are produced, maintained, and transformed within a human population. We can use the term collective intelligence to designate this object, as long as we understand intelligence not as the opposite of stupidity, but rather as a self-sustaining and interdependent dynamic of cognitive functions (perception, memory, learning, communication, coordination of action, etc.) at the scale of a community. Because the information produced, stored, exchanged and interpreted by human societies is increasingly being coded in digital form, and the circulation of this information tends to converge in the same interconnected network, it has become possible to observe the information economy in a manner that is much more subtle and integrated than was possible before the advent of computers. Nonetheless, simply taking a quantitative measure of information flows (or even of their monetary value) clearly is an insufficient basis from which to arrive at a thorough understanding of the information economy. Therefore, the research community must have access to observational and analytical instruments so that we can identify: 1) the semantic qualities of the data stored on servers and exchanged in networks - in other words, what the data represents - as well as 2) their pragmatic pertinence - that is to say, the effect and use of the data in context. Ieml has been designed to serve precisely this function of identifying and scientifically analyzing the meaning and contextual effects of information.
Metalanguage
A scientific-type ideographic script
The word “metalanguage” contained in the name “ieml” condenses several meanings. First, ieml belongs to the very general category of systems of cultural signs, or systems of symbols. In other words, we are dealing with a convention - or artefact - and not with a natural object.
Second, it is a meta-language: a language about language. It has been specially designed to index and characterize data and phenomena that are already symbolic in nature.
Further, of all the possible metalanguages, it is a system for the scientific notation of meaning, with a combinatorial structure that authorizes a broad range of automatic manipulations. Although its expressions can be pronounced (because they are expressed using alphabetical characters), ieml is not a natural language, nor is it intended to replace or simulate natural languages such as French, English, Russian, Mandarin, or Arabic. Rather, it is a scientific script, or a reasoned system of notation, designed to maximize the possibilities for computer calculation.
Third, ieml is also an ideographic script within which each symbol represents a concept. It is important to note that, during the 15 years I devoted to this project before publication, I worked by manipulating icons so that I would be influenced as little as possible by the natural languages I know. It was only in the final months of my research that I replaced the icons by letters of the Latin alphabet, to facilitate keyboard entry. Thus, ieml is - in principle - independent of natural languages.
Combinatorial and articulated structure
As with many other systems of signs, ieml is structured along several levels of articulation. For a proper understanding of the system of articulation used in ieml, it can be useful to compare it to the system used in natural languages. Therefore, I will begin with a look at the articulation of natural languages, before turning to the articulation seen in ieml.
The levels of articulation in natural languages
The first level of articulation in natural languages is the phoneme (phonemes are the elementary sounds of a language). Generally, phonemes have no meaning per se.
The second level of articulation is the morpheme (word roots and markers of case, gender, number, etc.). Morphemes are composed of phonemes. They constitute the first meaningful unit of articulation of a language.
The third level of articulation is the word, which is composed of morphemes. Words are not perceptible outside of writing. For cultures without a script, the distinction between word and morpheme would be meaningless.
The fourth level of articulation is the phrase, composed of words. The phrase is the first level of articulation to contain not only a meaning but also a reference. For example, the word “tree” is neither true nor false; rather, it only indicates a concept. Only the phrase “the tree grew”, referring to an actual event, has the capacity to be true or false.
The fifth level of articulation is discourse, or the text, which is composed of phrases, etc.
Now let’s have a look at the successive levels of articulation in ieml.
Levels of articulation and combinatorial structure of ieml
1) Five primitive elements form the first level of articulation. These are: the virtual U, the actual A, the sign S, the being B and the thing T. Their meanings will be explained in greater detail in the latter part of this text. These five elements are organized around two poles:
the pragmatic pole of action, which includes the elements virtual and actual
the semantic pole of representation, which includes the elements sign, being and thing.
Here, in contrast to the case of natural languages, even the first level of articulation is meaningful.

2) The second level of articulation, the level of events, is formed of 25 (or 52) directed pairs of elements, or information flows between elements. In contrast to natural languages, all combinations of any two of the first units of articulation are valid and meaningful units of the second level of articulation. As we shall see in greater detail later on, the meaning of a combination of elements results from the combination of the meaning of these elements. For example, the directed relationship U ? U (virtual to virtual) means “to reflect”, the directed relationship U ? A (virtual to actual) means “to act”, the directed relationship A ? U (actual to virtual) means “to perceive”, etc. A complete explanation can be found in the third part of this text.
ieml events

3) The third level of articulation, the level of relations, is formed of 625 (or 252) directed pairs of events, or information flows between elements. Here again, all combinations are valid and meaningful, and the meaning of a relation is in principle the result of the meaning of the events of which it is composed.
4) The fourth level of articulation, the level of ideas, is formed of simple relations, and by directed pairs or ordered triplets of relations. There are 240 million possible ideas (or 625 + 6252 + 6253). Of these, only slightly more than one thousand have been identified (status as of 01 May 2006). In other words, the - necessarily collective - worksite is open for construction.
5) Finally, the fifth level of articulation, of phrases, is formed by simple ideas, or by pairs or even ordered triplets of ideas. The number of possible phrases is astronomical, on the order of 1025. Below is a diagram mapping out the structure of an ieml phrase, using the example bo soko, which in ieml means “language of collective intelligence”. In this diagram, the stars * mark the empty roles within a structure that is exactly the same for all ieml phrases. Each phrase fills this structure to a greater or lesser extent depending on its particular composition.
A phrase in ieml

Thus, the elements (1st level), events (2nd level), relations (3rd level), ideas (4th level) and phrases (5th level) of ieml are ideograms of five levels of nested complexity, all constructed in a regular and combinatorial fashion.
In order to avoid any confusion with the particular level of ideas, I propose that we call the ideograms in ieml “glyphs”. The known connotation of the word glyph is for the hieroglyphs used in ancient Egyptian writing (actually, these characters were mixed in nature - part ideogram, part phonetic).
As a general rule:
All symbols in ieml are composed using symbols from a lower level of articulation, at least up to the simple, or non-compound, symbols, which are the elements.
The meaning of a combination of symbols results from the combination of the meanings of the combined symbols.
Digital addressing
Each glyph in ieml is permanently associated with a set of numbers that form the specific digital address for that glyph. The digital address of the glyphs is formed of degrees on scales. The principle of scales in ieml is explained in greater detail in the final part of this text, but it is important that I introduce the idea here so that I can justify the claim that ieml allows automatic semantic analysis, and makes semantic distances calculable. An element, for example, is associated with two degrees, with each of the two degrees marking a position on a different scale. A phrase is associated with a combination of 200 degrees situated on 200 distinct scales distributed over 5 levels. Each scale represents a particular dimension of analysis of the meaning of the glyph.
Primitives, glyphs and graphs

Finally, the glyphs (or ideograms) in ieml can be assembled into “texts” of innumerable quantity, called graphs. The graphs in ieml can take three main forms, which can in turn be combined: series (linear orders of glyphs), trees (hierarchical or genealogical orders of glyphs) and matrices (Cartesian orders of glyphs that cross rows and columns). The graphs can serve to describe or index documents and phenomena of all sorts, and to express ideas, theories, classifications, and more.
Dynamic properties
Ieml offers systematic coherence, digital addressing and computational capacities, making it a dynamic script with remarkable properties. Of note, ieml graphs can be used as “building blocks” for the modelling and simulation of information economies according to various rules.
Once certain criteria have been selected and combined, it is simple to automatically generate and re-order complex graphs.
The “semantic distances” between graphs can be calculated automatically from their digital address according to a large palette of criteria and an indefinite quantity of points of view.
Each graph or set of graphs can play any of three distinct roles.
1) First, an ieml graph can play the role of the object to be analyzed, of the text to be read and interpreted.
2) Next, an ieml graph can play the role of a reading grid, a tool for interpretation or analysis. In other words, an ieml graph is capable of displaying data about other graphs, depending on the cognitive perspective it represents. The cardinal importance of the fact that each graph can be positioned as a centre of reference, and of control, of the semantic space, reflects one of the foundational principles of this metalanguage: at a given level of composition, no concept is more important than any other, and they all can be considered virtual centres.
3) Finally, an ieml graph can also play the role of compositional instrument for other graphs. That is, matrices can serve as keyboards. Ordered lists and trees can serve as dictionaries or classifications that make it possible to select concepts judiciously.
Early intuitions toward ieml.org
Starting in the late 1970s, I began to anticipate that computers would become the medium of intellectual technologies, technologies that would profoundly transform and expand our ways of thinking and communicating. My early background was in philosophy, history and, more generally, in the humanities, which I studied in Paris from 1975 to 1985. I especially felt the influence of the French historical and anthropological schools, just as surely as I was marked by the philosophical excitement bubbling through Paris in the 1970s and 1980s. Still, this did not keep me from quenching my thirst at other sources as well (positivist, analytical, Anglo-Saxon, Oriental, etc.). In parallel explorations, I took an interest in the beginnings of computer science and artificial intelligence, as well as the connections between information theory, the cognitive sciences, and biology. I studied the Macy Conferences, read the works of Turing, Shannon, Wiener, von Neumann, McCulloch and von Foerster. I retraced the path blazed by the pioneers of augmented intelligence - Douglas Engelbart, Joseph Licklidder, Theodore Nelson. I was a passionate observer of the birth of personal computing and the Internet. In 1990, three years after the Web went public thanks to the genius of Tim Berners Lee, I published a book entitled Les Technologies de l’intelligence (Intelligence Technologies), which analyzed the philosophical and cultural meaning of the convergence of networks of computers with hypertextual networks. My work on the hypothesis of a dynamic ideography, published in 1991 as Idéographie Dynamique and the invention, together with Michel Authier, of a computerized system for the visualization of collective dynamics of knowledge (Trees of Knowledge, or Les Arbres de connaissances, 1992) bear witness to the fundamental intuitions that would eventually lead to the formation of ieml.
By the late 1980s, I was convinced that, in order to take best advantage of the unprecedented possibilities made available by cyberspace for the manipulation of symbols, we needed an intellectual technology that hypertextually links all possible concepts within a calculable network - yet without granting any particular privilege to any of them. In other words, we needed to extend the form “P2P” (which, although not common knowledge at the time, was nonetheless implicit in the structure of the Internet and hypertexts) to include the relationships between concepts. In order to retain this neutrality and equality of design, the generative motor for the new digitally-based thought instrument could be nothing other than the logical analysis of meaning itself. That way, no concept could be excluded or marginalized. It was an article by François Rastier that put me on the path of the semiotic triad (sign S, being B , thing T) as the possible foundation of the metalanguage to come. My subsequent work on collective intelligence (L’intelligence Collective, 1994) and the virtual (Qu’est-ce que le virtuel?, 1995) helped me refine my initial hypotheses and add complexity to the semiotic triad, with the pragmatic dyad (virtual U, actual A). Yet it was not until I was awarded a Canada Research Chair at the University of Ottawa that I was able, from 2002 to 2006, to dedicate my full-time efforts to detailed plans and the formation of ieml.
The site www.ieml.org will publish the various successive and augmented versions of the metalanguage. It will also offer open source downloads that make use of ieml, and publish reports and scientific studies on its use. In time, a community of developers and users could organize their efforts, and pool their various suitable means of collaboration (wikis, real-time P2P data-sharing, etc.).
For now, at the time of its inauguration in May 2006, the ieml language exists only as a core structure. And while its dictionary does make it possible, even now, to describe a broad range of ideas and phenomena, it remains limited to a few hundred lexical units. The editing and automatic indexing tools for using the dictionary are in the prototype or planning stages. So, for the time being, ieml remains a scientific research project. Its growth and future success will depend on the commitment and collaboration of many partners: public and private research laboratories, governments, international agencies, and user companies and communities.
In the next section, I will expand on the reasons that led me to design ieml, after which - in the third section of this document - I will describe in greater detail the fundamental structure first outlined in the introduction.
Please, find the complete document below
|
ieml-purpose-structure
|