Our work is intended to focus on developing the state-of-the-art revolutionary AI
software agent. We offer and can develop a truly “thinking computer” that is self aware,
can see and interpret the information, communicates with the human beings and other computers,
is able to extract and learn from a text or voice communication with a human, other computer,
or from the image data, or from its own experience. The computer will become a thinking machine
and can form teams with humans or other computers. This software will make a computer a
knowledgeable thinking and reacting machine in a way like a human. This software can update its
knowledge from an operator level to the knowledge-intensive engineering sphere. And it will
be achieved by a computer itself without programmer’s intervention as a matter of training. The
training can come from an interaction with a human “teacher”. It can come from a properly
organized library of books pretty much as a regular learning process for a human student. And
what our system does when it encounters something new? Pretty much like a regular human
student. It tries to find an answer or/and asks the teacher for an input.
Our software not only understands voice and text communications. It intelligently responds,
can act upon user’s input or texts. It can learn its tasks based on user’s input accumulating
previous experience, pretty much like a human student. Our software agent, even though written
for English language environments, can be easily converted to other languages. In cognition
process language is considered as a set of rules and human knowledge is universal.
The research will focus on a revolutionary way of how information is acquired, presented,
retrieved, and interpreted in a computer pretty much as it is presented in a human. Human
beings can store in a compressed form and interpret text, audio, video information, and
perception organizing it in databases inside our brain. Knowledge, language, text
interpretation, and learning are not governed by complex mathematical algorithms even though
machine interpretation does require some mathematical abstraction and sometimes probabilistic
fitting if precise sense or knowledge cannot be uncovered. Any new behavior is based on
interpretation of acquired experience data.
We are aware of numerous attempts to produce an AI agents. So far, they have had very
limited success or no success at all. A lot of research has been done. Ironically, it is
all that research and those attempts that have made it possible to create our AI agent now.
Some researchers gather statistical data from corpora and then write programs that use it.
One can combine the language model probability with the acoustics in order to hopefully get the
correct interpretation. For many years the speech recognition community has been using the
so-called “trigram” model. Acoustic models are subjective to a speaker and noise environment.
Statistical parser (in principle) finds all possible parses for the sentence and adds up
their probabilities, and the result is the probability for the sentence. In practice these
parsers find only a subset of the parses. All statistical models of complicated real-world
phenomena must make independence assumptions. The goal in statistical language work is to make
the required independence assumptions as natural as possible. It is not always possible.
Some proposed software agents take the form of semantic relationships between syntactic
elements in English sentences. These systems also depend on an outside source of data: a
cooperative user. Unlike knowledge-intensive and corpus-based systems, however, it does not
require a large repository of semantic information and it does not require any previously
analyzed data: it can start processing a text from scratch. The system inspects the surface
syntax of a sentence to make informed decisions about its possible interpretations. It then
suggests these interpretations to the user. As more text is analyzed, the system learns from
previous analyses to make better decisions, reducing its reliance on the user. Evaluation
confirms that the semi-automatic acquisition of the model a text is relatively painless for
the user.
Within the last decade, the availability of robust software agents for language analysis
has provided an opportunity for using lexical, syntactic, semantic information to improve
the performance of Machine Translation and document management applications such as Information
Retrieval/Extraction and Summarization.
There was a significant work done in linguistics. There are many different types of
information classification systems as well, such as Rapid Knowledge Formation, for instance.
However, they require significant human intervention in coding them. Some systems are very
specialized and do not permit the development of a generic agent. Our system will use the
initial rules of acquiring the knowledge. We can determine the algorithm for these rules.
However, our system will update its knowledge by itself based on human, graphic, or textual
input, and will modify its own knowledge database structuring rules. There are many theories
covering knowledge based systems, however they belong to different schools of thinking and they
have not been reviewed in their entirety. We are aware of these theories. We have our own
approach to those ideas.
Natural language processing systems perform some limited information retrieval (BADGER,
CRYSTAL, DEDAL, etc.) and extraction (TACITUS, Proteus, FASTUS, etc.) for human analyst,
summarization, and some limited tutoring. They analyze the data based on probabilistic
approach, linguistic structures, and some coded in advance knowledge databases. Some systems
can communicate with human beings (START, CHAT, TIPS, LAMA, etc.). The scope of their
conversations is limited. A number of software agents are available for NLP from parsers to
the dictionaries, speech and text corpora, lexical databases, grammars, and other tools for
analyzing the corpora (Alembic, ATLAS, SRILM, etc.).
However, all the tools publicly and commercially available perform only certain tasks and
have very limited accuracy and application. They all fail to understand the human knowledge
and react to that. They cannot be used in a capacity of a human companion that can understand
and communicate to a human being and perform the tasks close to human intelligence. We want
to emphasize that we are not even talking about voice communication here.
Over the years it became possible to develop voice recognition systems. Those systems
don’t make intelligent judgment. In other words those systems do not evaluate the meaning of
what has been said.
Multiple information representation and optimization techniques have been developed such as
reinforcement learning, optical flow based navigation, planning techniques for large Markov
decision process. Even though some of those techniques have use primarily for robot navigation,
they are sometimes very complex, require significant calculation effort, and do not represent
real-life human information handling.
Significant work has been done on learning especially supervised learning. Supervised
learning has been presented as a regression problem of sparse data interpolation. Various
aspects of statistical learning theory are being studied and developed. Statistical learning
theory has been developed for pattern recognition.
In learning theory regularization networks are important. The support vector machine theory
produce limited result. Jorma Rissanen introduced minimum Description Length (MDL) principle
in 1978. The MDL Principle can compress the data by finding any regularity in a given set of
data. We believe that raw data representation and query is the key to computer reaction to
particular information.
There was a significant work done in linguistics and information retrieval and presentation
in general. There are many different types of information classification systems as well, such
as Rapid Knowledge Formation, for instance. However, they require significant human
intervention in coding them. Some systems are very specialized and do not permit the
development of a generic agent. Our system will use the initial rules of acquiring the
knowledge. We can determine the algorithm for these rules. However, our system will update
its knowledge based on human or textual input, and will modify its own knowledge database
structuring rules. There are many theories covering knowledge based systems, however they
belong to different schools of thinking and they have not been reviewed in their entirety. We
are aware of these theories. We have our own approach to those ideas.
When people read a text, they rely on a priori knowledge of language, common sense knowledge
and knowledge of the domain. Many natural language processing systems implement this human
model of language understanding, and therefore are heavily knowledge-dependent. Such systems
assume the availability of large amounts of background knowledge coded in advance in a
specialized formalism. The problem with such an assumption is that building a knowledge base
with sufficient and relevant content is labor-intensive and very costly. And often, the
resulting knowledge is either too specific to be used for more than one very narrow domain or
too general to allow subtle analyses of texts.
In order to avoid the problems of manually encoding background knowledge, many researchers
have abandoned symbolic language analysis in favor of statistical methods. The availability
of large online corpora and improvements in computing resources have made it possible to make
predictions about meaning based on observations of frequencies, contexts, correlation, and
other phenomena in a corpus. Systems that use statistical methods have had some successes,
notably in part of speech tagging, word class clustering and word sense disambiguation. But
these systems often require large amounts of analyzed language data to arrive at even shallow
interpretations.