Ai Forums Home Welcome Guest    Sunday, February 26, 2017
Ai Site > Ai Forums > Language Mind and Consciousness > Chatterbots using Fuzzy String Matching Last PostsLoginRegisterWhy Register
Topic: Chatterbots using Fuzzy String Matching

Leibnitz
posted 12/26/2008  08:06Send e-mail to userReply with quote
Why every chatterbot developpers (at list most of them) always use the same algorithm: keyword macthing. I mean most of the chtterbots that we know ressembles too much to the original Eliza program developped by Joseph Waizenbaum.Keyword matching has its limits, you would need a very huge dabase that could include most of the things that people might use in a real conversations. Even if we have had that, it doesn't garanty us that the program will behave intelligently. For at list two reasons, chatterbots usualy do not learn new keywords automaticaly during a conversation and also for the most of it, they can't really follow the context of a conversation. Why don't we have mainly chatbots that relies on algorithm such as Levenshtein Distance for matching sentences instead.

Gonzales Cenelia.

 A.I and chatterbot website
Last edited by Leibnitz @ 1/14/2009 6:25:00 AM

Od1n
posted 12/31/2008  07:42Send e-mail to userReply with quote
I agree to an extent. I believe that the whole concept of "matching" questions to responses is in itself flawed.

Is the goal here to create a new life form? A thinking intelligent being? Or something that creates seemingly intelligent responses.

Honestly I would rather see an unintelligent response from an intelligent being, than an intelligent response from a fancy talking encyclopedia.

So here's a wild idea, what about getting rid of the entire "turn based" structure. That's not how people talk. The computer should be aware of "dead space" (not talking) as well as questions. Let it respond when it feels like, if it even wants to respond.... or respond multiple times.

Anyway that's what I think.


Leibnitz
posted 1/1/2009  07:58Send e-mail to userReply with quote
All of what you've said sound quite interesting. One thing however that we should keep in mind is that the way that we ourselves with our brain we process a conversation is definitly too much complex to imitated precisely. So in order to have something that is feasible, we need to simply to the maximum by keeping the essential parts. If we would consider or brain as a black box, when it receives a sentence as an input, it process it and return an answer. It's very clear that what ever decisions that we make in our everyday life is not done purely instinctively or mechanicaly. But what is also true is that those same decisions are also based on the knowledge that we have acquired and also our experiences about the world. By making a great oversimplication, we could say that the knowledge and experinces stored in our brain constitute a database that we use to process our decisions. What kind of rules that the brain uses to match inputs to existing records inside these database? It certainly not the kind of matching rules that we find in much chatterbots. The ones i have seen so far are: exact sentence matching (the input must be the same as the keyword). Ex: if we have:

input: what is your name

and the database:

WHAT
.....
WHAT IS
......
WHAT IS YOUR NAME
......
SO YOUR NAME IS
......

the only keyword that would match is only: WHAT IS YOUR NAME

the other type of sentence matching that i have also observed is when the keyword is contain in side the input sentence.

Ex:
input: what is your name again

and the database:

WHAT
.....
WHAT IS
......
WHAT IS YOUR NAME
......
SO YOUR NAME IS
......


the keyword: WHAT IS YOUR NAME would be a good match to the sentence: what is your name again.

These matching rules are however too rigid (too restrictive), if we have an input such as:

input: can you please tell me your name

given the previous database, the program wouldn't be able to find a match.

But by using Fuzzy String Search technics (Levenshtein Distance), we would quickly find out that the keywords: WHAT IS YOUR NAME and SO YOUR NAME IS are both a good match for these input (can you please tell me your name).

Gonzales Cenelia


 A.I and chatterbot website
Last edited by Leibnitz @ 1/14/2009 6:11:00 AM

Od1n
posted 1/2/2009  11:38Send e-mail to userReply with quote
There is no doubt that this would be an improvement on what is being used already. It would allow the bot to be able to relate input to it's database more effectively.

So do I think this would improve the quality of chatter bots? Yes.

But do I think it gets us any closer to achieving true AI? Not really.

I agree that the human brain is very complex, but that complexity is the result of millions of years of trial and error on the part of mother nature.

So perhaps instead of trying to jump right to the end result of advanced language skills, it would be more rewarding to instead aim for the intelligence of a dog or other similar creature.

I talk more about this in my other post (below this one in order) but I believe you cannot achieve true AI without motivation.

Neural networks excel at taking an input and telling you if that input is similar to other inputs its seen in the past.

Now imagine if every time a neural network is faced with a decision, the result of that decision either improves its current virtual physical state, (i.e. more food, less cold, less pain) makes it worse, or has no effect.

If these networks where trained to always improve its physical state then when presented with choices like; move away from, or closer to fire; move closer to or away from the smell of food; it would become more skilled at surviving.

Now imagine we take this simple AI who's always trying to survive, and gradually increase the complexity of the challenges it faces. Add in social situations with other wolves, add in injury, sickness, so on. Then eventually the need to communicate in advanced language in order to achieve what it wants.

At the very least, even if you skip the more primitive stages, AI needs motivation. Whats it getting out of the conversation?

You say: Hi how are you?

It can say: Screw you! | or | Good and you?

If its motivation is to have longer conversations for example, then what choice do you think will make the conversation last longer?

This is where I think AI needs to go. String matching will make a nice chatter bot, but why kid ourselves? It really has no idea what we are talking about.




Leibnitz
posted 1/6/2009  17:35Send e-mail to userReply with quote
I have heard this formula many times before. More often i hear things like: instead of trying to create the intelligence of an adult human, why not trying to imitate the one of a young child. Or even instead of trying to imitate the intelligence of a dog, why not trying to imitate the intelligence of a simple microbe(it would be easier). The only problem, is that we have the tendency to forget that computers nowadays is at the core only 0's and 1's and has nothing to do with the much more complex biological systems that exist in nature. At the current state of things, there is no simple way to simulate precisely anyone of this creatures (even the simplest ones). Just think for a moment when we were trying to create the first flying machines. None of the protypes that ressembles a human being with artificial wings never did succeed to fly. Only the Wright brothers with their airplain has succeed to do it. Can we really say that a modern airplain ressembles birds in there way of flying? Well, probably not. Even today, the exact mechanism of how birds fly and how to mimic it is still mostly unsolved. That only means that if we really want to try to create a true a.i program, we should only keep the essence (start with the basic). Everything that concerns living creatures are billion of times more complex than what we can do today with computers. We can manipulate strings, save them files, print them on the screen of a computer etc. A sentence or phrase can easealy be representend using a string of characters. And it is the main reason why it definitly more feasible to deal with conversational exchanges instead of emotions, conciousness, self awareness, self motivation etc

Gonzales Cenelia

 A.I and chatterbot website
Last edited by Leibnitz @ 1/14/2009 6:11:00 AM

squarebear
posted 1/12/2009  11:01Send e-mail to userReply with quote
 
Leibnitz wrote @ 1/1/2009 7:58:00 AM:
These matching rules are however too rigid (too restrictive), if we have an input such as:

input: can you please tell me your name

given the previous database, the program wouldn't be able to find a match.

But by using Fuzzy String Search technics (Levenstein Distance), we would quickly find out that the keywords: WHAT IS YOUR NAME and SO YOUR NAME IS are both a good match for these input (can you please tell me your name).

 
This is already in use. You've just described the <srai> tag of AIML.
For example:
Hi, Howdy, hallo, hi there would all match the one category "Hello" rather than having to code many categories.


Leibnitz
posted 1/14/2009  06:16Send e-mail to userReply with quote
Well, it's not really the same thing:

<category>
<pattern>HI</pattern>
<template>
<srai>HELLO</srai>
</template>
</category>
<category>
<pattern>HOWDY</pattern>
<template>
<srai>HELLO</srai>
</template>
</category>
<category>
<pattern>HALLO</pattern>
<template>
<srai>HELLO</srai>
</template>
</category>
<category>
<pattern>HI THERE</pattern>
<template>
<srai>HELLO</srai>
</template>
</category>
<category>
<pattern>HELLO</pattern>
<template>Hi there!</template>
</category>

all of this refers to synonyms but i was referring to Fuzzy String Search or Approximate String Search (http://en.wikipedia.org/wiki/Fuzzy_string_searching). In fact those two are quiet different.


Gonzales Cenelia


 A.I and chatterbot website
Last edited by Leibnitz @ 1/14/2009 6:20:00 AM

squarebear
posted 1/16/2009  18:15Send e-mail to userReply with quote
<category>
<pattern>WHAT IS YOUR NAME</pattern>
<template>My name is Mitsuku</template>
</category>

<category>
<pattern>SO YOUR NAME IS</pattern>
<template><srai>What is your name</srai></template>
</category>

<category>
<pattern> * WHAT YOUR NAME *</pattern>
<template><srai>What is your name</srai></template>
</category>

<category>
<pattern> WHAT * YOUR NAME</pattern>
<template><srai>What is your name</srai></template>
</category>

<category>
<pattern> WHAT * YOUR NAME *</pattern>
<template><srai>What is your name</srai></template>
</category>

This would cover "What is your name?", "What could your name be?" and even "What is the answer to the question of 'Excuse me robot, whatever could your name possibly be?'".

Looks like fuzzy matching to me. What is different?

 http://www.mitsuku.com
Last edited by squarebear @ 1/16/2009 6:26:00 PM

turing_machine
posted 1/16/2009  18:57Reply with quote
Response to Leibnitz's 12/26 comments:

I think it's important to remember that the average American has a vocabulary of only 14,000 words.

Further, have you ever tried to have a conversation with a fat, dump-truck driver type while he drinks beer and watches TV? He tends to rely disproportinately on the expression, "Huh?"

You can fool some of the people all the time, and all of the people some of the time, but you cannot fool all of the people all the time. - Lincoln


Leibnitz
posted 3/18/2009  15:39Send e-mail to userReply with quote
The current strategies of chatterbots doesn't seem to work that good because humans are very good at detecting patterns, chatterbots usualy repeat themselve very often. What ever trickery you use, it won't take very long during a conversation that someone would notice that there is nothing intelligent about the chatbot. So, that's the reason why, we need to come up with new chatterbots that belongs more to the true a.i field.

Last edited by Leibnitz @ 3/18/2009 3:53:00 PM

Leibnitz
posted 3/18/2009  15:41Send e-mail to userReply with quote
Reply to squarebear: this is WildCard String matching not Fuzzy String Matching, it's not the same.

 a.i website

Damir Olejar
posted 6/8/2009  03:30Send e-mail to userReply with quote
>you would need a very huge dabase that could include most of the things that people might use in a real conversations.

But we do have such database, and we use internet search engine to access it. I have just developed a program to aid my research. It uses the old MegaHal algorithm (what you called matching) to output (what is sometimes) the nonsense. The nonsense is then interpreted with a search engine to try to find the closest human-generated sentence.

So what happened is that I have simply inserted "Artificail Intelligence" into a program that had no database. It fetched the search results from google, made a new brain, did a response to my input, and interpreted the response interpretation with google again! It gave me a link to this page saying "Join our forums etc." and I did, finding your post.

To a question "What is the meaning of Ubuntu?" without any database to begin with, it gives an answer:" Here's what they say: "Ubuntu" is an ancient African word, meaning "humanity to others""
Furthermore, it also says:
-"Ubuntu also means "I am what I am because of who we all are"".
or
-"That said, I'm not sure what I think about their marketing schtick around the use of the word Ubuntu".


The only problem that I have is that I am the one who needs to choose those outputs since the other choice out of given four was:
-Ancient? I suppose it depends on what ancient means. (this is a problem because I said nothing about anything ancient, but asked a straight question. This means that it learned beyond the conversation, my knowledge and its database but has no mind-theory).
Therefore, these are the problems that such simple programs still have:
-A reliable text to sentence parser (this is crucial for developing a new brain).
-Formating the actual sentence (sometimes references, names, slang, numbers etc, are simply not needed and they interfere with the learning).
-A way to choose between the possible responses (I am doing this on my own since I do not know how to make a program to 'train' it).
-A way to toggle the learning on and off by itself
-A way to learn the grammar and output the sentences with a proper grammar
-A way to choose which word should have the greater precedence / importance in the sentence.
-Sometimes, the chatterbot works better if it trains a small temporary brain and gives a response, than with the whole brain. Then, the method to do this should also be derived.

A database is NOT one of such problems.


Last edited by Damir Olejar @ 6/8/2009 3:45:00 AM

Damir Olejar
posted 6/8/2009  03:54Send e-mail to userReply with quote
>can you please tell me your name

I have just tried such input, and with empty database to begin with, I got "Naveen May, 22, 2008Naveen How can we handle the situation in the server?"

So I guess his name is Naveen who wants to know about situation in the server... or maybe asking someone named Naveen ?? and the bad format is due to a lack of a reliable sentence parser :)))

Last edited by Damir Olejar @ 6/8/2009 4:00:00 AM

Smart_Orifice
posted 7/12/2009  02:14Send e-mail to userReply with quote
 
squarebear wrote @ 1/16/2009 6:15:00 PM:
<category>
<pattern>WHAT IS YOUR NAME</pattern>
<template>My name is Mitsuku</template>
</category>

<category>
<pattern>WHAT * YOUR NAME</pattern>
<template><srai>What is your name</srai></template>
</category>

 
Besides them being different (although related) I always hated this construct in AIML:

What is your name?
My name is Mitsuku.
That's nice. What does your name mean in your language?

My name is Mitsuku.

Rrrrriiight.


Nerketur
posted 8/11/2009  02:15Send e-mail to userReply with quote
 
Od1n wrote @ 1/2/2009 11:38:00 AM:

At the very least, even if you skip the more primitive stages, AI needs motivation. Whats it getting out of the conversation?


 
In a way, you're right. That's how humans learn, after all. If we want to learn it, we do. If we HAVE to learn it to survive, and we want to survive, we do. But that's only part of the story, I think. It's more about preference. What does it like? If it WANTS to survive, then it will do everything in it's power to do so. If it WANTS to "feel good" then it will continue to do things which make it feel good.

To a certain extent, that's how the HALs work, I think. As long as you don't "correct it" it keeps doing things the way it wants to. The only way it knows how. in my opinion, in order to create AI, it needs bias. And that's it, really. Preferring one object over another.

If you want to read more, then see my post about this.

 Consciousness. Does it exist at all?
  1  
'Send Send email to user    Reply with quote Reply with quote    Edit message Edit message

Forums Home    The Artificial Intelligence Forum    Hal and other child machines    Alan and other chatbots  
Contact Us Terms of Use